Monday 1 April 2024

Standard states and solution thermodynamics

Readers of this blog know that, on more than one occasion, I have denounced the ligand efficiency metric as physically meaningless on the grounds that perception of efficiency varies with the concentration value that defines the standard state. As I argue in NoLE this is clearly thermodynamic nonsense (Pauli might even have suggested that it wasn’t even wrong) and the equivalent cheminformatic argument is that perception shouldn’t change when you use a different unit to express a quantity.

A change in perception resulting from using a different standard concentration can also be a problem when analysing thermodynamic signatures. One particular absurdity is that binding can be switched from enthalpy-driven to entropy-driven simply by using a different concentration to define the standard state. This statement in the W2014 article unintentionally highlights the issue:

Consequently, we define the dimensionless ratio (ΔH + TΔS)/ΔG as the Enthalpy–Entropy Index (IE–E) and use it here to indicate the enthalpy content of binding. Its advantageous feature is that it is normalised by the free energy ΔG (= ΔH  – TΔS), and so it can be used to compare compounds with millimolar to nanomolar binding affinities during the course of a hit-to-lead optimisation.

I do indeed think that it makes a lot of sense to use (ΔH + TΔS) and ΔG as parameters for exploring thermodynamic signatures. However, the dimensionless ratio of the two quantities is physically meaningless because of its dependence on the concentration used to define the standard state (this dependence stems from the fact that ΔS depends on the standard concentration while ΔH is invariant to change in the standard concentration).

One article that I’ve been particularly critical of in the past is “The role of ligand efficiency metrics in drug discovery” NRDD 133:105-121 (2014) DOI. Specifically, I have expressed concerns about this sentence in Box 1 (Ligand efficiency metrics) of the article:

Assuming standard conditions of aqueous solution at 300K, neutral pH and remaining concentrations of 1M, –2.303RTlog(Kd/C°) approximates to –1.37 × log(Kd) kcal/mol.

I do need to mention a potential source of confusion when analysing Kd values. In biochemistry, biophysics and drug discovery Kd values are conventionally quoted as dimensioned quantities in units of concentration. However, Kd values may also be quoted as dimensionless ratios and, in these cases, the Kd value depends on the concentration used to define the standard state. There seems to be an error in that the approximation appears to eliminate the dimensions of the standard concentration C°.

I should say that I’ve always been a bit nervous about denouncing the approximation as an error because the authors are all renowned thought leaders in the drug discovery field. Furthermore, the journal impact factor of NRDD is a significant multiple of my underwhelming h-index and any error of such apparent grossness would surely have been detected during the rigorous peer review process applied by this elite journal. It turns out that my nervousness was indeed well placed and, when calculated at 300 K, the product RT actually serves as an annihilation operator that eliminates the dimensionality associated with Kd. This also explains why a temperature of 300 K must be used when calculating the ligand efficiency even though biochemical assays are usually run at human body temperature (310 K). 

I became convinced of the validity of the above approximation recently after examining a manuscript by the world-renowned expert on tetrodotoxin pharmacology, Prof. Angelique Bouchard-Duvalier of the Port-au-Prince Institute of Biogerontology, who is currently on secondment to the Budapest Enthalpomics Group (BEG). The manuscript has not yet been made publicly available although I was able to access it with the help of my associate ‘Anastasia Nikolaeva’ (she decamped last year from Tel Aviv to Uzbekistan and, to Derek’s likely disapproval, is currently running an open access journal out of a van in Samarkand). There is no doubt that this genuinely disruptive study will comprehensively reshape the generative AI landscape, enabling drug discovery scientists, for the very first time, to rationally design novel clinical candidates using only gene sequences as input.

Prof. Bouchard-Duvalier’s seminal study clearly demonstrates that it is indeed possible to eliminate the need to define standard states for the thermodynamic analysis of liquid solutions, provided that the appropriate temperature is used. The math is truly formidable (my rudimentary understanding of Haitian patois didn’t help either) and involves first projecting the atomic isothermal compressibility matrix into the quadrupole-normalized polarizability tensor before applying the Barone-Samedi transformation, followed by hepatic eigenvalue extraction using the algorithm introduced by E. V. Tooms (a reclusive Baltimore resident better known for his research in analytic topology). ‘Anastasia Nikolaeva’ was also able to ‘liberate’ a prepared press release in which a beaming BEG director Prof. Kígyó Olaj explains that, “possibilities are limitless now that we have eliminated the standard state from solution thermodynamics and thereby consigned the tedious and needlessly restrictive Second Law to the dustbin of history." 

Wednesday 27 March 2024

Leadeth me unto Truth and delivereth me from those who have already found it

A theory has only the alternative of being true or false.
A model has a third possibility: it may be true, but irrelevant.
With apologies to Manfred Eigen (1927 - 2019)
******************

I've just returned to Cheshire from the Caribbean and, to kick off blogging from 2024 I'll share a photo of the orchids at Berwick-on-Sea on the north coast of Trinidad.


Encountering words like “truth” and “beauty” (here's a good example) in the titles of scientific articles always sets off warning bells for me and I’ll kick off blogging for 2024 with a look at FM2024 (Structure is beauty, but not always truth) that was recently published in Cell (and has already been reviewed by Derek). The authors have highlighted  important issues: we typically use single conformations of targets in design and the experimentally-determined structures used for design may differ substantially from the structures of targets as they exist in vivo. These points do need be stressed given the expanding range of modalities being exploited by drug designers and the increasing use of AI/ML in drug design. That said, it’s my view that the authors have allowed themselves to become prisoners of their article’s title. Specifically, I see “beauty” as a complete red herring and suggest that it would have been much better to have discussed structure in terms of accuracy and relevance rather than truth. Here’s the abstract for FM2024:

Structural biology, as powerful as it is, can be misleading. We highlight four fundamental challenges: interpreting raw experimental data; accounting for motion; addressing the misleading nature of in vitro structures; and unraveling interactions between drugs and “anti-targets.” Overcoming these challenges will amplify the impact of structural biology on drug discovery.

I'll start by taking a look at the introduction and my view is that the authors do need to be much clearer about what they mean by “this hydrogen bond is better than that one” when using terms like “ground truth”. For example, we can infer that the geometry of one target-ligand hydrogen bond is closer to optimal than the geometry of another target-ligand hydrogen bond. However, the energetic cost of breaking a target-ligand hydrogen bond is not something that can generally be measured and, as noted in NoLE, the contribution of an intermolecular contact to affinity is not actually an experimental observable. Ligands associate with their targets (and anti-targets) in aqueous media and this means that intermolecular contacts, for example between polar and non-polar atoms, can destabilize the target-ligand complex without being inherently repulsive. What I’m getting at here is that structures of ligand-target complexes are relatively simple and well-defined entities within the broader context of drug discovery and yet it doesn’t appear useful to discuss them in terms of truth.

The remainder of the post follows the FM2024 section headings.    

A structure is a model, not experimental reality

The term “structure” can have a number of different meanings in structure-based drug design. First, drug targets (and anti-targets) have structures that exist regardless of whether they have been experimentally determined. Second, models are built for drug targets by fitting nuclear coordinates to experimental data such as electron density (these are often referred to as experimental structures although they should strictly be called models because they are abstractions of the experimental data). Third, the structure could have been predicted using computational tools such as AlphaFold2 (here's an article, cited by FM2024, on why we still need experimentally-determined structures). 

In the abstract the authors identify “interpreting raw experimental data” as one of “four fundamental challenges”. However, the actual focus of this section appears to be evaluation of predicted structures rather than interpretation of raw experimental data.  While I’m sure that we can find better ways to interpret raw experimental data, and indeed to evaluate predicted structures, I don’t see either as representing a fundamental challenge. 

Representing wiggling and jiggling is hard

My view is that it’s actually the ensemble of conformations rather than the wiggling and jiggling that we actually need to represent. Simulation of the wiggling and jiggling is one way to generate an ensemble of conformations but it’s not the only way (nor is it necessarily the best way).  That said, it's a lot easier to sell protein motion to venture capitalists than it is to sell ensembles of conformations.

The authors state:

Analogous to how structure-based drug design is great for optimizing “surface complementarity” and electrostatics, future protein modeling approaches will unlock ensemble-based drug design with an ability to predictably tune new and important aspects of design, including entropic contributions [7] and residence times [8] of bound ligands.

The term “entropic contributions” does come across as arm-waving (especially in a drug design context) and my view is that entropy should be seen as an effect rather than a cause. Thermodynamic signatures for binding are certainly of scientific interest but I would argue that they are essentially irrelevant to drug design (it can be instructive to consider how patients might sense the benefits of enthalpically-driven drug binding). The case for increasing residence time might not be quite as solid as many believe it to be (see the F2018 study and this blog post).

In vitro can be deceiving

The authors identify “addressing the misleading nature of in vitro structures” as a fundamental challenge and they state:

While purifying a protein out of its cellular context can be enabling for in vitro drug discovery, it can also provide a false impression. Recombinant expression can lead to missing post-translational modifications (e.g., phosphorylation or glycosylation) that are critical to understanding the function of a protein.

To this I’d add that we often don’t use the full-length proteins in design and recombinant proteins may have been engineered to make them easier to crystallize or more robust for soaking experiments. Furthermore, target engagement may require the drug to interact with two or more proteins (see HC2017) which will probably be more amenable individually to structure determination than the their complex. I fully agree that it is important for drug designers to be aware that the experimentally-determined structures that they're using differ from the structures of the targets as they exist in vivo.  However, I don't believe that it makes any sense to talk about “the misleading nature of in vitro structures” (or indeed about “in vitro drug discovery”) because target structures are never experimentally determined in vivo and are only misleading to the extent that users overinterpret them. As a more general point users of experimental data do need to very careful about describing the experimental data that they’re using as “misleading” or "deceiving".  

When we use structures to represent targets the issue is much less about the truth of the structures that we’re using and much more about their relevance to the targets that we’re trying to represent. This is not just an issue for structural biology and we might, for example, use the catalytic domain of an enzyme as a model for the full-length protein when running biochemical assays. We have to make assumptions in these situations and we also need to check that these assumptions are reasonable. For example, we might examine the structure-activity relationship in a cell-based assay for consistency with the structure-activity relationship that we’ve observed in the enzyme inhibition assay. It's also worth pointing out that what we observe in cells is usually a coarse approximation to what actually happens in vivo and we can't even measure the intracellular concentration of a drug in vivo.  

Drugs mingle with many different receptors

Drugs do indeed mingle with many receptors in vivo but it’s important to be aware that the consequences of this mingling depend on the drug concentration (a spatiotemporal quantity) at the site of action.  Drug discovery scientists use the term exposure when talking about drug concentration at the site of action and one underappreciated challenge in drug design is that intracellular drug concentration cannot generally be measured in vivo (here’s an open access article that I recommend to everybody working drug discovery). I argue in NoLE that controllability of exposure should be seen as a drug design objective although the current impossibility of measuring intracellular concentration means that we can only assess how effectively the objective has been achieved in an indirect manner. Alternatively, drug design can be seen in terms of minimization of the dose at which therapeutically beneficial effects can be observed.  

One assumption often made in drug design is that the drug concentration at the site of action is equal to the unbound concentration in plasma and this assumption is referred to as the free drug hypothesis (FDH) although the term “free drug theory” is also used. The basis for the FDH is the assumption that the drug can move freely between plasma and the target compartment. In reality the drug concentration at the site of action will generally lag behind its unbound plasma concentration and the lag time is inversely related to the ease with which the drug permeates through the barriers which separate the target from the plasma. There are a couple of scenarios under which you can’t assume that the drug concentration in the target compartment will be the same as its unbound plasma concentration. The first of these is when active transport is significant and this is a scenario with which drug designers tackling targets within the central nervous system (CNS) are familiar with. The second scenario is that there is an ionizable functional group (as is the case for amines) in the molecular structure of the drug and the pH at the site of action differs significantly from plasma pH (as is the case for lysosomes).

There are two general types of undesirable outcome that can result when a drug encounters receptors with which it mingles.  First, the receptor is an anti-target and the encounter results in binding of the drug, leading to toxicity (patients are harmed).  Second, the receptor is a metabolic enzyme or a transporter and the encounter leads to the drug either being turned over or pumped from where it needs to function (patients do not benefit from the treatment).

I've inserted some comments (italicised in red) into the following quoted text: 

The sad reality that all drug discoverers must face is that however well designed we may believe our compounds to be, they will find ways to interact with many other proteins or nucleic acids in the body and interfere with the normal functions of those biomolecules. While occasionally, the ability of a medicine to bind to multiple biomolecules will increase a drug’s efficacy, such polypharmacology is far more likely to produce undesirable effects. These undesirable outcomes take two forms. Obviously, the direct binding to an anti-target can lead to a bewildering range of toxicities, many of which render the drug too hazardous for any use. [While there are well-known anti-targets such as hERG that must be avoided, my understanding is that those responsible for drug safety generally prefer not to see any off-target activity given the difficulties in prediction of toxicity. Here are a couple of relevant articles (B2012 | J2020) and a link to some information about in vitro safety pharmacology profiling panels from Eurofins.] More subtly, the binding to anti-targets reduces the ability of the drug to reach the desired target. A drug that largely avoids binding to anti-targets will partition more effectively through the body, enabling it to accumulate at high enough concentrations in the disease-relevant tissue to effectively modulate the function of the target. [I consider it unlikely that binding to an anti-target could account for a significant proportion of the dose. In any case, I’d expect binding of a drug to anti-targets to cause unacceptable toxicity long before it results in sequestration of a significant proportion of the dose.] 

A particular challenge results from the interaction of drugs with the enzymes, transporters, channels, and receptors that are largely responsible for controlling the metabolism and pharmacokinetic properties (DMPK) of those drugs—their absorption, distribution, metabolism, and elimination. Drugs often bind to plasma proteins, preventing them from reaching the intended tissues; [A degree of binding to plasma proteins is not a problem and, in the case of warfarin, is probably essential for the safe use of the drug.] they can block or be substrates for all manner of pumps and transporters, changing their distribution through the body; [Transporters can indeed prevent drugs from getting to their sites of action at therapeutically effective concentrations and limited brain exposure resulting from active efflux is a common issue for CNS drug discovery programs (see H2012 and R2015). I am not aware of any transporters that are definitely considered to be anti-targets from the safety perspective (I'm happy to be corrected on this point) and inhibition of efflux pumps is a recognized tactic (see T2021 and H2020) in drug discovery.] xenobiotic sensors such as PXR that turn on transcriptional programs recognizing foreign substances; and they often block enzymes like cytochrome P450s, thereby changing their own metabolism and that of other medicines. [Inhibition of CYPs is generally considered undesirable from the safety perspective because of the potential for drug-drug interactions (see H2020). That said, the CYP3A inhibitor ritonavir (see CG2003) is used in the COVID-19 treatment Paxlovid to slow metabolism of SARS-CoV-2 main protease nirmatrelvir.]  They are themselves substrates for P450s and other metabolizing enzymes and, once altered, can no longer carry out their assigned, life-saving function. [Medicinal chemists are well aware of the challenges presented by drug-metabolizing enzymes although it must be stressed that any drug that was cleared too slowly would be considered to be an unacceptable safety risk.] 

Taken together, we refer to these DMPK-related proteins, somewhat tongue-in-cheek, as the “avoidome” (Figure 2). [It is unclear why the authors have chosen to only include DMPK-related proteins in the avoidome (hERG is not a DMPK-related protein but is an anti-target that every drug discovery scientist would wish to avoid blocking). For reasons outlined in the previous paragraph I would actually argue against the inclusion of DMPK-related proteins in the avoidome.]  Unfortunately, the structures of the vast majority of avoidome targets have not yet been determined. Further, many of these proteins are complex machines that contain multiple domains and exhibit considerable structural dynamism. Their binding pockets can be quite large and promiscuous, favoring distinct binding modes for even closely related compounds. [It is not clear whether this assertion is based on experimental observations.] As a consequence, multiple structures spanning a range of bound ligands and protein conformational states will be required to fully understand how best to prevent drugs from engaging these problematic anti-targets.  

We believe the structural biology community should “embrace the avoidome” with the same enthusiasm that structure-based design has been applied to intended targets. [My view is that the authors need to clearly articulate their reasons for only including DMPK-related proteins in the avoidome before seeking to direct the activities of structural biology community. I presume that the Target 2035 initiative, which aims to “to create by year 2035 chemogenomic libraries, chemical probes, and/or biological probes for the entire human proteome”, will also cover anti-targets. Having chemical and/or biological probes available for anti-targets should lead to better understanding of toxicity in humans.] The structures of these proteins will shed considerable light on human biology and represent exciting opportunities to demonstrate the power of cutting-edge structural techniques. [Experimental structures of target-ligand complexes do indeed provide valuable direct evidence that a ligand is binding to a protein but the structures themselves are not particularly informative from the perspective of understanding human biology. It is actually high-quality chemical probes that are needed to shed light on human biology and here’s a link to the Chemical Probes Portal. Structures at atomic resolution for protein-ligand complexes are certainly useful for chemical probe design but are not strictly necessary for effective use of chemical probes.]  Crucially, a detailed understanding of the ways that drugs engage with avoidome targets would significantly expedite drug discovery.  [Experimentally-determined structures of anti-targets complexed with ligands are certainly informative when elucidating structure-activity relationships for binding to anti-targets. However, structural information of this nature is much less directly useful for addressing problems such as metabolic lability and active efflux.] This information holds the potential to achieve a profound impact on the discovery of new and enhanced medicines.

Conclusion

The authors assert: 

In drug discovery, truth is a molecule that transforms the practice of medicine. [I disagree with this assertion. In drug discovery truth may also be a compound that, despite an excellent pharmacokinetic profile, chokes comprehensively in phase 2.]

It's been been a long post and this is a good place to leave things. While the authors have raised some valid points I found the 'Drugs mingle with many different receptors' section to be rather confused and I don't think that the drug discovery and structural biology communities are in desperate need of yet another 'ome' word. I hope that my review of FM2024 will be useful for readers of the article while providing helpful feedback for the authors and for the Editors of Cell. 

Sunday 31 December 2023

Chemical con artists foil drug discovery

One piece of general advice that I offer to fellow scientists is to not let the fact that an article has been published in Nature (or any other ‘elite’ journal for that matter) cause you to switch off your critical thinking skills while reading it and the BW2014 article (Chemistry: Chemical con artists foil drug discovery) that I’ll be reviewing in this post is an excellent case in point. My main criticism of BW2014 that is that the rhetoric is not supported by data and I’ve always seen the article as something of a propaganda piece.

One observation that I’ll make before starting my review of BW2014 is that what lawyers would call ‘standard of proof’ varies according to whether you’re saying something good about a compound or something bad. For example, I would expect a competent peer reviewer to insist on measured IC50 values if I had described compounds as inhibitors of an enzyme in a manuscript. However, it appears to be acceptable, even in top journals, to describe compounds as PAINS without having to provide any experimental evidence that they actually exhibit some type of nuisance behavior (let alone pan-assay interference). I see a tendency in the ‘compound quality’ field for opinions to be stated as facts and reading some of the relevant literature leaves me with the impression that some in the field have lost the ability to distinguish what they know from what they believe. 

BW2014 has been heavily cited in the drug discovery literature (it was cited as the first reference in the ACS assay interference editorial which I reviewed in K2017) despite providing little in the way of practical advice for dealing with nuisance behavior. B2014 appears to exert a particularly strong influence on the Chemical Probes Community having been cited by the A2015, BW2017, AW2022 and A2022 articles as well as in the Toxicophores and PAINS Alerts section of the Chemical Probes Portal. Given the commitment of the Chemical Probes Community to open science, their enthusiasm for the PAINS substructure model introduced in BH2010 (New Substructure Filters for Removal of Pan Assay Interference Compounds (PAINS) from Screening Libraries and for Their Exclusion in Bioassays) is somewhat perplexing since neither the assay data nor the associated chemical structures were disclosed. My advice to the Chemical Probes Community is to let go of PAINS filters. 

Before discussing BW2014, I’ll say a bit about high-throughput screening (HTS) which emerged three decades ago as a lead discovery paradigm. From the early days of HTS it was clear, at least to those who were analyzing the output from the screens, that not every hit smelt of roses.  Here’s what I wrote in K2017

Although poor physicochemical properties were partially blamed (3) for the unattractive nature and promiscuous behavior of many HTS hits, it was also recognized that some of the problems were likely to be due to the presence of particular substructures in the molecular structures of offending compounds. In particular, medicinal chemists working up HTS results became wary of compounds whose molecular structures suggested reactivity, instability, accessible redox chemistry or strong absorption in the visible spectrum as well as solutions that were brightly colored. While it has always been relatively easy to opine that a molecular structure ‘looks ugly’, it is much more difficult to demonstrate that a compound is actually behaving badly in an assay.

It has long been recognized that it is prudent to treat frequent-hitters (compounds that hit in multiple assays) with caution when analysing HTS output. In K2017 I discussed two general types of behavior that can cause compounds to hit in multiple assays: Type 1 (assay result gives an incorrect indication of the extent to which the compound affects target function) and Type 2 (compound acts on target by undesirable mechanism of action (MoA)). Type 1 behavior is typically the result of interference with the assay read-out and the hits in question can be accurately described as ‘false positives’ because the effects on the target are not real. Type 1 behaviour should be regarded as a problem with the assay (rather than with the compound) and, provided that the activity of a compound has been established using a read-out for which interference is not a problem, interference with other read-outs is irrelevant. In contrast, Type 2 behavior should be regarded as a problem with the compound (rather than with the assay) and an undesirable MoA should always be a show-stopper.

Interference with read-out and undesirable MoAs can both cause compounds to hit in multiple assays. However, these two types of bad behavior can still cause big problems whether or not the compounds are observed to be frequent-hitters. Interference with read-out and undesirable MoAs are very different problems in drug discovery and the failure to recognize this point is a serious deficiency that is shared by BW2014 and BH2010.

Although I’ve criticized the use of PAINS filters there is no suggestion that compounds matching PAINS substructures are necessarily benign (many of the PAINS substructures look distinctly unwholesome to me). I have no problem whatsoever with people expressing opinions as to the suitability of compounds for screening provided that the opinions are not presented as facts. In my view the chemical con-artistry of PAINS filters is not that benign compounds have been denounced but the implication that PAINS filters are based on relevant experimental data.

Given that the PAINS filters form the basis of a cheminformatic model that is touted for prediction of pan-assay interference, one could be forgiven for thinking that the model had been trained using experimental observations of pan-assay interference. This is not so, however, and the data that form the basis of the PAINS filter model actually consist of the output of six assays that each use the AlphaScreen read-out. As noted in K2017, a panel of six assays using the same read-out would appear to be a suboptimal design of an experiment to observe pan assay interference. Putting this in perspective, P2006 (An Empirical Process for the Design of High-Throughput Screening Deck Filters) which was based on analysis of the output from 362 assays had actually been published four years before BH2010.

After a bit of a preamble, I need to get back to reviewing BW2014 and my view is that readers of the article who didn’t know better could easily conclude that drug discovery scientists were completely unaware of the problems associated with misleading HTS assay results before the re-branding of frequent-hittters as PAINS in BH2010. Given that M2003 had been published over a decade previously. I was rather surprised that BW2014 had not cited a single article about how colloidal aggregation can foil drug discovery. Furthermore, it had been known (see FS2006) for years before the publication of BH2010 that the importance of colloidal aggregation could be assessed by running assays in the presence of detergent.

I'll be commenting directly on the text of BW2014 for the remainder of the post (my comments are italicized in red).

Most PAINS function as reactive chemicals rather than discriminating drugs. [It is unclear here whether “PAINS” refers to compounds that have been shown by experiment to exhibit pan-assay interference or simply compounds that share structural features with compounds (chemical structures not disclosed) claimed to be frequent-hitters in the BH2010 assay panel. In any case, sweeping generalizations like this do need to be backed with evidence. I do not consider it valid to present observations of frequent-hitter behavior as evidence that compounds are functioning as reactive chemicals in assays.] They give false readouts in a variety of ways. Some are fluorescent or strongly coloured. In certain assays, they give a positive signal even when no protein is present. [The BW2014 authors appear to be confusing physical phenomena such as fluorescence with chemical reactivity.]

Some of the compounds that should ring the most warning bells are toxoflavin and polyhydroxylated natural phytochemicals such as curcumin, EGCG (epigallocatechin gallate), genistein and resveratrol. These, their analogues and similar natural products persist in being followed up as drug leads and used as ‘positive’ controls even though their promiscuous actions are well-documented (8,9). [Toxoflavin is not mentioned in either Ref8 or Ref9 although T2004 would have been a relevant reference for this compound. Ref8 only discusses curcumin and I do not consider that the article documents the promiscuous actions of this compound.  Proper documentation of the promiscuity of a compound would require details of the targets that were hit, the targets that were not hit and the concentration(s) at which the compound was assayed. The effects of curcumin, EGCG (epigallocatechin gallate), genistein and resveratrol on four membrane proteins were reported in Ref9 and these effects would raise doubts about activity for any of these compounds (or their close structural analogs) that had been observed in a cell-based assay. However, I don’t consider that it would be valid to use the results given in Ref9 to cast doubt on biological activity measured in an assay that was not cell-based.] 

Rhodanines exemplify the extent of the problem. [Rhodanines are specifically discussed in K2017 in which I suggest that the most plausible explanation for the frequent-hitter behavior observed for rhodanines in the BH2010 panel of six AlphaScreen assays is that the singly-connected sulfur reacts with singlet oxygen (this reactivity has been reported for compounds with thiocarbonyl groups in their molecular structures).] A literature search reveals 2,132 rhodanines reported as having biological activity in 410 papers, from some 290 organizations of which only 24 are commercial companies. [Consider what the literature search would have revealed if the target substructure had been ‘benzene ring’ rather than ‘rhodanine’? As discussed in this post the B2023 study presented the diversity of targets hit by compounds incorporating a fused tetrahydroquinolines in their molecular structures as ‘evidence’ for pan-assay interference by compounds based on this scaffold.] The academic publications generally paint rhodanines as promising for therapeutic development. In a rare example of good practice, one of these publications (10) (by the drug company Bristol-Myers Squibb) warns researchers that these types of compound undergo light-induced reactions that irreversibly modify proteins. [The C2001 study (Photochemically enhanced binding of small molecules to the tumor necrosis factor receptor-1 inhibits the binding of TNF-α) is actually a more relevant reference since it focuses of the nature of the photochemically enhanced binding. The structure of the complex of TNFRc1 with one of the compounds studied (IV703; see graphic below) showed a covalent bond between one of carbon atoms of the pendant nitrophenyl and the backbone amide nitrogen of A62. The structure of the IV703–TNFRc1 complex shows that a covalent bond between pendant aromatic ring must also be considered as a distinct possiblity for the rhodanines reported in Ref10 and C2001.] It is hard to imagine how such a mechanism could be optimized to produce a drug or tool. Yet this paper is almost never cited by publications that assume that rhodanines are behaving in a drug-like manner. [It would be prudent to cite M2012 (Privileged Scaffolds or Promiscuous Binders: A Comparative Study on Rhodanines and Related Heterocycles in Medicinal Chemistry) if denouncing fellow drug discovery scientists for failure to cite Ref10.]

In a move partially implemented to help editors and manuscript reviewers to rid the literature of PAINS (among other things), the Journal of Medicinal Chemistry encourages the inclusion of computer-readable molecular structures in the supporting information of submitted manuscripts, easing the use of automated filters to identify compounds’ liabilities. [I would be extremely surprised if ridding the literature of PAINS was considered by the JMC Editors when they decided to implement a requirement that authors include computer-readable molecular structures in the supporting information of submitted manuscripts. In any case, claims such as this do need to be supported by evidence.]  We encourage other journals to do the same. We also suggest that authors who have reported PAINS as potential tool compounds follow up their original reports with studies confirming the subversive action of these molecules. [I’ve always found this statement bizarre since the BW2014 authors appear to be suggesting that that authors who have reported PAINS as potential tool compounds should confirm something that they have not observed and which may not even have occurred. When using the term “PAINS” do the BW2014 authors mean compounds that have actually been shown to exhibit pan-assay interference or compounds that that share structural features with compounds that were claimed to exhibit frequent-hitter behavior in the BH2010 assay panel? Would interference in with the AlphaScreen read-out by a singlet oxygen quencher be regarded as a subversive action by a molecule in situations when a read-out other than AlphaScreen had been used?] Labelling these compounds clearly should decrease futile attempts to optimize them and discourage chemical vendors from selling them to biologists as valid tools. [The real problem here is compounds being sold as tools in the absence of the measured data that is needed to support the use of the compounds for this purpose. Matches with PAINS substructures would not rule out the use of a compound as a tool if the appropriate package of measured data is available. In contrast, a compound that does not match any PAINS substructures cannot be regarded as an acceptable tool if the appropriate package of measured data is not available. Put more bluntly, you’re hardly going to be able to generate the package of measured data if the compound is as bad as PAINS filter advocates say it is.]

Box: PAINS-proof drug discovery

Check the literature. [It’s always a good idea to check the literature but the failure of the BW2014 authors to cite a single colloidal aggregation article such as M2003 suggests that perhaps they should be following this advice rather than giving it. My view is that the literature on scavenging and quenching of singlet oxygen was treated in a cursory manner in BH2010 (see earlier comment in connection with rhodanines).]  Search by both chemical similarity and substructure to see if a hit interacts with unrelated proteins or has been implicated in non-drug-like mechanisms. [Chemical similarity and substructure search will identify analogs of hits and it is actually the exact match structural search that you need do in order to see if a particular compound is a hit in assays against unrelated proteins.] Online services such as SciFinder, Reaxys, BadApple or PubChem can assist in the check for compounds (or classes of compound) that are notorious for interfering with assays. [I generally recommend ChEMBL as a source of bioactivity data.]  

Assess assays. For each hit, conduct at least one assay that detects activity with a different readout. [This will only detect problems associated with interference with read-out. As discussed in S2009 it may be possible to assess and even correct for interference with read-out without having to run an assay with a different read-out.]  Be wary of compounds that do not show activity in both assays. If possible, assess binding directly, with a technique such as surface plasmon resonance. [SPR can also provide information about MoA since association, dissociation and stoichiometry can all be observed directly using this detection technology.] 

That concludes blogging for 2023 and many thanks to anybody who has read any of the posts this year. For too many people Planet Earth is not a very nice place to be right now and my new year wish is for a kinder, happier and more peaceful world in 2024. 

Tuesday 19 December 2023

On quality criteria for covalent and degrader probes

I’ll be taking a look at H2023 (Expanding Chemical Probe Space: Quality Criteria for Covalent and Degrader Probes) in this post and this article has also been discussed In The Pipeline. I’ll primarily be discussing the quality criteria for covalent probes in this post although I’ll also comment briefly on chemical matter criteria proposed for degrader probes. The post is intended as a contribution to the important scientific discussion that the H2023 Perspective is intended to jumpstart:

We are convinced that now is the time to initiate similar efforts to achieve a consensus about quality criteria for covalently acting and degrader probes. This Perspective is intended to jumpstart this important scientific discussion.

Covalent bond formation between ligands and targets is a drug design tactic for exploiting molecular recognition elements in targets that are difficult to make beneficial contacts with.  Cysteine SH has minimal capacity to form hydrogen bonds with polar ligand atoms and the exposed nature of catalytic cysteine SH reduces its potential to make beneficial contacts with non-polar ligand atoms. One common misconception in drug discovery is that covalent bond formation between targets and ligands is necessarily irreversible and it wasn’t clear from my reading of H2023 whether the authors were aware that covalent bond formation between targets and ligands can also be reversible. In any case, it needed to be made clear that the quality criteria proposed by the authors for covalently acting small-molecule probes only apply to probes that act irreversibly.

Irreversible covalent bond formation is typically used to target non-catalytic residues and design is lot more complicated than for reversible covalent bond formation. First, IC50 values are time-dependent (there are two activity parameters: affinity and inactivation rate constant) which makes it much more difficult to assess selectivity or to elucidate SAR. Second, the transition state structural models required for modelling inactivation cannot be determined experimentally and therefore need to be calculated using computationally intensive quantum mechanical methods.

I’ll start my review with a couple of general comments. Intracellular concentration is factor that is not always fully appreciated in chemical biology and I generally recommend that people writing about chemical probes demonstrate awareness of SR2019 (Intracellular and Intraorgan Concentrations of Small Molecule Drugs: Theory, Uncertainties in Infectious Diseases and Oncology, and Promise). One a more pedantic note I cautioned against using ‘molecule’ as a synonym for ‘compound’ in my review of S2023 (Systematic literature review reveals suboptimal use of chemical probes in cell-based biomedical research) and I suggest that “covalent molecule” might be something that you don't want to see in the text of an article in a chemistry journal.

However, significant efforts need to be invested into characterizing and validating covalent molecules as a prerequisite for conclusive use in biomedical research and target validation studies.

The proposed quality criteria for covalently acting small-molecule probes are given in Figure 2 of H2023 although I’ll be commenting on the text of the article. Subscripting doesn't work well in blogger and so I'll use K.i and k.inact respectively throughout the post to denote the inhibition constant and the first order inactivation rate constant.  

I’ll start with Section 2.1 (Criteria for Assessing Potency of Covalent Probes) and my comments are italicised in red. 

When working with irreversible covalent probes, it is important to consider that target inhibition is time-dependent and therefore IC50 values, while frequently used, are a suboptimal descriptor of potency. (21) Best practice is to use k.inact (the rate of inactivation) over K.i (the affinity for the target) values instead. (22) [I recommend that values of both k.inact and K.i be reported since because this enables the extent of non-covalent target engagement by the chemical probe to be assessed. Regardless of whether binding to target is covalent or non-covalent, the concentration and affinity of substrates (as well as cofactors such as ATP) need be properly accounted for when interpreting effects of chemical probes in cell-based assays. This is a significant issue for ATP-competitive kinase inhibitors (as discussed in my review of S2023) and I recommend this tweetorial from Keith Hornberger.]

As measurement of k.inact/K.i values can be labor-intensive (or in certain cases technically impossible), IC50 values (or target engagement TE50 values) are often reported for covalent leads and used to generate structure–activity relationships (SARs). [The labor-intensive nature of the measurements is not a valid justification for a failure to measure k.inact and K.i values for a covalent chemical probe.]  Carefully designed biochemical assays used in determining IC50 values can be well-suited as surrogates for k.inact/K.i measurements. (24) [It is my understanding that the primary reason for doing this is to increase the throughput of irreversible inhibition assays for SAR optimization and I would generally be extremely wary of any IC50 value measured for an irreversible inhibitor if it had not been technically impossible to measure k.inact or K.i values for the inhibitor.]

2.2. Criteria for Assessing Covalent Probe Selectivity

We propose a selectivity factor of 30-fold in favor of the intended target of the probe compared to that of other family members or identified off-targets under comparable assay conditions. [The authors need to be clearer as to which measure of ‘activity’ they propose should be used for calculating the ratio and some justification for the ratio (why 30-fold rather than 50-fold or 25-fold?) should be given. Regardless of whether binding to target is covalent or non-covalent, the concentration and affinity of substrates (as well as cofactors such as ATP) need to be properly accounted for when assessing selectivity. It is not clear how the selectivity factor should be defined to quantify selectivity of an inhibitor that binds covalently to the target but non-covalently to off-targets. My comments on the THZ1 probe in my review of the S2023 study may be relevant.]

2.3. Chemical Matter Criteria for Covalent Probes

Ideally, the on-target activity of the covalent probe is not dominated by the reactive warhead, but the rest of the molecule provides a measurable reversible affinity for the intended target. [My view is that the reversible affinity of the probe should be greater than simply what is measurable and I suggest, with some liberal arm-waving, that a K.i cutoff of  ~100 nM might be more useful (a K.i value of 10 μM is usually measurable provided that the inhibitor is adequately soluble in assay buffer).] Seeing SARs over 1–2 log units of activity resulting from core, substitution, and warhead changes is an important quality criterion for covalent probe molecules. [The authors need to be clearer about which ‘activity’ they are referring to (differences in K.i and k.inact values between compounds are likely to be greater than the corresponding differences in k.inact/K.i values). The criterion “SAR for covalent and non-covalent interactions” shown in Figure 2 is nonsensical.]

3.3. Chemical Matter Criteria for Degrader Probes

When selecting chemical degrader probes, it is recommended that a chemist critically assesses the chemical structure of the degrader for the presence of chemical groups that impart polypharmacology or interfere with assay read-outs (PAINs motifs). (78) [I certainly agree that chemists should critically assess chemical structures of probes and, if performing a critical assessment of this nature for a degrader probe, I would be taking a look in ChEMBL to see what’s known for structurally-related compounds. I consider the risk of discarding acceptable chemical matter on the basis of matches with PAINS substructures to be low although there’s a lot more to critical assessment of chemical structures than simply checking for matches against PAINS substructures. My view is that genuine promiscuity (as opposed to frequent hitter behavior resulting from interference with read-out) cannot generally be linked to chemical groups. As noted in K2017 the PAINS substructure model introduced in BH2010 was actually trained on the output of six AlphaScreen assays and the applicability domain of the model should be regarded as prediction of frequent-hitter behavior in this assay panel rather than interference with assay read-outs (that said the most plausible explanation for frequent-hitter behavior in the PAINS assay panel is interference with the AlphaScreen read-out by compounds that quench or react with singlet oxygen). My recommendation is that chemical matter criteria for chemical probes should be specified entirely in terms of measured data and the models used to select/screen potentially acceptable chemical matter should not be included in the chemical matter criteria.] 

This is a good point to wrap up my contribution to the important scientific discussion that H2023 is intended to jumpstart. While some of what I've written might be seen as nitpicking please bear in mind that quality criteria for chemical probes need to be defined precisely in order to be useful to the chemical biology and medicinal chemistry communities.

Wednesday 6 December 2023

Are fused tetrahydroquinolines interfering with your assay?

I’ll be taking a look at B2023 (Fused Tetrahydroquinolines Are Interfering with Your Assay) in this post. The article has already been discussed in posts at Practical Fragments and In The Pipeline. In anticipation of the stock straw man counterarguments to my criticisms of PAINS filters, I must stress that there is absolutely no suggestion that compounds matching PAINS filters are necessarily benign. The authors have shown that fusion of cyclopentene at C3-C4 of the tetrahydroquinoline (THQ) ring system is associated with a risk of chemical instability and I consider this to be extremely useful information for anybody thinking about using this scaffold. However, the authors do also appear to be making a number of claims that are not supported by evidence and, in my view, have not demonstrated that the chemical instability leads to pan-assay interference or even frequent-hitter behavior.   

The term ‘PAINS’ crops up frequently in B2023 (the authors even refer to “the PAINS concept” although I think that’s pushing things a bit) and I’ll start by saying something about two general types of nuisance behavior of compounds in assays and these points are discussed in more detail in K2017 (Comment on The Ecstasy and Agony of Assay Interference Compounds). From the perspective of screening libraries of compounds for biological activity, the two types of nuisance behavior are very different problems that need to be considered very differently. One criticism that can be made of both BH2010 (original PAINS study) and BW2014 (Chemical con artists foil drug discovery) is that neither study considers the differing implications for drug discovery of these two types of nuisance behavior.

The first type of nuisance behavior in assays is interference with assay read-out and when ‘activity’ in an assay is due to assay interference hits can accurately be described as ‘false positives’ (this should be seen as a problem with the assay rather than the compound). Interference with assay read-outs is certainly irksome when you’re analysing output from screens because you don’t know if the ‘activity’ is real or not. However, if you’re able to demonstrate genuine activity for a compound using an assay with a read-out for which interference is not an issue then interference with other assay read-outs is irrelevant and would not rule out the compound as a viable starting point for further investigation. Interference with assay read-outs generally increases with the concentration of the compound in the assay (this is why biophysical methods are often favored for screening fragments) and I’ll direct readers to a helpful article by former colleagues. It’s also worth noting that interference with read-out can also lead to false negatives. 

The second type of nuisance behavior is that the compound acts on a target by an undesirable mechanism of action (MoA) and it is not accurate to describe hits behaving in this manner as ‘false positives’ because the effect on the target is real (this should be seen as a problem with the compound rather than the assay). In contrast to interference with read-out, an undesirable MoA is a show-stopper. An undesirable MoA with which many drug discovery scientists will be familiar is colloidal aggregate formation (see M2003) and the problem can be assessed by running the assay in the absence and presence of detergent (see FS2006). In some cases patterns in screening output may point to an undesirable MoA. For example, cysteine reactivity might be indicated by compounds hitting in multiple assays for inhibition of enzymes that use feature cysteine in their catalytic mechanisms.

I’ll make some comments on PAINS filters before I discuss B2023 in detail and much of what I’ll be saying has already been said in K2017 and C2017 (Phantom PAINS: Problems with the Utility of Alerts for Pan-Assay INterference CompoundS) although you shouldn’t need to consult these articles in order to read the blog post unless you want to get some more detail. The PAINS filter model introduced in BH2010 consists of number of substructures which are claimed (I say “claimed” because the assay results and associated chemical structures are proprietary) to be associated with frequent hitter behavior in a panel of six assays that all use the AlphaScreen read-out (compounds that react with or quench singlet oxygen have the potential of interfere with this read-out). I argued in K2017 that six assays, all using the same read-out, do not constitute a credible basis for the design of an experiment to detect pan-assay interference. Put another way, the narrow scope of the data used to train the PAINS filter model restricts the applicability domain of this model to prediction of frequent-hitter behavior in these six assays. The BH2010 study does not appear present a single example of a compound that has been actually been demonstrated by experiment to exhibit pan-assay interference.

The B2023 study reports that tetrahydroquinolines (THQs) fused at C3-C4 with cyclopentene (1) are unstable. This is valuable information for anybody who may be have the misfortune to be working with this particular scaffold and the observed instability implies that drug discovery scientists should also be extremely wary of any biological activity reported for compounds that incorporate this scaffold. Furthermore, the authors show that the instability can be linked to the presence of the carbon-carbon double bond in the ‘third ring’ since 2, the dihydro analog of 1, appears to be stable. I would certainly mention the chemical instability reported in B2023 if reviewing a manuscript that reported biological activity for compounds based on this scaffold. However, I would not mention that BH2010 has stated that the scaffold matches the anil_alk_ene (SLN: C[1]:C:C:C[4]:C(:C:@1)NCC[9]C@4C=CC@9 ) PAINS substructure because the nuisance behavior consists of hitting frequently in a six-assay panel of questionable relevance and the PAINS filters were based on analysis of proprietary data.

Although I wouldn’t have predicted the chemical instability reported for 1 by B2023, this scaffold is certainly not a structural feature that I would have taken into lead optimization with any enthusiasm (a hydrogen that is simultaneously benzylic and allylic does rather look like a free lunch for the CYPs). I would still be concerned about instability even if methylene groups were added to or deleted from the aliphatic parts of 1. I suspect that the electron-releasing nitrogen of 1 contributes to chemical instability although I don’t think that changing nitrogen for another atom type would eliminate the risk of chemical instability. Put another way, the instability observed for 1 should raise questions about the stability of a number of structurally-related scaffolds. Chemical instability is (or at least should be) a show-stopper in the context of drug discovery even if doesn't lead to interference with assay read-out, an undesirable MoA or pan-assay interference.

I certainly consider the instability observed for 1 to be of interest and relevant to a number of structurally-related chemotypes. However, I have a number of concerns about B2023 and one specific criticism is that the authors use “tricyclic/fused THQ” as a synonym throughout the text as a synonym for “tricyclic/fused THQ with a carbon-carbon double bond in the ‘third’ ring”. At best this is confusing and it could lead to groundless criticism, either publicly or in peer review, of a study that reported assay results for compounds based on the scaffold in 2A more general point is that the authors make a number of claims that, in my view, are not adequately supported by evidence. I’ll start with the significance section and my comments are italicized in red:

Tricyclic tetrahydroquinolines (THQs) are a family of lesser studied pan-assay interference compounds (PAINS) [The authors need to provide specific examples of tricyclic THQs that have been actually been shown to exhibit pan-assay interference to support this claim.] These compounds are found ubiquitously throughout commercial and academic small molecule screening libraries. [The authors do not appear to have presented evidence to support this claim and the presence of compounds in vendor catalogues does not prove that the compounds are actually being screened. In my view, the authors appear to be trying to ‘talk up’ the significance of their findings by making this statement.] Accordingly, they have been identified as hits in high-throughput screening campaigns for diverse protein targets. We demonstrate that fused THQs are reactive when stored in solution under standard laboratory conditions and caution investigators from investing additional resource into validating these nuisance compounds.

Continuing with the introduction

Fused tetrahydroquinolines (THQs) are frequent hitters in hit discovery campaigns. [In my view the authors have not presented sufficient evidence to support this statement and I don’t consider claims made in the BH2010 for frequent-hitter behavior by compounds matching the anil_alk_ene PAINS substructure to be admissible as evidence simply because they are based on proprietary data. In any case the numbers of compounds matching the anil_alk_ene PAINS substructure and reported in BH2010 to hit in zero (17) or one (11) assays in the PAINS assay panel suggest that 28 compounds (of a total of 51 substructural matches) cannot be regarded as frequent-hitters in this assay panel.]  Pan-assay interference compounds (PAINS) have been controversial in the recent literature. While some literature supports these as nuisance compounds, other papers describe PAINS as potentially valuable leads. (1 | 2 | 3 | 4) [The C2017 study referenced as 2 is actually a critique of PAINS filters and I’m assuming that the authors aren’t suggesting that it “supports these [PAINS] as nuisance compounds”. However, I would consider it a gross misrepresentation of C2017 to imply that the study describes “PAINS as potentially valuable leads”.] There have been descriptions of many different classes of PAINS that vary in their frequency of occurrence as hits in the screening literature. [In my view, the number of articles on PAINS appears to greatly exceed the number of compounds that have actually been shown to exhibit pan-assay interference.]

The number of papers that selected this scaffold during hit discovery campaigns from multiple chemical libraries supports the idea that fused THQs are frequent hitters. [Let’s take a closer look at what the authors are suggesting by considering a selection of compounds, each of which has a benzene ring in its molecular structure. Now let’s suppose that each of a large number of targets is hit by at least one of the compounds in this selection (I could easily satisfy this requirement by selecting marketed drugs with benzene rings in their molecular structures). Applying the same logic as the authors, I could use these observations to support the idea that compounds incorporating benzene rings in their molecular structures are frequent-hitters. In my view the B2023 study doesn’t appear to have presented a single example of a fused THQ that has actually been shown experimentally to exhibit frequent-hitter behavior. As mentioned earlier in this post less than half of the compounds matching the anil_alk_ene PAINS substructure that were evaluated in the BH2010 assay panel can be regarded as frequent-hitters.] At first glance, these compounds appear to be valid, optimizable hits, with reasonable physicochemical properties. Although micromolar and reproducible activity has been reported for multiple THQ analogues on many protein targets, hit-to-lead optimization programs aimed at improving the initial hits (Supporting Information (SI), Table S1) have resulted in no improvement in potency or no discernible structure–activity relationships (SAR) [Achieving increased potency and establishing SARs are certainly important objectives in hit-to-lead studies. However, assertions that hit-to-lead optimizations “have resulted in no improvement in potency or no discernible structure–activity relationships” do need to be supported with appropriate discussion of specific hit-to-lead optimization studies.]  

Examples of Fused THQs as “Hits” Are Pervasive

The diversity of protein targets captured below supports the premise that the fused THQ scaffold does not yield specific hits for these proteins but that the reported activity is a result of pan-assay interference. [I could use an argument analogous to the one that I’ve just used for frequent-hitters to ‘prove’ that compounds with benzene rings in their molecular structure do not yield specific hits and that any reported activity is due to pan-assay interference. The authors do not appear to have presented a single example of a fused THQ that has been shown by experiment to exhibit pan-assay interference.]

Concluding remarks

Our review and evidence-based experiments solidify the idea that tricyclic THQs are nuisance compounds that cause pan-assay interference in the majority of screens rather than privileged structures worthy of chemical optimization. [While I certainly agree that chemical instability would constitute a nuisance, I would consider it wildly extravagant to claim that tricyclic THQs can “cause pan assay interference” since nobody appears to have actually observed pan-assay interference for even a single tricyclic THQ.] Their widespread micromolar activities on a broad range of proteins with diverse assay readouts support our assertion that they are unlikely to be valid hits. [As stated previously, I do not consider that “widespread micromolar activities on a broad range of proteins” observed for compounds that share a particular structural feature implies that all compounds with the particular structural feature are unlikely to be valid hits.]

So that concludes my review of the B2023 study. I really liked the experimental work that revealed the instability of 1 and linked it to the presence of the double bond in the 'third' ring.  Furthermore, these experimental results would (at least for me) raise questions about the chemical stability of some scaffolds that are structurally-related to 1. However, I found the analysis of the bioactivity data reported in the literature for fused THQs to be unconvincing to the extent that it significantly weakened the B2023 study. 

Sunday 19 November 2023

On the misuse of chemical probes

It’s now time to get back to chemical probes and I’ll be taking a look at S2023 (Systematic literature review reveals suboptimal use of chemical probes in cell-based biomedical research) which has already been reviewed in blog posts from Practical Fragments, In The Pipeline and the Institute of Cancer Research. Readers of this blog are aware that PAINS filters usually crop up in posts on chemical probes but there are other things that I want to discuss and, in any case, references to PAINS in S2023 are minimal. Nevertheless, I’ll still stress that a substructural match of a chemical probe with a PAINS filter does not constitute a valid criticism of a chemical probe (it simply means that the chemical structure of the chemical probe shares structural features with compounds that have been claimed to exhibit frequent-hitter behaviour in a panel of six AlphaScreen assays) and one is more likely to encounter a bunyip than a compound that has actually been shown to exhibit pan-assay interference.

The authors of S2023 claim to have revealed “suboptimal use of chemical probes in cell-based biomedical research” and I’ll start by taking a look at the abstract (my annotations are italicised in red):

Chemical probes have reached a prominent role in biomedical research, but their impact is governed by experimental design. To gain insight into the use of chemical probes, we conducted a systematic review of 662 publications, understood here as primary research articles, employing eight different chemical probes in cell-based research. [A study such as S2023 that has been claimed by its authors to be systematic does need to say something about how the eight chemical probes were selected and why the literature for this particular selection of chemical probes should be regarded as representative of chemical probes literature in general.] We summarised (i) concentration(s) at which chemical probes were used in cell-based assays, (ii) inclusion of structurally matched target-inactive control compounds and (iii) orthogonal chemical probes. Here, we show that only 4% of analysed eligible publications used chemical probes within the recommended concentration range and included inactive compounds as well as orthogonal chemical probes. [I would argue that failure to use a chemical probe within a recommended concentration range is only a valid criticism if the basis for the recommendation is clearly articulated.] These findings indicate that the best practice with chemical probes is yet to be implemented in biomedical research. [My view is that the best practice with chemical probes is yet to be defined.] To achieve this, we propose ‘the rule of two’: At least two chemical probes (either orthogonal target-engaging probes, and/or a pair of a chemical probe and matched target-inactive compound) to be employed at recommended concentrations in every study. [The authors of S2023 do seem to moving the goalposts since the they’ve criticized studies for not using structurally matched target-inactive control compounds but are saying that using an additional orthogonal target-engaging probe makes it acceptable not to use a structurally matched target-inactive control compound. This  suggestion does appear to contradict the Chemical Probes Portal criteria for 'classical' modulators which do require the use of a control compound  defined as having a "similar structure with similar physicochemistry, non-binding against target".]

The following sentence does set off a few warning bells for me:

The term ‘chemical probe’ distinguishes compounds used in basic and preclinical research from ‘drugs’ used in the clinic, from the terms ‘inhibitor’, ‘ligand’, ‘agonist’ or ‘antagonist’ which are molecules targeting a given protein but are insufficiently characterised, and also from the term ‘probes’ which is often referring to laboratory reagents for biophysical and imaging studies.

First, the terms 'compound' and 'molecule' are not interchangeable and I would generally recommend using 'compound' when talking about biological activity or affinity. A more serious problem is that the authors of S2023 seem to be getting into homeopathic territory by suggesting that chemical probes are not ligands and this might have caused Paul Ehrlich (who died 26 years before Kaiser Wilhelm II) to spit a few feathers.  Drugs and chemical probes are ligands for their targets by virtue of binding to their targets (the term 'ligand' is derived from the Latin 'ligare' which means 'to bind' and a compound can be a ligand for one target without necessarily being a ligand for another target) while the terms 'inhibitor', 'agonist' and 'antagonist' specify the consequences of ligand binding. I was also concerned by the use of the term 'in cell concentration' in S2023 given that uncertainty in intracellular concentration is an issue when working with chemical probes (as well as in PK-PD modelling).  Although my comments above could be seen as nit-picking these are not the kind of errors that authors can afford to make if they’re going to claim that their “findings indicate that the best practice with chemical probes is yet to be implemented in biomedical research”.

Let’s take a look at the criteria by which the authors of S2023 have assessed the use of chemical probes. They assert that “Even the most selective chemical probe will become non-selective if used at a high concentration” although I think it’d be more correct to state that the functional selectivity of a probe depends on binding affinity of the probe for target and anti-targets as well as the concentration of the probe (at its site of action). Selectivity also depends on the concentration of anything that binds competitively with the probe and, when assessing kinase selectivity, it can be argued that assays for ATP-competitive kinase inhibitors should be run at a typical intracellular ATP concentration (here’s a recent open access review on intracellular ATP concentration). The presence of serum in cell-based assays should also be considered when setting upper concentration limits since chemical probes may bind to serum proteins such as albumin which means that the concentration of a compound that is ‘seen’ by the cells is lower than the total concentration of the compound in the assay. In my experience binding to albumin tends to increase with lipophilicity and is also favored by the presence of an acidic group such as carboxylate in a molecular structure.

I’m certainly not suggesting that chemical probes be used at excessive concentrations but if you’re going to criticise other scientists for exceeding concentration thresholds then, at very least, you do need to show that the threshold values have been derived in an objective and transparent manner. My view that it would not be valid to criticise studies publicly (or in peer review of submitted manuscripts) simply because the studies do not comply with recommendations made by the Chemical Probes Portal. It is significant that the recommendations from different groups of chemical probe experts with respect to the maximum concentration at which UNC1999 should be used differ by almost an order of magnitude:

As the recommended maximal in-cell concentration for UNC1999 varies between the Chemical Probes Portal and the Structural Genomics Consortium sites (400 nM and 3 μM, respectively), we analysed compliance with both concentrations.

One of the eight chemical probes featured in S2023 is THZ1 which is reported to bind covalently to CDK7 and the electrophilic warhead is acrylamide-based, suggesting that binding is irreversible. Chemical probes that form covalent bonds with their targets irreversibly need to be considered differently to chemical probes that engage their targets reversibly (see this article). Specifically, the degree of target engagement by a chemical probe that binds irreversibly depends on time as well as concentration (if you wait long enough then you’ll achieve 100% inhibition). This means that it’s not generally possible to quantify selectivity or to set concentration thresholds objectively for chemical probes that bind to their targets irreversibly. It’s not clear (at least to me) why an irreversible covalent inhibitor such as THZ1 was included as one of the eight chemical probes covered by the S2023 study so I checked to see what the Chemical Probes Portal had to say about THZ1 and something doesn’t look quite right.  The on-target potency is given as a Kd (dissociation constant which is a measure of affinity) value of 3.2 nM and the potency assay is described as time-dependent binding established supporting covalent mechanism”.  However, Kd is a measure of affinity (and therefore not a time-dependent) and my understanding is that it is generally difficult to measure Kd for irreversible covalent inhibitors which are typically characterized by kinact (inactivation rate constant) and Ki (inhibition constant) values obtained from analysis of enzyme inhibition data. The off-target potency of THZ1 is summarized as “KiNativ profiling against 246 kinases in Loucy cells was performed showing >75% inhibition at 1 uM of: MLK3, PIP4K2C, JNK1, JNK2, JNK3, MER, TBK1, IGF1R, NEK9, PCTAIRE2, and TBK1, but in vitro binding to off-target kinases was not time dependent indicating that inhibition was not via a covalent mechanism”. The results from the assays used to measure on-target and off-target potency for THZ1 do not appear to be directly comparable.

It’s now time to wrap up and I suggest that it would not be valid to criticise (either publicly or in peer review) a study simply on the grounds that it reported results of experiments in which a chemical probe was used at a concentration exceeding a recommended maximum value. The S2023 authors assert that an additional orthogonal target-engaging probe can be substituted for a matched target-inactive control compound but this appears to contradict criteria for classical modulators given by the Chemical Probes Portal.

Wednesday 27 September 2023

Five days in Vermont

A couple of months ago I enjoyed a visit to the US (my first for eight years) on which I caught up with old friends before and after a few days in Vermont (where a trip to the golf course can rapidly become a National Geographic Moment). One highlight of the trip was randomly meeting my friend and fellow blogger Ash Jogalekar for the first time in real life (we’ve actually known each other for about fifteen years) on the Boston T Red Line.  Following a couple of nights in green and leafy Belmont, I headed for the Flatlands with an old friend from my days in Minnesota for a Larry Miller group reunion outside Chicago before delivering a short harangue on polarity at Ripon College in Wisconsin. After the harangues, we enjoyed a number of most excellent Spotted Cattle (Only in Wisconsin) in Ripon. I discovered later that one of my Instagram friends is originally from nearby Green Lake and had taken classes at Ripon College while in high school. It is indeed a small world.

The five days spent discussing computer-aided drug design (CADD) in Vermont are what I’ll be covering in this post and I think it’s worth saying something about what drugs need to do in order to function safely.  First, drugs need to have significant effects on therapeutic targets without having significant effects on anti-targets such as hERG or CYPs and, given the interest in new modalities, I’ll be say “effects” rather than “affinity”, although Paul Ehrlich would have reminded us that drugs need to bind in order to exert effects. Second, drugs need to get to their targets at sufficiently high concentrations for their effects to be therapeutically significant (drug discovery scientists use the term ‘exposure’ when discussing drug concentration). Although it is sometimes believed that successful drugs simply reduce the numbers of patients suffering from symptoms it has been known from the days of Paracelsus that it is actually the dose that differentiates a drug from a poison.

Drug design is often said to be multi-objective in nature although the objectives are perhaps not as numerous as many believe (this point is discussed in the introduction section of NoLE, an article that I'd recommend to insomniacs everywhere). The first objective of drug design can be stated in terms of minimization of the concentration at which a therapeutically useful effect on the target is observed (this is typically the easiest objective to define since drug design is typically directed at specific targets). The second objective of drug design can be stated in analogous terms as maximization of the concentration at which toxic effects on the anti-targets are observed (this is a more difficult objective to define because we generally know less about the anti-targets than about the targets). The third objective of drug design is to achieve controllability of exposure (this is typically the most difficult objective to define because drug concentration is a dose-dependent, spaciotemporal quantity and intracellular concentration cannot generally be measured for drugs in vivo). Drug discovery scientists, especially those with backgrounds in computational chemistry and cheminformatics, don’t always appreciate the importance of controlling exposure and the uncertainty in intracellular concentration always makes for a good stock question for speakers and panels of experts.

I posted previously on  artificial intelligence (AI) in drug design and I think it’s worth highlighting a couple of common misconceptions. The first misconception is that we just need to collect enough data and the drugs will magically condense out of the data cloud that has been generated (this belief appears to have a number of adherents in Silicon Valley).  The second misconception is that drug design is merely an exercise in prediction when it should really be seen in a Design of Experiments framework. It’s also worth noting that genuinely categorical data are rare in drug design and my view is that many (most?) "global" machine learning (ML) models are actually ensembles of local models (this heretical view was expressed in a 2009 article and we were making the point that what appears to be an interpolation may actually be an extrapolation). Increasingly, ML is becoming seen as a panacea and it’s worth asking why quantitative structure activity relationship (QSAR) approaches never really made much of a splash in drug discovery.

I enjoyed catching up with old friends [ D | K | S | R/J | P/M ] as well as making some new ones [ G | B/R | L ]. However, I was disappointed that my beloved Onkel Hugo was not in attendance (I continue to be inspired by Onkel’s laser-like focus on the hydrogen bonding of the ester) and I hope that Onkel has finally forgiven me for asking (in 2008) if Austria was in Bavaria. There were many young people at the gathering in Vermont and their enthusiasm made me greatly optimistic for the future of CADD (I’m getting to the age at which it’s a relief not to be greeted with: "How nice to see you, I thought you were dead!"). Lots of energy at the posters (I learned from one that Voronoi was Ukrainian) although, if we’d been in Moscow, I’d have declined the refreshments and asked for a room on the ground floor (left photo below).  Nevertheless, the bed that folded into the wall (centre and right photos below) provided plenty of potential for hotel room misadventure without the ‘helping hands’ of NKVD personnel.

It'd been four years since CADD had been discussed at this level in Vermont so it was no surprise to see COVID-19 on the agenda. The COVID-19 pandemic led to some very interesting developments including the Covid Moonshot (a very different way of doing drug discovery and one I was happy to contribute to during my 19 month sojourn in Trinidad) and, more tangibly, Nirmatrelvir (an antiviral medicine that has been used to treat COVID-19 infections since early 2022). Looking at the molecular structure of Nirmatrelvir you might have mistaken trifluoroacetyl for a protecting group but it’s actually a important feature (it appears to be beneficial from the permeability perspective). My view is that the alkane/water logP (alkane is a better model than octanol for the hydrocarbon core of a lipid bilayer) for a trifluoroacetamide is likely to be a couple of log units greater than for the corresponding acetamide.



I’ll take you through how the alkane/water logP difference between a trifluoroacetamide and corresponding acetamide can be estimated in some detail because I think this has some relevance to using AI in drug discovery (I tend to approach pKa prediction in an analogous manner). Rather than trying to build an ML model for making the prediction, I’ve simply made connections between measurements for three different physicochemical properties (alkane/water logP, hydrogen bond basicity and hydrogen bond acidity) which is something that could easily be accommodated within an AI framework. I should stress that this approach can only be used because it is a difference in alkane/water logP (as opposed to absolute values) that is being predicted and these physicochemical properties can plausibly be linked to substructures.

Let’s take a look at the triptych below which I admit that is not quite up to the standards of Hieronymus Bosch (although I hope that you find it to be a little less disturbing). The first panel shows values of polarity (q) for some hydrogen bond acceptors and donors (you can find these in Tables 2 and 3 in K2022) that have been derived from alkane/water logP measurements. You could, for example, use these polarity values to predict that reducing the polarity of an amide carbonyl oxygen to the extent that it looks like a ketone will lead to a 2.2 log unit increase in alkane/water logP.  The second panel shows measured hydrogen bond basicity values for three hydrogen bond acceptors (you can find these in this freely available dataset) and the values indicate that a trifluoroacetamide is an even weaker hydrogen bond acceptor than a ketone. Assuming a linear relationship between polarity and hydrogen bond basicity, we can estimate that the trifluoroacetamide carbonyl oxygen is 2.4 log units less polar than the corresponding acetamide. The final panel shows measured hydrogen bond acidity values (you can find these in Table 1 of K2022) that suggest that an imide NH (q = 1.3; 0.5 log units more polar than typical amide NH) will be slightly more polar than the trifluoroacetamide NH of Nirmatrelvir. So to estimate he difference in alkane/water logP values you just need to subtract the additional polarity of trifluoroacetamide NH (0.5 log units) from the lower polarity of the trifluoroacetamide carbonyl oxygen (2.4) to get 1.9 log units.


Chemical space is a recurring theme in drug design and its vastness, which defies human comprehension, has inspired much navel-gazing over the years (it’s actually tangible chemical space that’s relevant to drug design). In drug discovery we need to be able to navigate chemical space (ideally without having to ingest huge quantities of Spice) and, given that Ukrainian chemists have revolutionized the world's idea of tangible chemical space (and have also made it a whole lot larger), it is most appropriate to have a Ukrainian guide who is most ably assisted by a trusty Transylvanian sidekick. I see benefits from considering molecular complexity more explicitly when mapping chemical space. 
   
AI (as its evangelists keep telling us) is quite simply awesome at generating novel molecular structures although, as noted in a previous post, there’s a little bit more to drug design than simply generating novel molecular structures. Once you’ve generated a novel molecular structure you need to decide whether or not to synthesize the compound and, in AI-based drug design, molecular structures are often assessed using ML models for biological activity as well as absorption, distribution, metabolism and excretion (ADME) behaviour. It’s well-known that you need a lot of data for training these ML models but you also need to check that the compounds for which you’re making predictions lie within the chemical space occupied by the training set (one way to do this is to ensure that close structural analogs of these compounds exist in the training set) because you can’t be sure that the big data necessarily cover the regions of chemical space of interest to drug designers using the models. A panel discusses the pressing requirement for more data although ML modellers do need to be aware that there’s a huge difference between assembling data sets for benchmarking and covering chemical space at sufficiently high resolution to enable accurate prediction for arbitrary compounds.  

There are other ways to think about chemical space. For example, differences in biological activity and ADME-related properties can also be seen in terms of structural relationships between compounds. These structural relationships can be defined in terms of molecular similarity (Tanimoto coefficient for the molecular fingerprints of X and Y is 0.9) or substructure (X is the 3-chloro analog of Y). Many medicinal chemists think about structure-activity relationships (SARs) and structure-property relationships (SPRs) in terms of matched molecular pairs (MMPs: pairs of molecular structures that are linked by specific substructural relationships) and free energy perturbation (FEP) can also be seen in this framework. Strong nonadditivity and activity cliffs (large differences in activity observed for close structural analogs) are of considerable interest as SAR features in their own right and because prediction is so challenging (and therefore very useful for testing ML and physics-based models for biological activity). One reason that drug designers need to be aware of activity cliffs and nonadditivity in their project data is that these SAR features can potentially be exploited for selectivity.
        
Cheminformatic approaches can also help you to decide how to synthesize the compounds that you (or your AI Overlords) have designed and automated synthetic route planning is a prerequisite for doing drug discovery in ‘self-driving’ laboratories. The key to success in cheminformatics is getting your data properly organized before starting analysis and the Open Reaction Database (ORD), an open-access schema and infrastructure for structuring and sharing organic reaction data, facilitates training of models. One area that I find very exciting is the use of high-throughput experimentation in the search for new synthetic reactions which can led to better coverage of unexplored chemical space. It’s well known in industry that the process chemists typically synthesize compounds by routes that differ from those used by the medicinal chemists and data-driven multi-objective optimization of catalysts can lead to more efficient manufacturing processes (a higher conversion to the desired product also makes for a cleaner crude product). 

It’s now time to wrap up what’s been a long post. Some of what is referred to as AI appears to already be useful in drug discovery (especially in the early stages) although non-AI computational inputs will continue to be significant for the foreseeable future. I see a need for cheminformatic thinking in drug discovery to shift from big data (global ML models) to focused data (generate project specific data efficiently for building local ML models) and also see advantages in using atom-based descriptors that are clearly linked to molecular interactions. One issue for data-driven approaches to prediction of biological activity such as ML and QSAR modelling is that the need for predictive capability is greatest when there's not much relevant data and this is a scenario under which physics-based approaches have an advantage. In my view, validation of ML models is not a solved problem since clustering in chemical space can cause validation procedures to make optimistic assessments of model quality. I continue to have significant concerns about how relationships (which are not necessarily linear) between descriptors are handled in ML modelling and remain generally skeptical of claims for interpretability of ML models (as noted in NoLE, the contribution of a protein–ligand contact to affinity is not, in general, an experimental observable).

Many thanks for staying with me to the end and hope to see many of you at EuroQSAR in Barcelona next year. I'll leave you with a memory from the early days of chemical space navigation.