Monday 20 May 2024

A time and place for Nature in drug discovery?

I’ll be reviewing Y2022 (The Time and Place for Nature in Drug Discovery) in this post and stating my position on natural products in modern drug discovery is a good place to start. I certainly see value in screening natural products and natural product-like compounds (especially in phenotypic assays) and there is currently a great deal of interest in chemical probes (I’ll point you toward an article on the Target 2035 initiative and a link to the Chemical Probes Portal).  In general, a natural product or natural product-like active identified by screening would either need to exhibit novel phenotypic effects or be significantly more potent than other known actives for me to enthusiastic about following it up. I would certainly consider screening fragments that are only present in natural product structures although these would need to still need comply with the criteria (typically defined in terms of properties such as molecular size, molecular complexity and lipophilicity) used to select fragments. I see significant benefits coming from the increased use of  biocatalysis, both in drug discovery and for manufacturing drugs, but I don’t see these benefits as being restricted to synthesis of natural products or natural product-like compounds. 

This will be a very long post (for which I make no apology) and it's a good point to say something about how the review is presented. I've used same section headings (in bold text) as Y2022 for my commentary and quoted text has been indented (my comments on the quoted text enclosed with square brackets and italicized in red). I'd like to raise four general points before starting my review: 

  1. Proprietary data cannot accurately be described as “facts” or “evidence” and it’s not valid to claim that you’ve proven or demonstrated something on the basis of analysis of proprietary data.  
  2. If continuous data such as oral bioavailability measurements have been made categorical (e.g., high | medium | low) prior to analysis then it’s generally a safe assumption that any trends "revealed" by the analysis are weak.  
  3. If basing claims on analysis of locations or distributions within a particular chemical space it is necessary to demonstrate the chemical space is actually relevant to the claims being made. One way to do this is to build usefully predictive models of relevant quantities such as aqueous solubility or permeability using only the dimensions of the chemical space as descriptors.  
  4. There are generally many ways to partition a region of chemical space into subregions with different average values for a measured quantity. Although the  boundaries resulting from these analyses typically appear to be well-defined (for example, as a line or curve in a 2-dimensional chemical space) it is a serious error to automatically interpret such boundaries as meaningful from a physicochemical perspective.    

I have a number of concerns about the Y2022 article and I’ll focus on the more serious of these in this post. I’ll also be commenting on the Rule of 5 (Ro5; see L1997), logP/logD differences, and the drug discovery “sweet spot” reported in the HK2012 article.  My view is that a number of the assertions and recommendations made by the authors of Y2022 are not supported by the analyses or the data that they’ve presented. Specifically, the authors present results of analyses that had been performed using proprietary and undocumented models and, in my view, they have grossly over-interpreted the predictions made using the models.  At times, the authors appear to be treating natural products as if these occupy a distinct and contiguous region of chemical space (this is a pitfall into which drug-likeness advocates also frequently stumble).  The authors of Y2022 discuss physicochemical properties at considerable length without making any convincing connection between this discussion and natural products. Reading the Y2022 article, I did detect a subliminal message that natural products might be infused with vital force and wouldn’t have been surprised to see Gwyneth Paltrow as a co-author.

I’ll make some general observations before examining Y2022 in detail. If you’re going to base decisions on trends in data then you need to now how strong the trends are because this tells you how much weight to give to the trends when making your decisions. In what I’ll call the ‘compound quality’ field you’ll often encounter data presentations that make it extremely difficult to see how strong (or weak) the trends in the data actually are (see KM2013: Inflation of correlation in the pursuit of drug-likeness). Since Ro5 was introduced in 1997 (see L1997) there has been a free flow of advice from self-appointed compound quality gurus as to how compounds can be made better, more developable and more beautiful (introduction of the term “Ro5 envy” in KM2013 appeared to cause some to spit feathers). This advice frequently comes in the form of dire warnings that exceeding a threshold value of a property, such as molecular weight or predicted octanol/water partition coefficient, will increase the probability of something bad happening. It’s actually very difficult to set thresholds like these objectively and you have to consider the possibility that some of these statements of probability are merely expressions of belief (to some “there is a high probability that God exists” will sound rather more convincing than “I believe in God”).

The graphical abstract is a good place to start my review of Y2022. I don’t know whether biotransformations exist that would convert the Core Scaffold into compounds that would match the Bios Collection generalized structure but a 1,3-diene in conjugation with a tertiary nitrogen is not the sort of substructure that I would want to see in a screening active that I had been charged with optimizing.  

Abstract

The authors of Y2022 state:  

The declining natural product-likeness of licensed drugs and the consequent physicochemical implications of this trend in the context of current practices are noted. [The authors do not make a convincing connection between natural product-likeness and physicochemical properties.]  To arrest these trends, the logic of seeking new bioactive agents with enhanced natural mimicry is considered; notably that molecules constructed by proteins (enzymes) are more likely to interact with other proteins (e.g., targets and transporters), a notion validated by natural products. [I consider this claim to be extravagant and it does need to be supported by evidence. The authors’ use of “validated” reminded me of the extravagant claim made in a Future Medicinal Chemistry editorial that “ligand efficiency validated fragment-based design”. Taking the statement literally, the authors appear to be suggesting that a compound would be more likely to interact with proteins if it had been isolated from natural sources than if it had been synthesized in a laboratory (I was reminded of the "water memory" explanation for why homeopathy works). If “molecules constructed by proteins” really are more likely to interact with other proteins then they’re also more likely to interact with anti-targets like hERG and CYPs. I’m guessing that the response of medicinal chemistry teams tackling CNS targets to suggestions that they should make their compounds more like natural products so as increase the likelihood of recognition by transporters might be to ask which natural products those offering the advice had been smoking.]

Introduction

The authors show time-dependence for the values of a number of parameters calculated for drugs in Figure 1. I see analyses like these as exercises in philately and, when I first encountered examples about two decades ago, I formed a view that some senior medicinal chemists had a bit too much time on their hands. The observation of significant time-dependency for a parameter calculated for drugs can mean one of three things. First, the parameter is irrelevant to drug discovery (however, the absence of a time-dependence shouldn't be taken as evidence that the parameter is relevant to drug discovery). Second, the old ways were best and the medicinal chemists of today have lost their way (I’m guessing this might be Jacob Rees Mogg’s interpretation if he were a medicinal chemist). Third, the old ways no longer work so well and the medicinal chemists of today have learned new ways.

I have a number of concerns about what is shown in Figure 1 (quite aside from these concerns I would question why 1b or 1c were even included in the study). The data values that have been plotted are actually mean values and, as we observed in KM2013, the presentation of mean value (or median) values without showing measures of the spread in the data, such as standard deviation or inter-quartile range, makes trends look stronger than they actually are (others use the term “voodoo correlations”).  This was of presenting data is specifically verboten by J Med Chem and Author Guidelines (viewed 18-May-2024) for that journal specifically state:

If average values are reported from computational analysis, their variance must be documented. This can be accomplished by providing the number of times calculations have been repeated, mean values, and standard deviations (or standard errors). Alternatively, median values and percentile ranges can be provided. Data might also be summarized in scatter plots or box plots.

However, the hidden variation in the response variables is not the only issue that I have with Figure 1. Let’s take a look at Figure 1a which shows “a temporal comparison of natural product likeness of approved drugs assessed by the Natural Product Scout algorithm (12) versus the year of the first disclosure of the drug” although it the caption for Figure 1a is “Natural product class probability. (8)”. I think that the authors do need to explain exactly what they mean by natural product class probability because the true probability that a compound is a natural product is either 1 (it’s a natural product) or 0 (it’s not a natural product). Put another way there are differences between natural products and Prof.  Schrödinger’s unfortunate feline companion. The measure of lipophilicity shown in Figure 1c is XLogP3 although no justification is given for the selection of this particular method for lipophilicity prediction nor is any reference provided.

Before continuing with my review of Y2022 I also need to examine Ro5 and discuss the difference between logP and logD (the reasons for these digressions will hopefully become clear later). Ro5 which was based on physicochemical property distributions for compounds that had been taken into phase 2 of clinical development before 1997 (the year that L1997 was published). My view is that Ro5 certainly raised awareness of the problems associated with excessive lipophilicity and molecular size (A Good Thing) but I’ve never considered Ro5 to be useful in design. Although Ro5 is accepted by many (most?) drug discovery scientists as an article of faith, some are prepared to ask awkward questions and I’ll mention the S2019 study. Let’s take a look at how Ro5 was specified in the L1997 article (the graphic is slide #17 from a presentation that I gave late last year):


Ro5 is stated in terms of likelihood of poor absorption or permeation although no measured oral absorption or permeability data are given in the L1997 study and Ro5 should therefore be regarded as a statement of belief. I realise that to make such an assertion runs the risk of an appointment with the auto-da-fé and I stress that had Ro5 been stated in terms of physicochemical and molecular property distributions I would not have made the assertion.

Medieval cartographers annotated the unknown regions of their maps with “here be dragons” and Ro5’s dragons are poor absorption and poor permeation. However, there's another issue which I touched on in HBD3:

It is significant that attempts to build global models for permeability and solubility, using only the dimensions of the chemical space in which the Ro5 is specified as descriptors, do not appear to have been successful.

What I was getting at in HBD3 is that the chemical space in which Ro5 is specified was not demonstrated to be relevant to permeability or solubility (this relates to the third of the four points that I raised at the start of the post). It must be stressed that I'm definitely not denying that relationships exist between descriptors, such as logP, used to specify Ro5 and properties such as aqueous solubility and permeability that are more directly relevant to getting drugs to where they need to. It’s just that these relationships are weak (see TY2020) and, while we don’t exactly know exactly how weak the relationships are, we do know that they are weak because continuous data have been binned to display them (see also KM2013 and specifically the comments on HY2010). I would generally anticipate that these relationships will be stronger within structural series but in these cases you’ll generally observe different relationships for different structural series. In practical terms this means that a logP of 5 might be manageable in one structural series while in another structural series compounds with logP greater than 3.5 prove to be inadequately soluble. As I advised in NoLE:

Drug designers should not automatically assume that conclusions drawn from analysis of large, structurally-diverse data sets are necessarily relevant to the specific drug design projects on which they are working.

I also need to discuss the distinction between logP and logD since this is a source of confusion for medicinal chemists and compound quality 'experts' alike. Here’s a graphic (it’s slide #18) from the presentation that I did at SancaMedChem in 2019 (if the piranhas did venture into the non-polar phase they'd probably end up swimming backstroke):


The partition coefficient (P) is simply the ratio of the concentration of the neutral form of the compound in the organic phase (usually octanol) to the concentration of the compound in water when both phases are in equilibrium. The distribution coefficient (D) is defined analogously as the ratio of the sum of concentrations of all forms of the compound in the organic phase to the sum of concentrations of all forms of the compound in water. Values of P and D are usually quoted as their logarithms logP and logD. When interpreting logD values it is commonly assumed that that is that only neutral forms of compounds partition into organic phases and if we make this assumption the relationship between logD and logP is given by Eqn 1 (see B2017):

When we perform experiments to quantify lipophilicity it is actually logD that is measured. Values of logP and logD are identical when ionization can be neglected and logP values for ionizable compounds can be obtained by examination of measured logD-pH profiles although this is rarely done. It’s usually a safe assumption that logP values used by drug discovery scientists (and quoted in medicinal chemistry publications) have been predicted and these values vary with the method used for prediction of logP. For example, L1997 states that the upper logP limit for Ro5 is 5 when logP is calculated using the ClogP method (see L1993) but 4.15 when logP is calculated using the method of Moriguchi et al. (see M1992). Values of logD that you encounter in the literature may have been calculated or measured (you might need to dig around to see if you’re dealing with real data) and it’s also important to remember that logD depends on pH. I would argue that logD is less appropriate than logP for defining compound quality metrics because excessive lipophilicity can be countered simply by increasing the extent to which compounds are are ionized (I hope you can see why that would be A Bad Thing). Another way to think about this is to consider an amine with a pKa value of 8 bound to hERG at a pH value of 7. Now suppose that you can change the pKa of the amine to 11 without changing anything else in the molecular structure constant. What effects would you expect this pKa changed to have on affinity, on logD and on logP?

I’ll now get back to reviewing Y2022 and let’s take a look at Figure 2 which shows an adapted version of the "drug discovery sweet spot” proposed in the HK2012 study. As with Figure 1b and 1c, I would question why Figure 2 was included in the Y2022 study since the connection with natural products is tenuous. In my view the authors of the HK2012 study made a number of serious errors in their definition of the “sweet spot” and these errors have been reproduced in the Y2022 study. The authors of HK2012 claimed to have identified a “drug discovery sweet spot” in a chemical space defined by “Log P” and “Molecular mass” but they didn’t actually demonstrate that this chemical space is actually relevant to drug discovery (one way to demonstrate relevance is to build convincing global models for prediction of properties like permeability and aqueous solubility using only the dimensions of the chemical space as descriptors).

If claiming to have identified a drug discovery “sweet spot” it’s important that each dimension of the chemical space in which the “sweet spot” corresponds to a single entity. While “Molecular mass” is unambiguous the term “Log P” does not refer to the same entity for each of the data sets from which the “sweet spot” has been derived. As noted previously ClogP (see L1993) was used to specify Ro5 while the Gleeson upper Log P limit (see G2008) and the “μM potency Log P” (see G2011) were specified respectively by values of clogP (calculated logP from ACD) and AlogP (no reference provided). In contrast the Pfizer Golden Triangle (see J2009) is specified using elogD (proprietary logD prediction method for which details were ot provided).  The Waring low and high logP/logD values stated in W2010 are at least partly based on analysis of AZlogD7.4 values (proprietary logD prediction method; details not provided) reported in the WJ2007 and W2009 studies. The W2010 study states that “the optimal range of lipophilicity lies between ~ 1 and 3” but the these are not the values that are depicted in Figure 3 (or indeed in the original HK2012 study). The Gleeson upper limits for Log P and Molecular Mass stated in G2008 reflect the arbitrary schemes used to bin the data and should not be regarded as objectively-determined limits for these quantities. The authors of Y2022 have superimposed ellipses for "SHMs", "Antibiotic Space?" and "bRo5 /  AbbVie MPS space for higher MW" on the HK2012 "sweet spot" in the creation of Figure 2 although it is not clear how these ellipses were constructed.

The Physicochemical Characteristics of Drugs

The authors assert:

A principle advocated by Hansch that drug molecules should be made as hydrophilic as possible without loss of efficacy (47) is commonly expressed and utilized as Lipophilic Ligand Efficiency (LLE). (48) [If actually using this principle advocated by Hansch you would optimize leads by varying hydrophilicity and observing efficacy. While LLE is one way to express Hansch’s principle it is by no means the only way and (pIC50 – 0.5 ´ logP) would be equally acceptable as a lipophilic ligand efficiency metric from the perspective of the Hansch’s principle.] This metric, widely accepted and exploited in drug discovery as a key metric in optimization, is expressed on a log scale as activity (e.g., −log10[XC50]) [The logarithm function is not defined for dimensioned quantities such as XC50 (see M2011) and, while it may appear to nitpicking to point it out, this is the source of the invalidity of the ligand efficiency metric as was discussed at length in NoLE.] minus a lipophilicity term (typically the Partition coefficient or log10 P or sometimes log D7.4). (49) [Although it is common to see LLE values quoted in the drug discovery literature it’s much less clear how (or even whether) the metric was actually used to make project decisions. In many studies, however, the focus is on plots of pIC50 against logP (or logD) rather than values of the metric itself. In lead optimization, medicinal chemists typically need to balance activity against properties such as permeability, aqueous solubility, metabolic stability and off-target activity. In these situations, experienced medicinal chemists typically give much more weight to structure-activity relationships (SARs) and structure-property relationships (SPRs) that they've observed within the structural series that they're optimising than to crude metrics of questionable relevance and predictivity. It is noteworthy that the authors of ref 49 use logD rather than logP to define LLE (which they call LiPE) and if you do this then you can make compounds more efficient simply by increasing the extent to which they are ionized.] The impact of lipophilicity on efficacy needs to be considered in the context that reducing lipophilicity (equating to increasing hydrophilicity) will generally increase the solubility, reduce the metabolism, and reduce the promiscuity of a given compound in a series. (50) [The relationships between these properties and lipophilicity shown in ref 50 are for structurally diverse data sets rather than for individual series. I consider the activity criterion (pIC50 > 5) used to quantify promiscuity in ref 50 to be at least an order of magnitude too permissive to be pharmaceutically relevant.]

Let’s take a look at Figure 3 in which values of “Calc Chrom Log D7.4” are plotted against “CMR”. This is what the authors of say about the figure in the text of Y2022:

The distribution of marketed oral drugs in terms of their lipophilicity and size, shows a remarkably similar distribution to the set of compounds designed by Kell as a representative set of natural products to investigate carrier mechanisms (Figure 3). (64) [To state “shows a remarkably similar distribution” is arm-waving given that there are methods for assessing the similarity of two distributions in an objective manner.]

As is the case for Figure 1a, what is written in the text about Figure 3 differs significantly from the caption for this figure:

Figure 3. Natural products are found across most size lipophilicity combinations, as exemplified in a representative set designed and compiled by O’Hagan and Kell (64) superimposed on the Chrom log D7.4 vs cmr training set of compounds with >30% bioavailability. (51) [It is unclear why this training set was restricted to compounds with >30% bioavailability.  The LDF is shown in this figure with “Limits of confidence” but the level of confidence to which these limits correspond is not given.]

The first criticism that I’ll make is that the authors of Y2022 have not actually demonstrated the relevance of chemical space specified by the axes of Figure 3 (this is the essence of the third of the four points that I raised at the start of the post and the same criticism can be made of Figure 4 and Figure 5). The authors note, with some arm-waving, that cmr “largely correlates with MW” which does rather beg the question of why they consider this particular measure of molecular size to be superior to MW for this type of analysis. The authors claim that “the GSK model based on log D7.4 vs calculated molar refraction” (it is actually molar refractivity as opposed to molar refraction that was calculated) is a useful guide to predict oral exposure. I consider this claim to be extravagant because one would need to have access to the proprietary model for calculation of Chrom Log D7.4 in order to use the model. The proprietary nature of the GSK model means that predictions made using this model cannot credibly be presented as “evidence”.

Details of the models for calculating Chrom Log D7.4 and for prediction of oral exposure are sketchy and I regard each of these proprietary models as undocumented. A linear discriminant function (LDF) model was reportedly used for prediction of oral exposure but it is unclear how the model was trained (or if it was even validated). An LDF is a classification model and it is not clear what how the classes were defined for prediction of oral exposure. I’m assuming that the oral absorption classes used in GSK oral exposure model have been defined by categorization of continuous data (I’m happy to be corrected on this point but, given that the details are sketchy, I can be forgiven for speculation) and setting thresholds like these is difficult to achieve in an objective manner. If this was indeed the case I'd assume that the threshold value used to categorize the continuous data was arbitrary (you’ll get a different LDF model if you use a different threshold to define the classes). My view is that that an LDF is an inappropriate way to model this type of data because the categorization of the data discards a huge amount of information.

Here's the caption for Figure 4:

Figure 4. Proposed regions of size/lipophilicity space for an oral drug set, (51) using the effectual combination of Chrom Log D7.4 vs calculated molar refraction (cmr) as a description of chemical space. [It’s actually molar refractivity as opposed to molar refractivity that was calculated. It is unclear what the authors mean by "bRo5 principles".] The highlighted regions suggest likely absorption mechanisms, based on ref (65) with compounds colored by binned NPScout probability scores. [The authors of Y2022 appear to be using a proprietary and undocumented LDF model of unknown predictivity to infer absorption mechanisms (this is what I was getting at in the fourth of the four points points that I raised at the start of the post). The depiction of data shown in Figure 4 would be much more informative had compounds known (as opposed to believed) by to be orally absorbed by one of these mechanisms been plotted in this chemical space.] Below the LDF line, then mean NPScout score is 0.45, (median 0.33) and above it (indicative of likely oral exposure) the mean is 0.31 and median 0.17 (p < 0.01) [It is unclear what (p < 0.01) refers to.]

Here's the caption for Figure 5: 

Figure 5. Illustration of antibiotic drug space, expressed as Calculated Chrom Log D7.4 vs cmr adapted from data in ref (65) colored by antibiotics (circles) and TB drugs (diamonds) which are sized by NP class probabilities and colored by prediction of likelihood of oral exposure (either side of the diagonal “linear discriminant function line” so to be oral, transporters a likely mechanism for the red colored compounds, which mostly have a high NPScout score). [As is the case for Figure 4, the authors of Y2022 appear to be using a proprietary and undocumented LDF model of unknown predictivity to infer absorption mechanisms. Stating that "mostly have a high NPScout score" is arm-waving.]  Vertical (cmr < 8) and horizontal lines (Chrom Log D7.4 < 2.5) together represent likely boundaries for paracellular absorption. [The basis (measured data or belief) for this assertion is unclear. The depiction of data shown in Figure 5 would have been more convincing had compounds known to be and known not to be absorbed by the paracellular route been plotted in this chemical space. While the problems of achieving good oral absorption for antibiotics should not be underestimated, I see getting compounds into cells as the bigger issue and in some cases the transporters cause active efflux (see R2021). The depiction of data shown in Figure 5 would have been much more informative had compounds known (as opposed to believed) to exhibit active influx and active efflux been plotted in this chemical space. Although Figure 5 is presented as a description of antibiotic drug space, the study (ref 65) on which Figure 5 is based is actually focused on antitubercular drug space (one of the challenges to discovery of antitubercular drugs is that Mycobacterium tuberculosis is an intracellular pathogen; see WL2012). One article that I recommend to all drug discovery scientists, especially those working on infectious diseases, is the SM2019 review on intracellular drug concentration.]

The authors suggest:

A logical extension of this hypothesis would be to consider recognition processes with natural molecules, which are likely to have discrete interactions with carrier proteins and therapeutic targets. [The authors do need to articulate what they mean by "discrete interactions" and why "natural molecules" are likely to have "discrete interactions" with carrier proteins and therapeutic targets.] Small molecule drugs are noted to be relatively promiscuous, so making interactions with several proteins is a likely event. (76) [This assertion is not supported by ref 76 which is actually a study of nuisance compounds, PAINS filters, and dark chemical matter in a proprietary compound collection. Promiscuity of a compound is typically defined by a count of the number of targets against which activity exceeds a specific threshold and promiscuity generally increases with the permissiveness of the activity threshold (it’s therefore meaningless to describe a compound as “promiscuous” without also stating the activity threshold). The activity threshold for the analysis reported in ref 76 is ³ 50% inhibition at a concentration of 10 µM which is appropriate if you’re worried about assay interference but, in my view, is at least an order of magnitude too permissive if considering the possibility of off-target activity for a drug in vivo.]  It similarly is logical to consider that a molecule made by a recognition process in a catalytic enzyme may also interact with another protein in a similar manner. (77) [This is not quite as logical as the authors would have us believe since enzymes catalyze reactions by stabilizing transition states. A high binding affinity of an enzyme for its reaction product would generally be expected to result in inhibition of the enzyme by the reaction product.]

Natural Product Fragments in Fragment-Based Drug Discovery

The authors note:

Fragment-based drug discovery (FBDD) can be employed to rapidly explore large areas of chemical space for starting points of molecular design. (91 | 92 | 93) However, most FBDD libraries are composed of privileged substructures of known synthetic drugs and drug candidates and populate already well-explored areas of chemical space, (94 | 95 | 96[I do not consider refs 94-96 to support this assertion (none of these three articles has a fragment screening library design focus and the most recent one was published in 2007).] often through the use of fragments with high sp2-character. (97)  Underexplored areas of chemical space can be rapidly explored by employing fragments derived from NPs that are already biologically prevalidated by evolution. [The authors appear to be suggesting that the physiological effects of natural products are more due to the fragments from which they have been constructed than of the way in which the fragments have been combined.] 

Molecular recognition

The authors state:

That the embedded recognition of natural products for proteins correlates with recognition of the biosynthetic enzyme is an increasingly validated concept. (118 | 119 | 120) [I have no idea what “embedded recognition” means and I’m guessing that the authors might be in a similar position.] The biosynthetic imprint translates to recognition of other proteins using similar interactions. [As I’ve already noted, high binding affinity of a natural product for the enzyme that catalysed its formation would lead to inhibition of the enzyme.] For example, the analysis of protein structures of 38 biosynthetic enzymes gave 64 potential targets for 25 natural products. (121) [Concepts are usually validated with measured data and not by making predictions.]

Conclusions and Prospects for Future Development

The authors assert:

More natural molecules will increase quality through their inherently improved permeability and solubility; [At the risk of appearing pedantic, permeability and solubility are properties of compounds as opposed to molecules. That said, the authors appear to be treating “natural molecules” as occupying a distinct and contiguous region of chemical space by making this claim and it is unclear what the improvements will be relative to. The authors do not present any measured data for permeability or solubility to support their claim.] this is a case of investing time and effort in the early stages of drug discovery to reap rewards with improvements in the later stages through more predictability in trials (and thus a greater chance of success, where quality rather than speed demonstrably impacts (170)) [Many, including me, do indeed believe that investing time and effort in the early stages of drug discovery increases the chances of success in the later stages. However, I would challenge the assertion by the authors of Y2022 that ref 170 actually demonstrates this.] and more sustainable manufacturing methods driven by the transformative power of biocatalysis. (171)

So that concludes my review of Y2022 and thanks for staying with me. I'll leave you with a photo of me in the office here in Trinidad with my faithful canine companions BB and Coco providing much-needed leadership (a few minutes earlier I had patiently explained to them why ligand efficiency is complete bollocks).

Monday 1 April 2024

Standard states and solution thermodynamics

Readers of this blog know that, on more than one occasion, I have denounced the ligand efficiency metric as physically meaningless on the grounds that perception of efficiency varies with the concentration value that defines the standard state. As I argue in NoLE this is clearly thermodynamic nonsense (Pauli might even have suggested that it wasn’t even wrong) and the equivalent cheminformatic argument is that perception shouldn’t change when you use a different unit to express a quantity.

A change in perception resulting from using a different standard concentration can also be a problem when analysing thermodynamic signatures. One particular absurdity is that binding can be switched from enthalpy-driven to entropy-driven simply by using a different concentration to define the standard state. This statement in the W2014 article unintentionally highlights the issue:

Consequently, we define the dimensionless ratio (ΔH + TΔS)/ΔG as the Enthalpy–Entropy Index (IE–E) and use it here to indicate the enthalpy content of binding. Its advantageous feature is that it is normalised by the free energy ΔG (= ΔH  – TΔS), and so it can be used to compare compounds with millimolar to nanomolar binding affinities during the course of a hit-to-lead optimisation.

I do indeed think that it makes a lot of sense to use (ΔH + TΔS) and ΔG as parameters for exploring thermodynamic signatures. However, the dimensionless ratio of the two quantities is physically meaningless because of its dependence on the concentration used to define the standard state (this dependence stems from the fact that ΔS depends on the standard concentration while ΔH is invariant to change in the standard concentration).

One article that I’ve been particularly critical of in the past is “The role of ligand efficiency metrics in drug discovery” NRDD 133:105-121 (2014) DOI. Specifically, I have expressed concerns about this sentence in Box 1 (Ligand efficiency metrics) of the article:

Assuming standard conditions of aqueous solution at 300K, neutral pH and remaining concentrations of 1M, –2.303RTlog(Kd/C°) approximates to –1.37 × log(Kd) kcal/mol.

I do need to mention a potential source of confusion when analysing Kd values. In biochemistry, biophysics and drug discovery Kd values are conventionally quoted as dimensioned quantities in units of concentration. However, Kd values may also be quoted as dimensionless ratios and, in these cases, the Kd value depends on the concentration used to define the standard state. There seems to be an error in that the approximation appears to eliminate the dimensions of the standard concentration C°.

I should say that I’ve always been a bit nervous about denouncing the approximation as an error because the authors are all renowned thought leaders in the drug discovery field. Furthermore, the journal impact factor of NRDD is a significant multiple of my underwhelming h-index and any error of such apparent grossness would surely have been detected during the rigorous peer review process applied by this elite journal. It turns out that my nervousness was indeed well placed and, when calculated at 300 K, the product RT actually serves as an annihilation operator that eliminates the dimensionality associated with Kd. This also explains why a temperature of 300 K must be used when calculating the ligand efficiency even though biochemical assays are usually run at human body temperature (310 K). 

I became convinced of the validity of the above approximation recently after examining a manuscript by the world-renowned expert on tetrodotoxin pharmacology, Prof. Angelique Bouchard-Duvalier of the Port-au-Prince Institute of Biogerontology, who is currently on secondment to the Budapest Enthalpomics Group (BEG). The manuscript has not yet been made publicly available although I was able to access it with the help of my associate ‘Anastasia Nikolaeva’ (she decamped last year from Tel Aviv to Uzbekistan and, to Derek’s likely disapproval, is currently running an open access journal out of a van in Samarkand). There is no doubt that this genuinely disruptive study will comprehensively reshape the generative AI landscape, enabling drug discovery scientists, for the very first time, to rationally design novel clinical candidates using only gene sequences as input.

Prof. Bouchard-Duvalier’s seminal study clearly demonstrates that it is indeed possible to eliminate the need to define standard states for the thermodynamic analysis of liquid solutions, provided that the appropriate temperature is used. The math is truly formidable (my rudimentary understanding of Haitian patois didn’t help either) and involves first projecting the atomic isothermal compressibility matrix into the quadrupole-normalized polarizability tensor before applying the Barone-Samedi transformation, followed by hepatic eigenvalue extraction using the algorithm introduced by E. V. Tooms (a reclusive Baltimore resident better known for his research in analytic topology). ‘Anastasia Nikolaeva’ was also able to ‘liberate’ a prepared press release in which a beaming BEG director Prof. Kígyó Olaj explains that, “possibilities are limitless now that we have eliminated the standard state from solution thermodynamics and thereby consigned the tedious and needlessly restrictive Second Law to the dustbin of history." 

Wednesday 27 March 2024

Leadeth me unto Truth and delivereth me from those who have already found it

A theory has only the alternative of being true or false.
A model has a third possibility: it may be true, but irrelevant.
With apologies to Manfred Eigen (1927 - 2019)
******************

I've just returned to Cheshire from the Caribbean and, to kick off blogging from 2024 I'll share a photo of the orchids at Berwick-on-Sea on the north coast of Trinidad.


Encountering words like “truth” and “beauty” (here's a good example) in the titles of scientific articles always sets off warning bells for me and I’ll kick off blogging for 2024 with a look at FM2024 (Structure is beauty, but not always truth) that was recently published in Cell (and has already been reviewed by Derek). The authors have highlighted  important issues: we typically use single conformations of targets in design and the experimentally-determined structures used for design may differ substantially from the structures of targets as they exist in vivo. These points do need be stressed given the expanding range of modalities being exploited by drug designers and the increasing use of AI/ML in drug design. That said, it’s my view that the authors have allowed themselves to become prisoners of their article’s title. Specifically, I see “beauty” as a complete red herring and suggest that it would have been much better to have discussed structure in terms of accuracy and relevance rather than truth. Here’s the abstract for FM2024:

Structural biology, as powerful as it is, can be misleading. We highlight four fundamental challenges: interpreting raw experimental data; accounting for motion; addressing the misleading nature of in vitro structures; and unraveling interactions between drugs and “anti-targets.” Overcoming these challenges will amplify the impact of structural biology on drug discovery.

I'll start by taking a look at the introduction and my view is that the authors do need to be much clearer about what they mean by “this hydrogen bond is better than that one” when using terms like “ground truth”. For example, we can infer that the geometry of one target-ligand hydrogen bond is closer to optimal than the geometry of another target-ligand hydrogen bond. However, the energetic cost of breaking a target-ligand hydrogen bond is not something that can generally be measured and, as noted in NoLE, the contribution of an intermolecular contact to affinity is not actually an experimental observable. Ligands associate with their targets (and anti-targets) in aqueous media and this means that intermolecular contacts, for example between polar and non-polar atoms, can destabilize the target-ligand complex without being inherently repulsive. What I’m getting at here is that structures of ligand-target complexes are relatively simple and well-defined entities within the broader context of drug discovery and yet it doesn’t appear useful to discuss them in terms of truth.

The remainder of the post follows the FM2024 section headings.    

A structure is a model, not experimental reality

The term “structure” can have a number of different meanings in structure-based drug design. First, drug targets (and anti-targets) have structures that exist regardless of whether they have been experimentally determined. Second, models are built for drug targets by fitting nuclear coordinates to experimental data such as electron density (these are often referred to as experimental structures although they should strictly be called models because they are abstractions of the experimental data). Third, the structure could have been predicted using computational tools such as AlphaFold2 (here's an article, cited by FM2024, on why we still need experimentally-determined structures). 

In the abstract the authors identify “interpreting raw experimental data” as one of “four fundamental challenges”. However, the actual focus of this section appears to be evaluation of predicted structures rather than interpretation of raw experimental data.  While I’m sure that we can find better ways to interpret raw experimental data, and indeed to evaluate predicted structures, I don’t see either as representing a fundamental challenge. 

Representing wiggling and jiggling is hard

My view is that it’s actually the ensemble of conformations rather than the wiggling and jiggling that we actually need to represent. Simulation of the wiggling and jiggling is one way to generate an ensemble of conformations but it’s not the only way (nor is it necessarily the best way).  That said, it's a lot easier to sell protein motion to venture capitalists than it is to sell ensembles of conformations.

The authors state:

Analogous to how structure-based drug design is great for optimizing “surface complementarity” and electrostatics, future protein modeling approaches will unlock ensemble-based drug design with an ability to predictably tune new and important aspects of design, including entropic contributions [7] and residence times [8] of bound ligands.

The term “entropic contributions” does come across as arm-waving (especially in a drug design context) and my view is that entropy should be seen as an effect rather than a cause. Thermodynamic signatures for binding are certainly of scientific interest but I would argue that they are essentially irrelevant to drug design (it can be instructive to consider how patients might sense the benefits of enthalpically-driven drug binding). The case for increasing residence time might not be quite as solid as many believe it to be (see the F2018 study and this blog post).

In vitro can be deceiving

The authors identify “addressing the misleading nature of in vitro structures” as a fundamental challenge and they state:

While purifying a protein out of its cellular context can be enabling for in vitro drug discovery, it can also provide a false impression. Recombinant expression can lead to missing post-translational modifications (e.g., phosphorylation or glycosylation) that are critical to understanding the function of a protein.

To this I’d add that we often don’t use the full-length proteins in design and recombinant proteins may have been engineered to make them easier to crystallize or more robust for soaking experiments. Furthermore, target engagement may require the drug to interact with two or more proteins (see HC2017) which will probably be more amenable individually to structure determination than the their complex. I fully agree that it is important for drug designers to be aware that the experimentally-determined structures that they're using differ from the structures of the targets as they exist in vivo.  However, I don't believe that it makes any sense to talk about “the misleading nature of in vitro structures” (or indeed about “in vitro drug discovery”) because target structures are never experimentally determined in vivo and are only misleading to the extent that users overinterpret them. As a more general point users of experimental data do need to very careful about describing the experimental data that they’re using as “misleading” or "deceiving".  

When we use structures to represent targets the issue is much less about the truth of the structures that we’re using and much more about their relevance to the targets that we’re trying to represent. This is not just an issue for structural biology and we might, for example, use the catalytic domain of an enzyme as a model for the full-length protein when running biochemical assays. We have to make assumptions in these situations and we also need to check that these assumptions are reasonable. For example, we might examine the structure-activity relationship in a cell-based assay for consistency with the structure-activity relationship that we’ve observed in the enzyme inhibition assay. It's also worth pointing out that what we observe in cells is usually a coarse approximation to what actually happens in vivo and we can't even measure the intracellular concentration of a drug in vivo.  

Drugs mingle with many different receptors

Drugs do indeed mingle with many receptors in vivo but it’s important to be aware that the consequences of this mingling depend on the drug concentration (a spatiotemporal quantity) at the site of action.  Drug discovery scientists use the term exposure when talking about drug concentration at the site of action and one underappreciated challenge in drug design is that intracellular drug concentration cannot generally be measured in vivo (here’s an open access article that I recommend to everybody working drug discovery). I argue in NoLE that controllability of exposure should be seen as a drug design objective although the current impossibility of measuring intracellular concentration means that we can only assess how effectively the objective has been achieved in an indirect manner. Alternatively, drug design can be seen in terms of minimization of the dose at which therapeutically beneficial effects can be observed.  

One assumption often made in drug design is that the drug concentration at the site of action is equal to the unbound concentration in plasma and this assumption is referred to as the free drug hypothesis (FDH) although the term “free drug theory” is also used. The basis for the FDH is the assumption that the drug can move freely between plasma and the target compartment. In reality the drug concentration at the site of action will generally lag behind its unbound plasma concentration and the lag time is inversely related to the ease with which the drug permeates through the barriers which separate the target from the plasma. There are a couple of scenarios under which you can’t assume that the drug concentration in the target compartment will be the same as its unbound plasma concentration. The first of these is when active transport is significant and this is a scenario with which drug designers tackling targets within the central nervous system (CNS) are familiar with. The second scenario is that there is an ionizable functional group (as is the case for amines) in the molecular structure of the drug and the pH at the site of action differs significantly from plasma pH (as is the case for lysosomes).

There are two general types of undesirable outcome that can result when a drug encounters receptors with which it mingles.  First, the receptor is an anti-target and the encounter results in binding of the drug, leading to toxicity (patients are harmed).  Second, the receptor is a metabolic enzyme or a transporter and the encounter leads to the drug either being turned over or pumped from where it needs to function (patients do not benefit from the treatment).

I've inserted some comments (italicised in red) into the following quoted text: 

The sad reality that all drug discoverers must face is that however well designed we may believe our compounds to be, they will find ways to interact with many other proteins or nucleic acids in the body and interfere with the normal functions of those biomolecules. While occasionally, the ability of a medicine to bind to multiple biomolecules will increase a drug’s efficacy, such polypharmacology is far more likely to produce undesirable effects. These undesirable outcomes take two forms. Obviously, the direct binding to an anti-target can lead to a bewildering range of toxicities, many of which render the drug too hazardous for any use. [While there are well-known anti-targets such as hERG that must be avoided, my understanding is that those responsible for drug safety generally prefer not to see any off-target activity given the difficulties in prediction of toxicity. Here are a couple of relevant articles (B2012 | J2020) and a link to some information about in vitro safety pharmacology profiling panels from Eurofins.] More subtly, the binding to anti-targets reduces the ability of the drug to reach the desired target. A drug that largely avoids binding to anti-targets will partition more effectively through the body, enabling it to accumulate at high enough concentrations in the disease-relevant tissue to effectively modulate the function of the target. [I consider it unlikely that binding to an anti-target could account for a significant proportion of the dose. In any case, I’d expect binding of a drug to anti-targets to cause unacceptable toxicity long before it results in sequestration of a significant proportion of the dose.] 

A particular challenge results from the interaction of drugs with the enzymes, transporters, channels, and receptors that are largely responsible for controlling the metabolism and pharmacokinetic properties (DMPK) of those drugs—their absorption, distribution, metabolism, and elimination. Drugs often bind to plasma proteins, preventing them from reaching the intended tissues; [A degree of binding to plasma proteins is not a problem and, in the case of warfarin, is probably essential for the safe use of the drug.] they can block or be substrates for all manner of pumps and transporters, changing their distribution through the body; [Transporters can indeed prevent drugs from getting to their sites of action at therapeutically effective concentrations and limited brain exposure resulting from active efflux is a common issue for CNS drug discovery programs (see H2012 and R2015). I am not aware of any transporters that are definitely considered to be anti-targets from the safety perspective (I'm happy to be corrected on this point) and inhibition of efflux pumps is a recognized tactic (see T2021 and H2020) in drug discovery.] xenobiotic sensors such as PXR that turn on transcriptional programs recognizing foreign substances; and they often block enzymes like cytochrome P450s, thereby changing their own metabolism and that of other medicines. [Inhibition of CYPs is generally considered undesirable from the safety perspective because of the potential for drug-drug interactions (see H2020). That said, the CYP3A inhibitor ritonavir (see CG2003) is used in the COVID-19 treatment Paxlovid to slow metabolism of SARS-CoV-2 main protease nirmatrelvir.]  They are themselves substrates for P450s and other metabolizing enzymes and, once altered, can no longer carry out their assigned, life-saving function. [Medicinal chemists are well aware of the challenges presented by drug-metabolizing enzymes although it must be stressed that any drug that was cleared too slowly would be considered to be an unacceptable safety risk.] 

Taken together, we refer to these DMPK-related proteins, somewhat tongue-in-cheek, as the “avoidome” (Figure 2). [It is unclear why the authors have chosen to only include DMPK-related proteins in the avoidome (hERG is not a DMPK-related protein but is an anti-target that every drug discovery scientist would wish to avoid blocking). For reasons outlined in the previous paragraph I would actually argue against the inclusion of DMPK-related proteins in the avoidome.]  Unfortunately, the structures of the vast majority of avoidome targets have not yet been determined. Further, many of these proteins are complex machines that contain multiple domains and exhibit considerable structural dynamism. Their binding pockets can be quite large and promiscuous, favoring distinct binding modes for even closely related compounds. [It is not clear whether this assertion is based on experimental observations.] As a consequence, multiple structures spanning a range of bound ligands and protein conformational states will be required to fully understand how best to prevent drugs from engaging these problematic anti-targets.  

We believe the structural biology community should “embrace the avoidome” with the same enthusiasm that structure-based design has been applied to intended targets. [My view is that the authors need to clearly articulate their reasons for only including DMPK-related proteins in the avoidome before seeking to direct the activities of structural biology community. I presume that the Target 2035 initiative, which aims to “to create by year 2035 chemogenomic libraries, chemical probes, and/or biological probes for the entire human proteome”, will also cover anti-targets. Having chemical and/or biological probes available for anti-targets should lead to better understanding of toxicity in humans.] The structures of these proteins will shed considerable light on human biology and represent exciting opportunities to demonstrate the power of cutting-edge structural techniques. [Experimental structures of target-ligand complexes do indeed provide valuable direct evidence that a ligand is binding to a protein but the structures themselves are not particularly informative from the perspective of understanding human biology. It is actually high-quality chemical probes that are needed to shed light on human biology and here’s a link to the Chemical Probes Portal. Structures at atomic resolution for protein-ligand complexes are certainly useful for chemical probe design but are not strictly necessary for effective use of chemical probes.]  Crucially, a detailed understanding of the ways that drugs engage with avoidome targets would significantly expedite drug discovery.  [Experimentally-determined structures of anti-targets complexed with ligands are certainly informative when elucidating structure-activity relationships for binding to anti-targets. However, structural information of this nature is much less directly useful for addressing problems such as metabolic lability and active efflux.] This information holds the potential to achieve a profound impact on the discovery of new and enhanced medicines.

Conclusion

The authors assert: 

In drug discovery, truth is a molecule that transforms the practice of medicine. [I disagree with this assertion. In drug discovery truth may also be a compound that, despite an excellent pharmacokinetic profile, chokes comprehensively in phase 2.]

It's been been a long post and this is a good place to leave things. While the authors have raised some valid points I found the 'Drugs mingle with many different receptors' section to be rather confused and I don't think that the drug discovery and structural biology communities are in desperate need of yet another 'ome' word. I hope that my review of FM2024 will be useful for readers of the article while providing helpful feedback for the authors and for the Editors of Cell. 

Sunday 31 December 2023

Chemical con artists foil drug discovery

One piece of general advice that I offer to fellow scientists is to not let the fact that an article has been published in Nature (or any other ‘elite’ journal for that matter) cause you to switch off your critical thinking skills while reading it and the BW2014 article (Chemistry: Chemical con artists foil drug discovery) that I’ll be reviewing in this post is an excellent case in point. My main criticism of BW2014 that is that the rhetoric is not supported by data and I’ve always seen the article as something of a propaganda piece.

One observation that I’ll make before starting my review of BW2014 is that what lawyers would call ‘standard of proof’ varies according to whether you’re saying something good about a compound or something bad. For example, I would expect a competent peer reviewer to insist on measured IC50 values if I had described compounds as inhibitors of an enzyme in a manuscript. However, it appears to be acceptable, even in top journals, to describe compounds as PAINS without having to provide any experimental evidence that they actually exhibit some type of nuisance behavior (let alone pan-assay interference). I see a tendency in the ‘compound quality’ field for opinions to be stated as facts and reading some of the relevant literature leaves me with the impression that some in the field have lost the ability to distinguish what they know from what they believe. 

BW2014 has been heavily cited in the drug discovery literature (it was cited as the first reference in the ACS assay interference editorial which I reviewed in K2017) despite providing little in the way of practical advice for dealing with nuisance behavior. B2014 appears to exert a particularly strong influence on the Chemical Probes Community having been cited by the A2015, BW2017, AW2022 and A2022 articles as well as in the Toxicophores and PAINS Alerts section of the Chemical Probes Portal. Given the commitment of the Chemical Probes Community to open science, their enthusiasm for the PAINS substructure model introduced in BH2010 (New Substructure Filters for Removal of Pan Assay Interference Compounds (PAINS) from Screening Libraries and for Their Exclusion in Bioassays) is somewhat perplexing since neither the assay data nor the associated chemical structures were disclosed. My advice to the Chemical Probes Community is to let go of PAINS filters. 

Before discussing BW2014, I’ll say a bit about high-throughput screening (HTS) which emerged three decades ago as a lead discovery paradigm. From the early days of HTS it was clear, at least to those who were analyzing the output from the screens, that not every hit smelt of roses.  Here’s what I wrote in K2017

Although poor physicochemical properties were partially blamed (3) for the unattractive nature and promiscuous behavior of many HTS hits, it was also recognized that some of the problems were likely to be due to the presence of particular substructures in the molecular structures of offending compounds. In particular, medicinal chemists working up HTS results became wary of compounds whose molecular structures suggested reactivity, instability, accessible redox chemistry or strong absorption in the visible spectrum as well as solutions that were brightly colored. While it has always been relatively easy to opine that a molecular structure ‘looks ugly’, it is much more difficult to demonstrate that a compound is actually behaving badly in an assay.

It has long been recognized that it is prudent to treat frequent-hitters (compounds that hit in multiple assays) with caution when analysing HTS output. In K2017 I discussed two general types of behavior that can cause compounds to hit in multiple assays: Type 1 (assay result gives an incorrect indication of the extent to which the compound affects target function) and Type 2 (compound acts on target by undesirable mechanism of action (MoA)). Type 1 behavior is typically the result of interference with the assay read-out and the hits in question can be accurately described as ‘false positives’ because the effects on the target are not real. Type 1 behaviour should be regarded as a problem with the assay (rather than with the compound) and, provided that the activity of a compound has been established using a read-out for which interference is not a problem, interference with other read-outs is irrelevant. In contrast, Type 2 behavior should be regarded as a problem with the compound (rather than with the assay) and an undesirable MoA should always be a show-stopper.

Interference with read-out and undesirable MoAs can both cause compounds to hit in multiple assays. However, these two types of bad behavior can still cause big problems whether or not the compounds are observed to be frequent-hitters. Interference with read-out and undesirable MoAs are very different problems in drug discovery and the failure to recognize this point is a serious deficiency that is shared by BW2014 and BH2010.

Although I’ve criticized the use of PAINS filters there is no suggestion that compounds matching PAINS substructures are necessarily benign (many of the PAINS substructures look distinctly unwholesome to me). I have no problem whatsoever with people expressing opinions as to the suitability of compounds for screening provided that the opinions are not presented as facts. In my view the chemical con-artistry of PAINS filters is not that benign compounds have been denounced but the implication that PAINS filters are based on relevant experimental data.

Given that the PAINS filters form the basis of a cheminformatic model that is touted for prediction of pan-assay interference, one could be forgiven for thinking that the model had been trained using experimental observations of pan-assay interference. This is not so, however, and the data that form the basis of the PAINS filter model actually consist of the output of six assays that each use the AlphaScreen read-out. As noted in K2017, a panel of six assays using the same read-out would appear to be a suboptimal design of an experiment to observe pan assay interference. Putting this in perspective, P2006 (An Empirical Process for the Design of High-Throughput Screening Deck Filters) which was based on analysis of the output from 362 assays had actually been published four years before BH2010.

After a bit of a preamble, I need to get back to reviewing BW2014 and my view is that readers of the article who didn’t know better could easily conclude that drug discovery scientists were completely unaware of the problems associated with misleading HTS assay results before the re-branding of frequent-hittters as PAINS in BH2010. Given that M2003 had been published over a decade previously. I was rather surprised that BW2014 had not cited a single article about how colloidal aggregation can foil drug discovery. Furthermore, it had been known (see FS2006) for years before the publication of BH2010 that the importance of colloidal aggregation could be assessed by running assays in the presence of detergent.

I'll be commenting directly on the text of BW2014 for the remainder of the post (my comments are italicized in red).

Most PAINS function as reactive chemicals rather than discriminating drugs. [It is unclear here whether “PAINS” refers to compounds that have been shown by experiment to exhibit pan-assay interference or simply compounds that share structural features with compounds (chemical structures not disclosed) claimed to be frequent-hitters in the BH2010 assay panel. In any case, sweeping generalizations like this do need to be backed with evidence. I do not consider it valid to present observations of frequent-hitter behavior as evidence that compounds are functioning as reactive chemicals in assays.] They give false readouts in a variety of ways. Some are fluorescent or strongly coloured. In certain assays, they give a positive signal even when no protein is present. [The BW2014 authors appear to be confusing physical phenomena such as fluorescence with chemical reactivity.]

Some of the compounds that should ring the most warning bells are toxoflavin and polyhydroxylated natural phytochemicals such as curcumin, EGCG (epigallocatechin gallate), genistein and resveratrol. These, their analogues and similar natural products persist in being followed up as drug leads and used as ‘positive’ controls even though their promiscuous actions are well-documented (8,9). [Toxoflavin is not mentioned in either Ref8 or Ref9 although T2004 would have been a relevant reference for this compound. Ref8 only discusses curcumin and I do not consider that the article documents the promiscuous actions of this compound.  Proper documentation of the promiscuity of a compound would require details of the targets that were hit, the targets that were not hit and the concentration(s) at which the compound was assayed. The effects of curcumin, EGCG (epigallocatechin gallate), genistein and resveratrol on four membrane proteins were reported in Ref9 and these effects would raise doubts about activity for any of these compounds (or their close structural analogs) that had been observed in a cell-based assay. However, I don’t consider that it would be valid to use the results given in Ref9 to cast doubt on biological activity measured in an assay that was not cell-based.] 

Rhodanines exemplify the extent of the problem. [Rhodanines are specifically discussed in K2017 in which I suggest that the most plausible explanation for the frequent-hitter behavior observed for rhodanines in the BH2010 panel of six AlphaScreen assays is that the singly-connected sulfur reacts with singlet oxygen (this reactivity has been reported for compounds with thiocarbonyl groups in their molecular structures).] A literature search reveals 2,132 rhodanines reported as having biological activity in 410 papers, from some 290 organizations of which only 24 are commercial companies. [Consider what the literature search would have revealed if the target substructure had been ‘benzene ring’ rather than ‘rhodanine’? As discussed in this post the B2023 study presented the diversity of targets hit by compounds incorporating a fused tetrahydroquinolines in their molecular structures as ‘evidence’ for pan-assay interference by compounds based on this scaffold.] The academic publications generally paint rhodanines as promising for therapeutic development. In a rare example of good practice, one of these publications (10) (by the drug company Bristol-Myers Squibb) warns researchers that these types of compound undergo light-induced reactions that irreversibly modify proteins. [The C2001 study (Photochemically enhanced binding of small molecules to the tumor necrosis factor receptor-1 inhibits the binding of TNF-α) is actually a more relevant reference since it focuses of the nature of the photochemically enhanced binding. The structure of the complex of TNFRc1 with one of the compounds studied (IV703; see graphic below) showed a covalent bond between one of carbon atoms of the pendant nitrophenyl and the backbone amide nitrogen of A62. The structure of the IV703–TNFRc1 complex shows that a covalent bond between pendant aromatic ring must also be considered as a distinct possiblity for the rhodanines reported in Ref10 and C2001.] It is hard to imagine how such a mechanism could be optimized to produce a drug or tool. Yet this paper is almost never cited by publications that assume that rhodanines are behaving in a drug-like manner. [It would be prudent to cite M2012 (Privileged Scaffolds or Promiscuous Binders: A Comparative Study on Rhodanines and Related Heterocycles in Medicinal Chemistry) if denouncing fellow drug discovery scientists for failure to cite Ref10.]

In a move partially implemented to help editors and manuscript reviewers to rid the literature of PAINS (among other things), the Journal of Medicinal Chemistry encourages the inclusion of computer-readable molecular structures in the supporting information of submitted manuscripts, easing the use of automated filters to identify compounds’ liabilities. [I would be extremely surprised if ridding the literature of PAINS was considered by the JMC Editors when they decided to implement a requirement that authors include computer-readable molecular structures in the supporting information of submitted manuscripts. In any case, claims such as this do need to be supported by evidence.]  We encourage other journals to do the same. We also suggest that authors who have reported PAINS as potential tool compounds follow up their original reports with studies confirming the subversive action of these molecules. [I’ve always found this statement bizarre since the BW2014 authors appear to be suggesting that that authors who have reported PAINS as potential tool compounds should confirm something that they have not observed and which may not even have occurred. When using the term “PAINS” do the BW2014 authors mean compounds that have actually been shown to exhibit pan-assay interference or compounds that that share structural features with compounds that were claimed to exhibit frequent-hitter behavior in the BH2010 assay panel? Would interference in with the AlphaScreen read-out by a singlet oxygen quencher be regarded as a subversive action by a molecule in situations when a read-out other than AlphaScreen had been used?] Labelling these compounds clearly should decrease futile attempts to optimize them and discourage chemical vendors from selling them to biologists as valid tools. [The real problem here is compounds being sold as tools in the absence of the measured data that is needed to support the use of the compounds for this purpose. Matches with PAINS substructures would not rule out the use of a compound as a tool if the appropriate package of measured data is available. In contrast, a compound that does not match any PAINS substructures cannot be regarded as an acceptable tool if the appropriate package of measured data is not available. Put more bluntly, you’re hardly going to be able to generate the package of measured data if the compound is as bad as PAINS filter advocates say it is.]

Box: PAINS-proof drug discovery

Check the literature. [It’s always a good idea to check the literature but the failure of the BW2014 authors to cite a single colloidal aggregation article such as M2003 suggests that perhaps they should be following this advice rather than giving it. My view is that the literature on scavenging and quenching of singlet oxygen was treated in a cursory manner in BH2010 (see earlier comment in connection with rhodanines).]  Search by both chemical similarity and substructure to see if a hit interacts with unrelated proteins or has been implicated in non-drug-like mechanisms. [Chemical similarity and substructure search will identify analogs of hits and it is actually the exact match structural search that you need do in order to see if a particular compound is a hit in assays against unrelated proteins.] Online services such as SciFinder, Reaxys, BadApple or PubChem can assist in the check for compounds (or classes of compound) that are notorious for interfering with assays. [I generally recommend ChEMBL as a source of bioactivity data.]  

Assess assays. For each hit, conduct at least one assay that detects activity with a different readout. [This will only detect problems associated with interference with read-out. As discussed in S2009 it may be possible to assess and even correct for interference with read-out without having to run an assay with a different read-out.]  Be wary of compounds that do not show activity in both assays. If possible, assess binding directly, with a technique such as surface plasmon resonance. [SPR can also provide information about MoA since association, dissociation and stoichiometry can all be observed directly using this detection technology.] 

That concludes blogging for 2023 and many thanks to anybody who has read any of the posts this year. For too many people Planet Earth is not a very nice place to be right now and my new year wish is for a kinder, happier and more peaceful world in 2024.