Molecular Design: 2023

Sunday, 31 December 2023

Chemical con artists foil drug discovery

One piece of general advice that I offer to fellow scientists is to not let the fact that an article has been published in Nature (or any other ‘elite’ journal for that matter) cause you to switch off your critical thinking skills while reading it and the BW2014 article (Chemistry: Chemical con artists foil drug discovery) that I’ll be reviewing in this post is an excellent case in point. My main criticism of BW2014 that is that the rhetoric is not supported by data and I’ve always seen the article as something of a propaganda piece.

One observation that I’ll make before starting my review of BW2014 is that what lawyers would call ‘standard of proof’ varies according to whether you’re saying something good about a compound or something bad. For example, I would expect a competent peer reviewer to insist on measured IC50 values if I had described compounds as inhibitors of an enzyme in a manuscript. However, it appears to be acceptable, even in top journals, to describe compounds as PAINS without having to provide any experimental evidence that they actually exhibit some type of nuisance behavior (let alone pan-assay interference). I see a tendency in the ‘compound quality’ field for opinions to be stated as facts and reading some of the relevant literature leaves me with the impression that some in the field have lost the ability to distinguish what they know from what they believe.

BW2014 has been heavily cited in the drug discovery literature (it was cited as the first reference in the ACS assay interference editorial which I reviewed in K2017) despite providing little in the way of practical advice for dealing with nuisance behavior. B2014 appears to exert a particularly strong influence on the Chemical Probes Community having been cited by the A2015, BW2017, AW2022 and A2022 articles as well as in the Toxicophores and PAINS Alerts section of the Chemical Probes Portal. Given the commitment of the Chemical Probes Community to open science, their enthusiasm for the PAINS substructure model introduced in BH2010 (New Substructure Filters for Removal of Pan Assay Interference Compounds (PAINS) from Screening Libraries and for Their Exclusion in Bioassays) is somewhat perplexing since neither the assay data nor the associated chemical structures were disclosed. My advice to the Chemical Probes Community is to let go of PAINS filters.

Before discussing BW2014, I’ll say a bit about high-throughput screening (HTS) which emerged three decades ago as a lead discovery paradigm. From the early days of HTS it was clear, at least to those who were analyzing the output from the screens, that not every hit smelt of roses. Here’s what I wrote in K2017:

Although poor physicochemical properties were partially blamed (3) for the unattractive nature and promiscuous behavior of many HTS hits, it was also recognized that some of the problems were likely to be due to the presence of particular substructures in the molecular structures of offending compounds. In particular, medicinal chemists working up HTS results became wary of compounds whose molecular structures suggested reactivity, instability, accessible redox chemistry or strong absorption in the visible spectrum as well as solutions that were brightly colored. While it has always been relatively easy to opine that a molecular structure ‘looks ugly’, it is much more difficult to demonstrate that a compound is actually behaving badly in an assay.

It has long been recognized that it is prudent to treat frequent-hitters (compounds that hit in multiple assays) with caution when analysing HTS output. In K2017 I discussed two general types of behavior that can cause compounds to hit in multiple assays: Type 1 (assay result gives an incorrect indication of the extent to which the compound affects target function) and Type 2 (compound acts on target by undesirable mechanism of action (MoA)). Type 1 behavior is typically the result of interference with the assay read-out and the hits in question can be accurately described as ‘false positives’ because the effects on the target are not real. Type 1 behaviour should be regarded as a problem with the assay (rather than with the compound) and, provided that the activity of a compound has been established using a read-out for which interference is not a problem, interference with other read-outs is irrelevant. In contrast, Type 2 behavior should be regarded as a problem with the compound (rather than with the assay) and an undesirable MoA should always be a show-stopper.

Interference with read-out and undesirable MoAs can both cause compounds to hit in multiple assays. However, these two types of bad behavior can still cause big problems whether or not the compounds are observed to be frequent-hitters. Interference with read-out and undesirable MoAs are very different problems in drug discovery and the failure to recognize this point is a serious deficiency that is shared by BW2014 and BH2010.

Although I’ve criticized the use of PAINS filters there is no suggestion that compounds matching PAINS substructures are necessarily benign (many of the PAINS substructures look distinctly unwholesome to me). I have no problem whatsoever with people expressing opinions as to the suitability of compounds for screening provided that the opinions are not presented as facts. In my view the chemical con-artistry of PAINS filters is not that benign compounds have been denounced but the implication that PAINS filters are based on relevant experimental data.

Given that the PAINS filters form the basis of a cheminformatic model that is touted for prediction of pan-assay interference, one could be forgiven for thinking that the model had been trained using experimental observations of pan-assay interference. This is not so, however, and the data that form the basis of the PAINS filter model actually consist of the output of six assays that each use the AlphaScreen read-out. As noted in K2017, a panel of six assays using the same read-out would appear to be a suboptimal design of an experiment to observe pan assay interference. Putting this in perspective, P2006 (An Empirical Process for the Design of High-Throughput Screening Deck Filters) which was based on analysis of the output from 362 assays had actually been published four years before BH2010.

After a bit of a preamble, I need to get back to reviewing BW2014 and my view is that readers of the article who didn’t know better could easily conclude that drug discovery scientists were completely unaware of the problems associated with misleading HTS assay results before the re-branding of frequent-hittters as PAINS in BH2010. Given that M2003 had been published over a decade previously. I was rather surprised that BW2014 had not cited a single article about how colloidal aggregation can foil drug discovery. Furthermore, it had been known (see FS2006) for years before the publication of BH2010 that the importance of colloidal aggregation could be assessed by running assays in the presence of detergent.

I'll be commenting directly on the text of BW2014 for the remainder of the post (my comments are italicized in red).

Most PAINS function as reactive chemicals rather than discriminating drugs. [It is unclear here whether “PAINS” refers to compounds that have been shown by experiment to exhibit pan-assay interference or simply compounds that share structural features with compounds (chemical structures not disclosed) claimed to be frequent-hitters in the BH2010 assay panel. In any case, sweeping generalizations like this do need to be backed with evidence. I do not consider it valid to present observations of frequent-hitter behavior as evidence that compounds are functioning as reactive chemicals in assays.] They give false readouts in a variety of ways. Some are fluorescent or strongly coloured. In certain assays, they give a positive signal even when no protein is present. [The BW2014 authors appear to be confusing physical phenomena such as fluorescence with chemical reactivity.]

Some of the compounds that should ring the most warning bells are toxoflavin and polyhydroxylated natural phytochemicals such as curcumin, EGCG (epigallocatechin gallate), genistein and resveratrol. These, their analogues and similar natural products persist in being followed up as drug leads and used as ‘positive’ controls even though their promiscuous actions are well-documented (8,9). [Toxoflavin is not mentioned in either Ref8 or Ref9 although T2004 would have been a relevant reference for this compound. Ref8 only discusses curcumin and I do not consider that the article documents the promiscuous actions of this compound. Proper documentation of the promiscuity of a compound would require details of the targets that were hit, the targets that were not hit and the concentration(s) at which the compound was assayed. The effects of curcumin, EGCG (epigallocatechin gallate), genistein and resveratrol on four membrane proteins were reported in Ref9 and these effects would raise doubts about activity for any of these compounds (or their close structural analogs) that had been observed in a cell-based assay. However, I don’t consider that it would be valid to use the results given in Ref9 to cast doubt on biological activity measured in an assay that was not cell-based.]

Rhodanines exemplify the extent of the problem. [Rhodanines are specifically discussed in K2017 in which I suggest that the most plausible explanation for the frequent-hitter behavior observed for rhodanines in the BH2010 panel of six AlphaScreen assays is that the singly-connected sulfur reacts with singlet oxygen (this reactivity has been reported for compounds with thiocarbonyl groups in their molecular structures).] A literature search reveals 2,132 rhodanines reported as having biological activity in 410 papers, from some 290 organizations of which only 24 are commercial companies. [Consider what the literature search would have revealed if the target substructure had been ‘benzene ring’ rather than ‘rhodanine’? As discussed in this post the B2023 study presented the diversity of targets hit by compounds incorporating a fused tetrahydroquinolines in their molecular structures as ‘evidence’ for pan-assay interference by compounds based on this scaffold.] The academic publications generally paint rhodanines as promising for therapeutic development. In a rare example of good practice, one of these publications (10) (by the drug company Bristol-Myers Squibb) warns researchers that these types of compound undergo light-induced reactions that irreversibly modify proteins. [The C2001 study (Photochemically enhanced binding of small molecules to the tumor necrosis factor receptor-1 inhibits the binding of TNF-α) is actually a more relevant reference since it focuses of the nature of the photochemically enhanced binding. The structure of the complex of TNFRc1 with one of the compounds studied (IV703; see graphic below) showed a covalent bond between one of carbon atoms of the pendant nitrophenyl and the backbone amide nitrogen of A62. The structure of the IV703–TNFRc1 complex shows that a covalent bond between pendant aromatic ring must also be considered as a distinct possiblity for the rhodanines reported in Ref10 and C2001.] It is hard to imagine how such a mechanism could be optimized to produce a drug or tool. Yet this paper is almost never cited by publications that assume that rhodanines are behaving in a drug-like manner. [It would be prudent to cite M2012 (Privileged Scaffolds or Promiscuous Binders: A Comparative Study on Rhodanines and Related Heterocycles in Medicinal Chemistry) if denouncing fellow drug discovery scientists for failure to cite Ref10.]

In a move partially implemented to help editors and manuscript reviewers to rid the literature of PAINS (among other things), the Journal of Medicinal Chemistry encourages the inclusion of computer-readable molecular structures in the supporting information of submitted manuscripts, easing the use of automated filters to identify compounds’ liabilities. [I would be extremely surprised if ridding the literature of PAINS was considered by the JMC Editors when they decided to implement a requirement that authors include computer-readable molecular structures in the supporting information of submitted manuscripts. In any case, claims such as this do need to be supported by evidence.] We encourage other journals to do the same. We also suggest that authors who have reported PAINS as potential tool compounds follow up their original reports with studies confirming the subversive action of these molecules. [I’ve always found this statement bizarre since the BW2014 authors appear to be suggesting that that authors who have reported PAINS as potential tool compounds should confirm something that they have not observed and which may not even have occurred. When using the term “PAINS” do the BW2014 authors mean compounds that have actually been shown to exhibit pan-assay interference or compounds that that share structural features with compounds that were claimed to exhibit frequent-hitter behavior in the BH2010 assay panel? Would interference in with the AlphaScreen read-out by a singlet oxygen quencher be regarded as a subversive action by a molecule in situations when a read-out other than AlphaScreen had been used?] Labelling these compounds clearly should decrease futile attempts to optimize them and discourage chemical vendors from selling them to biologists as valid tools. [The real problem here is compounds being sold as tools in the absence of the measured data that is needed to support the use of the compounds for this purpose. Matches with PAINS substructures would not rule out the use of a compound as a tool if the appropriate package of measured data is available. In contrast, a compound that does not match any PAINS substructures cannot be regarded as an acceptable tool if the appropriate package of measured data is not available. Put more bluntly, you’re hardly going to be able to generate the package of measured data if the compound is as bad as PAINS filter advocates say it is.]

Box: PAINS-proof drug discovery

Check the literature. [It’s always a good idea to check the literature but the failure of the BW2014 authors to cite a single colloidal aggregation article such as M2003 suggests that perhaps they should be following this advice rather than giving it. My view is that the literature on scavenging and quenching of singlet oxygen was treated in a cursory manner in BH2010 (see earlier comment in connection with rhodanines).] Search by both chemical similarity and substructure to see if a hit interacts with unrelated proteins or has been implicated in non-drug-like mechanisms. [Chemical similarity and substructure search will identify analogs of hits and it is actually the exact match structural search that you need do in order to see if a particular compound is a hit in assays against unrelated proteins.] Online services such as SciFinder, Reaxys, BadApple or PubChem can assist in the check for compounds (or classes of compound) that are notorious for interfering with assays. [I generally recommend ChEMBL as a source of bioactivity data.]

Assess assays. For each hit, conduct at least one assay that detects activity with a different readout. [This will only detect problems associated with interference with read-out. As discussed in S2009 it may be possible to assess and even correct for interference with read-out without having to run an assay with a different read-out.] Be wary of compounds that do not show activity in both assays. If possible, assess binding directly, with a technique such as surface plasmon resonance. [SPR can also provide information about MoA since association, dissociation and stoichiometry can all be observed directly using this detection technology.]

That concludes blogging for 2023 and many thanks to anybody who has read any of the posts this year. For too many people Planet Earth is not a very nice place to be right now and my new year wish is for a kinder, happier and more peaceful world in 2024.

Tuesday, 19 December 2023

On quality criteria for covalent and degrader probes

I’ll be taking a look at H2023 (Expanding Chemical Probe Space: Quality Criteria for Covalent and Degrader Probes) in this post and this article has also been discussed In The Pipeline. I’ll primarily be discussing the quality criteria for covalent probes in this post although I’ll also comment briefly on chemical matter criteria proposed for degrader probes. The post is intended as a contribution to the important scientific discussion that the H2023 Perspective is intended to jumpstart:

We are convinced that now is the time to initiate similar efforts to achieve a consensus about quality criteria for covalently acting and degrader probes. This Perspective is intended to jumpstart this important scientific discussion.

Covalent bond formation between ligands and targets is a drug design tactic for exploiting molecular recognition elements in targets that are difficult to make beneficial contacts with. Cysteine SH has minimal capacity to form hydrogen bonds with polar ligand atoms and the exposed nature of catalytic cysteine SH reduces its potential to make beneficial contacts with non-polar ligand atoms. One common misconception in drug discovery is that covalent bond formation between targets and ligands is necessarily irreversible and it wasn’t clear from my reading of H2023 whether the authors were aware that covalent bond formation between targets and ligands can also be reversible. In any case, it needed to be made clear that the quality criteria proposed by the authors for covalently acting small-molecule probes only apply to probes that act irreversibly.

Irreversible covalent bond formation is typically used to target non-catalytic residues and design is lot more complicated than for reversible covalent bond formation. First, IC50 values are time-dependent (there are two activity parameters: affinity and inactivation rate constant) which makes it much more difficult to assess selectivity or to elucidate SAR. Second, the transition state structural models required for modelling inactivation cannot be determined experimentally and therefore need to be calculated using computationally intensive quantum mechanical methods.

I’ll start my review with a couple of general comments. Intracellular concentration is factor that is not always fully appreciated in chemical biology and I generally recommend that people writing about chemical probes demonstrate awareness of SR2019 (Intracellular and Intraorgan Concentrations of Small Molecule Drugs: Theory, Uncertainties in Infectious Diseases and Oncology, and Promise). One a more pedantic note I cautioned against using ‘molecule’ as a synonym for ‘compound’ in my review of S2023 (Systematic literature review reveals suboptimal use of chemical probes in cell-based biomedical research) and I suggest that “covalent molecule” might be something that you don't want to see in the text of an article in a chemistry journal.

However, significant efforts need to be invested into characterizing and validating covalent molecules as a prerequisite for conclusive use in biomedical research and target validation studies.

The proposed quality criteria for covalently acting small-molecule probes are given in Figure 2 of H2023 although I’ll be commenting on the text of the article. Subscripting doesn't work well in blogger and so I'll use K.i and k.inact respectively throughout the post to denote the inhibition constant and the first order inactivation rate constant.

I’ll start with Section 2.1 (Criteria for Assessing Potency of Covalent Probes) and my comments are italicised in red.

When working with irreversible covalent probes, it is important to consider that target inhibition is time-dependent and therefore IC50 values, while frequently used, are a suboptimal descriptor of potency. (21) Best practice is to use k.inact (the rate of inactivation) over K.i (the affinity for the target) values instead. (22) [I recommend that values of both k.inact and K.i be reported since because this enables the extent of non-covalent target engagement by the chemical probe to be assessed. Regardless of whether binding to target is covalent or non-covalent, the concentration and affinity of substrates (as well as cofactors such as ATP) need be properly accounted for when interpreting effects of chemical probes in cell-based assays. This is a significant issue for ATP-competitive kinase inhibitors (as discussed in my review of S2023) and I recommend this tweetorial from Keith Hornberger.]

As measurement of k.inact/K.i values can be labor-intensive (or in certain cases technically impossible), IC50 values (or target engagement TE50 values) are often reported for covalent leads and used to generate structure–activity relationships (SARs). [The labor-intensive nature of the measurements is not a valid justification for a failure to measure k.inact and K.i values for a covalent chemical probe.] Carefully designed biochemical assays used in determining IC50 values can be well-suited as surrogates for k.inact/K.i measurements. (24) [It is my understanding that the primary reason for doing this is to increase the throughput of irreversible inhibition assays for SAR optimization and I would generally be extremely wary of any IC50 value measured for an irreversible inhibitor if it had not been technically impossible to measure k.inact or K.i values for the inhibitor.]

2.2. Criteria for Assessing Covalent Probe Selectivity

We propose a selectivity factor of 30-fold in favor of the intended target of the probe compared to that of other family members or identified off-targets under comparable assay conditions. [The authors need to be clearer as to which measure of ‘activity’ they propose should be used for calculating the ratio and some justification for the ratio (why 30-fold rather than 50-fold or 25-fold?) should be given. Regardless of whether binding to target is covalent or non-covalent, the concentration and affinity of substrates (as well as cofactors such as ATP) need to be properly accounted for when assessing selectivity. It is not clear how the selectivity factor should be defined to quantify selectivity of an inhibitor that binds covalently to the target but non-covalently to off-targets. My comments on the THZ1 probe in my review of the S2023 study may be relevant.]

2.3. Chemical Matter Criteria for Covalent Probes

Ideally, the on-target activity of the covalent probe is not dominated by the reactive warhead, but the rest of the molecule provides a measurable reversible affinity for the intended target. [My view is that the reversible affinity of the probe should be greater than simply what is measurable and I suggest, with some liberal arm-waving, that a K.i cutoff of ~100 nM might be more useful (a K.i value of 10 μM is usually measurable provided that the inhibitor is adequately soluble in assay buffer).] Seeing SARs over 1–2 log units of activity resulting from core, substitution, and warhead changes is an important quality criterion for covalent probe molecules. [The authors need to be clearer about which ‘activity’ they are referring to (differences in K.i and k.inact values between compounds are likely to be greater than the corresponding differences in k.inact/K.i values). The criterion “SAR for covalent and non-covalent interactions” shown in Figure 2 is nonsensical.]

3.3. Chemical Matter Criteria for Degrader Probes

When selecting chemical degrader probes, it is recommended that a chemist critically assesses the chemical structure of the degrader for the presence of chemical groups that impart polypharmacology or interfere with assay read-outs (PAINs motifs). (78) [I certainly agree that chemists should critically assess chemical structures of probes and, if performing a critical assessment of this nature for a degrader probe, I would be taking a look in ChEMBL to see what’s known for structurally-related compounds. I consider the risk of discarding acceptable chemical matter on the basis of matches with PAINS substructures to be low although there’s a lot more to critical assessment of chemical structures than simply checking for matches against PAINS substructures. My view is that genuine promiscuity (as opposed to frequent hitter behavior resulting from interference with read-out) cannot generally be linked to chemical groups. As noted in K2017 the PAINS substructure model introduced in BH2010 was actually trained on the output of six AlphaScreen assays and the applicability domain of the model should be regarded as prediction of frequent-hitter behavior in this assay panel rather than interference with assay read-outs (that said the most plausible explanation for frequent-hitter behavior in the PAINS assay panel is interference with the AlphaScreen read-out by compounds that quench or react with singlet oxygen). My recommendation is that chemical matter criteria for chemical probes should be specified entirely in terms of measured data and the models used to select/screen potentially acceptable chemical matter should not be included in the chemical matter criteria.]

This is a good point to wrap up my contribution to the important scientific discussion that H2023 is intended to jumpstart. While some of what I've written might be seen as nitpicking please bear in mind that quality criteria for chemical probes need to be defined precisely in order to be useful to the chemical biology and medicinal chemistry communities.

Wednesday, 6 December 2023

Are fused tetrahydroquinolines interfering with your assay?

I’ll be taking a look at B2023 (Fused Tetrahydroquinolines Are Interfering with Your Assay) in this post. The article has already been discussed in posts at Practical Fragments and In The Pipeline. In anticipation of the stock straw man counterarguments to my criticisms of PAINS filters, I must stress that there is absolutely no suggestion that compounds matching PAINS filters are necessarily benign. The authors have shown that fusion of cyclopentene at C3-C4 of the tetrahydroquinoline (THQ) ring system is associated with a risk of chemical instability and I consider this to be extremely useful information for anybody thinking about using this scaffold. However, the authors do also appear to be making a number of claims that are not supported by evidence and, in my view, have not demonstrated that the chemical instability leads to pan-assay interference or even frequent-hitter behavior.

The term ‘PAINS’ crops up frequently in B2023 (the authors even refer to “the PAINS concept” although I think that’s pushing things a bit) and I’ll start by saying something about two general types of nuisance behavior of compounds in assays and these points are discussed in more detail in K2017 (Comment on The Ecstasy and Agony of Assay Interference Compounds). From the perspective of screening libraries of compounds for biological activity, the two types of nuisance behavior are very different problems that need to be considered very differently. One criticism that can be made of both BH2010 (original PAINS study) and BW2014 (Chemical con artists foil drug discovery) is that neither study considers the differing implications for drug discovery of these two types of nuisance behavior.

The first type of nuisance behavior in assays is interference with assay read-out and when ‘activity’ in an assay is due to assay interference hits can accurately be described as ‘false positives’ (this should be seen as a problem with the assay rather than the compound). Interference with assay read-outs is certainly irksome when you’re analysing output from screens because you don’t know if the ‘activity’ is real or not. However, if you’re able to demonstrate genuine activity for a compound using an assay with a read-out for which interference is not an issue then interference with other assay read-outs is irrelevant and would not rule out the compound as a viable starting point for further investigation. Interference with assay read-outs generally increases with the concentration of the compound in the assay (this is why biophysical methods are often favored for screening fragments) and I’ll direct readers to a helpful article by former colleagues. It’s also worth noting that interference with read-out can also lead to false negatives.

The second type of nuisance behavior is that the compound acts on a target by an undesirable mechanism of action (MoA) and it is not accurate to describe hits behaving in this manner as ‘false positives’ because the effect on the target is real (this should be seen as a problem with the compound rather than the assay). In contrast to interference with read-out, an undesirable MoA is a show-stopper. An undesirable MoA with which many drug discovery scientists will be familiar is colloidal aggregate formation (see M2003) and the problem can be assessed by running the assay in the absence and presence of detergent (see FS2006). In some cases patterns in screening output may point to an undesirable MoA. For example, cysteine reactivity might be indicated by compounds hitting in multiple assays for inhibition of enzymes that use feature cysteine in their catalytic mechanisms.

I’ll make some comments on PAINS filters before I discuss B2023 in detail and much of what I’ll be saying has already been said in K2017 and C2017 (Phantom PAINS: Problems with the Utility of Alerts for Pan-Assay INterference CompoundS) although you shouldn’t need to consult these articles in order to read the blog post unless you want to get some more detail. The PAINS filter model introduced in BH2010 consists of number of substructures which are claimed (I say “claimed” because the assay results and associated chemical structures are proprietary) to be associated with frequent hitter behavior in a panel of six assays that all use the AlphaScreen read-out (compounds that react with or quench singlet oxygen have the potential of interfere with this read-out). I argued in K2017 that six assays, all using the same read-out, do not constitute a credible basis for the design of an experiment to detect pan-assay interference. Put another way, the narrow scope of the data used to train the PAINS filter model restricts the applicability domain of this model to prediction of frequent-hitter behavior in these six assays. The BH2010 study does not appear present a single example of a compound that has been actually been demonstrated by experiment to exhibit pan-assay interference.

The B2023 study reports that tetrahydroquinolines (THQs) fused at C3-C4 with cyclopentene (1) are unstable. This is valuable information for anybody who may be have the misfortune to be working with this particular scaffold and the observed instability implies that drug discovery scientists should also be extremely wary of any biological activity reported for compounds that incorporate this scaffold. Furthermore, the authors show that the instability can be linked to the presence of the carbon-carbon double bond in the ‘third ring’ since 2, the dihydro analog of 1, appears to be stable. I would certainly mention the chemical instability reported in B2023 if reviewing a manuscript that reported biological activity for compounds based on this scaffold. However, I would not mention that BH2010 has stated that the scaffold matches the anil_alk_ene (SLN: C[1]:C:C:C[4]:C(:C:@1)NCC[9]C@4C=CC@9 ) PAINS substructure because the nuisance behavior consists of hitting frequently in a six-assay panel of questionable relevance and the PAINS filters were based on analysis of proprietary data.

Although I wouldn’t have predicted the chemical instability reported for 1 by B2023, this scaffold is certainly not a structural feature that I would have taken into lead optimization with any enthusiasm (a hydrogen that is simultaneously benzylic and allylic does rather look like a free lunch for the CYPs). I would still be concerned about instability even if methylene groups were added to or deleted from the aliphatic parts of 1. I suspect that the electron-releasing nitrogen of 1 contributes to chemical instability although I don’t think that changing nitrogen for another atom type would eliminate the risk of chemical instability. Put another way, the instability observed for 1 should raise questions about the stability of a number of structurally-related scaffolds. Chemical instability is (or at least should be) a show-stopper in the context of drug discovery even if doesn't lead to interference with assay read-out, an undesirable MoA or pan-assay interference.

I certainly consider the instability observed for 1 to be of interest and relevant to a number of structurally-related chemotypes. However, I have a number of concerns about B2023 and one specific criticism is that the authors use “tricyclic/fused THQ” as a synonym throughout the text as a synonym for “tricyclic/fused THQ with a carbon-carbon double bond in the ‘third’ ring”. At best this is confusing and it could lead to groundless criticism, either publicly or in peer review, of a study that reported assay results for compounds based on the scaffold in 2. A more general point is that the authors make a number of claims that, in my view, are not adequately supported by evidence. I’ll start with the significance section and my comments are italicized in red:

Tricyclic tetrahydroquinolines (THQs) are a family of lesser studied pan-assay interference compounds (PAINS) [The authors need to provide specific examples of tricyclic THQs that have been actually been shown to exhibit pan-assay interference to support this claim.] These compounds are found ubiquitously throughout commercial and academic small molecule screening libraries. [The authors do not appear to have presented evidence to support this claim and the presence of compounds in vendor catalogues does not prove that the compounds are actually being screened. In my view, the authors appear to be trying to ‘talk up’ the significance of their findings by making this statement.] Accordingly, they have been identified as hits in high-throughput screening campaigns for diverse protein targets. We demonstrate that fused THQs are reactive when stored in solution under standard laboratory conditions and caution investigators from investing additional resource into validating these nuisance compounds.

Continuing with the introduction

Fused tetrahydroquinolines (THQs) are frequent hitters in hit discovery campaigns. [In my view the authors have not presented sufficient evidence to support this statement and I don’t consider claims made in the BH2010 for frequent-hitter behavior by compounds matching the anil_alk_ene PAINS substructure to be admissible as evidence simply because they are based on proprietary data. In any case the numbers of compounds matching the anil_alk_ene PAINS substructure and reported in BH2010 to hit in zero (17) or one (11) assays in the PAINS assay panel suggest that 28 compounds (of a total of 51 substructural matches) cannot be regarded as frequent-hitters in this assay panel.] Pan-assay interference compounds (PAINS) have been controversial in the recent literature. While some literature supports these as nuisance compounds, other papers describe PAINS as potentially valuable leads. (1 | 2 | 3 | 4) [The C2017 study referenced as 2 is actually a critique of PAINS filters and I’m assuming that the authors aren’t suggesting that it “supports these [PAINS] as nuisance compounds”. However, I would consider it a gross misrepresentation of C2017 to imply that the study describes “PAINS as potentially valuable leads”.] There have been descriptions of many different classes of PAINS that vary in their frequency of occurrence as hits in the screening literature. [In my view, the number of articles on PAINS appears to greatly exceed the number of compounds that have actually been shown to exhibit pan-assay interference.]

The number of papers that selected this scaffold during hit discovery campaigns from multiple chemical libraries supports the idea that fused THQs are frequent hitters. [Let’s take a closer look at what the authors are suggesting by considering a selection of compounds, each of which has a benzene ring in its molecular structure. Now let’s suppose that each of a large number of targets is hit by at least one of the compounds in this selection (I could easily satisfy this requirement by selecting marketed drugs with benzene rings in their molecular structures). Applying the same logic as the authors, I could use these observations to support the idea that compounds incorporating benzene rings in their molecular structures are frequent-hitters. In my view the B2023 study doesn’t appear to have presented a single example of a fused THQ that has actually been shown experimentally to exhibit frequent-hitter behavior. As mentioned earlier in this post less than half of the compounds matching the anil_alk_ene PAINS substructure that were evaluated in the BH2010 assay panel can be regarded as frequent-hitters.] At first glance, these compounds appear to be valid, optimizable hits, with reasonable physicochemical properties. Although micromolar and reproducible activity has been reported for multiple THQ analogues on many protein targets, hit-to-lead optimization programs aimed at improving the initial hits (Supporting Information (SI), Table S1) have resulted in no improvement in potency or no discernible structure–activity relationships (SAR) [Achieving increased potency and establishing SARs are certainly important objectives in hit-to-lead studies. However, assertions that hit-to-lead optimizations “have resulted in no improvement in potency or no discernible structure–activity relationships” do need to be supported with appropriate discussion of specific hit-to-lead optimization studies.]

Examples of Fused THQs as “Hits” Are Pervasive

The diversity of protein targets captured below supports the premise that the fused THQ scaffold does not yield specific hits for these proteins but that the reported activity is a result of pan-assay interference. [I could use an argument analogous to the one that I’ve just used for frequent-hitters to ‘prove’ that compounds with benzene rings in their molecular structure do not yield specific hits and that any reported activity is due to pan-assay interference. The authors do not appear to have presented a single example of a fused THQ that has been shown by experiment to exhibit pan-assay interference.]

Concluding remarks

Our review and evidence-based experiments solidify the idea that tricyclic THQs are nuisance compounds that cause pan-assay interference in the majority of screens rather than privileged structures worthy of chemical optimization. [While I certainly agree that chemical instability would constitute a nuisance, I would consider it wildly extravagant to claim that tricyclic THQs can “cause pan assay interference” since nobody appears to have actually observed pan-assay interference for even a single tricyclic THQ.] Their widespread micromolar activities on a broad range of proteins with diverse assay readouts support our assertion that they are unlikely to be valid hits. [As stated previously, I do not consider that “widespread micromolar activities on a broad range of proteins” observed for compounds that share a particular structural feature implies that all compounds with the particular structural feature are unlikely to be valid hits.]

So that concludes my review of the B2023 study. I really liked the experimental work that revealed the instability of 1 and linked it to the presence of the double bond in the 'third' ring. Furthermore, these experimental results would (at least for me) raise questions about the chemical stability of some scaffolds that are structurally-related to 1. However, I found the analysis of the bioactivity data reported in the literature for fused THQs to be unconvincing to the extent that it significantly weakened the B2023 study.

Sunday, 19 November 2023

On the misuse of chemical probes

It’s now time to get back to chemical probes and I’ll be taking a look at S2023 (Systematic literature review reveals suboptimal use of chemical probes in cell-based biomedical research) which has already been reviewed in blog posts from Practical Fragments, In The Pipeline and the Institute of Cancer Research. Readers of this blog are aware that PAINS filters usually crop up in posts on chemical probes but there are other things that I want to discuss and, in any case, references to PAINS in S2023 are minimal. Nevertheless, I’ll still stress that a substructural match of a chemical probe with a PAINS filter does not constitute a valid criticism of a chemical probe (it simply means that the chemical structure of the chemical probe shares structural features with compounds that have been claimed to exhibit frequent-hitter behaviour in a panel of six AlphaScreen assays) and one is more likely to encounter a bunyip than a compound that has actually been shown to exhibit pan-assay interference.

The authors of S2023 claim to have revealed “suboptimal use of chemical probes in cell-based biomedical research” and I’ll start by taking a look at the abstract (my annotations are italicised in red):

Chemical probes have reached a prominent role in biomedical research, but their impact is governed by experimental design. To gain insight into the use of chemical probes, we conducted a systematic review of 662 publications, understood here as primary research articles, employing eight different chemical probes in cell-based research. [A study such as S2023 that has been claimed by its authors to be systematic does need to say something about how the eight chemical probes were selected and why the literature for this particular selection of chemical probes should be regarded as representative of chemical probes literature in general.] We summarised (i) concentration(s) at which chemical probes were used in cell-based assays, (ii) inclusion of structurally matched target-inactive control compounds and (iii) orthogonal chemical probes. Here, we show that only 4% of analysed eligible publications used chemical probes within the recommended concentration range and included inactive compounds as well as orthogonal chemical probes. [I would argue that failure to use a chemical probe within a recommended concentration range is only a valid criticism if the basis for the recommendation is clearly articulated.] These findings indicate that the best practice with chemical probes is yet to be implemented in biomedical research. [My view is that the best practice with chemical probes is yet to be defined.] To achieve this, we propose ‘the rule of two’: At least two chemical probes (either orthogonal target-engaging probes, and/or a pair of a chemical probe and matched target-inactive compound) to be employed at recommended concentrations in every study. [The authors of S2023 do seem to moving the goalposts since the they’ve criticized studies for not using structurally matched target-inactive control compounds but are saying that using an additional orthogonal target-engaging probe makes it acceptable not to use a structurally matched target-inactive control compound. This suggestion does appear to contradict the Chemical Probes Portal criteria for 'classical' modulators which do require the use of a control compound defined as having a "similar structure with similar physicochemistry, non-binding against target".]

The following sentence does set off a few warning bells for me:

The term ‘chemical probe’ distinguishes compounds used in basic and preclinical research from ‘drugs’ used in the clinic, from the terms ‘inhibitor’, ‘ligand’, ‘agonist’ or ‘antagonist’ which are molecules targeting a given protein but are insufficiently characterised, and also from the term ‘probes’ which is often referring to laboratory reagents for biophysical and imaging studies.

First, the terms 'compound' and 'molecule' are not interchangeable and I would generally recommend using 'compound' when talking about biological activity or affinity. A more serious problem is that the authors of S2023 seem to be getting into homeopathic territory by suggesting that chemical probes are not ligands and this might have caused Paul Ehrlich (who died 26 years before Kaiser Wilhelm II) to spit a few feathers. Drugs and chemical probes are ligands for their targets by virtue of binding to their targets (the term 'ligand' is derived from the Latin 'ligare' which means 'to bind' and a compound can be a ligand for one target without necessarily being a ligand for another target) while the terms 'inhibitor', 'agonist' and 'antagonist' specify the consequences of ligand binding. I was also concerned by the use of the term 'in cell concentration' in S2023 given that uncertainty in intracellular concentration is an issue when working with chemical probes (as well as in PK-PD modelling). Although my comments above could be seen as nit-picking these are not the kind of errors that authors can afford to make if they’re going to claim that their “findings indicate that the best practice with chemical probes is yet to be implemented in biomedical research”.

Let’s take a look at the criteria by which the authors of S2023 have assessed the use of chemical probes. They assert that “Even the most selective chemical probe will become non-selective if used at a high concentration” although I think it’d be more correct to state that the functional selectivity of a probe depends on binding affinity of the probe for target and anti-targets as well as the concentration of the probe (at its site of action). Selectivity also depends on the concentration of anything that binds competitively with the probe and, when assessing kinase selectivity, it can be argued that assays for ATP-competitive kinase inhibitors should be run at a typical intracellular ATP concentration (here’s a recent open access review on intracellular ATP concentration). The presence of serum in cell-based assays should also be considered when setting upper concentration limits since chemical probes may bind to serum proteins such as albumin which means that the concentration of a compound that is ‘seen’ by the cells is lower than the total concentration of the compound in the assay. In my experience binding to albumin tends to increase with lipophilicity and is also favored by the presence of an acidic group such as carboxylate in a molecular structure.

I’m certainly not suggesting that chemical probes be used at excessive concentrations but if you’re going to criticise other scientists for exceeding concentration thresholds then, at very least, you do need to show that the threshold values have been derived in an objective and transparent manner. My view that it would not be valid to criticise studies publicly (or in peer review of submitted manuscripts) simply because the studies do not comply with recommendations made by the Chemical Probes Portal. It is significant that the recommendations from different groups of chemical probe experts with respect to the maximum concentration at which UNC1999 should be used differ by almost an order of magnitude:

As the recommended maximal in-cell concentration for UNC1999 varies between the Chemical Probes Portal and the Structural Genomics Consortium sites (400 nM and 3 μM, respectively), we analysed compliance with both concentrations.

One of the eight chemical probes featured in S2023 is THZ1 which is reported to bind covalently to CDK7 and the electrophilic warhead is acrylamide-based, suggesting that binding is irreversible. Chemical probes that form covalent bonds with their targets irreversibly need to be considered differently to chemical probes that engage their targets reversibly (see this article). Specifically, the degree of target engagement by a chemical probe that binds irreversibly depends on time as well as concentration (if you wait long enough then you’ll achieve 100% inhibition). This means that it’s not generally possible to quantify selectivity or to set concentration thresholds objectively for chemical probes that bind to their targets irreversibly. It’s not clear (at least to me) why an irreversible covalent inhibitor such as THZ1 was included as one of the eight chemical probes covered by the S2023 study so I checked to see what the Chemical Probes Portal had to say about THZ1 and something doesn’t look quite right. The on-target potency is given as a Kd (dissociation constant which is a measure of affinity) value of 3.2 nM and the potency assay is described as “time-dependent binding established supporting covalent mechanism”. However, Kd is a measure of affinity (and therefore not a time-dependent) and my understanding is that it is generally difficult to measure Kd for irreversible covalent inhibitors which are typically characterized by kinact (inactivation rate constant) and Ki (inhibition constant) values obtained from analysis of enzyme inhibition data. The off-target potency of THZ1 is summarized as “KiNativ profiling against 246 kinases in Loucy cells was performed showing >75% inhibition at 1 uM of: MLK3, PIP4K2C, JNK1, JNK2, JNK3, MER, TBK1, IGF1R, NEK9, PCTAIRE2, and TBK1, but in vitro binding to off-target kinases was not time dependent indicating that inhibition was not via a covalent mechanism”. The results from the assays used to measure on-target and off-target potency for THZ1 do not appear to be directly comparable.

It’s now time to wrap up and I suggest that it would not be valid to criticise (either publicly or in peer review) a study simply on the grounds that it reported results of experiments in which a chemical probe was used at a concentration exceeding a recommended maximum value. The S2023 authors assert that an additional orthogonal target-engaging probe can be substituted for a matched target-inactive control compound but this appears to contradict criteria for classical modulators given by the Chemical Probes Portal.

Wednesday, 27 September 2023

Five days in Vermont

A couple of months ago I enjoyed a visit to the US (my first for eight years) on which I caught up with old friends before and after a few days in Vermont (where a trip to the golf course can rapidly become a National Geographic Moment). One highlight of the trip was randomly meeting my friend and fellow blogger Ash Jogalekar for the first time in real life (we’ve actually known each other for about fifteen years) on the Boston T Red Line. Following a couple of nights in green and leafy Belmont, I headed for the Flatlands with an old friend from my days in Minnesota for a Larry Miller group reunion outside Chicago before delivering a short harangue on polarity at Ripon College in Wisconsin. After the harangues, we enjoyed a number of most excellent Spotted Cattle (Only in Wisconsin) in Ripon. I discovered later that one of my Instagram friends is originally from nearby Green Lake and had taken classes at Ripon College while in high school. It is indeed a small world.

The five days spent discussing computer-aided drug design (CADD) in Vermont are what I’ll be covering in this post and I think it’s worth saying something about what drugs need to do in order to function safely. First, drugs need to have significant effects on therapeutic targets without having significant effects on anti-targets such as hERG or CYPs and, given the interest in new modalities, I’ll be say “effects” rather than “affinity”, although Paul Ehrlich would have reminded us that drugs need to bind in order to exert effects. Second, drugs need to get to their targets at sufficiently high concentrations for their effects to be therapeutically significant (drug discovery scientists use the term ‘exposure’ when discussing drug concentration). Although it is sometimes believed that successful drugs simply reduce the numbers of patients suffering from symptoms it has been known from the days of Paracelsus that it is actually the dose that differentiates a drug from a poison.

Drug design is often said to be multi-objective in nature although the objectives are perhaps not as numerous as many believe (this point is discussed in the introduction section of NoLE, an article that I'd recommend to insomniacs everywhere). The first objective of drug design can be stated in terms of minimization of the concentration at which a therapeutically useful effect on the target is observed (this is typically the easiest objective to define since drug design is typically directed at specific targets). The second objective of drug design can be stated in analogous terms as maximization of the concentration at which toxic effects on the anti-targets are observed (this is a more difficult objective to define because we generally know less about the anti-targets than about the targets). The third objective of drug design is to achieve controllability of exposure (this is typically the most difficult objective to define because drug concentration is a dose-dependent, spaciotemporal quantity and intracellular concentration cannot generally be measured for drugs in vivo). Drug discovery scientists, especially those with backgrounds in computational chemistry and cheminformatics, don’t always appreciate the importance of controlling exposure and the uncertainty in intracellular concentration always makes for a good stock question for speakers and panels of experts.

I posted previously on artificial intelligence (AI) in drug design and I think it’s worth highlighting a couple of common misconceptions. The first misconception is that we just need to collect enough data and the drugs will magically condense out of the data cloud that has been generated (this belief appears to have a number of adherents in Silicon Valley). The second misconception is that drug design is merely an exercise in prediction when it should really be seen in a Design of Experiments framework. It’s also worth noting that genuinely categorical data are rare in drug design and my view is that many (most?) "global" machine learning (ML) models are actually ensembles of local models (this heretical view was expressed in a 2009 article and we were making the point that what appears to be an interpolation may actually be an extrapolation). Increasingly, ML is becoming seen as a panacea and it’s worth asking why quantitative structure activity relationship (QSAR) approaches never really made much of a splash in drug discovery.

I enjoyed catching up with old friends [ D | K | S | R/J | P/M ] as well as making some new ones [ G | B/R | L ]. However, I was disappointed that my beloved Onkel Hugo was not in attendance (I continue to be inspired by Onkel’s laser-like focus on the hydrogen bonding of the ester) and I hope that Onkel has finally forgiven me for asking (in 2008) if Austria was in Bavaria. There were many young people at the gathering in Vermont and their enthusiasm made me greatly optimistic for the future of CADD (I’m getting to the age at which it’s a relief not to be greeted with: "How nice to see you, I thought you were dead!"). Lots of energy at the posters (I learned from one that Voronoi was Ukrainian) although, if we’d been in Moscow, I’d have declined the refreshments and asked for a room on the ground floor (left photo below). Nevertheless, the bed that folded into the wall (centre and right photos below) provided plenty of potential for hotel room misadventure without the ‘helping hands’ of NKVD personnel.

It'd been four years since CADD had been discussed at this level in Vermont so it was no surprise to see COVID-19 on the agenda. The COVID-19 pandemic led to some very interesting developments including the Covid Moonshot (a very different way of doing drug discovery and one I was happy to contribute to during my 19 month sojourn in Trinidad) and, more tangibly, Nirmatrelvir (an antiviral medicine that has been used to treat COVID-19 infections since early 2022). Looking at the molecular structure of Nirmatrelvir you might have mistaken trifluoroacetyl for a protecting group but it’s actually a important feature (it appears to be beneficial from the permeability perspective). My view is that the alkane/water logP (alkane is a better model than octanol for the hydrocarbon core of a lipid bilayer) for a trifluoroacetamide is likely to be a couple of log units greater than for the corresponding acetamide.

I’ll take you through how the alkane/water logP difference between a trifluoroacetamide and corresponding acetamide can be estimated in some detail because I think this has some relevance to using AI in drug discovery (I tend to approach pKa prediction in an analogous manner). Rather than trying to build an ML model for making the prediction, I’ve simply made connections between measurements for three different physicochemical properties (alkane/water logP, hydrogen bond basicity and hydrogen bond acidity) which is something that could easily be accommodated within an AI framework. I should stress that this approach can only be used because it is a difference in alkane/water logP (as opposed to absolute values) that is being predicted and these physicochemical properties can plausibly be linked to substructures.

Let’s take a look at the triptych below which I admit that is not quite up to the standards of Hieronymus Bosch (although I hope that you find it to be a little less disturbing). The first panel shows values of polarity (q) for some hydrogen bond acceptors and donors (you can find these in Tables 2 and 3 in K2022) that have been derived from alkane/water logP measurements. You could, for example, use these polarity values to predict that reducing the polarity of an amide carbonyl oxygen to the extent that it looks like a ketone will lead to a 2.2 log unit increase in alkane/water logP. The second panel shows measured hydrogen bond basicity values for three hydrogen bond acceptors (you can find these in this freely available dataset) and the values indicate that a trifluoroacetamide is an even weaker hydrogen bond acceptor than a ketone. Assuming a linear relationship between polarity and hydrogen bond basicity, we can estimate that the trifluoroacetamide carbonyl oxygen is 2.4 log units less polar than the corresponding acetamide. The final panel shows measured hydrogen bond acidity values (you can find these in Table 1 of K2022) that suggest that an imide NH (q = 1.3; 0.5 log units more polar than typical amide NH) will be slightly more polar than the trifluoroacetamide NH of Nirmatrelvir. So to estimate he difference in alkane/water logP values you just need to subtract the additional polarity of trifluoroacetamide NH (0.5 log units) from the lower polarity of the trifluoroacetamide carbonyl oxygen (2.4) to get 1.9 log units.

Chemical space is a recurring theme in drug design and its vastness, which defies human comprehension, has inspired much navel-gazing over the years (it’s actually tangible chemical space that’s relevant to drug design). In drug discovery we need to be able to navigate chemical space (ideally without having to ingest huge quantities of Spice) and, given that Ukrainian chemists have revolutionized the world's idea of tangible chemical space (and have also made it a whole lot larger), it is most appropriate to have a Ukrainian guide who is most ably assisted by a trusty Transylvanian sidekick. I see benefits from considering molecular complexity more explicitly when mapping chemical space.

AI (as its evangelists keep telling us) is quite simply awesome at generating novel molecular structures although, as noted in a previous post, there’s a little bit more to drug design than simply generating novel molecular structures. Once you’ve generated a novel molecular structure you need to decide whether or not to synthesize the compound and, in AI-based drug design, molecular structures are often assessed using ML models for biological activity as well as absorption, distribution, metabolism and excretion (ADME) behaviour. It’s well-known that you need a lot of data for training these ML models but you also need to check that the compounds for which you’re making predictions lie within the chemical space occupied by the training set (one way to do this is to ensure that close structural analogs of these compounds exist in the training set) because you can’t be sure that the big data necessarily cover the regions of chemical space of interest to drug designers using the models. A panel discusses the pressing requirement for more data although ML modellers do need to be aware that there’s a huge difference between assembling data sets for benchmarking and covering chemical space at sufficiently high resolution to enable accurate prediction for arbitrary compounds.

There are other ways to think about chemical space. For example, differences in biological activity and ADME-related properties can also be seen in terms of structural relationships between compounds. These structural relationships can be defined in terms of molecular similarity (Tanimoto coefficient for the molecular fingerprints of X and Y is 0.9) or substructure (X is the 3-chloro analog of Y). Many medicinal chemists think about structure-activity relationships (SARs) and structure-property relationships (SPRs) in terms of matched molecular pairs (MMPs: pairs of molecular structures that are linked by specific substructural relationships) and free energy perturbation (FEP) can also be seen in this framework. Strong nonadditivity and activity cliffs (large differences in activity observed for close structural analogs) are of considerable interest as SAR features in their own right and because prediction is so challenging (and therefore very useful for testing ML and physics-based models for biological activity). One reason that drug designers need to be aware of activity cliffs and nonadditivity in their project data is that these SAR features can potentially be exploited for selectivity.

Cheminformatic approaches can also help you to decide how to synthesize the compounds that you (or your AI Overlords) have designed and automated synthetic route planning is a prerequisite for doing drug discovery in ‘self-driving’ laboratories. The key to success in cheminformatics is getting your data properly organized before starting analysis and the Open Reaction Database (ORD), an open-access schema and infrastructure for structuring and sharing organic reaction data, facilitates training of models. One area that I find very exciting is the use of high-throughput experimentation in the search for new synthetic reactions which can led to better coverage of unexplored chemical space. It’s well known in industry that the process chemists typically synthesize compounds by routes that differ from those used by the medicinal chemists and data-driven multi-objective optimization of catalysts can lead to more efficient manufacturing processes (a higher conversion to the desired product also makes for a cleaner crude product).

It’s now time to wrap up what’s been a long post. Some of what is referred to as AI appears to already be useful in drug discovery (especially in the early stages) although non-AI computational inputs will continue to be significant for the foreseeable future. I see a need for cheminformatic thinking in drug discovery to shift from big data (global ML models) to focused data (generate project specific data efficiently for building local ML models) and also see advantages in using atom-based descriptors that are clearly linked to molecular interactions. One issue for data-driven approaches to prediction of biological activity such as ML and QSAR modelling is that the need for predictive capability is greatest when there's not much relevant data and this is a scenario under which physics-based approaches have an advantage. In my view, validation of ML models is not a solved problem since clustering in chemical space can cause validation procedures to make optimistic assessments of model quality. I continue to have significant concerns about how relationships (which are not necessarily linear) between descriptors are handled in ML modelling and remain generally skeptical of claims for interpretability of ML models (as noted in NoLE, the contribution of a protein–ligand contact to affinity is not, in general, an experimental observable).

Many thanks for staying with me to the end and hope to see many of you at EuroQSAR in Barcelona next year. I'll leave you with a memory from the early days of chemical space navigation.

Wednesday, 26 July 2023

Blogger Meets Blogger

Over the years I’ve had had some cool random encounters (some years ago I bumped into a fellow member of the Macclesfield diving club in the village of Pai in the north of Thailand) but the latest is perhaps the most remarkable (even if it's not quite in the league of Safecracker Meets Safecracker in Surely You’re Joking). I was riding the Red Line on Boston’s T en route to Belmont from a conference in Vermont when my friend Ash Jogalekar, well known for The Curious Wavefunction blog, came over and introduced himself. Ash and I have actually known each other for about 15 years but we’d never before met in real life.

The odds against such an encounter would appear to be overwhelming since Ash lives in California while this was my first visit to the USA since 2015. I had also explored the possibility of getting a ride to Boston (some of those attending had driven to the conference from there) because the bus drops people off at the airport. Furthermore, I was masked on the T which made it more difficult for Ash to recognize me. However, I was carrying my poster tube (now re-purposed for the transport of unclean underwear) and, fortuitously, the label with my name was easy for Ash to spot. Naturally, we discussed the physics of ligand efficiency.

Tuesday, 18 July 2023

AI-based drug design?

|| >> Next

I’ll start this post by stressing that I’m certainly not anti-AI. I actually believe that drug design tools that are being described as AI-based are potentially very useful in drug discovery. For example, I’d expect natural language processing capability to enable drug discovery scientists to access relevant information without actually having to create database queries. I actually have a long-standing interest in automated molecular structure editing (see KS2005) and see the ability to build chemical structures in an automated manner using Generative AI as a potentially useful addition to the drug designer’s arsenal. Physical chemistry is very important in drug design and there are likely benefits to be had from building physicochemical awareness into the AI tools (one approach would be to use atom-based measures of interaction potential and I’ll direct you to some relevant articles: A1989 | K1994 | LB2000 | H2004 | L2009 | K2009 | L2011 | K2016 | K2022)

All that said, the AI field does appear to be associated with a degree of hype and number of senior people in the drug discovery field seem to have voluntarily switched off their critical thinking skills (it might be a trifle harsh to invoke terms like “herding instinct” although doing so will give you a better idea of what I’m getting at). Trying to deal with the diverse hype of AI-based drug design in a single blog post is likely to send any blogger on a one-way trip to the funny farm so I’ll narrow the focus a bit. Specifically, I’ll be trying to understand the meaning of the term “AI-designed drug”.

The prompt for this post came from the publication of “Inside the nascent industry of AI-designed drugs” DOI in Nature Medicine and I don’t get the impression that the author of the article is too clued up on drug design:

Despite this challenge, the use of artificial intelligence (AI) and machine learning to understand drug targets better and synthesize chemical compounds to interact with them has not been easy to sell.

Apparently, AI is going to produce the drugs as well as design them:

“We expect this year to see some major advances in the number of molecules and approved drugs produced by generative AI methods that are moving forward”, Hopkins says.

I’d have enjoyed being a fly on the wall at this meeting although perhaps they should have been asking “why” rather than “how”:

“They said to me: Alex, these molecules look weird. Tell us how you did it”, Zhavaoronkov [sic] says. "We did something in chemistry that humans could not do.”

So what I think it means to claim that a drug has been “AI-designed” is that the chemical structure of the drug has been initially generated by a computer rather than a human (I’ll be very happy to be corrected on this point). Using computers to generate chemical structures is not exactly new and people were enumerating combinatorial libraries from synthetic building blocks over two decades ago (that’s not to deny that there has been considerable progress in the field of generating chemical structures). Merely conceiving a structure does not, however, constitute design and I’d question how accurate it would be to use the term “AI-designed” if structures generated by AI had been subsequently been evaluated using non-AI methods such as free energy perturbation.

One piece of advice that I routinely offer to anybody seeking to transform or revolutionize drug discovery is to make sure that you understand what a drug needs to do. First, the drug needs to interact to a significant extent with one or more therapeutic targets (while not interacting with anti-targets such as hERG and CYPs) and this is why molecular interactions (see B2010 | P2015 ) are of great interest in medicinal chemistry. Second, the drug needs to get to its target(s) at a sufficiently high concentration (the term exposure is commonly used in drug discovery) in order to have therapeutically useful effects on the target(s). This means that achieving controllability of exposure should be seen as a key objective of drug design. One of the challenges facing drug designers is that it’s not generally possible to measure intracellular concentration for drugs in vivo and I recommend that AI/ML leaders and visionaries take a look at the SR2019 study.

Given that this post is focused on how AI generates chemical structures, I thought it might be an idea to look at how human chemists currently decide which compounds are to be synthesized. Drug design is incremental which reflects the (current) impossibility of accurately predicting the effects that a drug will have on a human body directly from its molecular structure. Once a target has been selected, compounds are screened for having a desired effect on the target and the compounds identified in the screening phase are usually referred to as hits.

The screening phase is followed by the hit-to-lead phase and it can be helpful to draw an analogy between drug discovery and what is called football outside the USA. It’s not generally possible to design a drug from screening output alone and to attempt to do so would be the equivalent of taking a shot at goal from the centre spot. Just as the midfielders try move the ball closer to the opposition goal, the hit-to-lead team use the screening hits as starting points for design of higher affinity compounds. The main objective in the hit-to-lead phase to generate information that can be used for design and mapping structure-activity relationships for the more interesting hits is a common activity in hit-to-lead work.

The most attractive lead series are optimized in the lead optimization phase. In addition to designing compounds with increased affinity, the lead optimization team will generally need to address specific issues such as inadequate oral absorption, metabolic liability and off-target activity. Each compound synthesized during the course of a lead optimization campaign is almost invariably a structural analog of a compound that had already been synthesized. Lead optimization tends to be less ‘generic’ than lead identification because the optimization path is shaped by these specific issues which implies that ML modelling is likely to be less applicable to lead optimization than to lead identification.

This post is all about how medicinal chemists decide which compounds get synthesized and these decisions are not made in a vacuum. The decisions made by lead optimization chemists are constrained by the leads identified by the hit-to-lead team just as the decisions made by lead identification chemists are constrained by the screening output. While AI methods can easily generate chemical structures, it's currently far from clear that AI methods can eliminate the need for humans to make decisions as to which compounds actually get synthesized.

This is a good point at which to wrap up. One error commonly made by people with an AI/ML focus is to consider drug design purely as an exercise in prediction while, in reality, drug design should be seen more in a Design of Experiments framework.