Molecular Design: Leadeth me unto Truth and delivereth me from those who have already found it

A theory has only the alternative of being true or false.

A model has a third possibility: it may be true, but irrelevant.

With apologies to Manfred Eigen (1927 - 2019)

******************

[This post was updated on 25-Jun-2024]

I've just returned to Cheshire from the Caribbean and, to kick off blogging from 2024 I'll share a photo of the orchids at Berwick-on-Sea on the north coast of Trinidad.

Encountering words like “truth” and “beauty” (here's a good example) in the titles of scientific articles always sets off warning bells for me and I’ll kick off blogging for 2024 with a look at FM2024 (Structure is beauty, but not always truth) that was recently published in Cell (and has already been reviewed by Derek). The authors have highlighted important issues: we typically use single conformations of targets in design and the experimentally-determined structures used for design may differ substantially from the structures of targets as they exist in vivo. These points do need be stressed given the expanding range of modalities being exploited by drug designers and the increasing use of AI/ML in drug design. That said, it’s my view that the authors have allowed themselves to become prisoners of their article’s title. Specifically, I see “beauty” as a complete red herring and suggest that it would have been much better to have discussed structure in terms of accuracy and relevance rather than truth. Here’s the abstract for FM2024:

Structural biology, as powerful as it is, can be misleading. We highlight four fundamental challenges: interpreting raw experimental data; accounting for motion; addressing the misleading nature of in vitro structures; and unraveling interactions between drugs and “anti-targets.” Overcoming these challenges will amplify the impact of structural biology on drug discovery.

I'll start by taking a look at the introduction and my view is that the authors do need to be much clearer about what they mean by “this hydrogen bond is better than that one” when using terms like “ground truth”. For example, we can infer that the geometry of one target-ligand hydrogen bond is closer to optimal than the geometry of another target-ligand hydrogen bond. However, the energetic cost of breaking a target-ligand hydrogen bond is not something that can generally be measured and, as noted in NoLE, the contribution of an intermolecular contact to affinity is not actually an experimental observable. Ligands associate with their targets (and anti-targets) in aqueous media and this means that intermolecular contacts, for example between polar and non-polar atoms, can destabilize the target-ligand complex without being inherently repulsive. What I’m getting at here is that structures of ligand-target complexes are relatively simple and well-defined entities within the broader context of drug discovery and yet it doesn’t appear useful to discuss them in terms of truth.

The remainder of the post follows the FM2024 section headings.

A structure is a model, not experimental reality

The term “structure” can have a number of different meanings in structure-based drug design. First, drug targets (and anti-targets) have structures that exist regardless of whether they have been experimentally determined. Second, models are built for drug targets by fitting nuclear coordinates to experimental data such as electron density (these are often referred to as experimental structures although they should strictly be called models because they are abstractions of the experimental data). Third, the structure could have been predicted using computational tools such as AlphaFold2 (here's an article, cited by FM2024, on why we still need experimentally-determined structures).

In the abstract the authors identify “interpreting raw experimental data” as one of “four fundamental challenges”. However, the actual focus of this section appears to be evaluation of predicted structures rather than interpretation of raw experimental data. While I’m sure that we can find better ways to interpret raw experimental data, and indeed to evaluate predicted structures, I don’t see either as representing a fundamental challenge.

Representing wiggling and jiggling is hard

My view is that it’s actually the ensemble of conformations rather than the wiggling and jiggling that we actually need to represent. Simulation of the wiggling and jiggling is one way to generate an ensemble of conformations but it’s not the only way (nor is it necessarily the best way). That said, it's a lot easier to sell protein motion to venture capitalists than it is to sell ensembles of conformations.

The authors state:

Analogous to how structure-based drug design is great for optimizing “surface complementarity” and electrostatics, future protein modeling approaches will unlock ensemble-based drug design with an ability to predictably tune new and important aspects of design, including entropic contributions [7] and residence times [8] of bound ligands.

The term “entropic contributions” does come across as arm-waving (especially in a drug design context) and my view is that entropy should be seen as an effect rather than a cause. Thermodynamic signatures for binding are certainly of scientific interest but I would argue that they are essentially irrelevant to drug design (it can be instructive to consider how patients might sense the benefits of enthalpically-driven drug binding). The case for increasing residence time might not be quite as solid as many believe it to be (see the F2018 study and this blog post).

In vitro can be deceiving

The authors identify “addressing the misleading nature of in vitro structures” as a fundamental challenge and they state:

While purifying a protein out of its cellular context can be enabling for in vitro drug discovery, it can also provide a false impression. Recombinant expression can lead to missing post-translational modifications (e.g., phosphorylation or glycosylation) that are critical to understanding the function of a protein.

To this I’d add that we often don’t use the full-length proteins in design and recombinant proteins may have been engineered to make them easier to crystallize or more robust for soaking experiments. Furthermore, target engagement may require the drug to interact with two or more proteins (see HC2017) which will probably be more amenable individually to structure determination than the their complex. I fully agree that it is important for drug designers to be aware that the experimentally-determined structures that they're using differ from the structures of the targets as they exist in vivo. However, I don't believe that it makes any sense to talk about “the misleading nature of in vitro structures” (or indeed about “in vitro drug discovery”) because target structures are never experimentally determined in vivo and are only misleading to the extent that users overinterpret them. As a more general point users of experimental data do need to very careful about describing the experimental data that they’re using as “misleading” or "deceiving".

When we use structures to represent targets the issue is much less about the truth of the structures that we’re using and much more about their relevance to the targets that we’re trying to represent. This is not just an issue for structural biology and we might, for example, use the catalytic domain of an enzyme as a model for the full-length protein when running biochemical assays. We have to make assumptions in these situations and we also need to check that these assumptions are reasonable. For example, we might examine the structure-activity relationship in a cell-based assay for consistency with the structure-activity relationship that we’ve observed in the enzyme inhibition assay. It's also worth pointing out that what we observe in cells is usually a coarse approximation to what actually happens in vivo and we can't even measure the intracellular concentration of a drug in vivo.

Drugs mingle with many different receptors

Drugs do indeed mingle with many receptors in vivo but it’s important to be aware that the consequences of this mingling depend on the drug concentration (a spatiotemporal quantity) at the site of action. Drug discovery scientists use the term exposure when talking about drug concentration at the site of action and one underappreciated challenge in drug design is that intracellular drug concentration cannot generally be measured in vivo (here’s an open access article that I recommend to everybody working drug discovery). I argue in NoLE that controllability of exposure should be seen as a drug design objective although the current impossibility of measuring intracellular concentration means that we can only assess how effectively the objective has been achieved in an indirect manner. Alternatively, drug design can be seen in terms of minimization of the dose at which therapeutically beneficial effects can be observed.

One assumption often made in drug design is that the drug concentration at the site of action is equal to the unbound concentration in plasma and this assumption is referred to as the free drug hypothesis (FDH) although the term “free drug theory” is also used. The basis for the FDH is the assumption that the drug can move freely between plasma and the target compartment. In reality the drug concentration at the site of action will generally lag behind its unbound plasma concentration and the lag time is inversely related to the ease with which the drug permeates through the barriers which separate the target from the plasma. There are a couple of scenarios under which you can’t assume that the drug concentration in the target compartment will be the same as its unbound plasma concentration. The first of these is when active transport is significant and this is a scenario with which drug designers tackling targets within the central nervous system (CNS) are familiar with. The second scenario is that there is an ionizable functional group (as is the case for amines) in the molecular structure of the drug and the pH at the site of action differs significantly from plasma pH (as is the case for lysosomes).

There are two general types of undesirable outcome that can result when a drug encounters receptors with which it mingles. First, the receptor is an anti-target and the encounter results in binding of the drug, leading to toxicity (patients are harmed). Second, the receptor is a metabolic enzyme or a transporter and the encounter leads to the drug either being turned over or pumped from where it needs to function (patients do not benefit from the treatment).

I've inserted some comments (italicised in red) into the following quoted text:

The sad reality that all drug discoverers must face is that however well designed we may believe our compounds to be, they will find ways to interact with many other proteins or nucleic acids in the body and interfere with the normal functions of those biomolecules. While occasionally, the ability of a medicine to bind to multiple biomolecules will increase a drug’s efficacy, such polypharmacology is far more likely to produce undesirable effects. These undesirable outcomes take two forms. Obviously, the direct binding to an anti-target can lead to a bewildering range of toxicities, many of which render the drug too hazardous for any use. [While there are well-known anti-targets such as hERG that must be avoided, my understanding is that those responsible for drug safety generally prefer not to see any off-target activity given the difficulties in prediction of toxicity. Here are a couple of relevant articles (B2012 | J2020) and a link to some information about in vitro safety pharmacology profiling panels from Eurofins. Update 25-Jun-2024: recent review on secondary pharmacology.] More subtly, the binding to anti-targets reduces the ability of the drug to reach the desired target. A drug that largely avoids binding to anti-targets will partition more effectively through the body, enabling it to accumulate at high enough concentrations in the disease-relevant tissue to effectively modulate the function of the target. [I consider it unlikely that binding to an anti-target could account for a significant proportion of the dose. In any case, I’d expect binding of a drug to anti-targets to cause unacceptable toxicity long before it results in sequestration of a significant proportion of the dose.]

A particular challenge results from the interaction of drugs with the enzymes, transporters, channels, and receptors that are largely responsible for controlling the metabolism and pharmacokinetic properties (DMPK) of those drugs—their absorption, distribution, metabolism, and elimination. Drugs often bind to plasma proteins, preventing them from reaching the intended tissues; [A degree of binding to plasma proteins is not a problem and, in the case of warfarin, is probably essential for the safe use of the drug.] they can block or be substrates for all manner of pumps and transporters, changing their distribution through the body; [Transporters can indeed prevent drugs from getting to their sites of action at therapeutically effective concentrations and limited brain exposure resulting from active efflux is a common issue for CNS drug discovery programs (see H2012 and R2015). I am not aware of any transporters that are definitely considered to be anti-targets from the safety perspective (I'm happy to be corrected on this point) and inhibition of efflux pumps is a recognized tactic (see T2021 and H2020) in drug discovery. Update 25-Jun-2024: I thank Mohamed Diwan M. AbdulHameed (google scholar profile) for making me aware that inhibition of bile salt export pump (BESP) is considered a risk factor for drug-induced liver injury (DILI). Here's a relevant article.] xenobiotic sensors such as PXR that turn on transcriptional programs recognizing foreign substances; and they often block enzymes like cytochrome P450s, thereby changing their own metabolism and that of other medicines. [Inhibition of CYPs is generally considered undesirable from the safety perspective because of the potential for drug-drug interactions (see H2020). That said, the CYP3A inhibitor ritonavir (see CG2003) is used in the COVID-19 treatment Paxlovid to slow metabolism of SARS-CoV-2 main protease nirmatrelvir.] They are themselves substrates for P450s and other metabolizing enzymes and, once altered, can no longer carry out their assigned, life-saving function. [Medicinal chemists are well aware of the challenges presented by drug-metabolizing enzymes although it must be stressed that any drug that was cleared too slowly would be considered to be an unacceptable safety risk.]

Taken together, we refer to these DMPK-related proteins, somewhat tongue-in-cheek, as the “avoidome” (Figure 2). [It is unclear why the authors have chosen to only include DMPK-related proteins in the avoidome (hERG is not a DMPK-related protein but is an anti-target that every drug discovery scientist would wish to avoid blocking). For reasons outlined in the previous paragraph I would actually argue against the inclusion of DMPK-related proteins in the avoidome.] Unfortunately, the structures of the vast majority of avoidome targets have not yet been determined. Further, many of these proteins are complex machines that contain multiple domains and exhibit considerable structural dynamism. Their binding pockets can be quite large and promiscuous, favoring distinct binding modes for even closely related compounds. [It is not clear whether this assertion is based on experimental observations.] As a consequence, multiple structures spanning a range of bound ligands and protein conformational states will be required to fully understand how best to prevent drugs from engaging these problematic anti-targets.

We believe the structural biology community should “embrace the avoidome” with the same enthusiasm that structure-based design has been applied to intended targets. [My view is that the authors need to clearly articulate their reasons for only including DMPK-related proteins in the avoidome before seeking to direct the activities of structural biology community. I presume that the Target 2035 initiative, which aims to “to create by year 2035 chemogenomic libraries, chemical probes, and/or biological probes for the entire human proteome”, will also cover anti-targets. Having chemical and/or biological probes available for anti-targets should lead to better understanding of toxicity in humans.] The structures of these proteins will shed considerable light on human biology and represent exciting opportunities to demonstrate the power of cutting-edge structural techniques. [Experimental structures of target-ligand complexes do indeed provide valuable direct evidence that a ligand is binding to a protein but the structures themselves are not particularly informative from the perspective of understanding human biology. It is actually high-quality chemical probes that are needed to shed light on human biology and here’s a link to the Chemical Probes Portal. Structures at atomic resolution for protein-ligand complexes are certainly useful for chemical probe design but are not strictly necessary for effective use of chemical probes.] Crucially, a detailed understanding of the ways that drugs engage with avoidome targets would significantly expedite drug discovery. [Experimentally-determined structures of anti-targets complexed with ligands are certainly informative when elucidating structure-activity relationships for binding to anti-targets. However, structural information of this nature is much less directly useful for addressing problems such as metabolic lability and active efflux.] This information holds the potential to achieve a profound impact on the discovery of new and enhanced medicines.

Conclusion

The authors assert:

In drug discovery, truth is a molecule that transforms the practice of medicine. [I disagree with this assertion. In drug discovery truth may also be a compound that, despite an excellent pharmacokinetic profile, chokes comprehensively in phase 2.]

It's been been a long post and this is a good place to leave things. While the authors have raised some valid points I found the 'Drugs mingle with many different receptors' section to be rather confused and I don't think that the drug discovery and structural biology communities are in desperate need of yet another 'ome' word. I hope that my review of FM2024 will be useful for readers of the article while providing helpful feedback for the authors and for the Editors of Cell.

Molecular Design

Wednesday, 27 March 2024

Leadeth me unto Truth and delivereth me from those who have already found it

No comments: