Molecular Design: May 2026

I'll open the post on drug design objectives with photos from a most enjoyable and informative visit to the Australian Synchrotron early in 2010 when I was helping with fragment library design at CSIRO.

I’ve been meaning for ages to do a post like this and was finally goaded into action when I recently looked at two short videos from interviews with Sir Demis Hassabis, founder of Google DeepMind and Isomorphic Labs, and one of the 2024 Nobel Chemistry Prize laureates. Predicting the 3D structure of a protein from its amino acid sequence is a capability that has been eagerly sought for a long time and, as we celebrate the award, we need to also recognize the remarkable foresight of those who launched the Protein Data Bank in 1971 with just seven X-ray crystal structures. We also need to recognize that protein structures are inherently flexible and subject to post translational modification such as glycosylation and phosphorylation. Furthermore, the crystal structure that has actually been determined might correspond to a relatively small portion (for example, a tyrosine kinase domain) of a much larger structure such as a dimeric growth factor receptor.

Let’s take a look at the two videos. In the first video, Sir Demis suggests that the end of disease is “within reach maybe in the next decade or so” and it’s worth pointing out that most of the cost of bringing a drug to market comes from clinical development rather than the actual discovery of the drug (nobody spends “ten years and billions of dollars to design just one drug” and it would be more accurate to say that we do so to see if what we've designed really is a drug). Furthermore, work in the late stage of drug discovery when project teams are assessing their best compounds should not really be regarded as drug design. In the second video, Sir Demis acknowledges that “knowing the structure of a protein is only one step in the drug discovery process” although it’s not clear exactly how “many adjacent AlphaFolds” are going to meaningfully address the issues of side effects.

Drug design is frequently asserted to be a multi-objective exercise and, in this post, I’ll be trying to discuss this in a way that I hope will be helpful to drug discovery scientists using artificial intelligence (AI) and machine learning (ML) in design. The ultimate aim of drug design is to identify compounds (and biological entities such as therapeutic antibodies) that can be used to treat diseases without harming patients and I suggest that this can be stated as three design objectives. My view is that the term 'multi-objective' is more appropriate than 'multi-parameter' in the context of drug design because even against a single objective design can involve optimization of multiple parameters. One characteristic of drug design is that the design process is over long before we get to find out how successfully the outputs of design perform their function (in design of materials it's possible to evaluate design outputs more directly). I recall a Head of Research and Development at Zeneca describing the process as "like steering an oil tanker".

I prefer to use the more general term ‘bioactivity’ to describe the effects of drugs on targets (and anti-targets) because in some cases these effects cannot be meaningfully described by a single parameter such as an IC₅₀ value. As an aside this is a good point at which to celebrate the recent FDA approval of the PROTAC Vepdegestrant for treatment of ESR1m, ER+/HER2- advanced breast cancer and I'll direct readers to this most excellent and timely review on targeted protein degradation. The concentration of a drug in contact with a target (or anti-target), which varies with time, is determined by dose, and by the drug’s absorption, distribution, metabolism, and excretion (commonly referred to as ADME). While the therapeutic and adverse effects of drugs are what the drug does to the body ADME is what the body does to the drug. Put another way, minimization of toxicity and optimizing ADME are entirely different objectives and I generally recommend that the acronym ADMET not be used.

Uncertainty is omnipresent in drug discovery and, despite what many appear to believe, AI/ML is not going to make this uncertainty vanish as if by magic. Derek was emphasizing the challenges presented by the complexity of biology long before AI came to be seen by some as a panacea for the ills of Pharma/Biotech (here’s a post from almost two decades ago and I also recommend reading his 2025 post on the “End of Disease” interview which also links relevant previous posts). The complexity of biology means that even if we knew the extent of target engagement in vivo (which varies with both dose and time) we wouldn’t generally be able to predict the in vivo effects of the drug with any confidence in the absence of other information. There is also uncertainty in exposure to consider and the concentration of a drug at its site(s) of action generally cannot be measured in vivo unless the target(s) are in direct contact with plasma. Uncertainty in exposure for intracellular targets is also a clinical development issue because failure in a Phase II trial may simply reflect inadequate exposure (we noted in KM2013 that “one can argue that a typical Phase I trial provides an incomplete description of distribution”). I recommend that everybody working in drug discovery and chemical biology read Smith & Rowland (2019) Intracellular and Intraorgan Concentrations of Small Molecule Drugs: Theory, Uncertainties in Infectious Diseases and Oncology, and Promise DMD 47:667-672 DOI. I argue in NoLE that achieving controllability of exposure should be seen as an objective of drug design.

One way that pharmacokinetic pharmacodynamic (PK/PD) modellers address the issue of intracellular exposure is to assume that the concentration of drug in contact with its target(s) (and anti-targets) equals its unbound concentration in plasma (which can be measured in real time) and this assumption is referred to as the ‘free drug hypothesis’ (‘principle’ and ‘theory’ are also used in this context although I personally prefer ‘hypothesis’ because it’s an assumption we’re making). There are two scenarios under which the approximation of the concentration of drug at its site(s) of action by its unbound concentration in plasma is known to be unreliable. The first scenario is that there is significant active transport at one or more points on the path between plasma and the drug’s site(s) of action (active efflux is a common problem, especially in CNS drug discovery, although active influx will still cause the assumption to break down). The second scenario is that the pH at the drug’s site(s) of action differs from plasma pH (as would be the case for a lysosomal target) and that there is an ionizable group such as a basic nitrogen in the chemical structure of the drug.

While drug design does indeed have multiple objectives it really shouldn’t need to be said that if the required level of bioactivity cannot be achieved then it becomes irrelevant whether the other objectives are achieved and I’ll direct readers to M2026 (The Affinity Advantage). I see M2026 as providing a much-needed cold shower for a 2024 JMC Editorial (Property-Based Drug Design Merits a Nobel Prize; see blog post) in which it is asserted that “a discovery compound is more likely to become a drug when Fsp3 > 0.40” and that “a compound is more likely to have good developability when PFI < 7”. Nevertheless, I don’t consider M2026 to be especially useful from the perspective of defining drug design objectives because bioactivity is typically quantified by potency rather than affinity in drug discovery projects (an assay for kinase inhibition might have been run at high ATP concentration to mimic the intracellular environment) and some bioactivity objectives are defined in terms of measurements made in cell-based assays. Furthermore, bioactivity for ‘new modalities’ such as irreversible covalent inhibition and targeted protein degradation cannot be adequately described by a single parameter such as an IC₅₀ value.

I criticized the term ‘avoid-ome’ in a previous post and, with apologies for the dreadful pun, I would recommend that its use be avoided (at the risk of repetition ADME and toxicity are entirely separate issues that must be addressed separately). Furthermore, I would question whether drug designers actually need yet another ‘ome’ word and I consider the notion that embracing the avoid-ome will transform drug discovery to be fanciful. While inhibition of cytochrome P450 (CYP) enzymes is generally undesirable from a toxicity perspective a compound that was not cleared by these metabolic enzymes would greatly worry those responsible for drug safety (bear in mind why we worry about inhibition of CYPs in the first place). Furthermore, I would challenge the inclusion by M2026 of serum albumin in a list of anti-targets such as hERG (I’m not aware of anybody suffering cardiac arrest on account of their medication binding to serum albumin) and the excellent B2025 study notes that "most drugs are >95% plasma protein bound (58%), with a large fraction >99% bound (29%)". Binding to plasma proteins should actually be considered within the framework of distribution (it can be instructive to pose the question as to whether you could tell where a drug was simply from knowing the total quantity of it in the body and its unbound plasma concentration). It’s also worth mentioning that binding to plasma proteins will protect an orally-dosed drug from the metabolizing enzymes during its first pass through the liver (before it gets a chance to distribute into the tissues). Variation of the plasma concentration during the dosing interval for an orally-dosed drug is a necessary evil resulting from oral dosing and in many situations the ‘ideal’ pharmacokinetic profile would actually be that resulting from intravenous infusion (plasma concentration of the drug is maintained at a level required for therapeutically useful effects).

At this point I’ll attempt to articulate three general objectives of drug design (the only thing that I’m entirely confident about is here that I won’t get these exactly right). One of the great challenges that drug designers face is that it is usually difficult to identify compounds that simultaneously achieve all the design objectives. Specifying criteria for objectives too permissively increases the risk of choking in clinical development. However, overly stringent specification of criteria for objectives decreases the likelihood of achieving all of the objectives and will slow the discovery process. I state these objectives in terms of ‘bioactivity’ rather than ‘potency’ to accommodate ‘new’ modalities such as irreversible covalent inhibition and targeted protein degradation although, in many cases, it will be possible to quantify the bioactivity for a compound by a single IC₅₀ or EC₅₀ value. I use ‘maximize’ and ‘minimize’ (as opposed to ‘optimize’) to frame the objectives because there is generally no penalty for identifying better compounds than you think you need. Assessing how well objectives have been achieved involves running a diverse range of assays and, as noted in this blog post on the A2025 study, it is important to be fully aware of the quantitation limits for each and every assay that you use.

I'll conclude the post with what I would argue are the three objectives of drug design:

Maximize on-target bioactivity. This is the least difficult objective to specify because bioactivity characterized in the in vitro assays is likely to translate to target engagement in vivo provided that the compound can be presented to the target(s) at the required concentration. Design outputs are usually evaluated in animal models for the human disease before initiating studies in humans but the design itself is almost invariably done against in vitro end points.
Minimize off-target bioactivity. It is generally more difficult to specify objectives for off-target bioactivity than for on-target bioactivity on account of the numbers and diversity of the assays involved. Design outputs are always evaluated for toxicity in animals before initiating studies in humans (as mandated by regulatory authorities) but the design itself is almost invariably done against in vitro end points.
Maximize controllability of exposure. This objective, which might also be stated as 'Optimize ADME', is the most difficult of the three objectives to specify because, as noted earlier in this post, exposure generally can’t be measured for targets that are not in direct contact with plasma. At absolute minimum it is necessary to demonstrate that a pharmacokinetic profile can be achieved in animals that will maintain the (unbound) concentration of the compound at levels that we believe will result in beneficial therapeutic effects in humans. For targets not in contact with plasma the PK/PD modellers also need to be able to confidently invoke the free drug hypothesis (this is why I prefer to frame the objective in terms of exposure rather than ADME) and this requires that design outputs have good passive permeability and are not subject to active transport. In some cases it will also be necessary to demonstrate access to specific organs such as the CNS.

Molecular Design

Wednesday, 20 May 2026

The objectives of drug design