Molecular Design: lipophilicity

Showing posts with label lipophilicity. Show all posts

Tuesday, 30 July 2024

A Nobel for property-based drug design?

[This post was updated on 04-Aug-2024. I thank Tim Ritchie (see RM2009 | RM2014) for bringing YG2003 (Prediction of Aqueous Solubility of Organic Compounds by Topological Descriptors) to my attention.]

"The problems of ADME are precisely those that determine success or failure of a drug in vivo. In vitro data can give a clearer picture of the receptor characteristics, but knowledge and control of ADME are also vital. A common trap in binding studies is that binding generally increases with lipophilicity, so that one may obtain extremely potent binding that is totally unattainable in vivo."

SH Unger (1987) Computer-Aided Drug Design in the Year 2000.

Drug Information Journal 21:267-275 DOI

******************************************

In this post I’ll be reviewing an Editorial (Property-Based Drug Design Merits a Nobel Prize) that was recently published in J Med Chem. For me, the Editorial raises questions about the critical thinking skills of its authors and of the judgement of the J Med Chem Editors (I’m guessing that some of the courteous and cultured members of the Nobel Prize committee might regard it to be somewhat pushy, and possibly even uncouth, for journals to be publishing nominations for Nobel Prizes as editorials). My advice to anybody nominating individuals for a Nobel Prize is to be aware of an observation, usually attributed to Jocelyn Bell Burnett, that it’s better that people ask why you didn’t win a Nobel Prize than why you did. Where applicable, I've used the the same reference numbers that were used in the Editorial and I’ll start by reproducing the Nobel Prize proposal (as is usual in posts at Molecular Design, I’ve inserted some comments, italicized in red and enclosed in square brackets, into the quoted text):

We propose that a Nobel Prize in Physiology or Medicine should be awarded for property-based drug design, with Christopher A. Lipinski, Paul D. Leeson, and Frank Lovering as the proposed recipients for their development of “important principles for drug design” [I would describe what the proposed Nobel laureates have introduced as a rule, a metric and a molecular descriptor rather than principles.], principles that have contributed to the development of numerous approved drugs. [The authors do need to provide convincing evidence to support what appear to be some wildly extravagant claims. Specifically, the authors need to demonstrate that the rule, metric and molecular descriptor (which they describe as “principles”) were actually critical to the decision-making in projects that led to the development of numerous drugs.] While drug design previously focused primarily on optimizing potency, they introduced a more holistic approach based on the consideration of how fundamental molecular and physicochemical properties affect pharmaceutical, pharmacodynamic, pharmacokinetic, and safety properties. [My view is that none of proposed Nobel laureates even demonstrated a single convincing link between molecular and physicochemical properties, and pharmaceutical, pharmacodynamic, pharmacokinetic, and safety properties.] The development of the Rof5 by Christopher A. Lipinski in 1997 introduced a new principle for how molecular and physicochemical properties affect oral bioavailability. The development of LipE by Paul D. Leeson in 2007 introduced a new principle for how physicochemical properties impact potency, selectivity, and safety. Finally, the development of Fsp3 by Frank Lovering in 2009 introduced a new principle for how molecular shape affects pharmaceutical properties and developability.

Before examining the contributions of the three nominated individuals it's worth saying something about the objectives of drug design. First, a drug needs to be highly active against its target(s). Second, activity against anti-targets should be very low (ideally too low to even be measured). Third, as I note in 34, the exposure (concentration at the site of action) of the drug needs to be controllable (one challenge in drug design is that intracellular drug concentration can’t generally be measured in vivo and I recommend that all drug discovery scientists read SR2019). I see controlling exposure as the primary focus of property-based design and one fundamental challenge is that structural modifications that lead to increased engagement potential for the therapeutic target(s) frequently result in reduced controllability of exposure as well as increased engagement potential for anti-targets. I’ve tried to capture these points in the graphic shown below.

It's generally accepted that excessive lipophilicity and molecular size are risk factors in drug design and the “compound quality” (CQ) literature abounds with fire-and-brimstone sermons on the evils of "molecular obesity" (see H2011). Nevertheless, the relationships between these descriptors and properties such as binding affinity for anti-targets, permeability, aqueous solubility and metabolic lability are generally not quite as strong as is commonly believed (or claimed). When using trends in data to inform design it’s really important to know how strong the trends are because this tells you how much weight to give to the trends when making decisions. It’s not unknown in CQ studies for trends in data to be made to appear to be stronger than they actually are which endows the CQ field with what I’ll politely call a “whiff of the pasture” (the term “correlation inflation” has been used; see KM2013). Transformation of continuous data (IC₅₀ values) to categorical data (high | medium | low) prior to analysis should trigger a deafening cacophony of alarm bells as should any averaging of groups of continuous data values without showing the spread in the data values. Some examples of studies in which I consider the strengths of trends to have been exaggerated include 29, 35, HMO2016 and HY2010.

I think that one thing that everybody who actually works (or has worked) on drug discovery projects agrees on is that drug discovery is really difficult. My view is that, by focusing on Rof5, LipE and Fsp3, the Editorial actually trivializes the challenges faced by drug discovery scientists. Most drug design (as opposed to ligand design) takes place during lead optimization and lead optimization teams are typically addressing specific problems (for example, structural changes that result in increased potency also result in reduced aqueous solubility). Lead optimization teams typically work with a lot of measured data (a significant component of drug design is efficient generation of data to enable decision-making) and a weak correlation between logP and aqueous solubility reported in the literature would be of no practical relevance when the lead optimization team is using aqueous solubility measurements for compounds in the structural series that they’re optimizing. It is common (see M2001 | G2008) for the simplicity of rules, guidelines and metrics to be touted and we noted in KM2013 that:

Given that drug discovery would appear to be anything but simple, the simplicity of a drug-likeness model could actually be taken as evidence for its irrelevance to drug discovery.

Guidelines for successful drug discovery are often presented in terms of something good (or bad) being more likely to happen when the value of a calculated property such as Fsp3 exceeds a threshold. When using guidelines like these be aware that it’s actually very difficult to set these threshold values objectively and that the guidelines would have been stated in an identical manner had different threshold values been chosen to specify them. One difficulty with using guidelines like these is that the creators of the guidelines don’t usually say what they mean by “more likely” (millions of people book flights knowing that one is “more likely” to die in a plane crash if one takes a flight than if one doesn’t take a flight). A number of published guidelines (some of which have been referenced in the Editorial) claim that compounds that comply with the guidelines are more likely to be developable. However, giving weight to these claims would require that developability be defined in an objective manner that enables compounds with arbitrary molecular structures and differing biological activity to be meaningfully compared.

I’ll examine the contributions of the three proposed laureates for the Nobel Prize in Physiology or Medicine following the order in the Editorial. Let's start with the first:

The development of the Rof5 by Christopher A. Lipinski in 1997 introduced a new principle for how molecular and physicochemical properties affect oral bioavailability. [As a reviewer of the manuscript I would have pressed the authors to explicitly state the new principle that their first nominee for the Nobel Prize for Physiology or Medicine had introduced 1997.]

My view is that the publication of the Rof5 (22) has certainly proven to be highly influential in that it made many drug discovery scientists aware of the need to take account of physicochemical properties, in particular lipophilicity, in drug design. What is less well-known, but possibly more important in my view, is that publication of the Rof5 sent a clear message to Pharma/Biotech management that high-throughput screening wasn’t going to be the panacea that many believed that it would be. However, I don't see the Rof5 as quite the epiphany that the authors of the Editorial would have us believe it to be. The quote with which I started this post was taken from an article that had been published ten years before 22 and the inverse nature of the relationship between aqueous solubility and lipophilicity was being discussed in the scientific literature (see YV1980) more than forty years ago. The NC1996 study is also worthy of mention because it was published more than a year before 22 and it makes the important point that optimal logP values are likely to vary with chemotype ("each congeneric series for a drug backbone usually demonstrates its own optimal log P").

Questions can be raised about the data analysis presented in support of the Rof5 and readers may find it helpful to take a look at the S2019 study as well as my comments on the Rof5 in HBD3 and in this post. I would argue that the Rof5 does not have any practical value as a drug design tool and I would challenge the assertion made in the Editorial that the publication of 22 demonstrated how “molecular and physicochemical properties affect oral bioavailability”. One aspect of the analysis presented (22) in support of the Rof5 that isn't always fully appreciated is that the compounds for which the descriptors are calculated were all treated as having equivalent oral bioavailability (compounds were selected for the analysis on the basis of having been taken into phase 2 clinical trials at some point before the Rof5 had been published in 1997). This is one reason that it’s not credible to assert that the analysis demonstrates that these molecular and physicochemical properties are linked to bioavailability (it must be stressed that, like many, I do actually believe that excessive lipophilicity and molecular size are risk factors in drug design). I make the following point in a blog post (I’ve modified the original text very slightly for consistency with the Editorial):

The Rof5 is stated in terms of likelihood of poor absorption or permeation although no measured oral absorption or permeability data are given in 22 and the Rof5 should therefore be regarded as a statement of belief. I realise that to make such an assertion runs the risk of an appointment with the auto-da-fé and I stress that had the Rof5 been stated in terms of physicochemical and molecular property distributions I would not have made the assertion.

To see what I was getting at let’s take a look at how the Rof5 was stated in 22 (“The ‘rule of 5’ states that: poor absorption or permeation are more likely when…”). However, the analysis presented in support of the Rof5 was of the distribution of compounds in chemical space defined by molecular weight, logP and numbers of hydrogen bond donors and acceptors with no account being taken of variation in either absorption or permeation for the compounds. Analysis like this can be informative but you need to demonstrate that the chemical space is actually relevant to the phenomena of interest. One way that you can demonstrate that a chemical space is relevant is to build predictive models for the phenomena of interest using only the dimensions of the chemical space as descriptors. Alternatively you might observe meaningful differences between the distributions in the chemical space for compounds that have respectively passed and failed at at a particular stage in clinical development.

So that’s all that I’ll be saying about Rof5 and it’s time to take a look at the contributions of the second proposed Nobel Laureate:

The development of LipE by Paul D. Leeson in 2007 introduced a new principle for how physicochemical properties impact potency, selectivity, and safety. [As a reviewer of the manuscript I would have pressed the authors to explicitly state the new principle that their second nominee for the Nobel Prize for Physiology or Medicine had introduced 2007.]

I'll start by saying that LipE is a simple mathematical formula and I suggest that one shouldn't be confusing simple mathematical formulae with principles when nominating people for Nobel Prizes. There are, however, other errors and these are not the kind of errors that you can afford to make when nominating people for Nobel Prizes. First, the term used in 29 is actually “ligand-lipophilicity efficiency” (LLE) although this appears to have mutated to “lipophilic ligand efficiency” (also LLE) by 2014 (see H2014). The term “LipE” was actually introduced by Pfizer scientists (see R2009) and it is significant that the more recent J2018 article defines LipE in terms of logD rather than logP (doing so means that you can make compounds more efficient simply by increasing extent of ionization and, as a drug design tactic, this is likely to end about as well as things did for the Sixth Army at Stalingrad).

The second (and more serious from the perspective of a Nobel nomination) error is that the metric had already been discussed, although not named, in the literature when 29 was published (I’m guessing that a suggestion that naming a metric merits a Nobel Prize for Physiology or Medicine might cause some members of the Nobel Prize committee to choke on their surströmming). The L2006 book chapter, published fifteen months before 29, states:

Thus, to achieve compounds with a not too high log P while still retaining potency, the difference between the log potency and the log D can be utilised.

From the A2007 perspective which was published three months before 29:

Lipophilicity is thought to be a driving force for binding to anti-targets such as the hERG ion channel and cytochrome p450 enzymes and potency can be scaled by lipophilicity by subtracting measured or calculated 1-octanol water partition coefficients from pIC₅₀.

It might be helpful to say something about efficiency metrics since LiPE (or LLE if you prefer) is an example of an efficiency metric. The idea behind efficiency metrics is to “normalize” a compound’s activity (typically quantified by potency or affinity) by the value of a risk factor such as lipophilicity or molecular size (for the masochists among you there’s an entire section in 34 on normalization of binding affinity). Ligand efficiency (LE) was introduced in 2004 (see H2004) and is generally regarded as the original efficiency metric although its creators do acknowledge the influence of the K1999 study. I’ve argued at length in 34 (Table 1 and Figure 1 in the article capture the essence of the argument) that LE is physically meaningless because perception of efficiency changes if you use a different concentration to define the standard state (by convention ΔG_binding values correspond to an arbitrary 1 M standard concentration) and there is no way to objectively select any particular value of the standard concentration for calculation of LE. The problem doesn’t go away if you try to define ligand efficiency in terms of logarithmically expressed values of IC₅₀, K_i or K_d instead of ΔG_binding because these quantities still have to be divided by an arbitrary concentration value in order to be expressed as logarithms (see M2011). My view is that LE shouldn't even be described as a metric and I sometimes appropriate a quote ("it's not even wrong") that is usually attributed to Pauli because those who advocate the use of LE in drug design are unable (or unwilling) to say what it measures.

The meaninglessness of LE stems from it being defined by scaling ΔG_binding by the design risk factor (molecular size). In contrast, LipE is defined by offsetting pIC₅₀ by the risk factor (logP) and can be interpreted (see 34) as the energetic cost of moving the ligand from octanol to its target binding site (this interpretation is only valid when the ligand binds in its neutral form and is predominantly neutral in the aqueous phase). When considering lipophilicity in property-based design it is important to be aware that octanol is an arbitrary choice of solvent for measurement of partition coefficients and that the logP (or logD) calculated for a compound may differ significantly depending on the algorithm used for the calculations. That said, the hydrogen bond donors/acceptors and ionizable groups tend to be relatively conserved within structural series which means that the details of exactly how lipophilicity is quantified are likely to be less critical in lead optimization than for structurally-diverse sets of compounds.

When we use LipE we’re actually assuming that logP (or logD) is predictive of properties such as aqueous solubility, affinity for anti-targets and metabolic lability. That is why it’s not accurate to state that the introduction of LipE showed how “physicochemical properties impact potency, selectivity, and safety”. In some published studies the focus is less on the LipE metric and more on what might be called the "lipophilic efficiency concept" (aim for top left corner of a plot of potency against lipophilicity). It is common to show reference lines of constant LipE to plots of potency against lipophilicity in this type of analysis and if you're doing this you really should be citing R2009 rather than 29.

I'll finish the commentary on LipE (or LLE if you prefer) with this statement made in the Editorial:

Emerging from an analysis of approved drugs, this rubric predicts a compound is more likely to be clinically developable when LipE > 5. [I don’t know what the authors of the Editorial mean by “rubric” (I'm not even sure that they do) but as a reviewer of the manuscript I would have pressed them to justify their claim. Specifically I would have been looking for a literature reference (for me, the choice of the word “emerging” does rather conjure up an image of hot gases and stoned priestesses at Delphi) and a coherent explanation for why a value of 5 yields a better rubric than values of 4 or 6.]

That’s all that I’ll be saying about LipE (or LLE if you prefer) and it’s time to take a look at the contributions of the third nominee for the Nobel Prize in Physiology or Medicine:

Finally, the development of Fsp3 by Frank Lovering in 2009 introduced a new principle for how molecular shape affects pharmaceutical properties and developability. [As a reviewer of the manuscript I would have pressed the authors to explicitly state the new principle that their third nominee for the Nobel Prize for Physiology or Medicine had introduced in 2009. My view is that Fsp3 is a thoroughly unconvincing descriptor of molecular shape and I suggest readers consider the suggestion that cyclohexane (Fsp3 = 1) would have a better shape match with benzene (Fsp3 = 0) than with either methane (Fsp3 = 1) or adamantane (Fsp3 = 1).]

[04-Aug-2024 update: The Fsp3 descriptor had actually been used as i_ali in the YG2003 study (Prediction of Aqueous Solubility of Organic Compounds by Topological Descriptors) six years before the publication of 35:

The aliphatic indicator of a molecule (i_ali) is equal to the number of sp3 carbons divided by the total number of carbon atoms in the molecule.

The YG2003 study discussed prediction of aqueous solubility using i_ali (renamed as Fsp3 in 35) in conjunction with other topological descriptors. In contrast with the claims made in 35 for Fsp3 the YG2003 study made no suggestion that i_ali was a highly effective predictor of aqueous solvation when used by itself.]

Before discussing the contributions of the third nominee for the Nobel Prize for Physiology or Medicine I should stress that I certainly consider gratuitous use of aromatic rings to be a very bad thing in drug design (it was the data analysis in 35 that was criticized in KM2013 but not the eminently sensible suggestion that drug designers should look beyond what the authors referred to as ‘Flatland’). Having sp3 carbon atoms in a scaffold provides drug designers with a wider range of options for placement of substituents than would be the case for a fully aromatic scaffold and we stated in KM2013 that:

One limitation of aromatic rings as components of drug molecules is that some regions above and below the plane defined by the atomic nuclear positions are not directly accessible to substituents. Molecular recognition considerations suggest a focus on achieving axial substitution in saturated rings with minimal steric footprint, for example by exploiting the anomeric effect or by substituting N-acylated cyclic amines at C2.

My view is that deleterious effects of aromatic rings on aqueous solubility would be more plausibly explained by molecular interactions stabilizing the solid state than in terms of molecular shape (this point is discussed in more detail in HBD3). I also see saturated ring systems such as bicyclo[1.1.1]pentane and cubane as potentially more resistant to metabolism than benzene.

There’s one point that I need to make before discussing 35 from the data analysis perspective which is that molecular structures with basic nitrogen atoms tend to have higher Fsp3 values than molecular structures that lack basic nitrogen atoms (see L2013). This means that you can’t tell whether the benefits of higher Fsp3 values are actually caused by the higher Fsp3 values or by the presence of basic nitrogen atoms.

The Editorial states:

Stemming from an analysis of discovery compounds, investigational drugs, and approved drugs, Fsp3 predicts a discovery compound is more likely to become a drug when Fsp3 > 0.40.

It’s not clear (at least to me) where the figure of 0.40 comes from and I would argue that that compound X (IC₅₀ against therapeutic target = 50 μM; Fsp3 = 0.80) would actually be less likely to become a drug than compound Y (IC₅₀ against therapeutic target = 10 nM; Fsp3 = 0.20). I’m assuming that what the Editorial refers to as “analysis of discovery compounds, investigational drugs, and approved drugs” is what is shown by Figure 3 in 35. Presenting data in this manner hides the variation in Fsp3 for the compounds at each stage of development and makes the trends look much stronger than they actually are (this is verboten according to current J Med Chem author guidelines). I would challenge the suggestion that what is shown in Figure 3 in 35 can be used to calculate the probability that an arbitrary compound will become a drug (my view is that it’s not feasible to even define the probability that a compound will become a drug in a meaningful manner). Analyses of success in clinical development are generally more convincing when comparisons are made between compounds that pass or fail in individual phases of clinical development than between compounds in different phases of clinical development.

The Editorial continues:

This observation was ascribed to increased Fsp3 leading to increased aqueous solubility, a critical physiochemical property for successful drug discovery.

I’m assuming that what the Editorial refers to as “increased Fsp3 leading to increased aqueous solubility” is the trend shown by Figure 5 of 35 (this featured prominently in the KM2013 correlation inflation article) which claims to show the relationship between Fsp3 and log S (aqueous solubility expressed as a logarithm). This claim is not accurate because the log S values have been binned and the relationship is actually between centre point of bin and mean log S value for bin. The authors of 35 used public domain aqueous solubility data for their analysis and we showed (KM2013; see Figure 5) that the Pearson correlation coefficient for the relationship between log S and Fsp3 is only 0.25 (the corresponding value for the binned data is 0.97). I consider the suggestion that such a weak correlation could have any relevance whatsoever to the the likelihood of success in clinical trials to be wild and uninformed conjecture.

I'll finish my commentary on Fsp3 by reproducing this claim made in the Editorial:

Much like the Rof5 and LipE, Fsp3 has proven to be enduringly useful for the design of compounds with improved chances of clinical success. (37) [My view is there is insufficient evidence to justify this claim and I'm perplexed by the citation of 37. In any case, members of the Nobel committee are likely to focus more on whether or not Fsp3 is usefully predictive than on the endurance of this molecular descriptor.]

It’s now time to summarise what has been a long and at times pedantic blog post, and I thank all readers who’ve stayed with me. I don’t consider any of the three studies (22 | 29 | 35) that form the basis of the Nobel Prize nomination to have reported significant scientific discoveries and I would also challenge the claim made in the Editorial that these studies introduced new principles. I’m aware that 22 is heavily cited and I certainly agree that it is common to see values of LipE and Fsp3 quoted in the drug discovery literature. Nevertheless, I would argue that that the Editorial failed to provide even a single convincing example of the Rof5, LipE or Fsp3 making a critical contribution to the discovery of a marketed drug (this should be quite sufficient to rule out the award of a share in the Nobel Prize for Physiology or Medicine to any of these nominees). Furthermore, the Editorial doesn’t provide any convincing evidence that the Rof5, LipE or Fsp3 are usefully predictive in drug discovery projects.

Aside from the failure of the Editorial to demonstrate significant impact for the Rof5, LipE and Fsp3, I do have some scientific concerns about this Nobel Prize nomination. First, the Rof5 is not actually supported by data. Second, LipE had already been discussed, although not named, in the drug discovery literature when 29 was published. Third, Fsp3 had been used previously (as i_ali) for aqueous solubility prediction and the data analysis in 35 would fail to comply with current J Med Chem author guidelines.

Monday, 20 May 2024

A time and place for Nature in drug discovery?

I’ll be reviewing Y2022 (The Time and Place for Nature in Drug Discovery) in this post and stating my position on natural products in modern drug discovery is a good place to start. I certainly see value in screening natural products and natural product-like compounds (especially in phenotypic assays) and there is currently a great deal of interest in chemical probes (I’ll point you toward an article on the Target 2035 initiative and a link to the Chemical Probes Portal). In general, a natural product or natural product-like active identified by screening would either need to exhibit novel phenotypic effects or be significantly more potent than other known actives for me to enthusiastic about following it up. I would certainly consider screening fragments that are only present in natural product structures although these would need to still need comply with the criteria (typically defined in terms of properties such as molecular size, molecular complexity and lipophilicity) used to select fragments. I see significant benefits coming from the increased use of biocatalysis, both in drug discovery and for manufacturing drugs, but I don’t see these benefits as being restricted to synthesis of natural products or natural product-like compounds.

This will be a very long post (for which I make no apology) and it's a good point to say something about how the review is presented. I've used section headings (in bold text) used in Y2022 for my commentary and quoted text has been indented (my comments on the quoted text enclosed with square brackets and italicized in red). I'd like to raise four general points before starting my review:

Proprietary data cannot accurately be described as “facts” or “evidence” and it’s not valid to claim that you’ve proven or demonstrated something on the basis of analysis of proprietary data.
If continuous data such as oral bioavailability measurements have been made categorical (e.g., high | medium | low) prior to analysis then it’s generally a safe assumption that any trends "revealed" by the analysis are weak.
If basing claims on analysis of locations or distributions within a particular chemical space it is necessary to demonstrate the chemical space is actually relevant to the claims being made. One way to do this is to build usefully predictive models of relevant quantities such as aqueous solubility or permeability using only the dimensions of the chemical space as descriptors.
There are generally many ways to partition a region of chemical space into subregions with different average values for a measured quantity. Although the boundaries resulting from these analyses typically appear to be well-defined (for example, as a line or curve in a 2-dimensional chemical space) it is a serious error to automatically interpret such boundaries as meaningful from a physicochemical perspective.

I have a number of concerns about the Y2022 article and I’ll focus on the more serious of these in this post. I’ll also be commenting on the Rule of 5 (Ro5; see L1997), logP/logD differences, and the drug discovery “sweet spot” reported in the HK2012 article. My view is that a number of the assertions and recommendations made by the authors of Y2022 are not supported by the analyses or the data that they’ve presented. Specifically, the authors present results of analyses that had been performed using proprietary and undocumented models and, in my view, they have grossly over-interpreted the predictions made using the models. At times, the authors appear to be treating natural products as if these occupy a distinct and contiguous region of chemical space (this is a pitfall into which drug-likeness advocates also frequently stumble). The authors of Y2022 discuss physicochemical properties at considerable length without making any convincing connection between this discussion and natural products. Reading the Y2022 article, I did detect a subliminal message that natural products might be infused with vital force and wouldn’t have been surprised to see Gwyneth Paltrow as a co-author.

I’ll make some general observations before examining Y2022 in detail. If you’re going to base decisions on trends in data then you need to now how strong the trends are because this tells you how much weight to give to the trends when making your decisions. In what I’ll call the ‘compound quality’ field you’ll often encounter data presentations that make it extremely difficult to see how strong (or weak) the trends in the data actually are (see KM2013: Inflation of correlation in the pursuit of drug-likeness). Since Ro5 was introduced in 1997 (see L1997) there has been a free flow of advice from self-appointed compound quality gurus as to how compounds can be made better, more developable and more beautiful (introduction of the term “Ro5 envy” in KM2013 appeared to cause some to spit feathers). This advice frequently comes in the form of dire warnings that exceeding a threshold value of a property, such as molecular weight or predicted octanol/water partition coefficient, will increase the probability of something bad happening. It’s actually very difficult to set thresholds like these objectively and you have to consider the possibility that some of these statements of probability are merely expressions of belief (to some “there is a high probability that God exists” will sound rather more convincing than “I believe in God”).

The graphical abstract is a good place to start my review of Y2022. I don’t know whether biotransformations exist that would convert the Core Scaffold into compounds that would match the Bios Collection generalized structure but a 1,3-diene in conjugation with a tertiary nitrogen is not the sort of substructure that I would want to see in a screening active that I had been charged with optimizing.

Abstract

The authors of Y2022 state:

The declining natural product-likeness of licensed drugs and the consequent physicochemical implications of this trend in the context of current practices are noted. [The authors do not make a convincing connection between natural product-likeness and physicochemical properties.] To arrest these trends, the logic of seeking new bioactive agents with enhanced natural mimicry is considered; notably that molecules constructed by proteins (enzymes) are more likely to interact with other proteins (e.g., targets and transporters), a notion validated by natural products. [I consider this claim to be extravagant and it does need to be supported by evidence. The authors’ use of “validated” reminded me of the extravagant claim made in a Future Medicinal Chemistry editorial that “ligand efficiency validated fragment-based design”. Taking the statement literally, the authors appear to be suggesting that a compound would be more likely to interact with proteins if it had been isolated from natural sources than if it had been synthesized in a laboratory (I was reminded of the "water memory" explanation for why homeopathy works). If “molecules constructed by proteins” really are more likely to interact with other proteins then they’re also more likely to interact with anti-targets like hERG and CYPs. I’m guessing that the response of medicinal chemistry teams tackling CNS targets to suggestions that they should make their compounds more like natural products so as increase the likelihood of recognition by transporters might be to ask which natural products those offering the advice had been smoking.]

Introduction

The authors show time-dependence for the values of a number of parameters calculated for drugs in Figure 1. I see analyses like these as exercises in philately and, when I first encountered examples about two decades ago, I formed a view that some senior medicinal chemists had a bit too much time on their hands. The observation of significant time-dependency for a parameter calculated for drugs can mean one of three things. First, the parameter is irrelevant to drug discovery (however, the absence of a time-dependence shouldn't be taken as evidence that the parameter is relevant to drug discovery). Second, the old ways were best and the medicinal chemists of today have lost their way (I’m guessing this might be Jacob Rees Mogg’s interpretation if he were a medicinal chemist). Third, the old ways no longer work so well and the medicinal chemists of today have learned new ways.

I have a number of concerns about what is shown in Figure 1 (quite aside from these concerns I would question why 1b or 1c were even included in the study). The data values that have been plotted are actually mean values and, as we observed in KM2013, the presentation of mean value (or median) values without showing measures of the spread in the data, such as standard deviation or inter-quartile range, makes trends look stronger than they actually are (others use the term “voodoo correlations”). This way of presenting data is specifically verboten by J Med Chem and Author Guidelines (viewed 18-May-2024) for that journal specifically state:

If average values are reported from computational analysis, their variance must be documented. This can be accomplished by providing the number of times calculations have been repeated, mean values, and standard deviations (or standard errors). Alternatively, median values and percentile ranges can be provided. Data might also be summarized in scatter plots or box plots.

However, the hidden variation in the response variables is not the only issue that I have with Figure 1. Let’s take a look at Figure 1a which shows “a temporal comparison of natural product likeness of approved drugs assessed by the Natural Product Scout algorithm (12) versus the year of the first disclosure of the drug” although it the caption for Figure 1a is “Natural product class probability. (8)”. I think that the authors do need to explain exactly what they mean by natural product class probability because the true probability that a compound is a natural product is either 1 (it’s a natural product) or 0 (it’s not a natural product). Put another way there are differences between natural products and Prof. Schrödinger’s unfortunate feline companion. The measure of lipophilicity shown in Figure 1c is XLogP3 although no justification is given for the selection of this particular method for lipophilicity prediction nor is any reference provided.

Before continuing with my review of Y2022 I also need to examine Ro5 and discuss the difference between logP and logD (the reasons for these digressions will hopefully become clear later). Ro5 which was based on physicochemical property distributions for compounds that had been taken into phase 2 of clinical development before 1997 (the year that L1997 was published). My view is that Ro5 certainly raised awareness of the problems associated with excessive lipophilicity and molecular size (A Good Thing) but I’ve never considered Ro5 to be useful in design. Although Ro5 is accepted by many (most?) drug discovery scientists as an article of faith, some are prepared to ask awkward questions and I’ll mention the S2019 study. Let’s take a look at how Ro5 was specified in the L1997 article (the graphic is slide #17 from a presentation that I gave late last year):

Ro5 is stated in terms of likelihood of poor absorption or permeation although no measured oral absorption or permeability data are given in the L1997 study and Ro5 should therefore be regarded as a statement of belief. I realise that to make such an assertion runs the risk of an appointment with the auto-da-fé and I stress that had Ro5 been stated in terms of physicochemical and molecular property distributions I would not have made the assertion.

Medieval cartographers annotated the unknown regions of their maps with “here be dragons” and Ro5’s dragons are poor absorption and poor permeation. However, there's another issue which I touched on in HBD3:

It is significant that attempts to build global models for permeability and solubility, using only the dimensions of the chemical space in which the Ro5 is specified as descriptors, do not appear to have been successful.

What I was getting at in HBD3 is that the chemical space in which Ro5 is specified was not demonstrated to be relevant to permeability or solubility (this relates to the third of the four points that I raised at the start of the post). It must be stressed that I'm definitely not denying that relationships exist between descriptors, such as logP, used to specify Ro5 and properties such as aqueous solubility and permeability that are more directly relevant to getting drugs to where they need to. It’s just that these relationships are weak (see TY2020) and, while we don’t exactly know exactly how weak the relationships are, we do know that they are weak because continuous data have been binned to display them (see also KM2013 and specifically the comments on HY2010). I would generally anticipate that these relationships will be stronger within structural series but in these cases you’ll generally observe different relationships for different structural series. In practical terms this means that a logP of 5 might be manageable in one structural series while in another structural series compounds with logP greater than 3.5 prove to be inadequately soluble. As I advised in NoLE:

Drug designers should not automatically assume that conclusions drawn from analysis of large, structurally-diverse data sets are necessarily relevant to the specific drug design projects on which they are working.

I also need to discuss the distinction between logP and logD since this is a source of confusion for medicinal chemists and compound quality 'experts' alike. Here’s a graphic (it’s slide #18) from the presentation that I did at SancaMedChem in 2019 (if the piranhas did venture into the non-polar phase they'd probably end up swimming backstroke):

The partition coefficient (P) is simply the ratio of the concentration of the neutral form of the compound in the organic phase (usually octanol) to the concentration of the compound in water when both phases are in equilibrium. The distribution coefficient (D) is defined analogously as the ratio of the sum of concentrations of all forms of the compound in the organic phase to the sum of concentrations of all forms of the compound in water. Values of P and D are usually quoted as their logarithms logP and logD. When interpreting logD values it is commonly assumed that that is that only neutral forms of compounds partition into organic phases and if we make this assumption the relationship between logD and logP is given by Eqn 1 (see B2017):

When we perform experiments to quantify lipophilicity it is actually logD that is measured. Values of logP and logD are identical when ionization can be neglected and logP values for ionizable compounds can be obtained by examination of measured logD-pH profiles although this is rarely done. It’s usually a safe assumption that logP values used by drug discovery scientists (and quoted in medicinal chemistry publications) have been predicted and these values vary with the method used for prediction of logP. For example, L1997 states that the upper logP limit for Ro5 is 5 when logP is calculated using the ClogP method (see L1993) but 4.15 when logP is calculated using the method of Moriguchi et al. (see M1992). Values of logD that you encounter in the literature may have been calculated or measured (you might need to dig around to see if you’re dealing with real data) and it’s also important to remember that logD depends on pH. I would argue that logD is less appropriate than logP for defining compound quality metrics because excessive lipophilicity can be countered simply by increasing the extent to which compounds are are ionized (I hope you can see why that would be A Bad Thing). Another way to think about this is to consider an amine with a pKa value of 8 bound to hERG at a pH of 7. Now suppose that you can change the pKa of the amine to 11 without changing anything else in the molecular structure. What effects would you expect this pKa change to have on affinity, on logD and on logP?

I’ll now get back to reviewing Y2022 and let’s take a look at Figure 2 which shows an adapted version of the "drug discovery sweet spot” proposed in the HK2012 study. As with Figure 1b and 1c, I would question why Figure 2 was included in the Y2022 study since the connection with natural products is tenuous. In my view the authors of the HK2012 study made a number of serious errors in their definition of the “sweet spot” and these errors have been reproduced in the Y2022 study. The authors of HK2012 claimed to have identified a “drug discovery sweet spot” in a chemical space defined by “Log P” and “Molecular mass” but they didn’t actually demonstrate that this chemical space is actually relevant to drug discovery (one way to demonstrate relevance is to build convincing global models for prediction of properties like permeability and aqueous solubility using only the dimensions of the chemical space as descriptors).

If claiming to have identified a drug discovery “sweet spot” it’s important that each dimension of the chemical space in which the “sweet spot” corresponds to a single entity. While “Molecular mass” is unambiguous the term “Log P” does not refer to the same entity for each of the data sets from which the “sweet spot” has been derived. As noted previously ClogP (see L1993) was used to specify Ro5 while the Gleeson upper Log P limit (see G2008) and the “μM potency Log P” (see G2011) were specified respectively by values of clogP (calculated logP from ACD) and AlogP (no reference provided). In contrast the Pfizer Golden Triangle (see J2009) is specified using elogD (proprietary logD prediction method for which details were ot provided). The Waring low and high logP/logD values stated in W2010 are at least partly based on analysis of AZlogD7.4 values (proprietary logD prediction method; details not provided) reported in the WJ2007 and W2009 studies. The W2010 study states that “the optimal range of lipophilicity lies between ~ 1 and 3” but the these are not the values that are depicted in Figure 3 (or indeed in the original HK2012 study). The Gleeson upper limits for Log P and Molecular Mass stated in G2008 reflect the arbitrary schemes used to bin the data and should not be regarded as objectively-determined limits for these quantities. The authors of Y2022 have superimposed ellipses for "SHMs", "Antibiotic Space?" and "bRo5 / AbbVie MPS space for higher MW" on the HK2012 "sweet spot" in the creation of Figure 2 although it is not clear how these ellipses were constructed.

The Physicochemical Characteristics of Drugs

The authors assert:

A principle advocated by Hansch that drug molecules should be made as hydrophilic as possible without loss of efficacy (47) is commonly expressed and utilized as Lipophilic Ligand Efficiency (LLE). (48) [If actually using this principle advocated by Hansch you would optimize leads by varying hydrophilicity and observing efficacy. While LLE is one way to express Hansch’s principle it is by no means the only way and (pIC₅₀ – 0.5 ´ logP) would be equally acceptable as a lipophilic ligand efficiency metric from the perspective of the Hansch’s principle.] This metric, widely accepted and exploited in drug discovery as a key metric in optimization, is expressed on a log scale as activity (e.g., −log₁₀[XC₅₀]) [The logarithm function is not defined for dimensioned quantities such as XC₅₀ (see M2011) and, while it may appear to nitpicking to point it out, this is the source of the invalidity of the ligand efficiency metric as was discussed at length in NoLE.] minus a lipophilicity term (typically the Partition coefficient or log₁₀ P or sometimes log D_7.4). (49) [Although it is common to see LLE values quoted in the drug discovery literature it’s much less clear how (or even whether) the metric was actually used to make project decisions. In many studies, however, the focus is on plots of pIC₅₀ against logP (or logD) rather than values of the metric itself. In lead optimization, medicinal chemists typically need to balance activity against properties such as permeability, aqueous solubility, metabolic stability and off-target activity. In these situations, experienced medicinal chemists typically give much more weight to structure-activity relationships (SARs) and structure-property relationships (SPRs) that they've observed within the structural series that they're optimising than to crude metrics of questionable relevance and predictivity. It is noteworthy that the authors of ref 49 use logD rather than logP to define LLE (which they call LiPE) and if you do this then you can make compounds more efficient simply by increasing the extent to which they are ionized.] The impact of lipophilicity on efficacy needs to be considered in the context that reducing lipophilicity (equating to increasing hydrophilicity) will generally increase the solubility, reduce the metabolism, and reduce the promiscuity of a given compound in a series. (50) [The relationships between these properties and lipophilicity shown in ref 50 are for structurally diverse data sets rather than for individual series. I consider the activity criterion (pIC₅₀ > 5) used to quantify promiscuity in ref 50 to be at least an order of magnitude too permissive to be pharmaceutically relevant.]

Let’s take a look at Figure 3 in which values of “Calc Chrom Log D_7.4” are plotted against “CMR”. This is what the authors of say about Figure 3 in the text of Y2022:

The distribution of marketed oral drugs in terms of their lipophilicity and size, shows a remarkably similar distribution to the set of compounds designed by Kell as a representative set of natural products to investigate carrier mechanisms (Figure 3). (64) [To state “shows a remarkably similar distribution” is arm-waving given that there are methods for assessing the similarity of two distributions in an objective manner.]

As is the case for Figure 1a, what is written in the text about Figure 3 differs significantly from the caption for this figure:

Figure 3. Natural products are found across most size lipophilicity combinations, as exemplified in a representative set designed and compiled by O’Hagan and Kell (64) superimposed on the Chrom log D_7.4 vs cmr training set of compounds with >30% bioavailability. (51) [It is unclear why this training set was restricted to compounds with >30% bioavailability. The LDF is shown in this figure with “Limits of confidence” but the level of confidence to which these limits correspond is not given.]

The first criticism that I’ll make is that the authors of Y2022 have not actually demonstrated the relevance of chemical space specified by the axes of Figure 3 (this is the essence of the third of the four points that I raised at the start of the post and the same criticism can be made of Figure 4 and Figure 5). The authors note, with some arm-waving, that cmr “largely correlates with MW” which does rather beg the question of why they consider this particular measure of molecular size to be superior to MW for this type of analysis. The authors claim that “the GSK model based on log D_7.4 vs calculated molar refraction” (it is actually molar refractivity as opposed to molar refraction that was calculated) is a useful guide to predict oral exposure. I consider this claim to be extravagant because one would need to have access to the proprietary model for calculation of Chrom Log D_7.4 in order to use the model. The proprietary nature of the GSK model means that predictions made using this model cannot credibly be presented as “evidence”.

Details of the models for calculating Chrom Log D_7.4 and for prediction of oral exposure are sketchy and I regard each of these proprietary models as undocumented. A linear discriminant function (LDF) model was reportedly used for prediction of oral exposure but it is unclear how the model was trained (or if it was even validated). An LDF is a classification model and it is not clear what how the classes were defined for prediction of oral exposure. I’m assuming that the oral absorption classes used in GSK oral exposure model have been defined by categorization of continuous data (I’m happy to be corrected on this point but, given the sketchiness of details, I can be forgiven for speculation) and setting thresholds like these is difficult to achieve in an objective manner. If this was indeed the case I'd assume that the threshold value used to categorize the continuous data was arbitrary (you’ll get a different LDF model if you use a different threshold to define the classes). My view is that that an LDF is an inappropriate way to model this type of data because the categorization of the data discards a huge amount of information.

Here's the caption for Figure 4:

Figure 4. Proposed regions of size/lipophilicity space for an oral drug set, (51) using the effectual combination of Chrom Log D_7.4vs calculated molar refraction (cmr) as a description of chemical space. [It’s actually molar refractivity as opposed to molar refractivity that was calculated. It is unclear what the authors mean by "bRo5 principles".] The highlighted regions suggest likely absorption mechanisms, based on ref (65) with compounds colored by binned NPScout probability scores. [The authors of Y2022 appear to be using a proprietary and undocumented LDF model of unknown predictivity to infer absorption mechanisms (this is what I was getting at in the fourth of the four points points that I raised at the start of the post). The depiction of data shown in Figure 4 would be much more informative had compounds known (as opposed to believed) by to be orally absorbed by one of these mechanisms been plotted in this chemical space.] Below the LDF line, then mean NPScout score is 0.45, (median 0.33) and above it (indicative of likely oral exposure) the mean is 0.31 and median 0.17 (p < 0.01) [It is unclear what (p < 0.01) refers to.]

Here's the caption for Figure 5:

Figure 5. Illustration of antibiotic drug space, expressed as Calculated Chrom Log D_7.4 vs cmr adapted from data in ref (65) colored by antibiotics (circles) and TB drugs (diamonds) which are sized by NP class probabilities and colored by prediction of likelihood of oral exposure (either side of the diagonal “linear discriminant function line” so to be oral, transporters a likely mechanism for the red colored compounds, which mostly have a high NPScout score). [As is the case for Figure 4, the authors of Y2022 appear to be using a proprietary and undocumented LDF model of unknown predictivity to infer absorption mechanisms. Stating that "mostly have a high NPScout score" is arm-waving.] Vertical (cmr < 8) and horizontal lines (Chrom Log D_7.4 < 2.5) together represent likely boundaries for paracellular absorption. [The basis (measured data or belief) for this assertion is unclear. The depiction of data shown in Figure 5 would have been more convincing had compounds known to be and known not to be absorbed by the paracellular route been plotted in this chemical space. While the problems of achieving good oral absorption for antibiotics should not be underestimated, I see getting compounds into cells as the bigger issue and in some cases the transporters cause active efflux (see R2021). The depiction of data shown in Figure 5 would have been much more informative had compounds known (as opposed to believed) to exhibit active influx and active efflux been plotted in this chemical space. Although Figure 5 is presented as a description of antibiotic drug space, the study (ref 65) on which Figure 5 is based is actually focused on antitubercular drug space (one of the challenges to discovery of antitubercular drugs is that Mycobacterium tuberculosis is an intracellular pathogen; see WL2012). One article that I recommend to all drug discovery scientists, especially those working on infectious diseases, is the SM2019 review on intracellular drug concentration.]

The authors suggest:

A logical extension of this hypothesis would be to consider recognition processes with natural molecules, which are likely to have discrete interactions with carrier proteins and therapeutic targets. [The authors do need to articulate what they mean by "discrete interactions" and why "natural molecules" are likely to have "discrete interactions" with carrier proteins and therapeutic targets.] Small molecule drugs are noted to be relatively promiscuous, so making interactions with several proteins is a likely event. (76) [This assertion is not supported by ref 76 which is actually a study of nuisance compounds, PAINS filters, and dark chemical matter in a proprietary compound collection. Promiscuity of a compound is typically defined by a count of the number of targets against which activity exceeds a specific threshold and promiscuity generally increases with the permissiveness of the activity threshold (it’s therefore meaningless to describe a compound as “promiscuous” without also stating the activity threshold). The activity threshold for the analysis reported in ref 76 is ³ 50% inhibition at a concentration of 10 µM which is appropriate if you’re worried about assay interference but, in my view, is at least an order of magnitude too permissive if considering the possibility of off-target activity for a drug in vivo.] It similarly is logical to consider that a molecule made by a recognition process in a catalytic enzyme may also interact with another protein in a similar manner. (77) [This is not quite as logical as the authors would have us believe since enzymes catalyze reactions by stabilizing transition states. A high binding affinity of an enzyme for its reaction product would generally be expected to result in inhibition of the enzyme by the reaction product.]

Natural Product Fragments in Fragment-Based Drug Discovery

The authors note:

Fragment-based drug discovery (FBDD) can be employed to rapidly explore large areas of chemical space for starting points of molecular design. (91 | 92 | 93) However, most FBDD libraries are composed of privileged substructures of known synthetic drugs and drug candidates and populate already well-explored areas of chemical space, (94 | 95 | 96) [I do not consider refs 94-96 to support this assertion (none of these three articles has a fragment screening library design focus and the most recent one was published in 2007).] often through the use of fragments with high sp2-character. (97) Underexplored areas of chemical space can be rapidly explored by employing fragments derived from NPs that are already biologically prevalidated by evolution. [The authors appear to be suggesting that the physiological effects of natural products are more due to the fragments from which they have been constructed than of the way in which the fragments have been combined.]

Molecular recognition

The authors state:

That the embedded recognition of natural products for proteins correlates with recognition of the biosynthetic enzyme is an increasingly validated concept. (118 | 119 | 120) [I have no idea what “embedded recognition” means and I’m guessing that the authors might be in a similar position.] The biosynthetic imprint translates to recognition of other proteins using similar interactions. [As I’ve already noted, high binding affinity of a natural product for the enzyme that catalysed its formation would lead to inhibition of the enzyme.] For example, the analysis of protein structures of 38 biosynthetic enzymes gave 64 potential targets for 25 natural products. (121) [Concepts are usually validated with measured data and not by making predictions.]

Conclusions and Prospects for Future Development

The authors assert:

More natural molecules will increase quality through their inherently improved permeability and solubility; [At the risk of appearing pedantic, permeability and solubility are properties of compounds as opposed to molecules. That said, the authors appear to be treating “natural molecules” as occupying a distinct and contiguous region of chemical space by making this claim and it is unclear what the improvements will be relative to. The authors do not present any measured data for permeability or solubility to support their claim.] this is a case of investing time and effort in the early stages of drug discovery to reap rewards with improvements in the later stages through more predictability in trials (and thus a greater chance of success, where quality rather than speed demonstrably impacts (170)) [Many, including me, do indeed believe that investing time and effort in the early stages of drug discovery increases the chances of success in the later stages. However, I would challenge the assertion by the authors of Y2022 that ref 170 actually demonstrates this.] and more sustainable manufacturing methods driven by the transformative power of biocatalysis. (171)

So that concludes my review of Y2022 and thanks for staying with me. I'll leave you with a selfie here in Trinidad's Maraval Valley with my faithful canine companions BB and Coco providing much-needed leadership (a few minutes earlier I had patiently explained to them why ligand efficiency is complete bollocks).

Wednesday, 20 September 2017

To logP or logD, that is the question

So last week I asked twitter which lipophilicity measure was more relevant of binding of bases to hERG. The poll resulted in a landslide for logD(pH=7.4) (70%; 21 votes) over logP (30%; 9 votes). I did not vote.

So let's take another look at the question and I've cooked up a thought experiment to help you do this. Let's suppose that we have an amine bound to hERG (which your Scottish colleagues may call hairrg). It has a pK_a of 10.4 and logP of 6 and the IC₅₀ in the hERG assay is 100 nM (the safety people think that this will lead to an unpleasant torsades de pointes that will hERG a whole lot more than a corrective thrashing by Wendi Whiplasch). Provided that there is no significant partitioning of the protonated form of the amine into the octanol, the logD(7.4) value for the amine will be 3.

Let's imagine that we can change the pK_a of the amine while keeping all the other physicochemical and molecular properties the same. Changing the amine pK_a from 10.4 to 12.4 will get logD(7.4) down to 1. But how do you think the hERG IC₅₀ will respond?

Sunday, 22 May 2016

Sailor Malan's guide to fragment screening library design

Today I'll take a look at a JMC Perspective on design principles for fragment libraries that is intended to provide advice for academics. When selecting compounds to be assayed the general process typically consists of two steps. First, you identify regions of chemical space that you hope will be relevant and then you sample these regions. This applies whether you're designing a fragment library, performing a virtual screen or selecting analogs of active compounds with which to develop structure-activity relationships (SAR). Design of compound libraries for fragment screening has actually been discussed extensively in the literature and the following selection of articles, some of which are devoted to the topic, may be useful: Fejzo (1999), Baurin (2004), Mercier (2005), Schuffenhauer (2005), Albert (2007) Blomberg (2009), Chen (2009), Law (2009), Lau (2011), Schulz (2011); Morley (2013). This series of blog posts ( 1 | 2 | 3 | 4) on fragment screening library design that may also be helpful.

The Perspective opens with the following quote:

"Rules are for the obedience of fools and the guidance of wise men"

Harry Day, Royal Air Force (1898-1977)

It wasn't exactly clear what the authors are getting at here since there appears to be no provision for wise women. Also it is not clear how the authors would view rules that required darker complexioned individuals to sit at the backs of buses (or that swarthy economists should not solve differential equations on planes). That said, the quote hands me a legitimate excuse to link Malan's Ten Rules for Air Fighting and I will demonstrate that the authors of this Perspective can learn much from the wise teachings of 'Sailor' Malan.

My first criticism of this Perspective is that the authors devote an inordinate amount of space to topics that are irrelevant from the viewpoint of selecting compounds for fragment screening. Whatever your views on the value of ligand efficiency metrics and thermodynamic signatures, these are things that you think about once you've got the screening results. The authors assert, "As a result, fragment hits form high-quality interactions with the target, usually a protein, despite being weak in potency" and some readers might consider the 'concept' of high-quality interactions to be pseudoscientific psychobabble on par with homeopathy, chemical-free food and the wrong type of snow. That said, discussion of some of these peripheral topics would have been more acceptable if the authors had articulated the library design problem clearly and discussed the most relevant literature early on. By straying from their stated objective, the authors have broken the second of Malan's rules ("Whilst shooting think of nothing else, brace the whole of your body: have both hands on the stick: concentrate on your ring sight").

The section on design principles for fragment libraries opens with a slightly gushing account of the Rule of 3 (Ro3). This is unfortunate because this would have been the best place for the authors to define the fragment library design problem and review the extensive literature on the subject. Ro3 was originally stated in a short communication and the analysis that forms its basis is not shared. As an aside, you need to be wary of rules like these because the cutoffs and thresholds may have been imposed arbitrarily by those analyzing the data. For example, the GSK 4/400 rule actually reflects the scheme used to categorize continuous data and it could just have easily been the GSK 3.75/412 rule if the data had been pre-processed differently. I have written a couple ( 1 | 2 ) of blog posts on Ro3 but I'll comment here so as to keep this post as self-contained as possible. In my view, Ro3 is a crude attempt to appeal to the herding instinct of drug discovery scientists by milking a sacred cow (Ro5). The uncertainties in hydrogen bond acceptor definitions and logP prediction algorithms mean that nobody knows exactly how others have applied Ro3. It also is somewhat ironic that the first article referenced by this Perspective actually states Ro3 incorrectly. If we assume that Ro5 hydrogen bond acceptor definitions are being used then Ro3 would appear to be an excellent way to ensure that potentially interesting acidic species such as tetrazoles and acylsulfonamides are excluded from fragment screening libraries. While this might not be too much of an issue if identification of adenine mimics is your principal raison d'etre, some researchers may wish to take a broader view of the scope of FBDD. It is even possible that rigid adherence to Ro3 may have led to the fragment starting points for this project being discovered in Gothenburg rather than Cambridge. Although it is difficult to make an objective assessment of the impact of Ro3 on industrial FBDD, its publication did prove to be manna from heaven for vendors of compounds who could now flog milligram quantities of samples that had previously been gathering dust in stock rooms.

This is a good point to see what 'Sailor' Malan might have made of this article. While dropping Ro3 propaganda leaflets, you broke rule 7 (Never fly straight and level for more than 30 seconds in the combat area) and provided an easy opportunity for an opponent to validate rule 10 (Go in quickly - Punch hard - Get out). Faster than you can say "thought leader" you've been bounced by an Me 109 flying out of the sun. A short, accurate (and ligand-efficient) burst leaves you pondering the lipophilicity of the mixture of glycol and oil that now obscures your windscreen. The good news is that you have been bettered by a top ace whose h index is quite a bit higher than yours. The bad news is that your cockpit canopy is stuck. "Spring chicken to shitehawk in one easy lesson."

Of course, there's a lot more to fragment screening library design than counting hydrogen bonding groups and setting cutoffs for molecular weight and predicted logP. Molecular complexity is one of the most important considerations when selecting compounds (fragments or otherwise) and anybody even contemplating compound library design needs to understand the model introduced by Hann and colleagues. This molecular complexity model is conceptually very important but it is not really a practical tool for selecting compounds. However, there are other ways to define molecular complexity in ways that allow the general concept to be distilled into usable compound selection criteria. For example, I've used restriction of extent of substitution (as detailed in this article) to control complexity and this can be achieved using SMARTS notation to impose substructural requirements. The thinking here is actually very close to the philosophy behind 'needle screening' which was first described in 2000 by researchers at Roche although they didn't actually use the term 'molecular complexity'.

As one would expect, the purging of unwholesome compounds such as PAINS is discussed. The PAINS field suffers from ambiguity, extrapolation and convolution of fact with opinion. This series ( 1 | 2 | 3 | 4) of blog posts will give you a better idea of my concerns. I say "ambiguity" because it's really difficult to know whether the basis for labeling a compound as a PAIN (or should that be a PAINS) is experimental observation, model-based prediction or opinion. I say "extrapolation" because the original PAINS study equates PAIN with frequent-hitter behavior in a panel of six AlphaScreen assays and this is extrapolated to pan-assay (which many would take to mean different types of assays) interference. There also seems to be a tendency to extrapolate the frequent-hitter behavior in the AlphaScreen panel to reactivity with protein although I am not aware that any of the compounds identified as PAINS in the original study were shown to react with any of the proteins in the AlphaScreen panel used in that study. This is a good point to include a graphic to break the text up a bit and, given an underlying theme of this post, I'll use this picture of a diving Stuka.

One view of the fragment screening mission is that we are trying to present diverse molecular recognition elements to targets of interest. In the context of screening library design, we tend to think of molecular recognition in terms of pharmacophores, shapes and scaffolds. Although you do need to keep lipophilicity and molecular size under tight control, the case can be made for including compounds that would usually be considered to be beyond norms of molecular good taste. In a fragment screening situation I would typically want to be in a position to present molecular recognition elements like naphthalene, biphenyl, adamantane and (especially after my time at CSIRO) cubane to target proteins. Keeping an eye on both molecular complexity and aqueous solubility, I'd select compounds with a single (probably cationic) substituent and I'd not let rules get in the way of molecular recognition criteria. In some ways compound selections like those above can be seen as compliance with Rule 8 (When diving to attack always leave a proportion of your formation above to act as top guard). However, I need to say something about sampling chemical space in order to make that connection a bit clearer.

This is a good point for another graphic and it's fair to say that the Stuka and the B-52 differed somewhat in their approaches to target engagement. The B-52 below is not in the best state of repair and, given that I took the photo in Hanoi, this is perhaps not totally surprising. The key to library design is coverage and former bombardier Joseph Heller makes an insightful comment on this topic. One wonders what First Lieutenant Minderbinder would have made of the licensing deals and mergers that make the pharma/biotech industry such an exciting place to work.

The following graphic, pulled from an old post, illustrates coverage (and diversity) from the perspective of somebody designing a screening library. Although I've shown the compounds in a 2 dimensional space, sampling is often done using molecular similarity which we can think of inversely related to distance. A high degree of molecular similarity between two compounds indicates that their molecular structures are nearby in chemical space. This is a distance-geometric view of chemical space in which we know the relative positions of molecular structures but not where they are. When we describe a selection of molecular structures as diverse, we're saying that the two most similar ones are relatively distant from each other. The primary objective of screening library design is to cover relevant chemical space as effectively as possible and devil is in the details like 'relevant' and 'effectively'. The stars in the graphic below show molecular structures that have been selected to cover the chemical space shown. When representing a number of molecular structures by a single molecular structure it is important, as it is in politics, that what is representative not be too distant from what is being represented. You might ask, "how far is acceptable?" and my response would be, as it often is in Brazil, "boa pergunta". One problem is scaffolds differ in their 'contributions' to molecular similarity and activity cliffs usually provide a welcome antidote to the hubris of the library designer.

I would argue that property distributions are more important than cutoff values for properties and it is during the sampling phase of library design that these distributions are shaped. One way of controlling distributions is to first define regions of chemical space using progressively less restrictive selection criteria and then sample these in order, starting with the most restrictively defined region. However, this is not the only way to sample and might also try to weight fragment selection using desirability functions. Obviously, I'm not going to provide a comprehensive review of chemical space sampling in a couple of paragraphs of a blog post but I hope to have shown that the sampling of chemical space is an important aspect of fragment screening library design. I also hope to have shown that failing to address the issue of sampling relevant chemical space represents a serious deficiency of the featured Perspective.

The Perspective concludes with a number of recommendations and I'll conclude the post with comments on some of these. I wouldn't have too much of a problem with the proposed 9 - 16 heavy atom range as a guideline although I would consider a requirement that predicted octanol/water logP be in the range 0.0 - 2.0 to be overly restrictive. It would have been useful for the authors to say how they arrived at these figures and I invite all of them to think very carefully about exactly what they mean by "cLogP" and "freely rotatable bonds" so we don't have a repeat of the Ro3 farce. There are many devils in the details of the statement:"avoid compounds/functional groups known to be associated with high reactivity, aggregation in solution, or false positives". My response to "known" is that it is not always easy to distinguish knowledge from opinion and "associated" (like correlated) is not a simple yes/no thing. It is not cleat how "synthetically accessible vectors for fragment growth" should be defined since there is also a conformational stability issue if bonds to hydrogen are regarded as growth vectors.

This is a good point at which to wrap things up and I'd like to share some more of Sailor Malan's wisdom before I go. The first rule (Wait until you see the whites of his eyes. Fire short bursts of 1 to 2 seconds and only when your sights are definitely 'ON') is my personal favorite and it provides excellent, practical advice for anybody reviewing the scientific literature. I'll leave you with a short video in which a pre-Jackal Edward Fox displays marksmanship and escaping skills that would have served him well in the later movie. At the start of the video, the chemists and biologists have been bickering (of course, this never really happens in real life) and the VP for biophysics is trying to get them to toe the line. Then one of the biologists asks the VP for biophysics if they can do some phenotypic screening and you'll need to watch the video to see what happens next...