Thursday, 1 January 2026

Hit to Lead best practice?

I'm now in Trinidad and I'll share a 180° panorama from Paramin where I walk for exercise. This district in Trinidad's Northern Range is renowned for its agriculture and the most excellent produce is grown in 'gardens' on steep hillsides. My walk would take about two and a quarter hours if I just walked but it usually takes rather longer because I like to take photos and often stop on the ridge to gaze at corbeaux 'surfing' the updrafts. Most of all I enjoy catching up with friends in Paramin and not so long ago one of them was telling me about the sound made by douens (which have terrified me since childhood because I was never baptised). Some years ago I was struggling along the ridge with a hacking cough that I'd brought with me from the UK three days previously when I heard a familiar voice (one of my friends was visiting his sister). The conversation turned to my cough and he instructed his sister to bring some medicine. She produced a bottle of a liquid that looked like fluorescein and, as she decanted some into a shot glass my friend exclaimed "dat too much yuh go kill him". The liquid appeared to have a puncheon base and my friend's sister also gave me some bush to make tea. My cough was history after three days.             


I’ll be taking a look at The European Federation for Medicinal Chemistry and Chemical Biology (EFMC) Best Practice Initiative: Hit to Lead (Q2025) in this post. I have a number of criticisms of this study and it shouldn’t need saying that do you raise the bar for yourself when you present your work as defining best practices. As is customary for blog posts here at Molecular Design I’ve used Q2025 reference numbers when referring to literature studies and quoted text is indented with my comments in red italics. This will be a long post and strong coffee is recommended.

Best practices are, in essence, ways of doing things and it’s actually very difficult to demonstrate objectively that one way of doing things is better (or worse) than another way. My general view of Q2025 is of a poorly organized article that at times lacks clarity and coherence. Some of the advice offered on how best to do Hit to Lead (H2L) work is unsound and the Authors also make a number of significant errors. Although the abstract refers to “contemporary drug discovery” the recommended best practices do, in my view, appear to be firmly rooted in the past given that that fragment-based design (FBD) is not covered and there is no mention of important 'new' modalities such as irreversible covalent inhibition and targeted protein degradation. It’s worth mentioning in passing that biological activity for some new modalities cannot be meaningful quantified as a single parameter such as an IC50 value and this complicates the use of the ligand efficiency metrics (a post on covalent ligand efficiency will give you an idea of some of the difficulties) which the Authors seem to consider important in H2L work. I consider the quantity of literature cited in the Q2025 study to be excessive, especially given that some of the cited articles have minimal relevance to H2L work (the failure of the Authors to cite R2009 is also noteworthy). In some cases the cited literature does not support assertions made by the Authors. In my view Figures 1, 5 and 8 are redundant.

While I see plenty wrong with Q2025 it’s worth flagging up points on which the Authors and I appear to be in agreement. I think that they put it well with the following statement: My view is 

Leads have line of sight to a development candidate and bring an understanding of what priorities Lead Optimisation should address.

I used this football analogy in an earlier post:

The screening phase is followed by the hit-to-lead phase and it can be helpful to draw an analogy between drug discovery and what is called football outside the USA. It’s not generally possible to design a drug from screening output alone and to attempt to do so would be the equivalent of taking a shot at goal from the centre spot. Just as the midfielders try move the ball closer to the opposition goal, the hit-to-lead team use the screening hits as starting points for design of higher affinity compounds. The main objective in the hit-to-lead phase to generate information that can be used for design and mapping structure-activity relationships for the more interesting hits is a common activity in hit-to-lead work.

I certainly agree that it is important to establish structure-activity relationships (SARs) for structural series of interest although I have no idea what the Authors mean by “dynamic SAR” (I’m guessing that some of them might be similarly in the dark on this point). I also agree that consideration of physicochemical properties, especially lipophilicity, is very important in H2L work (just as it is in optimisation of the leads) although the case for a Nobel Prize made in a 2024 JMC Editorial does, in my view, appear to have been overcooked.

I argue that drug discovery should be seen in a Design of Experiments framework (generate the information that you need as efficiently as possible) rather than as the prediction exercise that many who tout machine learning (ML) as a panacea for Pharma's ills would have you believe. Regardless of which view prevails it’s abundantly clear that generation and analysis of data are very important in contemporary drug discovery and are likely to become even more important in the future). However, if you’re going to base decisions on trends in data then it’s important that you know how strong the trends are because this tells you how much weight to give to the trends when making your decisions. Most drug discovery scientists will have encountered analyses of relationships between predictors of ADME (absorption, distribution, metabolism, and excretion) behaviour and physicochemical and chemical structure descriptors and we observed in the KM2013 perspective that:

The wide acceptance of Ro5 provided other researchers with an incentive to publish analyses of their own data and those who have followed the drug discovery literature over the last decade or so will have become aware of a publication genre that can be described as ‘retrospective data analysis of large proprietary data sets’ or, more succinctly, as ‘Ro5 envy’.

In some cases trends observed in data are presented in ways that make them appear to be stronger than they actually are (this is typically achieved by categorizing continuous-valued data prior to analysis) and [13a], [24] and [26] were criticised in this context in KM2013. When reading articles on drug-likeness and compound quality it is also important to be aware that correlation does not imply causation.  One should be particularly wary of of studies such as [20c] which present questionable analyses of proprietary data as "facts" or claim that such analyses have revealed "principles". I see the weakness of these trends partly as a reflection of chemical structure diversity in datasets and would expect the corresponding trends to be stronger within structural series (I offer the following advice in NoLE):

Drug designers should not automatically assume that conclusions drawn from analysis of large, structurally-diverse data sets are necessarily relevant to the specific drug design projects on which they are working.

I see erosion of critical thinking skills as a significant problem in contemporary drug discovery and some leaders in the field appear to have lost the ability to distinguish what they know from what they believe. As I observed in a review of a 2024 JMC Editorial (Property-Based Drug Design Merits a Nobel Prize) the Rule of 5 (Ro5) is not actually supported by data in the form that it was stated. The wide acceptance of Ro5 as a definition of drug-likeness propagates what I consider to be a misleading view that drugs occupy a contiguous and distinct region of chemical space. Some of the claims made in the JMC Editorial (“a compound is more likely to be clinically developable when LipE > 5”, “a discovery compound is more likely to become a drug when Fsp3 > 0.40” and “a compound is more likely to have good developability when PFI < 7”) do not appear to be based on data. I remain sceptical that developability and likelihood of clinical success of a compound can be meaningfully assessed even  one doesn't even when knows that the compound actually exhibits activity against the target(s) of interest. In my view the suggestion that simple drug discovery guidelines are worthy of a Nobel Prize does a huge disservice to drug discovery scientists by trivializing the very significant challenges that they face.   

Like many in the drug discovery field, I consider lipophilicity to be the single most important physicochemical property in drug discovery and I would generally anticipate that a surfeit of lipophilicity will end in tears. That said, I don't consider lipophilicity to be usefully predictive of physicochemical properties such as permeability and aqueous solubility that are more relevant than lipophilicity from the perspective of oral absorption. When I assert that lipophilicity is not "usefully predictive" I'm certainly not denying that trends in the data exist. The problem is that the trends are not so strong that having solubility value that has been predicted from lipophilicity means that you no longer need to measure aqueous solubility.    

In drug discovery projects I generally recommend examination of the response of potency (expressed as a logarithm) to increased lipophilicity. In the ideal situation the correlation of potency with lipophilicity will be weak, indicating that potency is driven by factors other than lipophilicity. If the correlation of potency with lipophilicity is strong then you need the response (the slope for a linear correlation) to be relatively steep. I consider it to be generally helpful to plot potency against lipophilicity with reference lines corresponding to different LipE values (see R2009) and I would also suggest modelling the response and using the residuals to quantify the extent that individual potency measurements beat (or are beaten by) the trend in the data (the approach is outlined in the 'Alternatives to ligand efficiency for normalization of affinity' section of NoLE).

In drug discovery lipophilicity is usually quantified by the logarithm of the octanol/water partition coefficient (log P) or distribution coefficient (log D). The choice of octanol/water for quantification of lipophilicity is arbitrary and some, including me, consider saturated hydrocarbons such as cyclohexane or hexadecane to be physically more realistic than octanol as a model for the core of a lipid bilayer. It is the distribution coefficient (D) rather than the partition coefficient (P) that is measured for lipophilicity assessment although the two quantities are equivalent when ionization can be safely neglected. Values of logP for ionizable compounds can be derived from the response of log D to pH although this is not generally done routinely in in drug discovery. Alternatively, you can make the assumption that only neutral forms of compounds partition into the organic phase and use Equation (1) below to convert log D values to log P values (to do this you’ll also need a reliable estimate for pKa in order to calculate the neutral fraction). When log D (as opposed to log P) is used to assess the ‘quality’ of compounds you can make compounds better simply by increasing the extent to which they are ionized and I hope you can see that going down this path is likely to end as well as things did for the Sixth Army at Stalingrad.


In drug discovery log P values are typically calculated and it can often be quite difficult when reading the literature to know which method has been used for the calculations (sometimes the term ‘cLogP’ appears to have been used merely to denote that log P values have been calculated).  For example, it is stated in [13a] that “Physical property data were obtained from AstraZeneca’s C-Lab tool, incorporating standard packages for LogP calculations (cLogP, ACDLogP), and an in-house algorithm for the distribution coefficient (1-octanol–water LogD at pH 7.4)”. In general, different prediction methods will give different log P values for the same compound (for example the Ro5 lipophilicity cutoff is 5 when ClogP is used but 4.15 when MlogP is used). That said, choice of method for predicting log P and whether you use measured log D or predicted log P become less important issues when working within structural series because hydrogen bond donors and acceptors, and ionizable groups tend to be relatively conserved under this scenario.

That log D and log P are different quantities in the context of drug design is one of a number of things that the Authors of [34a] (Molecular Property Design: Does Everyone Get It?) just don’t seem to ‘get’ and I’ll point you toward a blog post in which this point is discussed in a bit more detail. Let’s examine Figure 2 (Impact of hydrophobicity on developability assays and the profile of marketed oral drugs) of [34a] and I’d like you take a look at the upper panel (a). You’ll notice that the visualization for some of the ‘developability’ assays is based on PFI (derived from log D measured chromatographically at pH 7.4). However, the visualization for hERG (+1 charge) and promiscuity is based on iPFI (derived from ‘Chrom logP’ and it is not clear how this quantity was defined or generated). I would also argue that the activity criterion (pIC50 > 5) for the promiscuity analysis is too permissive to be physiologically relevant (this is a common issue in the promiscuity literature). As an aside, I am unconvinced that log D values were actually measured chromatographically at pH 7.4 for all the drugs that form the basis of the analysis shown in the lower panel (b) of Figure 2.        

After a long preamble it’s time to start my review of Q2025 and comments will follow the order of the article. I see citation of [2] and [3] as gratuitous while [4] does not appear to present hard evidence for the view that “ensuring high quality of lead series is a large cost and time saver in the overall process of drug discovery” (it must be stressed that I certainly don’t deny the value of high quality lead series and am merely pointing out that the chosen reference does not actually demonstrate that higher quality of lead series results in cost and time savings in drug discovery).

In my view neither Figure 1 nor its caption (see below) makes any sense.

Figure 1. Illustration of the multi-objective characterisation necessary in the journey from a hit to a drug. All these necessary characteristics, described by illustrative principal components, are influenced by the physicochemical properties of the molecules.

You’ll frequently encounter graphics like Figure 1 that show low-dimensional chemical spaces in the drug discovery literature (for example, a 2-dimemsional space might be specified in terms of lipophilicity and molecular size). While it’s very easy to generate graphics like these the relevance of the chemical spaces to drug design is often unclear. There are ways in which you can demonstrate the relevance of a chemical space to drug design and, for example, you might build usefully predictive models for quantities such as IC50, aqueous solubility or permeability using only the dimensions of the particular chemical space as descriptors. Alternatively, you could show that compounds in mutually exclusive categories such as ‘progressed to phase 2’ and ‘failed to progress to phase 2’ occupy different regions of the chemical space (note that it’s not sufficient to show that a single class of compounds such as ‘approved drugs’ occupies a particular region within the chemical space and this is the essence of a general criticism that I make of Ro5 and QED). It is common to depict the different categories as ellipses that enclose a given fraction of the data points corresponding to each category and the orientation of each ellipse with respect to the axes indicates the degree to which the descriptors that define the chemical space are correlated for each category. One problem with Figure 1 is that the meaning of the ellipses is unclear and I would challenge the assertion made by the Authors that “the journey of a drug discovery campaign is characterized in Figure 1, showing how the active hit needs to be modified to address the requirements impacting the efficacy and safety of the molecule”.

Potency optimisation alone is not a viable strategy towards the discovery of efficacious and safe drugs, or even high-quality leads. Concurrent optimisation of the physicochemical properties of a molecule is the most important facet of drug discovery, as these properties influence its behaviours, disposition and efficacy [12a | 12b]. [While I certainly agree that there is a lot more to drug design than maximisation of potency I would argue that controlling exposure is a more important objective than optimization of physicochemical properties. I don't consider either reference as evidence that "concurrent optimisation of the physicochemical properties of a molecule is the most important facet of drug discovery" and it is not accurate to describe metabolic stability, active efflux and  affinity for anti-targets as "physicochemical properties".  I think the Authors need to say more about which physicochemical properties they recommend to be optimized and be clearer about exactly what constitutes optimization. Lipophilicity alone is not usefully predictive of properties such as bioavailability, distribution and clearance that determine the in vivo behaviours of drugs.] Together these outcomes define the quality of the molecule, indicative of its chances of success in the clinic, as evidenced in numerous studies [13a | 13b]. [Neither of these articles appears to provide convincing evidence of a causal relationship between “the quality of a molecule” and probability of success in the clinic.  Much of the 'analysis' in [13a] consists of plots of median values without any indication of the spreads in the corresponding distributions. As explained in KM2013 presenting data in this manner exaggerates trends and I consider it unwise to base decisions on data that have been presented in this manner. Quite aside from from the issue of hidden variation I do not consider the relationship between promiscuity and median cLogP reported (Figure 3a) in [13a] to be indicative of probability of success in the clinic, given that the criterion for 'activity' ( > 30% inhibition at 10 µM) is far too permissive to be physiologically relevant (this is a common issue in the promiscuity literature).]

While the optimal lipophilicity range has been suggested as a log D7.4 between 1 and 3, [15] this is highly dependent on the chemical series. [The focus of the analysis was permeability and the range was actually defined in terms of AZlogD (calculated using proprietary in-house software) as opposed to log D measured at 7.4. The correlation between the logarithm of the A to B permeability and AZlogD is actually very weak (r2 = 0.16) which would imply a high degree of uncertainty in threshold values used to specify the optimal lipophilicity range. While I remain sceptical about the feasibility of meaningfully defining optimal property ranges the assertion that the proposed range in AZlogD of 1 to 3 “is highly dependent on the chemical series” is pure speculation and is not based on data.] Best practice would be to generate data for a diverse set of compounds in a series, if measuring it for all analogues is not possible, and determine the lipophilicity range that leads to the most balanced properties and potency [3 | 16]. [It is not clear what the Authors mean by “most balanced properties and potency” nor is it clear how one is actually supposed to use lipophilicity measurements to objectively “determine the lipophilicity range that leads to the most balanced properties and potency”. My view is that to demonstrate "balanced properties and potency" would requite measurements of properties such as aqueous solubility and permeability that are more predictive than lipophilicity of exposure in vivo. I do not consider either [3] or [16] to support the assertions being made by the Authors.]  Lipophilicity and pKa prediction models can then guide further designs and synthesis of analogues along the optimisation pathway (Figure 3 [17]). but measurements are advised, particularly by chromatographic methods, such as Chrom log D7.4, in [18] contemporary practice. [In general, it is very difficult to convincingly demonstrate that one measure of lipophilicity is superior to another. Chromatographic measurement of log D is faster than the shake flask method used traditionally but it is unclear as to which solvent system the measurement corresponds.  Furthermore, the high surface area to volume area of the stationary phase means that an ionized species can interact to a significant extent with the non-polar stationary phase while keeping the ionized group in contact with the polar stationary phase and one should anticipate that the contribution of ionization to log D values might be lower in magnitude than for a shake flask measurement.]

As noted earlier in the post I consider it helpful to plot (as is done in Figure 3 which also serves as the graphical abstract) potency against lipophilicity with reference lines corresponding to different LLE (LipE) values (see R2009 which really should have been cited) to be a good way for H2L project teams to visualize potency measurements for their project compounds. That said, I consider view of the discovery process implied by Figure 3 to be neither accurate nor of any practical value for scientists working on H2L projects. It is relatively easy to define optimization of potency and measurements in an vitro assay are typically relevant to target engagement in vivo (uncertainty in the concentration of the drug in the target compartment, and of the species with which it competes, is likely to be the bigger issue when trying to understand why in vitro potency fails to translate to beneficial effects in vivo).

However, there is quite a bit more to optimization of properties such as permeability, aqueous solubility, metabolic stability and pharmacological promiscuity that are believed to be predictive of ADME and toxicity, and I consider a view that determining "the lipophilicity range that leads to the most balanced properties and potency" constitutes optimization to be hopelessly naive. The main challenge in H2L work (and in lead optimization) is to identify compounds for which potency and properties related to ADME and toxicity are all acceptable.           

Figure 3. There are numerous routes to climb a mountain, as there are to discover a drug, but a measured approach to lipophilicity will guide an optimal path, [The Authors need to articulate what they mean by “a measured approach to lipophilicity” (which does come across as arm-waving) and provide evidence to support their claim that it “will guide an optimal path”.] where the outcome is usually driven by a balance of activity and lipophilicity [This appears to be a statement of belief and the Authors do need to provide evidence to support their claim. The Authors also need to say more about how the “balance of activity and lipophilicity” can be objectively assessed.] (The parallel lines represent LLE, i.e. plC50 - log P). [This way of visualizing data was introduced in the R2009 study which, in my view, should have been cited.]

Thus the Distribution Coefficient, (log D at a given pH) is a highly influential physical property governing ADMET profiles [20a | 20b | 20c] such as on- and off-target potency, solubility, permeability, metabolism and plasma protein binding (Figure 4) [14b]. [I recommend that the term ‘ADMET’ not be used in drug discovery because ADME (Absorption, Distribution, Metabolism, and Excretion) and T (Toxicity) are completely different issues that need to be addressed differently in design. I would argue that the ADME profile of a drug is actually defined by its in vivo characteristics such as fraction absorbed (which may vary with dose and formulation), volume of distribution and clearance (the Authors appear to be confusing ADME with in vitro predictors of ADME) and I would also argue that toxicity is an in vivo phenomenon. In order to support the claim that log D “is a highly influential physical property governing ADMET profiles” it would be necessary to show that log D is usefully predictive of what happens to drugs in vivo. My view is that the cited literature does not support the claim that the claim that log D “is a highly influential physical property governing ADMET profiles” given that  [20a] does not even mention log D and neither [20b] nor [20c] provides any evidence that log D is usefully predictive of in vivo behaviour of drugs.]

Figure 4. The impact of increasing lipophilicity on various developability outcomes [14b] [It is unclear as to whether lipophilicity is defined for this graphic in terms of log P or log D. It would be necessary to show more than just the ‘sense’ of trends for the term “impact” to be appropriate in this context. I do not consider the use of the term “developability outcomes” to be accurate.]

Aqueous solubility is certainly an important consideration in H2L work although I think that the Authors could have articulated the relevant physical chemistry rather more clearly than they have done. You can think of the process of dissolution as occurring in two steps (sublimation of the solid followed by transfer from the gas phase to water). Lipophilicity usually features in models for prediction of aqueous solubility although I consider wet octanol to be a thoroughly unconvincing model for the gas phase. We generally assume that aqueous solubility is limited by the solubility of the neutral form (which is why ionization tends to be beneficial) but when this assumption breaks down the solubility that you measure will depend on both the nature and concentration of the counter-ion. As I note in HBD3 optimization of intrinsic aqueous solubility that optimization of intrinsic aqueous solubility (the solubility of the neutral form of the compound) is still a valid objective for ionizable compounds because we're typically assuming that only neutral species can cross the cell membrane by passive permeation.

Some general advice that I would offer to drug discovery scientists encountering solubility issues is that they should try to think about molecular structures from the perspectives of molecular interactions in the solid state and crystal packing. I would expect the left hand 'Reduce crystal packing' structure in Figure 6 to be able to easily adopt a conformation in which the planes corresponding to the aromatic rings and amide are all mutually coplanar (this is a scenario in which a non-aromatic replacement for an aromatic ring might be expected to have a relatively large impact). In HBD3 I suggest that deleterious effects of aromatic rings on aqueous solubility might be due the molecular interactions of the aromatic rings rather than their planarity. I also suggest in HBD3 that elimination of non-essential hydrogen bond donors be considered as a tactic for improving aqueous solubility because it tends to increase the imbalance between hydrogen bond donors and acceptors while minimizing the resulting increase lipophilicity.      

Rational [this use of "rational" is tautological] reasons for poor solubility were succinctly described by Bergstrom, who coined "Brick Dust and Greaseballs" as two limiting phenomena in drug discovery [22] which are in line with the empirical findings that led to General Solubility Equation [23] (Figure 5). [I don’t consider the General Solubility Equation to have any relevance to H2L work because it has not been shown to be usefully predictive of aqueous solubility for compounds of interest to medicinal chemists and the inclusion of Figure 5, which merely shows how predicted solubility values map on to an arbitrary categorisation scheme, appears to be gratuitous.] Succinctly, three factors influence solubility: lipophilicity, solid state interactions and ionisation. [It is solvation energy as opposed to lipophilicity that influences solubility and wet octanol is a poor model for the gas phase.] Determining which are the strongest drivers of low solubility will guide the optimisation (Figure 6). Using the analysis in Figure 5 the Solubility Forecast Index emerged, using the principle that an aromatic ring is detrimental to solubility, roughly equivalent to an extra log unit of lipophilicity for each aromatic ring (Thus SFI = clog D7.4 + #Ar) [24]. [I consider the use of the term “principle” in this context to to be inaccurate given that that the basis for SFI is subjective interpretation of a graphic generated from proprietary aqueous solubility data and I direct readers to the criticism of SFI in KM2023.] Minimising aromatic ring count is an important and statistically significant metric to consider [25] [The importance of minimizing aromatic ring count is debatable and it is meaningless to describe metrics as “statistically significant”.] - consistent with the "escape from flatland" concept [26] that focusses on increasing the sp³ (versus sp²) ratio in molecules, [The focus in the “escape from flatland” study is actually on the fraction of carbon atoms that are sp3 (Fsp3) and not on “the sp³ (versus sp²) ratio”.] even though no significant trends are apparent in detailed analyses of sp³ fractions [27]. [The “analyses of sp³ fractions” in [27] consist of comparisons of drug - target medians for the periods 1939-1989, 1990-2009 and 2010-2020 and all appear to be statistically significant (although I don't consider these analyses to have any relevance to H2L work).]

An important factor in hit selection is to prioritise compounds with higher ligand efficiency. Ligand efficiency, defined as activity [LE is actually defined in terms of Gibbs free energy of binding and not activity.] per heavy atom (LE=1.37* pKi/Heavy Atom Count, Figure 7a), is commonly considered in discovery programmes as a quality metric [33]. [LE (Equation 3) is actually defined as the Gibbs free energy of binding, ΔG° (Equation 2), divided by the number of non-hydrogen atoms, NnH (this is identical to heavy atom count although I consider the term to be less confusing), but the quantity is physically (and thermodynamically) meaningless because perception of efficiency varies with the arbitrary concentration, C°, that defines the standard state (see Table 1 in NoLE). Using a standard concentration enables us to calculate changes in free energy that result from changes in composition and, while the convention of  using C° = 1 M when reporting ΔG° values. is certainly useful, it would be no less (or more) correct to report ΔG° values for  C° = 1 µM. Put another way the widely held belief that 1 M is a 'privileged' standard concentration is thermodynamic nonsense (Equation 2 shows you how to interconvert ΔG° values between different standard concentrations. Given the serious deficiencies of LE as a drug design metric, I suggest modelling the response and using the residuals to quantify the extent that individual potency measurements beat (or are beaten by) the trend in the data (the approach is outlined in the 'Alternatives to ligand efficiency for normalization of affinity' section of NoLE). There are two errors in the expression that the Authors have used for LE (the molar energy units are missing and the expression is written in terms of Ki rather than KD). The factor of 1.37 in the expression for LE  comes from the conversion of affinity (or potency) to ΔG° at a temperature of 300 K, as recommended in [35], although biochemical assays are typically are typically run at human body temperature (310 K). My view is that it is pointless to include the factor of 1.37 given that this entails dropping the molar energy units and using a temperature other than that at which the assay was run. Dropping the factor of 1.37 would also bring LE into line with LLE (LipE).] Various analyses suggest that, on average, this value barely change over the course of an optimisation process [20b | 27 | 34a | 34b] - so it is important to consider maintenance of any figure during any early SAR studies. [I disagree with this recommendation. These analyses are completely meaningless because the variation of LE over the course of an optimization itself varies with the concentration unit in which affinity (or potency) is expressed (Table 1 of NoLE illustrates this for three ligands of that differ in molecular size and potency). In [34a] the start and finish values values of LE were averaged over the different optimizations without showing variance and it is therefore not accurate to state that the study supports the assertion that LE values "barely change over the course of an optimisation process".Lipophilic Ligand Efficiency (activity minus lipophilicity typically pKi -log P, Figure 7b), which is widely recognised as the key principle in successful drug optimisation, comes into play both for hit prioritization and optimisation. [LLE is a simple mathematical expression and I don’t consider it accurate to describe it as a “principle” let alone “the key principle in successful drug optimisation”. LLE can be thought of as quantifying the energetic cost of transferring a ligand from octanol to its target binding site although this interpretation is only valid when the ligand is predominantly neutral at physiological pH and binds in its neutral form. LLE is just one of a number of ways to normalize potency with respect to lipophilicity and I don't think that anybody has actually demonstrated that (pIC50 – log P) is any better (or worse) as a drug design principle than pIC50 – 0.9 × log P. When drug discovery scientists report that they have used LLE it often means that they have plotted their project data in a similar manner to Figure 3 as opposed to staring at a table of LLE values for their compounds. As an alternative to LLE (LipE) for normalization of affinity (or potency) with respect to lipophilicity I suggest modelling the response and using the residuals to quantify the extent that individual potency measurements beat (or are beaten by) the trend in the data (the approach is outlined in the 'Alternatives to ligand efficiency for normalization of affinity' section of NoLE).] Improving this value reflects producing potent compounds without adding excessive lipophilicity. Taken together, it has been shown that for any given target, the drugs mostly lie towards the leading "nose" [?] where LE and LLE are both towards higher values [20b | 35]. [This perhaps not the penetrating an insight that the Authors consider it to be, given that drugs are usually more potent than the leads and hits from which they have been derived.] However, setting aspirational targets for either metric is unwise, as analysis of outcomes indicates that the values are target dependant [20b]. [I consider target dependency to be a complete red herring in this context and a more important issue is that you can’t compensate for inadequate potency by reducing molecular size or lipophilicity.]  Focusing on increasing LLE to the maximum range possible and prioritizing series with higher average values is the recommended strategy [27 | 36]. [It is not clear what is meant by “increasing LLE to the maximum range possible” and I consider it very poor advice indeed to recommend “prioritizing series with higher average values” (my view is that you actually need to be comparing the compounds from different series that have a realistic chance of matching the desired lead profile. The Authors of Q2025 appear to be misrepresenting [36] given that the study does not actually recommend “prioritizing series with higher average values”. This blog post on [27] might be relevant.]

One can summarize this section with a simple but critical best practice: potency and properties (physicochemical and ADMET) have to be optimized in parallel (Figure 8) [37] to get to quality leads and later drug candidates with higher chances of clinical success. Whilst seemingly trivial, this proposition is rendered challenging by an "addiction to potency" and a constant reminder of this critical concept remains useful for medicinal chemists [38]. [My view is that many medicinal chemists had moved on from the addiction to potency when the molecular obesity article was published a decade and a half ago and I would question the article's relevance to contemporary H2L practice. The threshold values that define the GSK 4/400 rule actually come from an arbitrary scheme used to categorize the proprietary data analyzed in the G2008 study as opposed being derived from objective analysis of the data. The study reproduces the promiscuity analysis from [13a] which I criticised earlier in this post for exaggerating the strength of the trend and using an excessively permissive threshold for ‘activity’.] With poor properties, even "good ligands" may not fully answer pharmacological questions [39a | 39b]. [These two articles focus on chemical probes and I don’t consider either article have any relevance to H2L work.  Chemical probes need to be highly selective (more so than drugs) and permeable although solubility requirements are likely to be less stringent when using chemical probes to study intracellular phenomena than in H2L work and one generally does not need to worry about achieving oral bioavailability.] 

I agree that mapping SARs for structural series of interest is an important aspect of H2L work and activity cliffs (small modifications in structure resulting in large changes in activity) are of particular interest given the potential for to beating trends and achieving greater selectivity. Instances of decreased lipophilicity resulting in increased potency (or at least minimal loss of potency) should also be of significant interest to H2L teams. When mapping SARs it is important that structural transformations should change a single pharmacophore feature at a time and one should always consider potential ‘collateral effects’, such as perturbed conformational preferences, that might confound the analysis. Some of the structural transformations shown in Figure 10 change more than one pharmacophore feature at a time which makes it impossible to determine which pharmacophore feature is required for activity.    

Figure 10. Conceptual example of iterative SAR [the meaning of the term “iterative SAR” is unclear] to determine the pharmacophore. As each change may affect binding interactions, conformation and ionization state; complementary structural modification will be needed to understand the change in potency and determine the pharmacophore 

Is Nitrogen needed (e.g. HBA)? [In addition to eliminating the quinoline N hydrogen bond acceptor this structural transformation eliminates a potential pharmacophore feature (the amide carbonyl oxygen can function as a hydrogen bond acceptor) while creating a cationic centre which will incur a significant desolvation penalty.]

Is NH needed? [This structural transformation eliminates the amide NH but it also is unlikely to address the question of whether the NH is needed because the amide carbonyl has also been eliminated.]

Is carbonyl needed? [The elimination of the amide carbonyl oxygen (hydrogen bond acceptor) creates a cationic centre which will incur a desolvation penalty.] 

As a last proposition, [49a | 49b] we suggest that the progress in computational physicochemical and ADMET property predictions represents an opportunity to accelerate the optimisation of molecules with a "predict-first" mindset [4 | 50]. [I certainly agree that models should be used if they are available. However, the citation of literature does appear to be gratuitous and it is unclear why the Authors believe that scientists working on H2L projects will benefit from knowing that a proprietary system for automated molecular design has been developed at GSK.]  The first step is to generate sufficient data for a series to build confidence in [51] any models, which can then be exploited in the prioritization of compounds for synthesis that fit with aspirational profiles [My view is that it would be very unwise for H2L project teams to blindly use models without assessing how well the models predict project data and the citation of [51] appears to be gratuitous been cited. Typically, H2L project teams use measured data to move their projects forward and generating data purely for the purpose of model evaluation is likely to be a distraction. One piece of advice that I will offer to H2L project teams is that they attempt to characterise responses of ADME predictors, such as aqueous solubility and permeability, to lipophilicity (likely to involve measurements for less potent compounds).] This ensures higher physicochemical quality [I consider “ensures” to be an exaggeration and I would argue that “physicochemical quality” is that not something that can even be defined meaningfully or objectively (let alone quantified).], asks more pertinent questions and might reduce the total number of molecules made to get to the lead (Figure 11).

It's been a long post and I'll say a big thank you for staying with me until the end. I wrote this post primarily for younger scientists and one piece of advice that I will offer to them is to not switch off their critical thinking skills just because a study has been presented as defining best practices or is highly-cited. The world right now is not a nice place for many of its inhabitants and my wish for 2026 is a kinder, gentler and fairer world.