I'll examine an article entitled ‘Mapping the Efficiency and Physicochemical Trajectories of Successful Optimizations’ (YL2018) in this post and I should note that the article title reminded me that abseiling has been described as the second fastest way down the mountain. The orchids in Blanchisseuse have been particularly good this year and I’ll include some photos of them to break the text up a bit.
It’s been almost 22 years since the rule of 5 (Ro5) was published. While the Ro5 article highlighted molecular size and lipophilicity as pharmaceutical risk factors, the rule itself is actually of limited utility as a drug design tool. Some of the problems associated with excessive lipophilicity had actually been recognized (see Yalkowsky | Hansch) over a decade before the publication of Ro5 in 1997 and there’s also this article that had been published in the previous year. However, it was the emergence of high-throughput screening that can be regarded as the trigger for Ro5 which, in turn, dramatically raised awareness of the importance of physicochemical properties in drug design. The heavy citation and wide acceptance of Ro5 provided incentives for researchers to publish their own respective analyses of large (usually proprietary) data sets and this has been expressed more succinctly as “Ro5 envy”.
So let's take a look at YL2018 and the trajectories. I have to concede that ‘trajectory’ makes it all seem so physical and scientifically rigorous even though ‘path’ would be more appropriate (and easier to say after a few beers). As noted in ‘The nature of ligand efficiency’ (NoLE), I certainly believe that it is a good idea for medicinal chemistry teams to both plot potency (e.g. pIC50) against risk factors such as molecular size or lipophilicity for their project compounds and to analyze the relationships between potency and these quantities. However, it is far from clear that a medicinal chemistry team optimizing a specific structural series against a particular target would necessarily find the plots corresponding to optimization of other structural series against other targets to be especially relevant to their own project.
YL2018 claims that “the wider employment of efficiency metrics and lipophilicity control is evident in contemporary practice and the impact on quality demonstrable”. While I would agree that efficiency metrics are integral to the philatelic aspects of modern drug discovery, I don’t believe that YL2018 actually presents a single convincing example of efficiency metrics being used for decision making in a specific drug design project. I should also point out that each of the authors of YL2018 provided cannon fodder (LS2007 | HY2010 ) for the correlation inflation article and you might want to keep that in mind when you read the words “evident” and “demonstrable”. They also published 'Molecular Property Design: Does Everyone Get It?' back in 2015 and you may find this review of that seminal contribution to the drug design literature to be informative.
I reckon that it would actually be a lot more difficult to demonstrate that efficiency metrics were used meaningfully (i.e. for decision making rather than presentation at dog and pony shows) in projects than it would be to demonstrate that they were predictive of pharmaceutically relevant behavior of compounds. In NoLE, I stated:
"However, a depiction  of an optimization path for a project that has achieved a satisfactory endpoint is not direct evidence that consideration of molecular size or lipophilicity made a significant contribution toward achieving that endpoint. Furthermore, explicit consideration of lipophilicity and molecular size in design does not mean that efficiency metrics were actually used for this purpose. Design decisions in lead optimization are typically supported by assays for a range of properties such as solubility, permeability, metabolic stability and off-target activity as well as pharmacokinetic studies. This makes it difficult to assess the extent to which efficiency metrics have actually been used to make decisions in specific projects, especially given the proprietary nature of much project-related data."
YL2018 states, “Trajectory mapping, based on principles rather than rules, is useful in assessing quality and progress in optimizations while benchmarking against competitors and assessing property-dependent risks.” and, as a general point, you need to show you're on top of the physical chemistry if you're going write articles like this.
Ligand efficiency represents something of a liability for anybody claiming expertise in physical chemistry. The reason for this is that perception of efficiency depends on the unit that you use to express affinity and this is a serious issue (in the "not even wrong" category) that was highlighted in 2009 and 2014 before NoLE was published. While YL2018 acknowledges that criticisms of ligand efficiency have been made, you really need to say exactly why this dependence of perception is not a problem if you're going lecture about principles to readers of Journal of Medicinal Chemistry.
Ligand lipophilic efficiency (LLE) which is also known as ligand lipophilicity efficiency (LLE) and lipophilic efficiency (LipE) can be described as offset efficiency metric (lipophilicity is subtracted from potency). As such, perception of efficiency does not change when you use a different unit to express potency and, provided that ionization of ligand is insignificant, efficiency can be seen as a measure of the ease of transfer of ligand from octanol to its binding site. Here's a graphic that illustrates this:
LLE (LipE) measures ease of transfer of ligand from octanol to binding site
I'm not entirely convinced that the authors of YL2018 properly understood the difference between logP and logD. Even if they did, they needed to articulate the implications for drug design a lot more clearly than they have done. Here's an equation that expresses logD as a function of logP and the fraction of ligand in the neutral form at the experimental pH (assuming that only neutral forms of ligands partition into the octanol).
The equation highlights the problems that result from using logD (rather than logP) to define "compound quality". In essence the difficulty stems from the composite nature of logD which means that logD can be also be reduced by increasing the extent of ionization. While this is likely to result in increased aqueous solubility, it is much less likely that problems associated with binding to anti-targets will be addressed. Increasing the extent of ionization may also compromise permeability.
YL2018 is clearly a long article and I'm going to focus on two of the ways in which the authors present values of efficiency metrics. The first of these is the "% better" statistic which is used to reference specific compounds (e.g. optimization endpoints) to sets of compounds (e.g. everything synthesized by project chemists). The statistic is calculated as the fraction of compounds in the set for which both LE and LLE values are greater than the corresponding values for the compound of interest. The smallest values of the "% better" statistic are considered to correspond to the most optimal compounds. The use of the "% better" statistic could be taken as indicating that absolute thresholds for LE and LLE are not useful for analyzing optimization trajectories..
The fundamental problem with analyzing data in this manner is that LE has a nontrivial dependence on the concentration unit in which affinity is expressed (this is shown in Table 1 and Fig. 1 in NoLE). One consequence of this nontrivial dependence is that both perception of efficiency and the "% better" statistic vary with the concentration unit used to express efficiency.
The second way that the authors of YL2018 present values of efficiency metrics is to plot LE against LLE and, as has already been noted, this is a particularly bone-headed way to analyze data. One problem is that the plot changes in a nontrivial manner if you express affinity in a different unit. This makes it difficult to explain to medicinal chemists why they need to convert the micromolar potencies from their project database to molar units in order for The Truth to be revealed. Another problem is that LE and LLE are both linear functions of pIC50 (or pKi) and that means that the appearance of the plot is heavily influenced by the (trivial) correlation of potency with itself.
A much better way to present the data is to plot LLE against number of non-hydrogen atoms (or any other measure of molecular size that you might prefer). In such a plot, expressing potency (or affinity) in a different unit simply shifts all points 'up' or 'down' to the same extent which means that you no longer have the problem that the appearance of plot changes when you change units. The other advantage of plotting the data in this manner is that there is no explicit correlation between the quantities being plotted. I have used a variant of this plot in NoLE (see Fig. 2b) to compare some fragment to lead optimizations that had been analyzed previously.
I think this is a good point to wrap things up. Even if you have found the post to be tedious, I hope that you have at least enjoyed the orchids. As we would say in Brazil, até mais!