Wednesday, 29 May 2019

Transforming computational drug discovery (but maybe not)

"A theory has only the alternative of being right or wrong. A model has a third possibility: it may be right, but irrelevant."
Manfred Eigen (1927 - 2019)

I'll start this blog post with some unsolicited advice to those who seek to transform drug discovery. First, try to understand what a drug needs to do (as opposed to what compound quality 'experts' tell us a drug molecule should look like). Second, try to understand the problems that drug discovery scientists face and the constraints under which they have to solve them. Third, remember that many others have walked this path before and difficulties that you face in gaining acceptance for your ideas may be more a consequence of extravagant claims made previously by others than of a fundamentally Luddite nature of those whom you seek to influence. As has become a habit, I'll include some photos to break the text up a bit and the ones in this post are from Armenia.

Mount Ararat taken from the Cascade in Yerevan. I stayed at the excellent Cascade Hotel which is a two minute walk from the bottom of the Cascade.

Here are a couple of slides from my recent talk at Maynooth University that may be helpful to machine learning evangelists, AI visionaries and computational chemists who may lack familiarity with drug design. The introductions to articles on ligand efficiency and correlation inflation might also be relevant.

Defining controllability of exposure (drug concentration) as a design objective is extremely difficult while unbound intracellular drug concentration is not generally measurable in vivo.

Computational chemists and machine learning evangelists commonly make (at least) one of two mistakes when seeking to make impact on drug design. First, they see design purely as an exercise in prediction. Second, they are unaware of the importance of exposure as the driver of drug action. I believe that we'll need to change (at least) one of these characteristics of drug design if we are to achieve genuine transformation.

In this post, I'm going to take a look at an article in ACS Medchem Letters entitled 'Transforming Computational Drug Discovery with Machine Learning and AI'. The article opens with a Pablo Picasso quote although I'd argue that the observation made by Manfred Eigen at the beginning of the blog post would be way more appropriate. The World Economic Forum (WEF) is quoted as referring to "to the combination of big data and AI as both the fourth paradigm of science and the fourth industrial revolution". The WEF reference reminded me of an article (published in the same journal and reviewed in this post) that invoked "views obtained from senior medicinal chemistry leaders". However, I shouldn't knock the WEF reference too much since we observed in the correlation inflation article that "lipophilicity is to medicinal chemists what interest rates are to central bankers".

The Temple of Garni is the only Pagan temple in Armenia and is sited next to a deep gorge (about 20 metres behind me). I took a keen interest in the potential photo opportunities presented by two Russian ladies who had climbed the safety barrier and were enthusiastically shooting selfies...

Much of the focus of the article is on the ANI-1x potential (and related potentials), developed by the authors for calculation of molecular energies. These potentials were derived by using a deep neural network to fit calculated (DFT) molecular energies to calculated molecular geometry descriptors. This certainly looks like an interesting and innovative approach to calculating energies of molecular structures. It's also worth mentioning the Open Force Field Initiative since they too are doing some cool stuff. I'll certainly be watching to see how it all turns out.

One key question concerns accuracy of DFT energies. The authors talk about a "zoo" of force fields but I'm guessing the diversity of DFT protocols used by computational chemists may be even greater than the diversity of force fields (here's a useful review). Viewing the DFT field as an outsider, I don't see a clear consensus as to the most appropriate DFT protocol for calculating molecular energy and the lack of consensus appears to be even more marked when considering interactions between molecules. It's also worth remembering that the DFT methods are themselves parameterized.  

Potentials such as those described by the authors are examples of what drug discovery scientists would call a quantitative structure-property relationship (QSPR). When assessing whether or not a model constitutes AI in the context of drug discovery, I would suggest consideration of the nature of the model rather than the nature of the algorithm used to build the model. The fitting of DFT energies to molecular descriptors that the authors describe is considerably more sophisticated than would be the case for a traditional QSPR. However, there are a number of things that you need to keep in mind when fitting measured or calculated properties to descriptors regardless of the sophistication of the fitting procedure. This post on QSAR as well as the recent exchange ( 1 | 2 | 3 ) between Pat Walters and me may be informative. First, over-fitting is always a concern and validation procedures may make an optimistic assessment of model quality when the space spanned by descriptors is unevenly covered. Second, it is difficult to build stable and transferable models if there are relationships between descriptors (the traditional way to address this problem is to first perform principal component analysis which assumes that the relationships between descriptors is linear). Third, it is necessary to account for numbers of adjustable parameters in models in an appropriate manner if claiming that one model has outperformed another.

Armenia appeared to be awash with cherry blossoms when I visited in April. This photo was taken at Tatev Monastery which can be accessed by cable car.

The authors have described what looks to be a promising approach to calculation of molecular energies. Is it AI in the context of drug discovery? I would say, "no, or at least no more so than the QSPR and QSAR models that have been around for decades". Will it transform computational drug discovery? I would say, "probably not". Now I realize that you're thinking that I'm a complete Luddite (especially given my blinkered skepticism of the drug design metrics introduced by Pharma's Finest Minds) but I can legitimately claim to have exploited knowledge of ligand conformational energy in a real discovery project. I say "probably not" simply because drug designers have been able to calculate molecular energy for many years although I concede that the SOSOF (same old shit only faster) label would be unfair. That said, I would expect faster, more accurate and more widely applicable methods to calculate molecular energy to prove very useful in computational drug discovery. However, utility is a necessary, but not sufficient, condition for transformation.

Geghard Monastery was carved from the rock

So I'll finish with some advice for those who manage (or, if you prefer, lead) drug discovery.  Suppose that you've got some folk trying to sell you an AI-based system for drug design. Start by getting them to articulate their understanding of the problems that you face. If they don't understand your problems then why should you believe their solutions? Look them in the eye when you say "unbound intracellular concentration" to see if you can detect signs of glazing over. In particular, be wary of crude scare tactics such as the suggestion that those medicinal chemists that don't use AI will lose their jobs to medicinal chemists who do use AI. If the terrors of being left behind by the Fourth Industrial Revolution are invoked then consider deploying the conference room furniture that you bought on eBay from Ernst Stavro Blofeld Associates.

Selfie with MiG-21 (apparently Artem's favorite) at the Mikoyan Brothers Museum in Sanahin where the brothers grew up. Anastas was even more famous than his brother and played a key role in defusing the Cuban Missile Crisis.

Saturday, 11 May 2019

Efficient trajectories

I'll examine an article entitled ‘Mapping the Efficiency and Physicochemical Trajectories of Successful Optimizations’ (YL2018) in this post and I should note that the article title reminded me that abseiling has been described as the second fastest way down the mountain. The orchids in Blanchisseuse have been particularly good this year and I’ll include some photos of them to break the text up a bit.

It’s been almost 22 years since the rule of 5 (Ro5) was published. While the Ro5 article highlighted molecular size and lipophilicity as pharmaceutical risk factors, the rule itself is actually of limited utility as a drug design tool. Some of the problems associated with excessive lipophilicity had actually been recognized (see Yalkowsky | Hansch) over a decade before the publication of Ro5 in 1997 and there’s also this article that had been published in the previous year. However, it was the emergence of high-throughput screening that can be regarded as the trigger for Ro5 which, in turn, dramatically raised awareness of the importance of physicochemical properties in drug design. The heavy citation and wide acceptance of Ro5 provided incentives for researchers to publish their own respective analyses of large (usually proprietary) data sets and this has been expressed more succinctly as “Ro5 envy”.

So let's take a look at YL2018 and the trajectories. I have to concede that ‘trajectory’ makes it all seem so physical and scientifically rigorous even though ‘path’ would be more appropriate (and easier to say after a few beers). As noted in ‘The nature of ligand efficiency’ (NoLE), I certainly believe that it is a good idea for medicinal chemistry teams to both plot potency (e.g. pIC50) against risk factors such as molecular size or lipophilicity for their project compounds and to analyze the relationships between potency and these quantities. However, it is far from clear that a medicinal chemistry team optimizing a specific structural series against a particular target would necessarily find the plots corresponding to optimization of other structural series against other targets to be especially relevant to their own project.

YL2018 claims that “the wider employment of efficiency metrics and lipophilicity control is evident in contemporary practice and the impact on quality demonstrable”. While I would agree that efficiency metrics are integral to the philatelic aspects of modern drug discovery, I don’t believe that YL2018 actually presents a single convincing example of efficiency metrics being used for decision making in a specific drug design project. I should also point out that each of the authors of YL2018 provided cannon fodder (LS2007 | HY2010 ) for the correlation inflation article and you might want to keep that in mind when you read the words “evident” and “demonstrable”. They also published 'Molecular Property Design: Does Everyone Get It?' back in 2015 and you may find this review of that seminal contribution to the drug design literature to be informative.

I reckon that it would actually be a lot more difficult to demonstrate that efficiency metrics were used meaningfully (i.e. for decision making rather than presentation at dog and pony shows) in projects than it would be to demonstrate that they were predictive of pharmaceutically relevant behavior of compounds. In NoLE, I stated:

"However, a depiction [6] of an optimization path for a project that has achieved a satisfactory endpoint is not direct evidence that consideration of molecular size or lipophilicity made a significant contribution toward achieving that endpoint. Furthermore, explicit consideration of lipophilicity and molecular size in design does not mean that efficiency metrics were actually used for this purpose. Design decisions in lead optimization are typically supported by assays for a range of properties such as solubility, permeability, metabolic stability and off-target activity as well as pharmacokinetic studies. This makes it difficult to assess the extent to which efficiency metrics have actually been used to make decisions in specific projects, especially given the proprietary nature of much project-related data."

YL2018 states, “Trajectory mapping, based on principles rather than rules, is useful in assessing quality and progress in optimizations while benchmarking against competitors and assessing property-dependent risks.” and, as a general point, you need to show you're on top of the physical chemistry if you're going write articles like this.

Ligand efficiency represents something of a liability for anybody claiming expertise in physical chemistry. The reason for this is that perception of efficiency depends on the unit that you use to express affinity and this is a serious issue (in the "not even wrong" category) that was highlighted in 2009 and 2014 before NoLE was published. While YL2018 acknowledges that criticisms of ligand efficiency have been made, you really need to say exactly why this dependence of perception is not a problem if you're going lecture about principles to readers of Journal of Medicinal Chemistry.

Ligand lipophilic efficiency (LLE) which is also known as ligand lipophilicity efficiency (LLE) and lipophilic efficiency (LipE) can be described as offset efficiency metric (lipophilicity is subtracted from potency). As such, perception of efficiency does not change when you use a different unit to express potency and, provided that ionization of ligand is insignificant, efficiency can be seen as a measure of the ease of transfer of ligand from octanol to its binding site. Here's a graphic that illustrates this:

LLE (LipE) measures ease of transfer of ligand from octanol to binding site

I'm not entirely convinced that the authors of YL2018 properly understood the difference between logP and logD. Even if they did, they needed to articulate the implications for drug design a lot more clearly than they have done. Here's an equation that expresses logD as a function of logP and the fraction of ligand in the neutral form at the experimental pH (assuming that only neutral forms of ligands partition into the octanol).

The equation highlights the problems that result from using logD (rather than logP) to define "compound quality". In essence the difficulty stems from the composite nature of logD which means that logD can be also be reduced by increasing the extent of ionization. While this is likely to result in increased aqueous solubility, it is much less likely that problems associated with binding to anti-targets will be addressed. Increasing the extent of ionization may also compromise permeability.    

YL2018 is clearly a long article and I'm going to focus on two of the ways in which the authors present values of efficiency metrics. The first of these is the "% better" statistic which is used to reference specific compounds (e.g. optimization endpoints) to sets of compounds (e.g. everything synthesized by project chemists). The statistic is calculated as the fraction of compounds in the set for which both LE and LLE values are greater than the corresponding values for the compound of interest. The smallest values of the "% better" statistic are considered to correspond to the most optimal compounds. The use of the "% better" statistic could be taken as indicating that absolute thresholds for LE and LLE are not useful for analyzing optimization trajectories..

The fundamental problem with analyzing data in this manner is that LE has a nontrivial dependence on the concentration unit in which affinity is expressed (this is shown in Table 1 and Fig. 1 in NoLE). One consequence of this nontrivial dependence is that both perception of efficiency and the "% better" statistic vary with the concentration unit used to express efficiency.

The second way that the authors of YL2018 present values of efficiency metrics is to plot LE against LLE and, as has already been noted, this is a particularly bone-headed way to analyze data. One problem is that the plot changes in a nontrivial manner if you express affinity in a different unit. This makes it difficult to explain to medicinal chemists why they need to convert the micromolar potencies from their project database to molar units in order for The Truth to be revealed. Another problem is that LE and LLE are both linear functions of pIC50 (or pKi) and that means that the appearance of the plot is heavily influenced by the (trivial) correlation of potency with itself.

A much better way to present the data is to plot LLE against number of non-hydrogen atoms (or any other measure of molecular size that you might prefer). In such a plot, expressing potency (or affinity) in a different unit simply shifts all points 'up' or 'down' to the same extent which means that you no longer have the problem that the appearance of plot changes when you change units. The other advantage of plotting the data in this manner is that there is no explicit correlation between the quantities being plotted. I have used a variant of this plot in NoLE (see Fig. 2b) to compare some fragment to lead optimizations that had been analyzed previously.

I think this is a good point to wrap things up. Even if you have found the post to be tedious, I hope that you have at least enjoyed the orchids. As we would say in Brazil, até mais!