Thursday, 31 December 2015

The homeopathic limit of ligand efficiency

With the thermodynamic proxies staked, I can get back to salting the ligand efficiency (LE) soucouyant. In the previous post on this topic, I responded to a 'sound and fury' article which appeared to express the opinion that we should be using the metrics and not asking rude questions about their validity. One observation that one might make about an article like this is that the journal in question could be seen as trying to suppress legitimate scientific debate and I put this to their editorial staff. The response was that an article like this represents the opinion of the author and that I should consider a letter to editor if there was a grievance. I reassured them that there was no grievance whatsoever and that it actually takes a lot of the effort out of providing critical commentary on the drug discovery literature when journals serve up cannon-fodder like this. In the spirit of providing helpful and constructive feedback to the the editorial team, I did suggest that they might discuss the matter among themselves because a medicinal chemistry journal that is genuinely looking to the future needs to be seen as catalyzing rather than inhibiting debate. Now there is something else about this article that it took me a while to spot which is that it is freely available while 24 hours of access to the other editorial article in the same issue will cost you $20 in the absence of a subscription. If the author really coughed up for an APC so we don't have to pay to watch the toys getting hurled out of the pram then fair enough. If, however, the journal has waived the APC then claims that it is not attempting to stifle debate become a lot less convincing. Should we be talking about the Pravda of medicinal chemistry? Too early to say but I'll be keeping an eye open for more of this sort of thing.

Recalling complaints that our criticism of the thermodynamic basis of LE was 'complex', I'm going to try to make things even simpler than in the previous post.  They say a picture is worth a thousand words so I'm going to use a graphical method to show how LE can be assessed.  To make things really simple, we'll dispense with the pretentious energy units by using -log10(IC50/Cref ) as the measure of activity and I'll also point you towards an article that explains why you need that reference concentration (Cref) if you want to calculate a logarithm for IC50. I'll plot activity for two hypothetical compounds, one of which is a fragment with 10 heavy atoms ad the other is a more potent hit from high-throughput screening (HTS) that has 20 heavy atoms. I won't actually to say what units IC50 values are expressed in and you can think of the heavy atom axis as sliding up or down the activity axis in response to changes in the concentration unit Cref. I've done things this way to emphasize the arbitrary nature of the concentration unit in the LE context.

Take a look at the plot in the left of the figure which I've labeled as 'A'.  Can you tell which of the compounds is more ligand-efficient just by looking at this plot?  Don't worry because I can't either. 

It's actually very easy to use a plot like this to determine whether one compound is more ligand-efficient than another one. First locate the point on the vertical axis corresponding to an IC50 of 1 M. Then draw a line through this point and the point representing the activity of one of the compounds. If the point representing the activity of the other compound lies below the line then it is less ligand-efficient and vice versa. Like they say in Stuttgart, vorsprung durch metrik!

Now take a look at the plot on the right of the figure which I've labelled 'B'. I've plotted two different lines that pass through the point corresponding to the fragment hit. The red line suggests that the fragment hit is more ligand efficient than the HTS hit but the green line suggests otherwise. Unfortunately there's no way of knowing which of these lines is the 'official' LE line (with intercept corresponding to IC50 = 1 M) because I've not told you what units IC50 is expressed in. Presenting the IC50 values in this manner is not particularly helpful if you need to decide which of the two hits is 'better' but it does highlight the arbitrary manner in which the intercept is selected in order to calculate LE. It also highlights how our choice of intercept influences our perception of efficiency.  

You can also think of the intercept as a zero molecular size limit for activity. One reason for doing so is that if the correlation between activity and molecular size is sufficiently strong, you may be able to extrapolate the trend in the data to the activity axis. Would it be a good idea to be make assumptions about the intercept if the data can't tell you where it is? LE is based on an assumption that the 1 M concentration is somehow 'privileged' but, in real life, molecules don't actually give a toss about IUPAC.  You can almost hear the protein saying "IUPAC... schmupack" when the wannabe ligand announces its arrival outside the binding pocket armed with the blessing of a renowned thought-leader.  The best choice of a zero molecular size limit for activity would appear to be an interesting topic for debate. Imagine different experts each arguing noisily for his or her recommended activity level to be adopted as the One True Zero Molecular Size Limit For Activity. With apologies to Prof Tolkien and his pointy-eared friends,

One Unit to rule them all, One Unit to find them,
One Unit to bring them all and in the darkness bind them, 

If this all sounds strangely familiar, it might be because you can create an absence of a solute just as effectively by making its molecules infinitely small as you can by making the solution infinitely dilute. Put another way, LE may be a lot closer to homeopathy than many 'experts' and 'thought-leaders' would like you to believe.

So that's the end of blogging for the year at Molecular Design.  I wish all readers a happy, successful and metric-free 2016.

Thursday, 5 November 2015

The rise and fall of rational drug design

I'll be taking a look at a thought-provoking article (which got me chuckling several times) on drug design by my good friend Brooke Magnanti in this blog post.  I've known Brooke for quite a few years and I'll start this post with some photos taken by her back in 1998 on a road trip that took us from Santa Fe to Carlsbad via Roswell and then to White Sands, the Very Large Array before returning to Santa Fe.  The people in these photos (Andrew Grant, Anthony Nicholls, Roger Sayle and) also appear in Brooke's article and you can see Brooke reflected in my sunglasses. The football photos are a great way to remember Andrew who died in in his fiftieth year while running and they're also a testament to Andrew's leadership skills because I don't think anybody else could have got us playing football at noon in the gypsum desert that is White Sands. We can only guess what Noël Coward would have made of it all.

What Brooke captures in her article on rational drug design is the irrational optimism that was endemic in the pharma/biotech industry of the mid-to-late nineties and she also gives us a look inside what used to be called 'Info Mesa'.  I particularly liked, 

"The name Info Mesa may be more apt than those Wired editors realised, since the prospects of a paradigm shift in drug development rose rapidly only to flatten out just when everyone thought they were getting to the top."

However, it wasn't just happening in computation and informatics in those days and it can be argued that the emergence of high-throughput screening (HTS) had already taken some of the shine off virtual screening even before usable virtual screening tools became available. The history of technology in drug discovery can be seen as a potent cocktail of hype and hope that dulls the judgement of even the most level-headed. In the early (pre-HTS) of computational chemistry we were not, as I seem to remember the founder of a long-vanished start-up saying, going to be future-limited. In a number of academic and industrial institutions (although thankfully not where I worked), computational chemists were going to design drugs with computers and any computational chemist who questioned the feasibility of this noble mission was simply being negative. HTS changed things a bit and, to survive, the computational chemist needed to develop cheminformatic skills.

There is another aspect to technology in the pharma/biotech industry which is not considered polite to raise (although I was sufficiently uncouth to do so in this article).  When a company spends a large amount of money to acquire a particular capability, it is in the interests of both vendor and company that the purchase is seen in the most positive light. This can result in advocates for the different technologies expending a lot of energy in trying to show that 'their' technology is more useful and valuable than the other technologies and this can lead to panacea-centric thinking by drug discovery managers (who typically prefer to be called 'leaders'). In drug discovery, the different technologies and capabilities tend to have the greatest impact when deployed in a coordinated manner. For example, the core technologies for fragment-based drug discovery are detection/quantification of weak affinity and efficient determination of structures for fragment-protein complexes. Compound management, cheminformatics and the ability to model protein-ligand complexes all help but, even when used together, these cannot substitute for FBDD's core technologies.  Despite the promises, hype and optimism twenty years ago, so vividly captured by Brooke, small molecule drug discovery is still about getting compounds into assays (and it is likely to remain that way for the foreseeable future).

This is probably a good point to say something about rational drug design. Firstly, it is not a term that I tend to use because it is tautological and we are yet to encounter 'irrational drug design'. Secondly, much of the focus of rational drug design has been identification of starting points for optimization which, by some definitions, is not actually design. I would argue that few technological developments in drug discovery have been directed at the problems of lead optimization. This is not to say that technology has failed to impact on lead optimization. For example, the developments in automation that enabled HTS also led to increased throughput of CYP inhibition assays. One indication of the extent to which technological developments have ignored the lead optimization phase of drug discovery is the almost reverential view that many have of the Rule of 5 (Ro5) almost twenty years after it was first presented.  There is some irony here because Ro5 is actually of very limited value in typical lead optimization scenarios in that it provides little or no guidance for how the characteristics of Ro5-compliant compounds can be improved. When rational drug design is used in lead optimization, the focus is almost always on affinity prediction which is only one half of the equation. The other half of that equation is is free drug concentration which is a function of dose, location in the body and time. I discussed some of the implications of this in a blog post and have suggested that it may be useful to define measures of target engagement potential when thinking about drug action.

What I wrote in that blog post four years ago would have been familiar to many chemists working in lead optimization twenty years ago and that's another way of saying that lead identification has changed a lot more in the last twenty years than has lead optimization.  Perhaps it is unfair to use the acronym SOSOF (Same Old Shit Only Faster) but I hope that you'll see what I'm driving at.  Free drug concentration is a particular problem when the drug targets are not in direct contact with blood as is the case when the target is intracellular or on the 'dark side' of the blood brain barrier. If you're wondering about the current state of the art for predicting intracellular free drug concentration, I should mention that it is not currently possible to measure this quantity for an arbitrary chemical compound in live humans. That's a good place to leave things although I should mention that live humans were not the subject of Brooke's doctoral studies...

Friday, 30 October 2015

Voodoo thermodynamics for dummies

Metrics are like  the heads of the Hydra. Dispatch one and two pop up to take its place.

So #RealTimeChem week is over and it's time to return to the topic of metrics and readers of this blog will be aware that this is a recurring theme here. Sometimes, to give them a more 'hard science feel', drug discovery metrics are cast in thermodynamic terms and 'conversion' of IC50 to free energy provides a good example of the problem. Ligand efficiency (LE) was originally defined by scaling free energy of binding by molecular size and it is instructive to observe how toys are ejected from prams when the thermodynamic basis of LE is challenged.

The most important point to note about a metric is that it's supposed to measure something and, regardless of how much you wave your arms and how noisily you assert the metric's usefulness, the metric still needs to measure. That's why we call it a 'metric' and not a 'security blanket for timid medicinal chemists' nor a 'floatation device for self-appointed experts and wannabe thought-leaders'. To be useful, a metric also has to measure something relevant and, in many drug discovery scenarios, that means being predictive of the chemical or biological behavior of compounds. Drug discovery metrics (and guidelines) are often based on trends in data and the strength of the trend tells us how much weight we should give to metrics and how rigidly we should adhere to guidelines. In the metric business, relevance trumps simplicity and even the most anemic of trends can acquire eye-wateringly impressive significance when powered by enough data.

I'll start my review of the article featured in this post by saying that, had the manuscript been sent to me, the response the editor would have been something between 'why have you sent this out for review' and 'this manuscript needs to be put out of its misery as swiftly and mercifully as possible'. The article appears to be the write up for material presented in webinar format which was reviewed less than favorably.  The authors have made a few changes and what was previously called SEEnthalpy (Simplistic Estimate of Enthalpy) is now called PEnthalpy (Proxy for Enthalpy) but the fatal design flaws in the original metric remain and the review of that webinar will show what happens when you wander by mistake into the mess that metrics make.

Before we try to cone these thermodynamic proxies in the searchlights, it may be an idea to ask why we should worry about enthalpy or entropy when drug action is driven by affinity and free concentration. That's a good question and, to be quite honest, I really don't know the answer. Isothermal titration calorimetry (ITC) is an excellent, label-free method for measuring affinity and enthalpy of binding. However, the idea that the thermodynamic signature for binding of a compound to a protein will somehow be predictive of the behavior of the compound is all sorts of situations that do not involve that protein does seem to be entering the realms of wild conjecture.  There is also the question of how isothermal systems like live humans can 'sense' the benefits of an enthalpically-optimized drug. Needless to say, these are questions that some ITC experts and many aspiring thought-leaders would prefer that you didn't think too hard about.

So let's take a look at the thermodynamic proxies which are defined in terms of the total number (HBT) of hydrogen bond donors and acceptors and the number (RB) of rotatable bonds.  The proxies are defined as follows:

 PEnthalpy = HBT/(RB + HBT)                                          (1)

 PEntropy  = RB/(RB + HBT)                                              (2)

 PEnthalpy  +  PEntropy  =  1                                                 (3)

The proxies predict that the enthalpy and entropy changes associated with binding are functions only of ligand structure and therefore are of no value for comparing the thermodynamics for a particular ligand binding to different proteins as one might want to do when assessing selectivity. Equation (3) shows that there is effectively only one metric (what a relief) since the two proxies are perfectly anticorrelated so each is as effective as the other as a predictor of either the enthalpy or entropy changes associated with ligand binding.

Now you may remember in the webinar that one of the authors of the featured article was telling us at 22:43 that "entropy comes from non-direct hydrophobic interactions like rotatable bonds".  At least now they seem to realize that the rotatable bonds represent degrees of freedom although I don't get the impression from reading the article that have a particularly solid grasp of the underlying physicochemical principles. Freezing rotatable bonds is an established medicinal chemistry tactic for increasing affinity and, if successful, we expect it to lead to a more favorable entropy of binding which some self-appointed thought-leaders would assert is a bad way to increase affinity.  Trying to keep an open mind on this issue, I suggest that we might follow the lead of British Rail and try to define right and wrong types of entropy.

One of the criticisms that I made of the webinar was that no attempt was made to validate the metrics against measured values of binding enthalpy and entropy. In the article, the metrics are evaluated against a small data set of measured values.  As I mentioned earlier, there is effectively only one metric because the two metrics are perfectly anti-correlated so you need to look beyond the fit of the data to the metrics if you want to assess what I'll call the 'thermodynamic connection'. This means digging into the supplementary information.  I found the following on page 5 of the SI:

-TdeltaS =  159.80522 − 343.46172*Pentropy(RB/(HBT + RB))              (4)

which implies that:

TdeltaS  =  −159.80522 + 343.46172*Pentropy(RB/(HBT + RB))            (5)

These equations tell us that the change in entropy associated with binding actually increases with RB rather than decreasing with RB as one would expect for degrees of freedom that become frozen when the intermolecular complex forms. When you're assessing proxies for thermodynamic quantities it's a really good idea to take a look at the root mean square error (RMSE) for the fit of the quantity to the proxy.  The RMSE values for fitting ΔH and TΔS are 28.35 kJ/mol and 29.30 kJ/mol respectively and I will leave it to you, the reader, to decide for yourself whether or not you consider these RMSE values to justify PEnthalpy an PEntropy being called thermodynamic proxies.  The alert reader might ask where the units for ΔH and TΔS° came from since neither the article nor the the SI provides this information and the answer is that you need to go to the source from which the ΔH and TΔS° values were taken to find out.

Now you'll recall that these thermodynamic proxies predict constant values of ΔH and TΔS° for a binding of a given compound to any protein (even those proteins to which it does not bind). The  ΔG° values for the compounds in the small data set used to evaluate the thermodynamic proxies lie in a relatively narrow range (i.e. less than the RMSE values mentioned in the previous paragraph) from −37.6 kJ/mol to −57.3 kJ/mol and are not representative of the affinity of these compounds for proteins against which they had not been optimized. Any guesses how the RMSE values for fittling the data would have differed if  ΔH and TΔS° values had been used for each compound binding to each of the protein targets?

Now if you've you've kept up to date with the latest developments in the drug discovery metric field, you'll know that even when the mathematical basis of a metric is fragile, there exists the much-exercised option of touting the metric's simplicity and claiming that it is still useful. Provided that nobody calls your bluff, metrics can prove to be a very useful propaganda instruments. The featured article does present examples of data analysis based on the using the thermodynamic proxies as descriptors and one general criticism that I will make of this analysis is that most of it is based on the significance rather the strength of trends. When you tout the significance of a trend, you're saying as much about the size of your data set as you are about the strength of the trend in it. This point is discussed in our correlation inflation article and I'd suggest taking a particularly close look at what we had to say about the analysis in this much-cited article.

I'd like to focus on the analysis presented in the section entitled 'GSK PKIS Dataset' and which explored correlations between protein kinase % inhibition and a number of molecular descriptors. The authors state,

"In addition to PEnthalpy, we assessed the correlation across a variety for physicochemical properties including molecular weight, polar surface area, and logP in addition to PEnthalpy  (Fig. 5)"   

This statement is actually inaccurate because they have assessed the significance of the correlations rather than the correlations themselves. Although they may have done the assessment for logP and polar surface area, the results of these assessments do not seem have materialized in Fig. 5 and we are left to speculate as to why. The strongest correlation between PEnthalpy  and % inhibition was observed for CDK3/cyclinE and the plot is shown in Fig. 5b. I invite you, the reader, to ask yourself whether the correlation shown in Fig. 5b would be useful in a drug discovery project.

Since the title of the post mentions voodoo thermodynamics, we should take a look at this in the context of the article and the best place to look is in the Discussion section.  We are actually spoiled for choice when looking for examples of voodoo thermodynamics there but take a look at:

"It is assumed in the literature that the "enthalpically driven compound series" with fewer RBs tend to be (generally) lower MW compounds as well. In contrast, in cases where selectivity is steeper among compounds in a series for which activity and selectivity is likely governed by compounds with relatively more RB versus HBA and HBA [sic], than when the entropic contributors are dominating."

So that's about as much voodoo thermodynamics as I can take for a while so, if it's OK with you, I'll finish by addressing a couple of points to the authors of this article. The flagship product of company with which the authors are associated is a database system for integrating chemical and biological data. Although I'm not that familiar with this database system, responses to my questions during the course of a discussion in the FBDD LinkedIn group suggested that a number of cheminformatic issues have been carefully thought through and that the database system could be very useful in drug discovery. One problem with the featured article is that its scientific weaknesses could lead to some customers losing confidence in the database system. Secondly, the folk who created the database system (and keep it running) may have only limited opportunities to publish and scientifically weak publications by colleagues who are perhaps less focused on what actually pays the bills may breed some resentment.

That's where I'll wrap because there is only so much voodoo thermodynamics that one can take in a day so, as we say in Brazil, 'até mais'.


Friday, 23 October 2015

From schwefeläther to octanol

So in this blog post, written specially for #RealTimeChem week on an #OldTimeChem theme, I'll start ten years after the Kaiser's grandmother became Queen of the United Kingdom of Great Britain and Ireland.  I first came across Ernst Friherr von Bibra while doing some literature work for an article on predicting alkane/water partition coefficients and learned that he was from an illustrious family of Franconian Prince-Bishops. Von Bibra certainly seems to have been a colorful character who is said to have fought no less than 49 duels as a young man. Presumably these were not the duels to the death that did for poor Galois and Pushkin but more like the München frat house duels that leave participants with the dueling scars that München Fräuleins find so irresistible.

So that's how I learned about von Bibra and it was his 1847 study with Harless 'Die Ergebnisse der Versuche über die Wirkung des Schwefeläthers' (The results of the experiments on the effect of the sulfuric ether) that we cited. Schwefeläther is simply diethyl ether and so named because in 1847 you needed to make it from ethanol and sulfuric acid. Von Bibra was a pioneer in the anesthesia field and proposed that anesthetics like ether exerted their effects by dissolving the fatty fraction of brain cells. Now it's easy in 2015 to scoff at this thinking but remember that in 1847 nobody knew about cell membranes or molecules and you couldn't just pick up the phone and expect the ether to arrive the next day. Put another way, if it was 1847 and I was in the laboratory (or kitchen?) gazing at a bowl of brains and a bottle of ether, I would probably have come to a similar conclusion.

What we know now is that anesthetics (and other drugs) dissolve IN lipids as opposed to dissolving THE lipids. Nobody in 1847 knew about partition coefficients and Walther Nernst didn't articulate his famous distribution law (Verteilung eines Stoffes zwischen zwei Lösungsmitteln und zwischen Lösungsmittel und Dampfraum. Z Phys Chem 8:110–139) until 1891 by which time the Kaiser had already handed Bismarck his P45. Within ten years Ernest Overton and Hans Meyer had shown incredible foresight in using amphibians as animal models and the concept of the cell membrane would soon be introduced.

Before moving on, let's take a look at the 'introduction to partition coefficients' graphic below in which the aqueous phase is marked by a the presence of fish (they're actually piranhas and have have graced my partition coefficient powerpoints since my first visit to Brazil in 2009). We would describe the compound on the left as lipophilic because its neutral form prefers the organic phase to water and, for now, I'm not going to be too specific about exactly what that organic phase is. The compound on the right prefers to be in the water so we describe it as hydrophilic.  The red molecule on the left represents an ionized form of the compound on the left and typically these don't particularly like to go into the organic phase (especially not without counter ions for company).  A compound that prefers to to be in the organic phase can still be drawn into the aqueous phase by increasing the extent to which it is ionized (e.g. by decreasing pH if the compound is basic).

Now I'd like to introduce Runar Collander (whom many of you will have heard of) and Calvin Golumbic (whom few of you will have heard of).  Let's first take a look at Golumbic's 1949 study of the effects of ionization on the partitioning of phenols between water and cyclohexane.  Please observe the responses to pH in Fig 1 in that article but also take a look at equation 5 which accounts for self-association in the organic phase and the discussion about how methyl group ortho the phenolic hydroxyl compromises the hydrogen bonding of that hydroxyl group and has observable effects on the partition coefficient.

Collander's study (The Partition of Organic Compounds Between Higher Alcohols and Water) explores how differences in the organic solvent affect partitioning behaviour of solutes.  Collander presented evidence for strong linear relationships between partition coefficients measured using different alcohols (see Fig 1 in his article) although he notes that compounds with two or more hydrophilic groups in their molecular structure tend to deviate from the trend. Now take a look at Fig 2 in Collander's study which shows a plot of octanol/water partition coefficients against their ether/water equivalents. Now the correlation doesn't look so strong although relationships within chemical families appear to be a lot better. In particular, amines appear to be more soluble in octanol than they are in ether and Collander attributes this to the greater acidity of octanol. 

When reading these articles from over sixty years ago, I'm struck by the way the authors ratioanalize their observations in physical terms.  Don't be misled by what we would regard as the obscure use of language (e.g. Collander's "double molecules"  and Golumbic's description of partition coefficients as "true") because the conceptual and linguistic basis of chemistry in 2015 is richer than it was when these pioneering studies were carried out.  How these pioneers would have viewed some of the more mindless metrics by which chemists of 2015 have become enslaved can only be speculated about.

So I'm almost done but one last character in this all-star cast has yet to make his entrance and that, of course, is Corwin Hansch. Most drug discovery scientists 'know' that octanol 'defines' lipophilicity and only a small minority actually question the suitability of octanol for this purpose or even ask how this situation came to be. In order to address the second question, let's take a look at what Hansch et al have to say in this 1963 article,

"We have chosen octanol and water as a model system to approximate the effect of step I on the growth reaction in much the same fashion as the classical work of Meyer and Overton rationalized the relative activities of various anesthetics. This assumption is expressed in 2 where P is the partition coefficient (octanol-water) of the auxin.

          A = f(P)                              (2)

Collander has shown that the partition coefficients for a given compound in two different solvent systems (e.g., ether-water, octanol-water) are related as in 3.

          log P1 = a log P2 + b        (3)

This would also indicate, as does the Meyer-Overton work, that it is not unreasonable to use the results from one set of solvents to predict results in a second set". 

Now if you take another look at Collander's article, you'll see that he only claims that a linear relationship exists between partition coefficients when the organic phase is an alcohol. Collander's Fig 2 seems to suggest that Hansch et al's equation 3 cannot be used to relate octanol/water and ether/water partition coefficients. Can Collander's study be used to justify what appears to be a rather arbitrary choice of octanol as a solvent for partition coefficient measurements? That question, I will leave to you, the reader but, if you're interested, let me point you towards a short talk that I did recently at Ripon College. 

See you next year at #RealTimeChem week and don't forget to take a look at Laura's nails which get my vote for highlight of the week.

Monday, 19 October 2015

Halogen bonding and the curious case of the poisoned dogs

In this blog post, written specially for #RealTimeChem week on an #OldTimeChem theme, I'm going to start in the current century and work back to just a couple of years after the Kaiser's grandmother became Queen of the United Kingdom of Great Britain and Ireland. Readers might want to consider how history might have turned out differently had her eldest child succeeded her to the throne.

One can be forgiven for thinking that halogen bonding is a new phenomenon.  The term halogen bonding refers to attractive interactions between halogens and hydrogen bond acceptors which should be repulsive because hydrogen bond acceptors and halogens are electronegative and would therefore be expected to carry negative partial charges. Nevertheless halogens (other than fluorine) do seem to rather enjoy the company of hydrogen bond acceptors and molecular interaction 'catalogs' such as A Medicinal Chemist's Guide to Molecular Interactions and Molecular Recognition in Chemical and Biological Systems ensure that the medicinal chemist of 2015 is made aware of the importance of halogen bonding and of potential opportunities for exploiting them.

I was first made aware of halogen bonding in the mid 1990s through interactions with Zeneca colleagues at Jealotts Hill and, some time after that, The Nature and Geometry of Intermolecular Interactions between Halogens and Oxygen or Nitrogen was published with one of those colleagues (who had by then moved to CCDC) as a co-author.  A few years ago, I reproduced some of the analysis from that paper for a talk and here is one of the slides which shows how the closest contacts between the carbonyl oxygen and halogen are observed when the two atoms approach along along the axis defined by the halogen and the carbon atom to which it is bound.

If the halogen is bound to something that is sufficiently electron-withdrawing, the molecular electrostatic potential (MEP) on a circular patch on the van der Waals surface of the halogen becomes positive and sometimes this is described as a s-hole.  This article shows that of the s-hole is most pronounced for iodine and non-existent for fluorine, which is consistent with what X-ray crystal structures tell us about halogen bonding. I created a slightly different picture of the s-hole for my talk which shows MEP as a function of distance along two orthogonal directions of approach to the chlorine.  The significance of the  s-hole is that you doesn't actually have to invoke polarization to 'explain' halogen bonding even though it will always happen when atoms get up close and personal. 

When I first encountered halogen bonding, I remembered learning about the reaction of iodine with iodide anion as a schoolboy in Port-of-Spain in the 1970s. Iodine is not particularly water-soluble and this is a way to coax it into solution. That iodine forms complexes with Lewis bases has been known for many years and the Nantes group, better known for their extensive studies of hydrogen bond basicity, have used this to develop a halogen bond basicity scale based on this chemistry.

So in the spring of 2008 I found myself charged with doing a talk on halogens at EuroCUP (OpenEye European user group meeting) and halogen bonding was clearly going to be an important topic. To understand why I was doing this, we need to go back to the 2007 Computer-Aided Drug Design Gordon Conference and specifically the election of the vice chair for the 2009 conference. My good friend Anthony Nicholls was one of the nominees and in his candidacy speech said that he was going to have a session on halogens.  Although Ant didn't win that election, this is by far the most lucid and sensible suggestion that I've heard in a GRC candidacy speech. Needless to say the session on halogens happened in Strasbourg at EuroCUP (the morning after the conference dinner in a winery) and it was on the bus ride to that dinner that I had this conversation with my favorite Austro-Hungarian:

 "Onkel Hugo, where are you from in Germany?   
I am from AUSTRIA!   
Is that in Bavaria?"

While preparing the harangue, I decided to follow the iodine/iodide trail to see how far back it led and I was thrilled to encounter The Periodides by Albert B Prescott, writing in 1895, in which it was noted that, 

"In I839 Bouchardat, a medical writer in Paris, recounts that, when dogs were being surreptitiously poisoned with strychnine in Paris, and an antidote was asked for, first Guibourt recommended powdered galls, and then Donne advised iodine tincture, whereupon Bouchardat himself, approving the use of iodine, said they should use it in potassium iodide solution."

Thirty one years after the nefarious activities of the notorious Parisian dog poisoners  a sudden influx of Prussian visitors demonstrated the inadequacy of Strasbourg's supply of sunbeds and and I probably should now let you take a look at that infamous EuroCUP2008 harangue.

Monday, 12 October 2015

A PAINful convolution of fact with opinion?

<< Previous || Next >>

So once more I find myself blogging on the subject of PAINS although in the wake of the 2015 Nobel Prize for medicine which will have forced many drug-likeness 'experts' onto the back foot. This time the focus is on antifungal research and the article in question has already been reviewed In The Pipeline. It's probably a good idea to restate my position on PAINS so that I don't get accused of being a Luddite who willfully disregards evidence (I'm happy to be called a heretic who willfully disregards sermons on the morality of ligand efficiency metrics and the evils of literature pollution). Briefly, I am well aware that not all output from biological assays smells of roses although I have suggested that deconvolution of fact from opinion is not always as quite as straightforward as some of the 'experts' and 'thought leaders' would have you believe. This three-part series of blog posts ( 1 | 2 | 3 ) should make my position clear but I'll try to make this post as self-contained as possible so you don't have to dig around there too much.

So it's a familiar story in antifungals. More publications but less products and the authors of the featured article note, “However, we believe that one key reason is the meager quality of some of these new inhibitors reported in the antifungal literature; many of which contain undesirable features” and I'd have to agree that if the features really do cause compounds to choke in (or before) development then the features are can legitimately be described as 'undesirable'. As I've said before, it is one thing to opine that something looks yucky but another thing entirely to establish that it really is yucky. In other words, it is not easy to deconvolute fact from opinion. The authors claim, “It is therefore not surprising to see a long list of papers reporting molecules that are fungicidally active by virtue of some embedded undesirable feature. On the basis of our survey and analysis of the antifungal literature over the past 5 years, we estimate that those publications could cover up to 80% of the new molecules reported to have an antifungal effect”. I have a couple of points to make here. Firstly, it would be helpful to know what proportion of that 80% have embedded undesirable features that are actually linked to relevant bad behavior in experimental studies. Secondly, this is a chemistry journal so please say 'compound' when you mean 'compound' because 80% of even just a mole of molecules is a shit-load of molecules.

So it's now time to say something about PAINS but I first need to make the point that embedded undesirable features can lead to different types of bad behavior by compounds. Firstly, the compound can interact with assay components other than target protein and I'll term this 'assay interference'. In this situation, you can't believe 'activity' detected in the assay but you may be able to circumvent the problem by using an orthogonal assay. Sometimes you can assess the seriousness of the problem and even correct for it as described in this article. A second type of bad behaviour is observed when the compound does something unpleasant to the target (and other proteins) and I'd include aggregators and redox cyclers in this class along with compounds that form covalent bonds with the protein in an unselective manner. The third type of bad behaviour is observed when the embedded undesirable feature causes the compound to be rapidly metabolized or otherwise 'ADME-challenged'.  This is the sort of bad behaviour that lead optimization teams have to deal with but I'll not be discussing it in this post.

The authors note that the original PAINS definitions were derived from observation of “structural features of frequent hitters from six different and independent assays”. I have to admit to wondering what is meant by 'different and independent' in this context and it comes across as rather defensive. I have three questions for the authors. Firstly, if you had the output of 40+ screens available, would you consider the selection of six AlphaScreen assays to be an optimal design for an experiment to detect and characterize pan-assay interference? Secondly, are you confident that singlet oxygen quenching/scavenging by compounds can be neglected as an interference mechanism for these 'six different and independent AlphaScreen assays'? Thirdly, how many compounds identified as PAINS in the AlphaScreen assays were shown to bind covalently to one or more of their targets in the original PAINS article?

The following extract from the article should illustrate some of difficulties involved in deconvoluting fact from opinion on the PAINS literature and I've annotated it in red.  Here is the structure of compound 1:

"One class of molecules often reported as antifungal are rhodanines and molecules containing related scaffolds. These molecules are attractive, since they are easily prepared in two chemical steps. An example from the recent patent literature discloses (Z)-5-decylidenethiazolidine-2,4-dione (1) [It's not actually a rhodanine and don't even think about bringing up the fact that my friends at Practical Fragments have denounced both TZDs and thiohydantoins as rhodanines] as a good antifungal against Candida albicans (Figure2).(16) As a potential carboxylic acid isostere, thiazolidine-2,4-diones may be sufficiently unreactive such that they can progress some way in development.(17) Nevertheless, one should be aware of the thiol reactivity associated with this type of molecule, as highlighted by ALARM NMR and glutathione assays,(18) [I can't access reference 18 but rhodanines appear to represent its primary focus. Are you able to present evidence for thiol reactivity for compound 1? What is the pKa for compound 1 and how might this be relevant to its ability to function as a Michael acceptor?] and this is especially relevant when the compound contains an exocyclic alkene such as in 1. The fact that rhodanines are promiscuous compounds has been recently highlighted in the literature.(13, 18) [Maybe rhodanines really are promiscuous but, at the risk of appearing repetitious, 1, is not a rhodanine and when we start extrapolating observations made for rhodanines to other structural types, we're polluting fact with opinion. Also the evidence for promiscuity presented in reference 13 is frequent-hitter behavior in a panel of six AlphaScreen assays and this can't be invoked as evidence for thiol reactivity because rhodanines lacking the exocycylic carbon-carbon double bond are associated with even greater PAIN levels than rhodanines that can function as Michael acceptors] The observed antifungal activity of 1 is certainly genuine; however, this could potentially be the result of in vivo promiscuous reactivity in which case the main issue lies in the potential lack of selectivity between the fungi and the other exposed living organisms" [This does come across as arm-waving. Have you got any evidence that there really is an issue? It's also worth remembering that, once you move away from the AlphaScreen technology, rhodanines and their cousins are not the psychopathic literature-polluters of legend and it is important for PAINS advocates to demonstrate awareness of this article.]

The authors present chemical structures of thirteen other compounds that they find unwholesome and I certainly wouldn't be be volunteering to optimize any of the compounds presented in this article. As an aside, when projects are handed over at transition time, it is instructive to observe how perceptions of compound quality differ between those trying to deliver a project and those charged with accepting it.  My main criticism of the article is that very little evidence that bad things are actually happening is presented. The authors seem to be of the view that reactivity towards thiols is necessarily a Bad Thing. While formation of covalent bonds between ligands and proteins may be frowned upon in Switzerland (at least on Sundays), it remains a perfectly acceptable way to tame errant targets.  A couple of quinones also appear in the rogues gallery and it needs to be pointed out that the naphthaquinone atovaquone is one of the two components (the other is proguanil which would also cause many compound quality 'experts' to spit feathers if they had failed to recognize it as an approved drug) of the antimalarial drug Malarone. I have actually taken Malarone on several occasions and the 'quinone-ness' of atovaquone worries me a great deal less than the potential neuropsychiatric effects of the (arguably) more drug-like mefloquine that is a potential alternative. My reaction to a 'compound quality' advocate who told me that I should be taking a more drug-like malaria medication would be a two-fingered gesture that has occasionally been attributed to the English longbowmen at Agincourt. Some of the objection to quinones appears to be due to their ability to generate hydrogen peroxide via redox cycling and, when one makes this criticism of compounds, it is a good idea to at demonstrate that one is at least aware that hydrogen peroxide is an integral component of the regulatory mechanism of PTP1B.

This a good point to wrap things up. I just want to reiterate the importance of making a clear distinction in science between what you know and what you believe. This echoes Feynman who is reported to have said that, "The first principle is that you must not fool yourself and you are the easiest person to fool". Drug discovery is really difficult and I'm well aware that we often have to make decisions with incomplete information. When basing decisions on conclusions from data analysis,  it is important to be fully aware of any assumptions that have been made and of limitations in what I'll call the 'scope' of the data. The output of forty high throughput screens that use different detection technologies is more likely to reveal genuinely pathological behavior than the output of six high throughput screens that all use the same detection technology. One needs to be very careful when extrapolating frequent hitter behavior to thiol reactivity and especially so when using results from a small number of assays that use a single detection technology. This article on antifungal PAINS is heavy on speculation (I counted 21 instances of 'could', 10 instances of 'might' and 5 instances of 'potentially') and light on evidence.  I'm not denying that some (much?) of the output from high throughput screens is of minimal value and one key challenge is how to detect anc characterize bad chemical behavior in an objective manner.  We need to think very carefully about how the 'PAINS' term should be used and the criteria by which compounds are classified as PAINS. Do we actually need to observe pan-assay interference in order to apply the term 'PAINS' to a compound or is it simply necessary for the compound to share a substructural feature with a minimum number of compounds for which pan-assay interference has been observed?  How numerous and diverse must the assays be? The term 'PAINS' seems to get used more and more to describe any type of bad behavior (real, suspected or imagined) by compounds in assays and a case could be made for going back to talking about 'false positives' when referring to generic bad behavior in screens.

And I think that I'll leave it there. Hopefully provided some food for thought.       

Tuesday, 22 September 2015

Ligand efficiency metrics: why all the fuss?

I guess it had to happen.  Our gentle critique of ligand efficiency metrics (LEMs) has finally triggered a response in the form of a journal article (R2015) in which it is dismissed as "noisy arguments" and I have to admit, as Denis Healey might have observed, that this is like being savaged by a dead sheep.  The R2015 defense of LEMs follows on from an earlier one (M2014) in which the LEM Politburo were particularly well represented and it is instructive to compare the two articles.  The M2014 study was a response to Mike Schultz’s assertion ( 1 | 2 | 3 ) that ligand efficiency (LE) was mathematically invalid and its authors correctly argued that LE was a mathematically valid expression although they did inflict a degree of collateral damage on themselves by using a mathematically invalid formula for LE (their knowledge of the logarithm function did not extend to its inability to take an argument that has units).  In contrast, the R2015 study doesn’t actually address the specific criticisms that were made of LE and can be described crudely as an appeal to the herding instinct of drug discovery practitioners.  It actually generates a fair amount of its own noise by asserting that, "Ligand efficiency validated fragment-based design..." and fails to demonstrate any understanding of the points made in our critique.  The other difference between the two studies is that the LEM Politburo is rather less well represented in the R2015 study and one can only speculate as to whether any were invited to join this Quixotic assault on windmills that can evade lances by the simple (and valid) expedient of changing their units.  One can also speculate about what the responses to such an invitation might have been (got a lot on right now… would love to have helped… but don’t worry…go for it... you’ll be just fine... we’ll be right behind you if it gets ugly) just as we might also speculate as to whether Marshal Bosquet might have summed up the R2015 study as, "C'est magnifique, mais ce n'est pas la guerre".

It’s now time to say something about our critique of LE metrics and I’ll also refer readers to a three-part series ( 1 | 2 | 3 ) of blog posts and the ‘Ligand efficiency: nice concept, shame about the metrics’ presentation.  In a nutshell, the problem is that LE doesn’t actually measure what the LEM Politburo would like you to believe that it measures and it is not unreasonable to ask whether it actually measures anything at all. Specifically, LE provides a view of chemico-biological space that changes with the concentration units in which measures of affinity and inhibitory activity are usually expressed. A view of a system that changes with the units in which the quantities that describe the system are expressed might have prompted Pauli to exclaim that it was "not even wrong" and, given that LE is touted as a metric for decision-making, this is a serious deficiency that is worth making a fuss about.  If you want to base design decisions on such a metric (or planetary alignments for that matter) then by all means do so. I believe the relevant term is 'consenting adults'.

Now you might be thinking that it's a little unwise to express such heretical thoughts since this might cause the auto-da-fé to be kindled and I have to admit that, at times, the R2015 study does read rather like the reaction of the Vatican’s Miracle Validation Department to pointed questions about a canonization decision.  Our original criticism of LE was on thermodynamic grounds and this seemed reasonable given that LE was originally defined in thermodynamic terms. As an aside, anybody thinking of invoking Thermodynamics when proposing a new metric should consider the advice from Tom Lehrer in 'Be Prepared' that goes, "don't write naughty words on wall when you can't spell". Although the R2015 study complains that the thermodynamic arguments used in our critique were complex, the thermodynamic basis of our criticism was both simple and fundamental.  What we showed is that perception of ligand efficiency changes when we change the value of the concentration used to define the standard state for ΔG° and, given that the standard concentration is really just a unit, that is a very serious criticism indeed.  It is actually quite telling that the R2015 study doesn't actually say what the criticism was and to simply dismiss criticism as "noisy arguments" without counter-argument in these situations is to effectively run the white flag up the pole.  I would argue that if there was a compelling counter-argument to our criticism of LE then it's a pretty safe assumption that the R2015 study would have used it.  This is a good time to remind people that if you're going to call something a metric then you're claiming that it measures something and, if somebody calls your bluff, you'll need more than social media 'likes' to back your claims.

All that said, I realize that many in the drug discovery field find Thermodynamics a little scary so I'll try to present the case against LE in non-thermodynamic terms. I suggest that we use  IC50 which is definitely not thermodynamic and I'll define something called generalized ligand efficiency (GLE) to illustrate the problems with LE:

GLE = -(1/NHA´ log10(IC50/Cref )

In defining generalized ligand efficiency, I've dumped the energy units, used the number of heavy atoms (which means that we don't have to include a 'per heavy atom' when we quote values) and changed the basis for the logarithm to something a bit more user-friendly.  Given that LE advocates often tout the simplicity of LE, I just can't figure out why they are so keen on the energy units and natural logarithms especially when they so frequently discard the energy units when they quote values of LE.  One really important point of which I'd like you to take notice is that IC50 has been divided by an arbitrary concentration unit that I'll call  Cref  (for reference concentration). The reason for doing this is that you can't calculate a logarithm for a quantity with units and, when you say "IC50 in molar units", you are actually talking about the ratio of IC50 to a 1 M concentration.  Units are really important in science but they are also arbitrary in the sense that coming to different conclusions using different units is more likely to indicate an error in the 'not even wrong' category rather than penetrating insight. Put another way, if you concluded that Anne-Marie was taller than Béatrice when measured in inches how would you react when told that Béatrice was taller than Anne-Marie when measured in centimetres?  Now before we move on, just take another look at formula for GLE and please remember that Cref  is an arbitrary unit of concentration and that a metric is no more than a fancy measuring tape.

Ligand efficiency calculations almost invariably involve selection of 1 M as a the concentration unit although those using LE (and quite possibly those who introduced the metrics)  are usually unaware that this choice has been made. If you base a metric on a specific unit then you either need to demonstrate that the choice of unit is irrelevant or you need to justify your choice of unit. GLE allows us to to explore the consequences of arbitrarily selecting a concentration unit. As a potential user of a metrics, you should be extremely wary of any metric if those advocating its use evade their responsibility to be open with the assumptions made in defining the metric and the consequences of making those assumptions. Let's take a look at the consequences of tying LE to the 1 M concentration unit and I'll ask you to take a look at the table below which shows GLE values calculated for a fragment hit, a lead and an optimized clinical candidate.  When we calculate  GLE using 1 M as the concentration unit, all three compounds appear to be equally ligand-efficient but that picture changes when we choose another concentration unit.  If we choose 0.1 M as the concentration unit we now discover that the optimized clinical candidate is more ligand efficient than the fragment hit. However, if we choose a different concentration unit (10 M) we find that the fragment hit is now more ligand-efficient than the optimized clinical candidate. All we need to do now is name our fragment hit 'Anne-Marie' and our optimized clinical candidate 'Béatrice' and it will be painfully apparent that something has gone horribly awry with our perception of reality.
One should be very worried about basing decisions on a metric that tells you different things when you express quantities in different units because 10 nM is still 10 nM whether you call it 10000 pM, 10-2 mM or 10-8 M.  This is a good time to ask readers if they can remember LE being used to prioritize a one compound (or even a series) over another in a project situation.  Would the decision have been any different had you expressed of IC50 in different units when you calculated LE?   Was Anne-Marie really more beautiful than Béatrice?   The other point that I'd like you think about is the extent to which improperly-defined metrics can invalidate the concept that they are intended to apply. I do believe that a valid concept of ligand efficiency can be retrieved from the wreckage of metrics and this is actually discussed both in the conclusion of our LEM critique and and my recent presentation. But enough of that because I'd like to let the R2015 say a few words.

The headline message for the R2015 study asserts, “The best argument for ligand efficiency is simply the assertion that on average smaller drugs have lower cost of goods, tend to be more soluble and bioavailable, and have fewer metabolic liabilities”. I do agree that excessive molecular size is undesirable (we applied the term 'pharmaceutical risk factor' to molecular size and lipophilicity in our critique) but one is still left with the problem of choosing a concentration unit or, at least justifying the arbitrary choice of 1 M.  As such, it is not accurate to claim that the undesirability of excessive molecular size is the best argument for ligand efficiency because the statement also implicitly claims justification for the arbitrary choice of 1 M as the concentration unit.  I have no problem normalizing potency with respect to molecular size but doing so properly requires analyzing the data appropriately rather than simply pulling a concentration unit out of a hat and noisily declaring it to be the Absolute Truth in the manner in which one might assert the age of the Earth to be 6000 years..

I agree 100% with the statement, "heavy atoms are the currency we spend to achieve high-affinity binding, and all else being equal it is better to spend fewer for a given potency".  When we progress from a fragment hit to an optimized clinical candidate we spend by adding molecular size and our purchase is an increase in potency that is conveniently measured by the log of the ratio of IC50 values of the hit and optimized compound. If you divide this quantity by the difference in heavy atoms, you'll get a good measure of how good the deal was. Crucially, your perception of the deal does not depend on the concentration unit in which you express the  IC50 values and this is how group efficiency works. It's a different story if you use LE and I'll direct you back to the figure above so that you can convince yourself of this.

I noted earlier that the R2015 study fails to state what criticisms have been made of LE and simply dismisses them in generic terms as fuss and noise.  I don't know whether this reflects a failure to understand the criticisms or a 'head in the sand' reaction to a reality that is just too awful to contemplate.  Either way, the failure to address specific criticisms made of LE represents a serious deficiency in the R2015 study and it will leave some readers with the impression that the criticisms are justified even if they have not actually read our critique. The R2015 study asserts, "There is no need to become overly concerned with noisy arguments for or against ligand efficiency metrics being exchanged in the literature" and, given the analogy drawn in R2015 between LE and fuel economy, I couldn't help thinking of a second hand car dealer trying allay a fussy customer's concerns about a knocking noise coming from the engine compartment of a potential purchase.

So now it's time to wrap up and I hope that you have learned something from the post even if you have found it to be tedious (I do have to admit that metrics bore me shitless).  Why all the fuss?  I'll start by saying that drug discovery is really difficult and making decisions using metrics that alter your perception when different units are used is unlikely to make drug discovery less difficult. If we do bad analysis and use inappropriate metrics then those who fund drug discovery may conclude that the difficulties we face are of our own making.  Those who advocate the use of LE are rarely (if ever) open about the fact that LE metrics are almost invariably tied to a 1 M reference concentration and that changing this reference alters our perception of efficiency. We need to remember that the purpose of a metric is to measure and not to tell us what we want to hear. I don't deny that LE is widely used in drug discovery and that is precisely why we need to make a fuss about its flaws. Furthermore, there are ways to normalize activity with respect to molecular size that do not suffer from LE's deficiencies.  I'll even admit that LE is useful since the metric has proven to be a powerful propaganda tool for FBDD. So useful, in fact, that the R2015 study uses the success of FBDD as propaganda for LE. Is the tail wagging the dog or have the inmates finally taken over the asylum?