Wednesday, 6 January 2016

Looking back at 2015


I'll start the year by taking a look back at some of the 2015 blog posts. The dynamic range of the bullshitometer was severely tested last year and there was an element of 'pour encourager les autres' to more than one of the posts. I thought that it'd be fun to share some travel pics and the first is of the Danube in Belgrade (I'd dropped by to catch up with friends and deliver a harangue at the university).  The early evening light was quite perfect although I hope that I won't spoil your experience of the photo by telling you that there was a pig carcass floating about 100 m from it was taken.

 

I changed the title of the blog this year. I've not been involved with FBDD for some years now and molecular design was always my main interest. One of the ideas that I try to communicate is that there's more to design than just making predictions. After Belgrade, I dropped in at Fidelta in Zagreb where I delivered another harangue before heading south to Sarajevo.  I'm a keen student of history so it was inevitable that this would be the first photo I'd take in Sarajevo.


It seems so bizarre today. There had already been one assassination attempt for the day when the driver of the car took the fateful wrong turn that gave Gavrilo Princip the opportunity to fire two shots at the royal couple. Back in Vienna, Sophie was not always allowed out in public with Franz Ferdinand so the trip to Sarajevo may have been a special treat for her. What if SatNav had already been invented but, then again, what if Queen Victoria's eldest child had succeeded her to the throne?

Part of the problem was that, as a lowly Czech countess, Sophie was not considered an appropriate match for the Habsburg heir by Franz Josef (the reigning emperor and a puritanical old killjoy) and there were rules (although metrics and Lean Six Sigma 'belts' had, thankfully, not yet been invented). One of the rules was that the children of Sophie and Franz Ferdinand were barred from succession. It is somewhat ironic that poor Franz Ferdinand was never even supposed to be crown prince in the first place and only got the job because his cousin Rudolf had abruptly removed himself from the Habsburg line of succession a quarter of a century previously. 

All this talk of puritanical rules serves as a reminder that, before moving on, I need to point you towards a friend's blog post on roundheads (who were bigger killjoys than Franz Josef or even Lean Six Sigma 'belts') and cavaliers in drug discovery.  I really like the term 'roundhead' and I think you do have to agree that it's a lot politer than 'compound quality jackboot'. Terms like 'roundhead' and 'jackboot' are invariably associated with pain and that brings me to the next topic which is PAINS. My interest in this topic was piqued by a PAINS-shaming post at Practical Fragments and I have to thank my friends there for launching me on what has proven to be a most stimulating, although at times disturbing, line of inquiry.

My first post on PAINS examined some of the basic science and cheminformatics behind the substructural filters used. One observation that I'll make is that cheminformaticians would have done themselves rather more credit if, instead of implementing PAINS filters quite so enthusiastically, they'd first taken a more forensic look at how the filters had been derived. Singlet oxygen is an integral component of the AlphaScreen technology used in all six assays that formed the basis of the original PAINS study and the second post explored some of the consequences of this reliance on singlet oxygen. The third post was written as a 'get out of jail' card for those who need to get their use of PAINS past manuscript reviewers but, on a more serious note, it does pose some questions about how much we actually know about the behavior of PAINS compounds. The final PAINS post emphasized the need to make a clear distinction in science between what we know and what we believe. If we are unable (or unwilling) to demonstrate that we can do this in drug discovery then those who fund our work may conclude that the difficulties we face are of our own making.

There's actually a lot more to Sarajevo than dead Habsburgs and the city hosted the  1984 Winter Olympics. I took a taxi to the top of the bobsled run and walked back down to the city. Here are some photos. 




  


  
  

So I guess you're wondering where the 1984 bobsled run fits into drug discovery.  Ligand efficiency is, in essence, about slopes and intercepts and, like bobsledders, ligand efficiency advocates prefer not to think about intercepts.  I did two posts on ligand efficiency in 2015. The first post was a response to an article in which our criticism of ligand efficiency metrics was denounced as noise although, in the manner of Pravda, the article didn't actually say what the criticism was and I was left with the impression of a panicky batsman desperately trying to fend off a throat ball that had lifted sharply off just short of a length. The second post explored the link between ligand efficiency and homeopathy. 

I have described ligand efficiency as not even wrong and it also fits snugly into the voodoo thermodynamics category. Sometimes I think that if a coiled dog turd could be converted to molar energy units and scaled by coil radius then it would get adopted as a metric (which we might call 'scatological efficiency'). Voodoo thermodynamics is likely to feature more frequently in 2016 although I did manage one post on this topic in 2015. 

I took the train from Sarajevo to Mostar and the next four photos show a guy jumping, as is the local custom, off the reconstructed Stari Most into the Neretva River.


 





Now I guess you're wondering what a guy jumping off a bridge in Herzegovina has to do with molecular design and the quick answer is nothing at all. During the course of the year I jumped off a bridge of sorts (more accurately out of my applicability domain) with a post on Open Access and there'll hopefully be more of this sort of thing this year. This is probably a good point to wrap up the review of 2015 and I look forward to seeing you towards the end of the month when you'll meet the boys who cried wolf.


Thursday, 31 December 2015

The homeopathic limit of ligand efficiency

With the thermodynamic proxies staked, I can get back to salting the ligand efficiency (LE) soucouyant. In the previous post on this topic, I responded to a 'sound and fury' article which appeared to express the opinion that we should be using the metrics and not asking rude questions about their validity. One observation that one might make about an article like this is that the journal in question could be seen as trying to suppress legitimate scientific debate and I put this to their editorial staff. The response was that an article like this represents the opinion of the author and that I should consider a letter to editor if there was a grievance. I reassured them that there was no grievance whatsoever and that it actually takes a lot of the effort out of providing critical commentary on the drug discovery literature when journals serve up cannon-fodder like this. In the spirit of providing helpful and constructive feedback to the the editorial team, I did suggest that they might discuss the matter among themselves because a medicinal chemistry journal that is genuinely looking to the future needs to be seen as catalyzing rather than inhibiting debate. Now there is something else about this article that it took me a while to spot which is that it is freely available while 24 hours of access to the other editorial article in the same issue will cost you $20 in the absence of a subscription. If the author really coughed up for an APC so we don't have to pay to watch the toys getting hurled out of the pram then fair enough. If, however, the journal has waived the APC then claims that it is not attempting to stifle debate become a lot less convincing. Should we be talking about the Pravda of medicinal chemistry? Too early to say but I'll be keeping an eye open for more of this sort of thing.

Recalling complaints that our criticism of the thermodynamic basis of LE was 'complex', I'm going to try to make things even simpler than in the previous post.  They say a picture is worth a thousand words so I'm going to use a graphical method to show how LE can be assessed.  To make things really simple, we'll dispense with the pretentious energy units by using -log10(IC50/Cref ) as the measure of activity and I'll also point you towards an article that explains why you need that reference concentration (Cref) if you want to calculate a logarithm for IC50. I'll plot activity for two hypothetical compounds, one of which is a fragment with 10 heavy atoms ad the other is a more potent hit from high-throughput screening (HTS) that has 20 heavy atoms. I won't actually to say what units IC50 values are expressed in and you can think of the heavy atom axis as sliding up or down the activity axis in response to changes in the concentration unit Cref. I've done things this way to emphasize the arbitrary nature of the concentration unit in the LE context.




Take a look at the plot in the left of the figure which I've labeled as 'A'.  Can you tell which of the compounds is more ligand-efficient just by looking at this plot?  Don't worry because I can't either. 

It's actually very easy to use a plot like this to determine whether one compound is more ligand-efficient than another one. First locate the point on the vertical axis corresponding to an IC50 of 1 M. Then draw a line through this point and the point representing the activity of one of the compounds. If the point representing the activity of the other compound lies below the line then it is less ligand-efficient and vice versa. Like they say in Stuttgart, vorsprung durch metrik!

Now take a look at the plot on the right of the figure which I've labelled 'B'. I've plotted two different lines that pass through the point corresponding to the fragment hit. The red line suggests that the fragment hit is more ligand efficient than the HTS hit but the green line suggests otherwise. Unfortunately there's no way of knowing which of these lines is the 'official' LE line (with intercept corresponding to IC50 = 1 M) because I've not told you what units IC50 is expressed in. Presenting the IC50 values in this manner is not particularly helpful if you need to decide which of the two hits is 'better' but it does highlight the arbitrary manner in which the intercept is selected in order to calculate LE. It also highlights how our choice of intercept influences our perception of efficiency.  

You can also think of the intercept as a zero molecular size limit for activity. One reason for doing so is that if the correlation between activity and molecular size is sufficiently strong, you may be able to extrapolate the trend in the data to the activity axis. Would it be a good idea to be make assumptions about the intercept if the data can't tell you where it is? LE is based on an assumption that the 1 M concentration is somehow 'privileged' but, in real life, molecules don't actually give a toss about IUPAC.  You can almost hear the protein saying "IUPAC... schmupack" when the wannabe ligand announces its arrival outside the binding pocket armed with the blessing of a renowned thought-leader.  The best choice of a zero molecular size limit for activity would appear to be an interesting topic for debate. Imagine different experts each arguing noisily for his or her recommended activity level to be adopted as the One True Zero Molecular Size Limit For Activity. With apologies to Prof Tolkien and his pointy-eared friends,


One Unit to rule them all, One Unit to find them,
One Unit to bring them all and in the darkness bind them, 

If this all sounds strangely familiar, it might be because you can create an absence of a solute just as effectively by making its molecules infinitely small as you can by making the solution infinitely dilute. Put another way, LE may be a lot closer to homeopathy than many 'experts' and 'thought-leaders' would like you to believe.

So that's the end of blogging for the year at Molecular Design.  I wish all readers a happy, successful and metric-free 2016.

Thursday, 5 November 2015

The rise and fall of rational drug design

I'll be taking a look at a thought-provoking article (which got me chuckling several times) on drug design by my good friend Brooke Magnanti in this blog post.  I've known Brooke for quite a few years and I'll start this post with some photos taken by her back in 1998 on a road trip that took us from Santa Fe to Carlsbad via Roswell and then to White Sands, the Very Large Array before returning to Santa Fe.  The people in these photos (Andrew Grant, Anthony Nicholls, Roger Sayle and) also appear in Brooke's article and you can see Brooke reflected in my sunglasses. The football photos are a great way to remember Andrew who died in in his fiftieth year while running and they're also a testament to Andrew's leadership skills because I don't think anybody else could have got us playing football at noon in the gypsum desert that is White Sands. We can only guess what Noël Coward would have made of it all.


What Brooke captures in her article on rational drug design is the irrational optimism that was endemic in the pharma/biotech industry of the mid-to-late nineties and she also gives us a look inside what used to be called 'Info Mesa'.  I particularly liked, 

"The name Info Mesa may be more apt than those Wired editors realised, since the prospects of a paradigm shift in drug development rose rapidly only to flatten out just when everyone thought they were getting to the top."

However, it wasn't just happening in computation and informatics in those days and it can be argued that the emergence of high-throughput screening (HTS) had already taken some of the shine off virtual screening even before usable virtual screening tools became available. The history of technology in drug discovery can be seen as a potent cocktail of hype and hope that dulls the judgement of even the most level-headed. In the early (pre-HTS) of computational chemistry we were, as I seem to remember the founder of a long-vanished start-up saying, were not going to be future-limited. In a number of academic and industrial institutions (although thankfully not where I worked), computational chemists were going to design drugs with computers and any computational chemist who questioned the feasibility of this noble mission was simply being negative. HTS changed things a bit and, to survive, the computational chemist needed to develop cheminformatic skills.

There is another aspect to technology in the pharma/biotech industry which is not considered polite to raise (although I was sufficiently uncouth to do so in this article).  When a company spends a large amount of money to acquire a particular capability, it is in the interests of both vendor and company that the purchase is seen in the most positive light. This can result in advocates for the different technologies expending a lot of energy in trying to show that 'their' technology is more useful and valuable than the other technologies and this can lead to panacea-centric thinking by drug discovery managers (who typically prefer to be called 'leaders'). In drug discovery, the different technologies and capabilities tend to have the greatest impact when deployed in a coordinated manner. For example, the core technologies for fragment-based drug discovery are detection/quantification of weak affinity and efficient determination of structures for fragment-protein complexes. Compound management, cheminformatics and the ability to model protein-ligand complexes all help but, even when used together, these cannot substitute for FBDD's core technologies.  Despite the promises, hype and optimism twenty years ago, so vividly captured by Brooke, small molecule drug discovery is still about getting compounds into assays (and it is likely to remain that way for the foreseeable future).

This is probably a good point to say something about rational drug design. Firstly, it is not a term that I tend to use because it is tautological and we are yet to encounter 'irrational drug design'. Secondly, much of the focus of rational drug design has been identification of starting points for optimization which, by some definitions, is not actually design. I would argue that few technological developments in drug discovery have been directed at the problems of lead optimization. This is not to say that technology has failed to impact on lead optimization. For example, the developments in automation that enabled HTS also led to increased throughput of CYP inhibition assays. One indication of the extent to which technological developments have ignored the lead optimization phase of drug discovery is the almost reverential view that many have of the Rule of 5 (Ro5) almost twenty years after it was first presented.  There is some irony here because Ro5 is actually of very limited value in typical lead optimization scenarios in that it provides little or no guidance for how the characteristics of Ro5-compliant compounds can be improved. When rational drug design is used in lead optimization, the focus is almost always on affinity prediction which is only one half of the equation. The other half of that equation is is free drug concentration which is a function of dose, location in the body and time. I discussed some of the implications of this in a blog post and have suggested that it may be useful to define measures of target engagement potential when thinking about drug action.

What I wrote in that blog post four years ago would have been familiar to many chemists working in lead optimization twenty years ago and that's another way of saying that lead identification has changed a lot more in the last twenty years than has lead optimization.  Perhaps it is unfair to use the acronym SOSOF (Same Old Shit Only Faster) but I hope that you'll see what I'm driving at.  Free drug concentration is a particular problem when the drug targets are not in direct contact with blood as is the case when the target is intracellular or on the 'dark side' of the blood brain barrier. If you're wondering about the current state of the art for predicting intracellular free drug concentration, I should mention that it is not currently possible to measure this quantity for an arbitrary chemical compound in live humans. That's a good place to leave things although I should mention that live humans were not the subject of Brooke's doctoral studies...
    




Friday, 30 October 2015

Voodoo thermodynamics for dummies


Metrics are like  the heads of the Hydra. Dispatch one and two pop up to take its place.


So #RealTimeChem week is over and it's time to return to the topic of metrics and readers of this blog will be aware that this is a recurring theme here. Sometimes, to give them a more 'hard science feel', drug discovery metrics are cast in thermodynamic terms and 'conversion' of IC50 to free energy provides a good example of the problem. Ligand efficiency (LE) was originally defined by scaling free energy of binding by molecular size and it is instructive to observe how toys are ejected from prams when the thermodynamic basis of LE is challenged.


The most important point to note about a metric is that it's supposed to measure something and, regardless of how much you wave your arms and how noisily you assert the metric's usefulness, the metric still needs to measure. That's why we call it a 'metric' and not a 'security blanket for timid medicinal chemists' nor a 'floatation device for self-appointed experts and wannabe thought-leaders'. To be useful, a metric also has to measure something relevant and, in many drug discovery scenarios, that means being predictive of the chemical or biological behavior of compounds. Drug discovery metrics (and guidelines) are often based on trends in data and the strength of the trend tells us how much weight we should give to metrics and how rigidly we should adhere to guidelines. In the metric business, relevance trumps simplicity and even the most anemic of trends can acquire eye-wateringly impressive significance when powered by enough data.


I'll start my review of the article featured in this post by saying that, had the manuscript been sent to me, the response the editor would have been something between 'why have you sent this out for review' and 'this manuscript needs to be put out of its misery as swiftly and mercifully as possible'. The article appears to be the write up for material presented in webinar format which was reviewed less than favorably.  The authors have made a few changes and what was previously called SEEnthalpy (Simplistic Estimate of Enthalpy) is now called PEnthalpy (Proxy for Enthalpy) but the fatal design flaws in the original metric remain and the review of that webinar will show what happens when you wander by mistake into the mess that metrics make.


Before we try to cone these thermodynamic proxies in the searchlights, it may be an idea to ask why we should worry about enthalpy or entropy when drug action is driven by affinity and free concentration. That's a good question and, to be quite honest, I really don't know the answer. Isothermal titration calorimetry (ITC) is an excellent, label-free method for measuring affinity and enthalpy of binding. However, the idea that the thermodynamic signature for binding of a compound to a protein will somehow be predictive of the behavior of the compound is all sorts of situations that do not involve that protein does seem to be entering the realms of wild conjecture.  There is also the question of how isothermal systems like live humans can 'sense' the benefits of an enthalpically-optimized drug. Needless to say, these are questions that some ITC experts and many aspiring thought-leaders would prefer that you didn't think too hard about.


So let's take a look at the thermodynamic proxies which are defined in terms of the total number (HBT) of hydrogen bond donors and acceptors and the number (RB) of rotatable bonds.  The proxies are defined as follows:



 PEnthalpy = HBT/(RB + HBT)                                          (1)

 PEntropy  = RB/(RB + HBT)                                              (2)

 PEnthalpy  +  PEntropy  =  1                                                 (3)

The proxies predict that the enthalpy and entropy changes associated with binding are functions only of ligand structure and therefore are of no value for comparing the thermodynamics for a particular ligand binding to different proteins as one might want to do when assessing selectivity. Equation (3) shows that there is effectively only one metric (what a relief) since the two proxies are perfectly anticorrelated so each is as effective as the other as a predictor of either the enthalpy or entropy changes associated with ligand binding.

Now you may remember in the webinar that one of the authors of the featured article was telling us at 22:43 that "entropy comes from non-direct hydrophobic interactions like rotatable bonds".  At least now they seem to realize that the rotatable bonds represent degrees of freedom although I don't get the impression from reading the article that have a particularly solid grasp of the underlying physicochemical principles. Freezing rotatable bonds is an established medicinal chemistry tactic for increasing affinity and, if successful, we expect it to lead to a more favorable entropy of binding which some self-appointed thought-leaders would assert is a bad way to increase affinity.  Trying to keep an open mind on this issue, I suggest that we might follow the lead of British Rail and try to define right and wrong types of entropy.


One of the criticisms that I made of the webinar was that no attempt was made to validate the metrics against measured values of binding enthalpy and entropy. In the article, the metrics are evaluated against a small data set of measured values.  As I mentioned earlier, there is effectively only one metric because the two metrics are perfectly anti-correlated so you need to look beyond the fit of the data to the metrics if you want to assess what I'll call the 'thermodynamic connection'. This means digging into the supplementary information.  I found the following on page 5 of the SI:


-TdeltaS =  159.80522 − 343.46172*Pentropy(RB/(HBT + RB))              (4)

which implies that:

TdeltaS  =  −159.80522 + 343.46172*Pentropy(RB/(HBT + RB))            (5)

These equations tell us that the change in entropy associated with binding actually increases with RB rather than decreasing with RB as one would expect for degrees of freedom that become frozen when the intermolecular complex forms. When you're assessing proxies for thermodynamic quantities it's a really good idea to take a look at the root mean square error (RMSE) for the fit of the quantity to the proxy.  The RMSE values for fitting ΔH and TΔS are 28.35 kJ/mol and 29.30 kJ/mol respectively and I will leave it to you, the reader, to decide for yourself whether or not you consider these RMSE values to justify PEnthalpy an PEntropy being called thermodynamic proxies.  The alert reader might ask where the units for ΔH and TΔS° came from since neither the article nor the the SI provides this information and the answer is that you need to go to the source from which the ΔH and TΔS° values were taken to find out.

Now you'll recall that these thermodynamic proxies predict constant values of ΔH and TΔS° for a binding of a given compound to any protein (even those proteins to which it does not bind). The  ΔG° values for the compounds in the small data set used to evaluate the thermodynamic proxies lie in a relatively narrow range (i.e. less than the RMSE values mentioned in the previous paragraph) from −37.6 kJ/mol to −57.3 kJ/mol and are not representative of the affinity of these compounds for proteins against which they had not been optimized. Any guesses how the RMSE values for fittling the data would have differed if  ΔH and TΔS° values had been used for each compound binding to each of the protein targets?


Now if you've you've kept up to date with the latest developments in the drug discovery metric field, you'll know that even when the mathematical basis of a metric is fragile, there exists the much-exercised option of touting the metric's simplicity and claiming that it is still useful. Provided that nobody calls your bluff, metrics can prove to be a very useful propaganda instruments. The featured article does present examples of data analysis based on the using the thermodynamic proxies as descriptors and one general criticism that I will make of this analysis is that most of it is based on the significance rather the strength of trends. When you tout the significance of a trend, you're saying as much about the size of your data set as you are about the strength of the trend in it. This point is discussed in our correlation inflation article and I'd suggest taking a particularly close look at what we had to say about the analysis in this much-cited article.


I'd like to focus on the analysis presented in the section entitled 'GSK PKIS Dataset' and which explored correlations between protein kinase % inhibition and a number of molecular descriptors. The authors state,


"In addition to PEnthalpy, we assessed the correlation across a variety for physicochemical properties including molecular weight, polar surface area, and logP in addition to PEnthalpy  (Fig. 5)"   


This statement is actually inaccurate because they have assessed the significance of the correlations rather than the correlations themselves. Although they may have done the assessment for logP and polar surface area, the results of these assessments do not seem have materialized in Fig. 5 and we are left to speculate as to why. The strongest correlation between PEnthalpy  and % inhibition was observed for CDK3/cyclinE and the plot is shown in Fig. 5b. I invite you, the reader, to ask yourself whether the correlation shown in Fig. 5b would be useful in a drug discovery project.


Since the title of the post mentions voodoo thermodynamics, we should take a look at this in the context of the article and the best place to look is in the Discussion section.  We are actually spoiled for choice when looking for examples of voodoo thermodynamics there but take a look at:


"It is assumed in the literature that the "enthalpically driven compound series" with fewer RBs tend to be (generally) lower MW compounds as well. In contrast, in cases where selectivity is steeper among compounds in a series for which activity and selectivity is likely governed by compounds with relatively more RB versus HBA and HBA [sic], than when the entropic contributors are dominating."


So that's about as much voodoo thermodynamics as I can take for a while so, if it's OK with you, I'll finish by addressing a couple of points to the authors of this article. The flagship product of company with which the authors are associated is a database system for integrating chemical and biological data. Although I'm not that familiar with this database system, responses to my questions during the course of a discussion in the FBDD LinkedIn group suggested that a number of cheminformatic issues have been carefully thought through and that the database system could be very useful in drug discovery. One problem with the featured article is that its scientific weaknesses could lead to some customers losing confidence in the database system. Secondly, the folk who created the database system (and keep it running) may have only limited opportunities to publish and scientifically weak publications by colleagues who are perhaps less focused on what actually pays the bills may breed some resentment.


That's where I'll wrap because there is only so much voodoo thermodynamics that one can take in a day so, as we say in Brazil, 'até mais'.

      

Friday, 23 October 2015

From schwefeläther to octanol

So in this blog post, written specially for #RealTimeChem week on an #OldTimeChem theme, I'll start ten years after the Kaiser's grandmother became Queen of the United Kingdom of Great Britain and Ireland.  I first came across Ernst Friherr von Bibra while doing some literature work for an article on predicting alkane/water partition coefficients and learned that he was from an illustrious family of Franconian Prince-Bishops. Von Bibra certainly seems to have been a colorful character who is said to have fought no less than 49 duels as a young man. Presumably these were not the duels to the death that did for poor Galois and Pushkin but more like the München frat house duels that leave participants with the dueling scars that München Fräuleins find so irresistible.

So that's how I learned about von Bibra and it was his 1847 study with Harless 'Die Ergebnisse der Versuche über die Wirkung des Schwefeläthers' (The results of the experiments on the effect of the sulfuric ether) that we cited. Schwefeläther is simply diethyl ether and so named because in 1847 you needed to make it from ethanol and sulfuric acid. Von Bibra was a pioneer in the anesthesia field and proposed that anesthetics like ether exerted their effects by dissolving the fatty fraction of brain cells. Now it's easy in 2015 to scoff at this thinking but remember that in 1847 nobody knew about cell membranes or molecules and you couldn't just pick up the phone and expect the ether to arrive the next day. Put another way, if it was 1847 and I was in the laboratory (or kitchen?) gazing at a bowl of brains and a bottle of ether, I would probably have come to a similar conclusion.

What we know now is that anesthetics (and other drugs) dissolve IN lipids as opposed to dissolving THE lipids. Nobody in 1847 knew about partition coefficients and Walther Nernst didn't articulate his famous distribution law (Verteilung eines Stoffes zwischen zwei Lösungsmitteln und zwischen Lösungsmittel und Dampfraum. Z Phys Chem 8:110–139) until 1891 by which time the Kaiser had already handed Bismarck his P45. Within ten years Ernest Overton and Hans Meyer had shown incredible foresight in using amphibians as animal models and the concept of the cell membrane would soon be introduced.

Before moving on, let's take a look at the 'introduction to partition coefficients' graphic below in which the aqueous phase is marked by a the presence of fish (they're actually piranhas and have have graced my partition coefficient powerpoints since my first visit to Brazil in 2009). We would describe the compound on the left as lipophilic because its neutral form prefers the organic phase to water and, for now, I'm not going to be too specific about exactly what that organic phase is. The compound on the right prefers to be in the water so we describe it as hydrophilic.  The red molecule on the left represents an ionized form of the compound on the left and typically these don't particularly like to go into the organic phase (especially not without counter ions for company).  A compound that prefers to to be in the organic phase can still be drawn into the aqueous phase by increasing the extent to which it is ionized (e.g. by decreasing pH if the compound is basic).

  
Now I'd like to introduce Runar Collander (whom many of you will have heard of) and Calvin Golumbic (whom few of you will have heard of).  Let's first take a look at Golumbic's 1949 study of the effects of ionization on the partitioning of phenols between water and cyclohexane.  Please observe the responses to pH in Fig 1 in that article but also take a look at equation 5 which accounts for self-association in the organic phase and the discussion about how methyl group ortho the phenolic hydroxyl compromises the hydrogen bonding of that hydroxyl group and has observable effects on the partition coefficient.

Collander's study (The Partition of Organic Compounds Between Higher Alcohols and Water) explores how differences in the organic solvent affect partitioning behaviour of solutes.  Collander presented evidence for strong linear relationships between partition coefficients measured using different alcohols (see Fig 1 in his article) although he notes that compounds with two or more hydrophilic groups in their molecular structure tend to deviate from the trend. Now take a look at Fig 2 in Collander's study which shows a plot of octanol/water partition coefficients against their ether/water equivalents. Now the correlation doesn't look so strong although relationships within chemical families appear to be a lot better. In particular, amines appear to be more soluble in octanol than they are in ether and Collander attributes this to the greater acidity of octanol. 

When reading these articles from over sixty years ago, I'm struck by the way the authors ratioanalize their observations in physical terms.  Don't be misled by what we would regard as the obscure use of language (e.g. Collander's "double molecules"  and Golumbic's description of partition coefficients as "true") because the conceptual and linguistic basis of chemistry in 2015 is richer than it was when these pioneering studies were carried out.  How these pioneers would have viewed some of the more mindless metrics by which chemists of 2015 have become enslaved can only be speculated about.

So I'm almost done but one last character in this all-star cast has yet to make his entrance and that, of course, is Corwin Hansch. Most drug discovery scientists 'know' that octanol 'defines' lipophilicity and only a small minority actually question the suitability of octanol for this purpose or even ask how this situation came to be. In order to address the second question, let's take a look at what Hansch et al have to say in this 1963 article,


"We have chosen octanol and water as a model system to approximate the effect of step I on the growth reaction in much the same fashion as the classical work of Meyer and Overton rationalized the relative activities of various anesthetics. This assumption is expressed in 2 where P is the partition coefficient (octanol-water) of the auxin.

          A = f(P)                              (2)

Collander has shown that the partition coefficients for a given compound in two different solvent systems (e.g., ether-water, octanol-water) are related as in 3.

          log P1 = a log P2 + b        (3)

This would also indicate, as does the Meyer-Overton work, that it is not unreasonable to use the results from one set of solvents to predict results in a second set". 

Now if you take another look at Collander's article, you'll see that he only claims that a linear relationship exists between partition coefficients when the organic phase is an alcohol. Collander's Fig 2 seems to suggest that Hansch et al's equation 3 cannot be used to relate octanol/water and ether/water partition coefficients. Can Collander's study be used to justify what appears to be a rather arbitrary choice of octanol as a solvent for partition coefficient measurements? That question, I will leave to you, the reader but, if you're interested, let me point you towards a short talk that I did recently at Ripon College. 


See you next year at #RealTimeChem week and don't forget to take a look at Laura's nails which get my vote for highlight of the week.

Monday, 19 October 2015

Halogen bonding and the curious case of the poisoned dogs

In this blog post, written specially for #RealTimeChem week on an #OldTimeChem theme, I'm going to start in the current century and work back to just a couple of years after the Kaiser's grandmother became Queen of the United Kingdom of Great Britain and Ireland. Readers might want to consider how history might have turned out differently had her eldest child succeeded her to the throne.

One can be forgiven for thinking that halogen bonding is a new phenomenon.  The term halogen bonding refers to attractive interactions between halogens and hydrogen bond acceptors which should be repulsive because hydrogen bond acceptors and halogens are electronegative and would therefore be expected to carry negative partial charges. Nevertheless halogens (other than fluorine) do seem to rather enjoy the company of hydrogen bond acceptors and molecular interaction 'catalogs' such as A Medicinal Chemist's Guide to Molecular Interactions and Molecular Recognition in Chemical and Biological Systems ensure that the medicinal chemist of 2015 is made aware of the importance of halogen bonding and of potential opportunities for exploiting them.

I was first made aware of halogen bonding in the mid 1990s through interactions with Zeneca colleagues at Jealotts Hill and, some time after that, The Nature and Geometry of Intermolecular Interactions between Halogens and Oxygen or Nitrogen was published with one of those colleagues (who had by then moved to CCDC) as a co-author.  A few years ago, I reproduced some of the analysis from that paper for a talk and here is one of the slides which shows how the closest contacts between the carbonyl oxygen and halogen are observed when the two atoms approach along along the axis defined by the halogen and the carbon atom to which it is bound.


If the halogen is bound to something that is sufficiently electron-withdrawing, the molecular electrostatic potential (MEP) on a circular patch on the van der Waals surface of the halogen becomes positive and sometimes this is described as a s-hole.  This article shows that of the s-hole is most pronounced for iodine and non-existent for fluorine, which is consistent with what X-ray crystal structures tell us about halogen bonding. I created a slightly different picture of the s-hole for my talk which shows MEP as a function of distance along two orthogonal directions of approach to the chlorine.  The significance of the  s-hole is that you doesn't actually have to invoke polarization to 'explain' halogen bonding even though it will always happen when atoms get up close and personal. 


When I first encountered halogen bonding, I remembered learning about the reaction of iodine with iodide anion as a schoolboy in Port-of-Spain in the 1970s. Iodine is not particularly water-soluble and this is a way to coax it into solution. That iodine forms complexes with Lewis bases has been known for many years and the Nantes group, better known for their extensive studies of hydrogen bond basicity, have used this to develop a halogen bond basicity scale based on this chemistry.

So in the spring of 2008 I found myself charged with doing a talk on halogens at EuroCUP (OpenEye European user group meeting) and halogen bonding was clearly going to be an important topic. To understand why I was doing this, we need to go back to the 2007 Computer-Aided Drug Design Gordon Conference and specifically the election of the vice chair for the 2009 conference. My good friend Anthony Nicholls was one of the nominees and in his candidacy speech said that he was going to have a session on halogens.  Although Ant didn't win that election, this is by far the most lucid and sensible suggestion that I've heard in a GRC candidacy speech. Needless to say the session on halogens happened in Strasbourg at EuroCUP (the morning after the conference dinner in a winery) and it was on the bus ride to that dinner that I had this conversation with my favorite Austro-Hungarian:

 "Onkel Hugo, where are you from in Germany?   
I am from AUSTRIA!   
Is that in Bavaria?"

While preparing the harangue, I decided to follow the iodine/iodide trail to see how far back it led and I was thrilled to encounter The Periodides by Albert B Prescott, writing in 1895, in which it was noted that, 

"In I839 Bouchardat, a medical writer in Paris, recounts that, when dogs were being surreptitiously poisoned with strychnine in Paris, and an antidote was asked for, first Guibourt recommended powdered galls, and then Donne advised iodine tincture, whereupon Bouchardat himself, approving the use of iodine, said they should use it in potassium iodide solution."

Thirty one years after the nefarious activities of the notorious Parisian dog poisoners  a sudden influx of Prussian visitors demonstrated the inadequacy of Strasbourg's supply of sunbeds and and I probably should now let you take a look at that infamous EuroCUP2008 harangue.

Monday, 12 October 2015

A PAINful convolution of fact with opinion?


So once more I find myself blogging on the subject of PAINS although in the wake of the 2015 Nobel Prize for medicine which will have forced many drug-likeness 'experts' onto the back foot. This time the focus is on antifungal research and the article in question has already been reviewed In The Pipeline. It's probably a good idea to restate my position on PAINS so that I don't get accused of being a Luddite who willfully disregards evidence (I'm happy to be called a heretic who willfully disregards sermons on the morality of ligand efficiency metrics and the evils of literature pollution). Briefly, I am well aware that not all output from biological assays smells of roses although I have suggested that deconvolution of fact from opinion is not always as quite as straightforward as some of the 'experts' and 'thought leaders' would have you believe. This three-part series of blog posts ( 1 | 2 | 3 ) should make my position clear but I'll try to make this post as self-contained as possible so you don't have to dig around there too much.

So it's a familiar story in antifungals. More publications but less products and the authors of the featured article note, “However, we believe that one key reason is the meager quality of some of these new inhibitors reported in the antifungal literature; many of which contain undesirable features” and I'd have to agree that if the features really do cause compounds to choke in (or before) development then the features are can legitimately be described as 'undesirable'. As I've said before, it is one thing to opine that something looks yucky but another thing entirely to establish that it really is yucky. In other words, it is not easy to deconvolute fact from opinion. The authors claim, “It is therefore not surprising to see a long list of papers reporting molecules that are fungicidally active by virtue of some embedded undesirable feature. On the basis of our survey and analysis of the antifungal literature over the past 5 years, we estimate that those publications could cover up to 80% of the new molecules reported to have an antifungal effect”. I have a couple of points to make here. Firstly, it would be helpful to know what proportion of that 80% have embedded undesirable features that are actually linked to relevant bad behavior in experimental studies. Secondly, this is a chemistry journal so please say 'compound' when you mean 'compound' because 80% of even just a mole of molecules is a shit-load of molecules.

So it's now time to say something about PAINS but I first need to make the point that embedded undesirable features can lead to different types of bad behavior by compounds. Firstly, the compound can interact with assay components other than target protein and I'll term this 'assay interference'. In this situation, you can't believe 'activity' detected in the assay but you may be able to circumvent the problem by using an orthogonal assay. Sometimes you can assess the seriousness of the problem and even correct for it as described in this article. A second type of bad behaviour is observed when the compound does something unpleasant to the target (and other proteins) and I'd include aggregators and redox cyclers in this class along with compounds that form covalent bonds with the protein in an unselective manner. The third type of bad behaviour is observed when the embedded undesirable feature causes the compound to be rapidly metabolized or otherwise 'ADME-challenged'.  This is the sort of bad behaviour that lead optimization teams have to deal with but I'll not be discussing it in this post.

The authors note that the original PAINS definitions were derived from observation of “structural features of frequent hitters from six different and independent assays”. I have to admit to wondering what is meant by 'different and independent' in this context and it comes across as rather defensive. I have three questions for the authors. Firstly, if you had the output of 40+ screens available, would you consider the selection of six AlphaScreen assays to be an optimal design for an experiment to detect and characterize pan-assay interference? Secondly, are you confident that singlet oxygen quenching/scavenging by compounds can be neglected as an interference mechanism for these 'six different and independent AlphaScreen assays'? Thirdly, how many compounds identified as PAINS in the AlphaScreen assays were shown to bind covalently to one or more of their targets in the original PAINS article?

The following extract from the article should illustrate some of difficulties involved in deconvoluting fact from opinion on the PAINS literature and I've annotated it in red.  Here is the structure of compound 1:

"One class of molecules often reported as antifungal are rhodanines and molecules containing related scaffolds. These molecules are attractive, since they are easily prepared in two chemical steps. An example from the recent patent literature discloses (Z)-5-decylidenethiazolidine-2,4-dione (1) [It's not actually a rhodanine and don't even think about bringing up the fact that my friends at Practical Fragments have denounced both TZDs and thiohydantoins as rhodanines] as a good antifungal against Candida albicans (Figure2).(16) As a potential carboxylic acid isostere, thiazolidine-2,4-diones may be sufficiently unreactive such that they can progress some way in development.(17) Nevertheless, one should be aware of the thiol reactivity associated with this type of molecule, as highlighted by ALARM NMR and glutathione assays,(18) [I can't access reference 18 but rhodanines appear to represent its primary focus. Are you able to present evidence for thiol reactivity for compound 1? What is the pKa for compound 1 and how might this be relevant to its ability to function as a Michael acceptor?] and this is especially relevant when the compound contains an exocyclic alkene such as in 1. The fact that rhodanines are promiscuous compounds has been recently highlighted in the literature.(13, 18) [Maybe rhodanines really are promiscuous but, at the risk of appearing repetitious, 1, is not a rhodanine and when we start extrapolating observations made for rhodanines to other structural types, we're polluting fact with opinion. Also the evidence for promiscuity presented in reference 13 is frequent-hitter behavior in a panel of six AlphaScreen assays and this can't be invoked as evidence for thiol reactivity because rhodanines lacking the exocycylic carbon-carbon double bond are associated with even greater PAIN levels than rhodanines that can function as Michael acceptors] The observed antifungal activity of 1 is certainly genuine; however, this could potentially be the result of in vivo promiscuous reactivity in which case the main issue lies in the potential lack of selectivity between the fungi and the other exposed living organisms" [This does come across as arm-waving. Have you got any evidence that there really is an issue? It's also worth remembering that, once you move away from the AlphaScreen technology, rhodanines and their cousins are not the psychopathic literature-polluters of legend and it is important for PAINS advocates to demonstrate awareness of this article.]

The authors present chemical structures of thirteen other compounds that they find unwholesome and I certainly wouldn't be be volunteering to optimize any of the compounds presented in this article. As an aside, when projects are handed over at transition time, it is instructive to observe how perceptions of compound quality differ between those trying to deliver a project and those charged with accepting it.  My main criticism of the article is that very little evidence that bad things are actually happening is presented. The authors seem to be of the view that reactivity towards thiols is necessarily a Bad Thing. While formation of covalent bonds between ligands and proteins may be frowned upon in Switzerland (at least on Sundays), it remains a perfectly acceptable way to tame errant targets.  A couple of quinones also appear in the rogues gallery and it needs to be pointed out that the naphthaquinone atovaquone is one of the two components (the other is proguanil which would also cause many compound quality 'experts' to spit feathers if they had failed to recognize it as an approved drug) of the antimalarial drug Malarone. I have actually taken Malarone on several occasions and the 'quinone-ness' of atovaquone worries me a great deal less than the potential neuropsychiatric effects of the (arguably) more drug-like mefloquine that is a potential alternative. My reaction to a 'compound quality' advocate who told me that I should be taking a more drug-like malaria medication would be a two-fingered gesture that has occasionally been attributed to the English longbowmen at Agincourt. Some of the objection to quinones appears to be due to their ability to generate hydrogen peroxide via redox cycling and, when one makes this criticism of compounds, it is a good idea to at demonstrate that one is at least aware that hydrogen peroxide is an integral component of the regulatory mechanism of PTP1B.

This a good point to wrap things up. I just want to reiterate the importance of making a clear distinction in science between what you know and what you believe. This echoes Feynman who is reported to have said that, "The first principle is that you must not fool yourself and you are the easiest person to fool". Drug discovery is really difficult and I'm well aware that we often have to make decisions with incomplete information. When basing decisions on conclusions from data analysis,  it is important to be fully aware of any assumptions that have been made and of limitations in what I'll call the 'scope' of the data. The output of forty high throughput screens that use different detection technologies is more likely to reveal genuinely pathological behavior than the output of six high throughput screens that all use the same detection technology. One needs to be very careful when extrapolating frequent hitter behavior to thiol reactivity and especially so when using results from a small number of assays that use a single detection technology. This article on antifungal PAINS is heavy on speculation (I counted 21 instances of 'could', 10 instances of 'might' and 5 instances of 'potentially') and light on evidence.  I'm not denying that some (much?) of the output from high throughput screens is of minimal value and one key challenge is how to detect anc characterize bad chemical behavior in an objective manner.  We need to think very carefully about how the 'PAINS' term should be used and the criteria by which compounds are classified as PAINS. Do we actually need to observe pan-assay interference in order to apply the term 'PAINS' to a compound or is it simply necessary for the compound to share a substructural feature with a minimum number of compounds for which pan-assay interference has been observed?  How numerous and diverse must the assays be? The term 'PAINS' seems to get used more and more to describe any type of bad behavior (real, suspected or imagined) by compounds in assays and a case could be made for going back to talking about 'false positives' when referring to generic bad behavior in screens.

And I think that I'll leave it there. Hopefully provided some food for thought.