Sunday, 12 July 2015

Molecular recognition, controlled experiments and discrete chemical space

A recent Curious Wavefunction post (The fundamental philosophical dilemma of chemistry) got me thinking a bit.  Ash notes,

“For me the greatest philosophical dilemma in chemistry is the following: It is the near impossibility of doing controlled experiments on the molecular level. Other fields also suffer from this problem, but I am constantly struck by how directly one encounters it in chemistry”.  

I have a subtly different take on this although I do think that Ash is most definitely pointing the searchlight in the right direction and his post was discussed recently by members of the LinkedIn Computational Chemistry Group.

Ash illustrates the problem from a perspective that will be extremely familiar to anybody with experience in pharmaceutical molecular design,

“Let’s say I am interested in knowing how important a particular hydrogen bond with an amide in the small molecule is. What I could do would be to replace the amide with a non hydrogen-bonding group and then look at the affinity, either computationally or experimentally. Unfortunately this change also impacts other properties of the molecules; its molecular weight, its hydrophobicity, its steric interactions with other molecules. Thus, changing a hydrogen bonding interaction also changes other interactions, so how can we then be sure that any change in the binding affinity came only from the loss of the hydrogen bond? The matter gets worse when we realize that we can’t even do this experimentally”

The “gets worse” acknowledges that the contribution of an interaction to affinity is not an experimental observable.  As we noted in a recent critique of ligand efficiency metrics,

“In general, the contribution of a particular intermolecular contact (or group of contacts) to affinity (or the changes in enthalpy, entropy, heat capacity or volume associated with binding) cannot be measured experimentally”

It’s easy to see why this should be the case for interaction of a drug with its target because this happens in an aqueous environment and the binding event is coupled to changes in solvent structure that are non-local with respect to the protein-ligand interactions.  As former colleagues at AZ observe in their critique of enthalpy optimization,

“Both ΔH° and ΔS° are typically sums over an enormous amount of degrees of freedom, many contributions are opposing each other and comparable in amplitude, and in the end sums up to a comparably smaller number, ΔG°”

Water certainly complicates interpretation of binding thermodynamics in terms of structure but, even in gas phase, the contributions of intermolecular contacts are still no more observable experimentally than atomic charges. 

Let’s think a bit about what would constitute a controlled experiment in the context of generating a more quantitative understanding of the importance of individual interactions.  Suppose that it’s possible to turn a hydrogen bond acceptor into (apolar) hydrocarbon without changing the remainder of the molecule. We could measure the difference in affinity/potency pKd/pIC50 between the compound of interest and its analog in which the hydrogen bond acceptor has been transformed into hydrocarbon. The Figure below shows how this can be done and the pIC50 values (taken from this article) can be interpreted as evidence that hydrogen bonds to one of the pyridazine nitrogen are more beneficial for potency against cathepsins S and L2 than against cathepsin L.

Although analysis like this can provide useful insight for design, it still doesn’t tell us what absolute contributions these hydrogen bonds make to potency because our perception depends on our choice of reference. Things get more complicated if we are trying to assess the importance of the importance of the hydrogen bond donor and acceptor characteristics of a secondary amide in determining activity. We might replace the carbonyl oxygen atom with a methylene group to assess the importance of hydrogen bond acceptor ability.. er... maybe not.  The amide NH could be replaced with a methylene group but this will reduce the hydrogen bond acceptor strength of the carbonyl oxygen atom as well as changing the torsional preferences of the amidic bond.   This illustrates the difficulty, highlighted by Ash, of performing controlled experiments when trying to dissect the influences of different molecular properties on the behavior of compounds.

The above examples raise another issue that rarely, if ever, gets discussed.  Although chemical space is vast, it is still discrete at the molecular level and that may prove to be an even bigger dilemma than not being able perform controlled experiments at the molecular level.  As former colleagues and I suggested in our FBDD screening library article, fragment-based approaches may enable chemical space to be sampled at a more controllable resolution.  Could it be that fragments may have advantages in dealing with the discrete nature of chemical space? 

Wednesday, 1 July 2015

Into the mess that metrics make

So it looks like the post publication peer review of the article featured in the Expertitis blog post didn't go down too well but as Derek said recently, “…open, post-publication peer review in science is here, and it's here to stay. We'd better get used to it”.   In this post, I’m going to take a look at the webinar that drew me to that article and I hope that participants in that webinar will find the feedback to be both constructive and educational.

Metrics feature prominently in this webinar and the thing that worries me most is the Simplistic Estimate of Enthalpy (SEEnthalpy).   Before getting stuck into the thermodynamics, it’ll be necessary to point out some errors so please go to 20:10 (slide entitled ‘ligand efficiency lessons’) in the webinar.   

The ‘ligand efficiency lessons’ slide concedes that LE has been criticized in print but then incorrectly asserts that the criticism has been rebutted in print.  It is well-known that Mike Schultz has criticized ( 1 | 2 | 3 ) LE as being mathematically invalid and the counter to this is that LE (written as free energy of binding divided by number of non-hydrogen atoms) is a mathematically valid expression (even though the metric itself is thermodynamic nonsense). Nevertheless, Mike still identified the Achilles' heel of LE which is that a linear response of affinity to molecular size has to have zero intercept in order for that linear response to represent constant LE. It is also somewhat ironic that the formula for LE used in the rebuttal to Mike's criticism is itself mathematically invalid because it includes a logarithm of a quantity with units of concentration.  However, another criticism (article and related blog posts 1 | 2 | 3 ) has been made of LE and this has not been rebutted in print.   The ‘ligand efficiency lessons’ slide also asserts that "LE tends to decline with ligand size" and it should also be pointed out that some of the size dependency of LE is an artefact of the arbitrary choice of 1 mole per litre as the standard concentration (article and blog posts 1 | 2 ).   The Ligand Efficiency: Nice Concept, Shame About the Metrics presentation may also be helpful.
This is a good point to introduce SEEnthalpy and, were I writing a satire on drug discovery metrics, I could scarcely do better than this.  Perhaps even ‘Enthalpy – the musical’ with lyrics by Leonard Cohen (Some girls wander by mistake/ Into the mess that metrics make)?  Let’s pick this up at 22:43 in the webinar with the ‘Conventional wisdom – simplistic view’ slide which teaches us that rotatable bonds are hydrophobic interactions (you really did hear that correctly, “entropy comes from non-direct, hydrophobic interactions like rotatable bonds"). I don’t happen to agree with ‘conventional’ or ‘wisdom’ in this context although I do fully endorse ‘simplistic’. Let’s take a closer look at SEEnthalpy which is defined (24:45) from the numbers of hydrogen bond donors (#HBD), hydrogen bond acceptors (#HBA) and rotatable bonds (#RB):

SEEnthalpy = (#HBD + #HBA)/(#HBD + #HBA + #RB)

As I’ve mentioned before, definitions of hydrogen bonding groups can be ambiguous (as can be definitions of rotatable bonds) but I don’t want to get sidetracked by this right now because there are more pressing issues to deal with. You can learn a lot about metrics simply by thinking about them (have a look at comments on LELP in this article) and one question that you might like to ask yourselves is what is #RB (which they’re telling us is associated with entropy) doing in an enthalpy metric?  This may be a good time to note that the contribution of an intermolecular contact (or group of contacts) to the changes in enthalpy or entropy associated with binding is not, in general, an experimental observable.  Could also be a good idea to take a look at A Medicinal Chemist's Guide to Molecular Interactions and Ligand Binding Thermodynamics in Drug Discovery: Still a Hot Tip? 

I have to admit that I found the reasoning behind SEEnthalpy to be unconvincing but, given that the metric appears to have the endorsement of all participants of the webinar, perhaps we should at least give them the chance to demonstrate that it is meaningful. If I was introducing an enthalpy metric then the first thing that I’d do would be to (try to) show that it measured what it was claimed to measure (we call something a metric because it measures something and not because it's easy to remember the formula or because we've thought up a cute acronym). As it turns out, the Binding Database is public and simply oozing with the thermodynamic data that could be used to investigate the relationship between metric and reality.  This makes how they’ve chosen to evaluate the metric seem somewhat bizarre.

The first test of SEEnthalpy was performed using a TB activity dataset.  About 1000 of the compounds in this dataset were found to be active by high throughput screening (HTS) and the remaining 100-ish had come from structure-based drug design (SBDD).   The differences between the two groups of compounds were found to be significant although it is not clear what to make of this.  One interpretation is that the HTS compounds are screening hits (possibly from phenotypic screens) and the SBDD compounds have been optimized.  If this is the case, it will not be too difficult to perceive differences between the two groups of compounds and doing so does not represent one of the more pressing prediction problems in drug discovery.  This is probably a good time to note that correlation does not equal causation.  The other point worth making is that observing that two mean values are significantly different doesn’t tell us about the size of the effect which is more relevant to prediction.  If you want to illustrate the strength of the trend (as opposed to its statistical significance) then you need to show standard deviations with the mean values rather than just standard errors.  If this is unfamiliar territory then why not take a look at this article and make it more familiar.

The next test of SEEnthalpy was to investigate its correlation with biological activity and the chosen activity measure was %inhibition in protein kinase assays.  Typically when we model and analyze activity we use pIC50 or pKi (see this webinar, for example) and it is not at all clear why %inhibition was used instead given the vast amount of public data available in ChEMBL.  One problem with using %inhibition as a measure of activity is that it has a very limited dynamic range and there really is a reason that people waste all that time measuring a concentration response so that they can calculate pIC50 (or pKi).   Let’s pick up the webinar at 27:10 and at 28:13 a plot will appear.   This plot appears to have been generated by ordering compounds by SEEenthalpy and then plotting %inhibition against order.  If you look at the plot you’ll notice that the %inhibition values for most of the compounds are close to zero indicating that they are essentially inactive at the chosen test concentration which means that the correlation between %inhibition will be very weak.  But follow the webinar to 28:32 and you will learn that, “…for this particular kinase, there was a really clear relationship between SEEnthalpy and the activity”.  I’ll leave it to you, the reader, to decide for yourself how clear that trend actually is.

Let’s go to 28:42 in the webinar, at which point the question is posed as to whether SEEnthalpy correlates more strongly than other metrics with activity.   However, when we get to 28:53, we discover that it is actually statistical significance (p values) rather than strength of correlation that is the focus of the analysis.   It is really important to remember that given sufficiently large samples, even the most anemic of trends can acquire eye-wateringly impressive statistical significance and it is the correlation coefficient not the p value that tells us how strong a trend is. Once again, I am forced to steer the participants of the webinar towards this article and this time, in order to reinforce the message, I'll also link an article which illustrates the sort of tangle into which you can get if you confuse the strength of a trend with its statistical significance.

I think this is a good point at which to wrap things up. I'll start by noting that attempting to educate the drug discovery community and shape the thinking of that community's members can backfire if weaknesses in one's grasp of the fundamental science become apparent.  For example confusing the statistical significance of a trend with its strength in a webinar like this may get people asking uncouth questions about the machine learning models that were mentioned (although not discussed) during the webinar.  I'll also mention that the LE values are presented in the webinar without units and I'll finish off by posing the question of whether it is correct,especially for cell-based assays, to convert IC50 to molar energy for the purpose of calculating LE