So it looks like the post publication peer review of
the article featured in the Expertitis blog post didn't go down too well but as
Derek said recently, “…open, post-publication peer review in science is here,
and it's here to stay. We'd better get used to it”. In this post, I’m going to take a look at
the webinar that drew me to that article and I hope that participants in
that webinar will find the feedback to be both constructive and educational.
Metrics feature prominently in this webinar and the thing that worries me most is the Simplistic Estimate of Enthalpy
(SEEnthalpy). Before getting stuck into
the thermodynamics, it’ll be necessary to point out some errors so please go to
20:10 (slide entitled ‘ligand efficiency lessons’) in the webinar.
The ‘ligand efficiency lessons’ slide concedes that LE has
been criticized in print but then incorrectly asserts that the criticism has
been rebutted in print. It is well-known
that Mike Schultz has criticized ( 1 | 2 | 3 ) LE as being mathematically
invalid and the counter to this is that LE (written as free energy of binding
divided by number of non-hydrogen atoms) is a mathematically valid expression (even though the metric itself is thermodynamic nonsense). Nevertheless, Mike still identified the Achilles' heel of LE which is that a linear response of affinity to molecular size has to have zero intercept in order for that linear response to represent constant LE. It is also somewhat ironic that the
formula for LE used in the rebuttal to Mike's criticism is itself mathematically invalid because it includes a logarithm of a quantity with units of concentration. However, another criticism (article and related
blog posts 1 | 2 | 3 ) has been made of LE and this has not been rebutted in
print. The ‘ligand efficiency lessons’
slide also asserts that "LE tends to decline with ligand size" and it should also
be pointed out that some of the size dependency of LE is an artefact of the
arbitrary choice of 1 mole per litre as
the standard concentration (article and blog posts 1 | 2 ). The Ligand Efficiency: Nice Concept, Shame About the Metrics presentation may also be helpful.
This is a good point to introduce SEEnthalpy and, were I
writing a satire on drug discovery metrics, I could scarcely do better than
this. Perhaps even ‘Enthalpy
– the musical’ with lyrics by Leonard Cohen (Some girls wander by mistake/ Into
the mess that metrics make)? Let’s pick
this up at 22:43 in the webinar with the ‘Conventional wisdom – simplistic
view’ slide which teaches us that rotatable bonds are hydrophobic interactions
(you really did hear that correctly, “entropy comes from non-direct,
hydrophobic interactions like rotatable bonds"). I don’t happen to agree with ‘conventional’
or ‘wisdom’ in this context although I do fully endorse ‘simplistic’. Let’s
take a closer look at SEEnthalpy which is defined (24:45) from the numbers of hydrogen
bond donors (#HBD), hydrogen bond acceptors (#HBA) and rotatable bonds (#RB):
SEEnthalpy = (#HBD + #HBA)/(#HBD + #HBA + #RB)
As I’ve mentioned before, definitions of hydrogen bonding
groups can be ambiguous (as can be definitions of rotatable bonds) but I don’t
want to get sidetracked by this right now because there are more pressing
issues to deal with. You can learn a lot about metrics simply by thinking about
them (have a look at comments on LELP in this article) and one question that you might like to ask yourselves is what is
#RB (which they’re telling us is associated with entropy) doing in an enthalpy metric? This may be a good time to note that the contribution of an intermolecular contact (or group of contacts) to the changes in enthalpy or entropy associated with binding is not, in general, an experimental observable. Could also be a good idea to take a look at A Medicinal Chemist's Guide to Molecular Interactions and Ligand Binding Thermodynamics in Drug Discovery: Still a Hot Tip?
I have to admit that I found the reasoning behind SEEnthalpy to be unconvincing but, given that the metric appears to have the
endorsement of all participants of the webinar, perhaps we should at
least give them the chance to demonstrate that it is meaningful. If I was
introducing an enthalpy metric then the first thing that I’d do would be to
(try to) show that it measured what it was claimed to measure (we call something a metric because it measures something and not because it's easy to remember the formula or because we've thought up a cute acronym). As it
turns out, the Binding Database is public and simply oozing with the
thermodynamic data that could be used to investigate the relationship between metric and reality. This makes how they’ve chosen to evaluate the metric seem somewhat bizarre.
The first test of SEEnthalpy was performed using a TB
activity dataset. About 1000 of the
compounds in this dataset were found to be active by high throughput screening
(HTS) and the remaining 100-ish had come from structure-based drug design (SBDD). The differences between the two groups of
compounds were found to be significant although it is not clear what to make of
this. One interpretation is that the HTS
compounds are screening hits (possibly from phenotypic screens) and the SBDD
compounds have been optimized. If this
is the case, it will not be too difficult to perceive differences between the two
groups of compounds and doing so does not represent one of the more pressing
prediction problems in drug discovery.
This is probably a good time to note that correlation does not equal
causation. The other point worth making
is that observing that two mean values are significantly different doesn’t tell
us about the size of the effect which is more relevant to prediction. If you want to illustrate the strength of the
trend (as opposed to its statistical significance) then you need to show standard deviations with the mean values rather than just standard errors. If this is unfamiliar territory then why not
take a look at this article and make it more familiar.
The next test of SEEnthalpy was to investigate its
correlation with biological activity and the chosen activity measure was
%inhibition in protein kinase assays.
Typically when we model and analyze activity we use pIC50 or pKi (see this
webinar, for example) and it is not at all clear why %inhibition was used instead
given the vast amount of public data available in ChEMBL. One problem with using %inhibition as a
measure of activity is that it has a very limited dynamic range and there
really is a reason that people waste all that time measuring a concentration response
so that they can calculate pIC50 (or pKi). Let’s pick up the webinar at 27:10 and at 28:13
a plot will appear. This plot appears to
have been generated by ordering compounds by SEEenthalpy and then plotting
%inhibition against order. If you look
at the plot you’ll notice that the %inhibition values for most of the compounds
are close to zero indicating that they are essentially inactive at the chosen test
concentration which means that the correlation between %inhibition will be
very weak. But follow the webinar to
28:32 and you will learn that, “…for this particular kinase, there was a really
clear relationship between SEEnthalpy and the activity”. I’ll leave it to you, the reader, to decide
for yourself how clear that trend actually is.
Let’s go to 28:42 in the webinar, at which point the question is
posed as to whether SEEnthalpy correlates more strongly than other metrics with
activity. However, when we get to 28:53,
we discover that it is actually statistical significance (p values) rather than
strength of correlation that is the focus of the analysis. It is
really important to remember that given sufficiently large samples, even the
most anemic of trends can acquire eye-wateringly impressive statistical significance
and it is the correlation coefficient not the p value that tells us how strong a
trend is. Once again, I am forced to steer the participants of the webinar towards this article and this time, in order to reinforce the message, I'll also link an article which illustrates the sort of tangle into which you can get if you confuse the strength of a trend with its statistical significance.
I think this is a good point at which to wrap things up. I'll start by noting that attempting to educate the drug discovery community and shape the thinking of that community's members can backfire if weaknesses in one's grasp of the fundamental science become apparent. For example confusing the statistical significance of a trend with its strength in a webinar like this may get people asking uncouth questions about the machine learning models that were mentioned (although not discussed) during the webinar. I'll also mention that the LE values are presented in the webinar without units and I'll finish off by posing the question of whether it is correct,especially for cell-based assays, to convert IC50 to molar energy for the purpose of calculating LE.
I think this is a good point at which to wrap things up. I'll start by noting that attempting to educate the drug discovery community and shape the thinking of that community's members can backfire if weaknesses in one's grasp of the fundamental science become apparent. For example confusing the statistical significance of a trend with its strength in a webinar like this may get people asking uncouth questions about the machine learning models that were mentioned (although not discussed) during the webinar. I'll also mention that the LE values are presented in the webinar without units and I'll finish off by posing the question of whether it is correct,especially for cell-based assays, to convert IC50 to molar energy for the purpose of calculating LE.
1 comment:
Isn't the main problem here, among many, that if this "metric" were predictive, it would suggest that to increase enthalpy, you simply add more H-bond donors and acceptors and reduce the rotatable bonds. As Pete points out, rotatable bonds will have nothing to do with enthalpy and H-bonding could increase or decrease enthalpy depending on how the orientation of H-bond groups align with complementary (or otherwise) groups in the receptor. I seem to remember this used to be called a pharmacophore.
If the metric were predictive, inositol (all isomers equally) would always be much more active than hexane, say, regardless of the protein target.
Post a Comment