Wednesday, 4 June 2014

Ligand efficiency metric teaser

Here's a bit of fun. See how these parallel lines transform when you divide Y by X (this is how you calculate ligand efficiency).

Here's some more fun. See how these lines transform when you subtract X from Y (this is how you calculate lipophilic efficiency.

Would you consider activities lying on a straight line when plotted against molecular size or lipophilicity to represent compounds with the same efficiency?

Friday, 23 August 2013

To review or not to review?

I'll admit that I don’t always see eye-to-eye with journal editors although it has been a while since I last blogged on this theme.  Recently we published an article on prediction of alkane/water partition coefficients and, as was noted at the end of the blog post, the article had originally been submitted to another journal which declined to review the submission.  Much is written about peer review and, if your manuscript actually gets reviewed, you’ll get a good idea about why the journal said nein.  Put another way, there is some accountability even if you don’t like the outcome.
It can be very different when the journal declines to review and I see this as a sort of shadow land in the manuscript review world.  You often (usually?) don’t get any useful feedback as to why your submission was not even worthy of review and the explanation will often be that the manuscript lacked broad appeal.  Sometimes the manuscript will be farmed out to a member of the editorial advisory board who will effectively make the decision for the editor but you never get to see what actually got said.   This is why I talk about a shadow land. 
Our alkane/water partition coefficient manuscript was originally submitted to the Journal of Chemical Information and Modeling and I really did think it was a JCIM sort of manuscript.  Those open science enthusiasts reading this might be interested to know that we were providing source code (albeit source that used the proprietary OEChem toolkit from OpenEye) and the data that was used to build and test the model.  I went through the submission process which included creating a graphical abstract and providing email addresses for potential reviewers (whom I contacted to see if it’d be OK).  The response from JCIM was swift:

'Thank you for considering the Journal of Chemical Information and Modeling for your manuscript submission. The Journal is committed to promoting high standards in the fields that it covers, so we will be quite selective. Most ACS journals including JCIM are also experiencing substantial increases in submissions that are burdening the editorial and reviewing systems. Consequently, most editors are now prescreening and returning ca. 30% of submissions without further review.

For JCIM, some typical problems that will lead to return of a manuscript by the Editor include papers that:
(A) fall outside the aims and scope of JCIM, sometimes characterized by few if any references to papers in JCIM or other mainstream journals in computational chemistry,
(B) are routine studies, do not provide significant new results, or promote new methods without demonstration of advantages over prior alternatives,
(C) center on developments and interest in research areas that are tangential to the scope of JCIM, or
(D) are characterized by poor presentation of the work that is inconsistent with the style and quality of typical papers in JCIM. In particular, QSAR/QSPR submissions are welcome, but authors must be careful to avoid problems under heading (B).

In my judgment, your submission is inappropriate for JCIM.  The journal has limited interest in QSAR/QSPR studies; ones that are considered need to adhere to the guidelines published in JCIM 46, 937 (2006).
I regret any inconvenience, but hope that this prompt return will allow you to consider alternative journals without further delay.'
I was genuinely surprised by the decision but take the view that journals can publish anything they want to. Nobody is entitled to have their manuscripts published.  Unless of course they have paid for the service as is the case for Open Access.  I was a bit alarmed that the journal appeared to be using number of references to its own articles ('…fall outside the aims and scope of JCIM, sometimes characterized by few if any references to papers in JCIM or other mainstream journals in computational chemistry') as a criterion for deciding whether to send a manuscript for review. However, life must go on and so I replied as follows:

'Thank you for your prompt response to our manuscript submission. While we are disappointed to learn that JCIM is not interested in prediction of alkane/water partition coefficients, we fully accept your decision and will seek a more appropriate home for the manuscript.' (07-Mar-2013)

The manuscript was duly submitted to JCAMD which accepted it but just as the article went to press, I noticed that JCIM had recently published an article on alkane/water partition coefficients.  I was aware that our article had cited two or three articles from JCIM’s predecessor JCICS but only one from JCIM itself.  Given that that it appeared that JCIM were actually interested in alkane/water partitioning, I sought specific feedback as to why our submission was not considered worth sending for review:
'Although our alkane/water partition article is currently in press at JCAMD, I must now confess to being perplexed by your decision to reject the manuscript without review given the recent publication of "Structural Determinants of Drug Partitioning in n-Hexadecane/Water System" ( ) in JCIM.  The possibility that number of references to JCIM articles might be used as a criterion for deciding whether to send a manuscript for review did worry me (and continues to do so) when you first communicated the decision in March.  I would be interested to know if you are able to comment more specifically on this matter.' (23-May-2013)
JCIM have not yet responded to my query and I am still perplexed.

Literature cited
Kenny, Montanari, Propopczyk (2013) ClogPalk: A method for predicting alkane/water partition coefficient. JCAMD 27:389-402 DOI

Natesan, Wang, Lukacova, Peng, Subramaniam, Lynch, Balaz (2013) Structural determinants of drug partitioning in n-hexadecane/water system. JCIM 53:1424-1435 DOI

Tuesday, 13 August 2013

Malan's Ten Rules for Air Fighting

I'm going to introduce the #SavingPharma hashtag in this post but I'm also going to introduce WW2 fighter pilot (and apartheid opponent)  Adolph Gysbert Malan (better known as Sailor).  Some of what these combat pilots had to say is still relevant and resonant today.  One of my favorites is the cricketer Keith Miller (and former Mosquito pilot) responding to an interviewer's question about the pressures of Test Cricket by noting that, "Pressure is a Messerschmitt up your arse, playing cricket is not".  Malan's Ten Rules for Air Fighting can teach those who would claim to be leading Drug Discovery and I reproduce them here:
1. Wait until you see the whites of his eyes. Fire short bursts of one to two seconds only when your sights are definitely "ON".

2. Whilst shooting think of nothing else, brace the whole of your body: have both hands on the stick: concentrate on your ring sight.

3. Always keep a sharp lookout. "Keep your finger out".

4. Height gives you the initiative.

5. Always turn and face the attack.

6. Make your decisions promptly. It is better to act quickly even though your tactics are not the best
7. Never fly straight and level for more than 30 seconds in the combat area.

8. When diving to attack always leave a proportion of your formation above to act as a top guard.

9. INITIATIVE, AGGRESSION, AIR DISCIPLINE, and TEAMWORK are words that MEAN something in Air Fighting.

10. Go in quickly - Punch hard - Get out

Sunday, 4 August 2013

Presentations on slideshare

I recently uploaded some presentations to slideshare including my recent Gordon conference talk and the slide set used on visits to a couple of Pharmaceutical companies before the conference.  My slideshare profile is pwkenny and there are currently 13 presentations uploaded.  Be warned that there is plenty of overlap between some slide sets and it is fair to say that there are more presentations than themes.  I hope that you'll find the slide sets useful.

Monday, 29 July 2013

Some reflections on computer-aided drug design (after attending CADD Gordon conference)

I’ve just returned to Trinidad where I’ve been spending the summer.  I was in the USA at the Computer-Aided Drug Design (CADD) Gordon Conference (GRC) organized by Anthony Nicholls and Martin Stahl.  The first thing that I should say is that this will not be a conference report because what goes on at all Gordon Conferences is confidential and off-record. This is intended to make discussions freer and less inhibited and you won’t see proceedings of GRCs published.  Nevertheless, the conference program is available online so I think that it’ll be OK to share some general views of the CADD field (which have been influenced by what I picked up at the conference) even though commenting on specific lectures, posters or discussions would be verboten.

The focus of the conference was Statistics in CADD and the stuff that I took most notice of was the use of Baysian methods although I still need to get my head round things a bit more. Although modelling affinity/potency and output of ADMET assays (e.g. hERG blockade) tends to dominate the thinking of CADD scientists, the links between these properties and clinical outcomes in live humans are not as strong as many assume.  Could the wisdom of Rev Bayes be applied to PK/PD modeling and the design of clinical trials?  I couldn’t help worrying about closet Bayesians in search of a good posterior and what would be the best way to quantify the oomph of a ROC curve...

Reproducibility is something that needs to be addressed in CADD studies and if we are to improve this we must be prepared to share both data and methods (e.g. software).  This open access article should give you an idea of some of the issues and directions in which we need to head. Journal editors have a part to play here and must resist the temptation to publish retrospective analyses of large proprietary data sets because of the numbers of citations that they generate.  At the same time, journal editors should not be blamed for supplemental information ending up in PDF format.  For example, I had no problems (I just asked) getting JMC (2008), JCIM (2009) and JCAMD (2013 and 2013) to publish supplemental information in text (or zipped text) format.

When you build models from data, it is helpful to think of signal and noise.  The noise can be thought of as coming from both the model and from the data and in some cases it may be possible to resolve it into these two components.  The function of Statistics is to provide an objective measure of the relative magnitudes of signal and noise but you can’t use Statistics to make noise go away (not that this stops people from trying).  Molecular design can be defined as control of behavior of compounds and materials by manipulation of molecular properties and can be thought of as being prediction-driven or hypothesis-driven.   Prediction-driven molecular design is about building predictive models but it is worth remembering that a much (most?) pharmaceutical design involves a significant hypothesis-driven component. One way of thinking about hypothesis-driven molecular design is as framework for assembling structure activity/property relationships (SAR/SPR) as efficiently as possible but this is not something that statistical methodology currently appears equipped to do particularly well.

The conference has its own hashtag (#grccadd) and appeared to out-tweet the Sheffield Cheminformatics conference which ran concurrently.  Some speakers have shared their talks publically and a package of statistical tools created especially for the conference is available online  

Literature cited and links to talks
WP Walters (2013) Modeling, informatics, and the quest for reproducibility. JCIM 53:1529-1530 DOI

CC Chow, Bayesian and MCMC methods for parameter estimation and model comparison. Link

N Baker, The importance of metadata in preserving and reusing scientific information Link

PW Kenny, Tales of correlation inflation.  Link

CADD Stat Link

Sunday, 14 July 2013

Prediction of alkane/water partition coefficients

Those of you who follow this blog will know that I have a long standing interest in alkane/water partition coefficient and I’d like to tell you a bit about the ClogPalk model for predicting these from molecular structure that we published during my time in Brasil. Some years ago we explored prediction of ΔlogP (logPoct - logPalk) from calculated molecular electrostatic potentials and this can be thought of as treating the alkane/water partition coefficient as a perturbation of the octanol/water partition coefficient.  One disadvantage of this approach is that it requires access to logPoct and I was keen to explore other avenues.  The correlation of logPalk with computed molecular surface area (MSA) is excellent for saturated hydrocarbons and I wondered if this class of compound might represent a suitable reference state for another type of perturbation model.  Have a look at Fig 1 which shows plots of logPalk against MSA for saturated hydrocarbons (green), aliphatic alcohols (red) and aliphatic diols (blue).  You can see how adding a single hydroxyl group to a saturated hydrocarbon shifts logPalk down by about 4.5 units and adding two hydroxyl groups shifts logPalk further still.

The perturbations are defined substructurally using SMARTS notation. Specifically, each perturbation term consists of a SMARTS definition for the relevant functional group and a decrement term (e.g. 4.5 units for alcohol hydroxyl).  The model also allows functional groups to interact with each other.  For example, an intramolecular hydrogen bond ‘absorbs’ some of a molecule’s polarity and manifests itself as an unexpectedly high logPalk value.  Take a look at this article if you’re interested in this sort of thing.  The interaction terms can be thought of as perturbations of perturbations. The ClogPalk model is shown in Fig 2.

The performance of the model against external test data is shown in Figure 3.  There do appear to be some issues with some of the data and measured values of logPalk were found to differ by two or more units for some compounds (Atropine, Propanolol, Papavarine).  Also there are concerns about the self-consistency of the measurements for Cortexolone, Cortisone and Hydrocortisone. Specifically, the logPalk of Cortexolone (-1.00) is actually lower than that for its keto analogue Cortisone (-0.55).
The software was built using OpenEye programming toolkits (OEChem and Spicoli) and you’ll find the source code and makefiles in the supplementary information with all the data used to parameterize and test the models. It’s not completely open source because you’ll need a license from OpenEye to actually run the software.  However, the documentation for the toolkits is freely available online and you may be even able to get an evaluation license to see how things work.  You’ll also find the source code for SSProFilter in the supplemental material and this is an improved (it also profiles) version of the Filter program that I put together with the Daylight toolkit back in 1996. Very useful for designing screening libraries and you might want to take a look at this post on SMARTS from a couple of years ago.

There's some general discussion in the article that is not specific to the ClogPalk model and I'll mention it briefly since I think this is relevant to molecular design. Those of you who believe that the octanol/water partition coefficient is somehow fundamental might like to trace how we ended up with this particular partitioning system.  We also address the question of whether logP or logD is the more appropriate measure of lipophilicity measure and some ligand efficiency stuff from an earlier post makes its journal debut.  
That’s about all I wanted to say for now and I’ll finish by noting that the manuscript was originally submitted to another journal but that's going to be the subject of a post all of its very own...
Literature cited
Toulmin, Wood, Kenny (2008) Toward prediction of alkane/water partition coefficients. J Med Chem 51:3720-3730 DOI
Kenny, Montanari, Prokopczyk (2013)ClogPalk: A method for predicting alkane/water partition coefficient. JCAMD 27:389-402 DOI

Friday, 26 April 2013

Thermodynamics and molecular interactions

So it’s #RealTimeChem week on twitter and I thought I’d get into the spirit with a blog post.  The article that I’ve selected for review focuses on non-additivity of functional group contributions to affinity.  The protein in question is Thrombin and ligand binding was characterised using protein crystallography and isothermal titration calorimetry.  

Before reviewing the article, it’s probably a good idea to articulate my position on the thermodynamics of ligand-protein binding.  Firstly, G, H and S are three state functions, each of which can be written in terms of the other two, but only one of which is directly relevant to the binding of ligands to proteins.  Kd is no less thermodynamic than ΔH° or ΔS° and the contribution of a particular intermolecular contact to ΔG° (or ΔH° for that matter) is not in general an experimental observable.  Thermodynamics with state functions is like accountancy in that if you over-pay one interaction, the other interactions will lose out.

Now back to the featured article.  One observation presented as evidence for non-additivity is that the slopes of plots of ΔG° against hydrophobic contact area differ according to whether X (see figure below) is H or NH2.  Inhibitors with the amino group bind with greater affinity than the corresponding inhibitors lacking the amino group which interacts (in its cationic form) with the protein.   However, the difference is not constant and actually increases with hydrophobic contact area.   I would certainly agree that something interesting is going on and if we’re ever going to understand ligand-protein then combining affinity measurements for structurally-related ligands with structural information will be very useful.   
About enthalpy measurements, I am a lot less sure.  Isothermal titration calorimetry (ITC) is a direct, label-free method for measuring affinity but the fact that you get ΔH° from the experiment at no extra cost doesn’t by itself make ΔH° useful.     If measured values of ΔH° lead to improved predictions of ΔG° or provide clear insight into the nature of the interactions between ligand and protein then I certainly agree that we should make use of ΔH°.  However, there is still the problem that there is no unique way to distribute ΔG°  (or, for that matter, ΔH°) over intermolecular contacts and biomolecular recognition also takes place in aqueous media.    The cohesiveness of liquid water that drives hydrophobic association in aqueous media is a consequence of strong, cooperative hydrogen bonds between water molecules and I like to think of the hydrophobic force as a non-local, indirect, electrostatic interaction.  This non-local nature of hydrophobic interactions complicates interpretation of affinity measurements in structural terms.

Let's see what the authors have to say:

"Analysis of the individual crystal structures and factorizing the free energy into enthalpy and entropy demonstrates that the binding affinity of the ligands results from a mixture of enthalpic contributions from hydrogen bonding and hydrophobic contacts, and entropic considerations involving an increasing loss of residual mobility of the bound ligands."

I'm going to put my cards on the table and say that I believe this statement represents an exercise in arm waving.    See what you think and ask yourself the question as to whether this statement would help you design a higher affinity Thrombin inhibitor?  It's also worth thinking carefully about the relationship between mobility and entropy.   One way of looking at entropy is as the degree to which systems are constrained and a more highly constrained system will be less mobile.    The authors state:

"The present study shows, by use of crystal structure analysis and isothermal titration calorimetry for a congeneric series of thrombin inhibitors, that extensive cooperative effects between hydrophobic contacts and hydrogen bond formation are intimately coupled via dynamic properties of the formed complexes."

I think that one needs to be very careful when talking about 'dynamic properties' in the context of equilibrium thermodynamics.  Ultimately entropy is determined by the characteristics of potential energy surfaces and Statistical Mechanics tells us that entropy (and other thermodynamic properties) can be calculated from the partition function.  It may be instructive to think how one would use the partition function to put these 'dynamic properties' on a quantitative basis.

I'm now going to change direction because there is one factor that the authors appear not to have considered  and I think that it could be quite important.  The ligand amino group (see figure above; it's protonated under assay conditions) interacts with the protein but it can also affect affinity in another way.  Have a look a this figure (showing the stricture of complex with 3e) from the article which shows how one of the inhibitors lacking the amino group binds.  Now I'd like you to look at one of the dihedral angles in the ligand structure and you can see how it is defined by looking at the substructures inset in the histograms below.  
The histograms show the distributions of (the absolute value of) this dihedral angle observed for the instances of substructure in the CSD. The histogram on the left suggests that the dihedral angle will tend to be 180° when the carbon next to the carbonyl carbon has two attached hydrogens. The histogram on the right shows how a substituent (X ¹ H) on that carbon shifts the distribution of dihedral angles.  Now let's go back to the structure of complex 3e, which can be downloaded from the PDB as refcode 2ZIQ, and we can see that the relevant dihedral angle is 117°.  Comparing the two histograms tells us is that a substituent (such as an amino group) on the carbon next to the carbonyl group will tend to stabilise the bound conformations of these inhibitors in addition to making direct interactions with the protein.  Do the ITC results tell us this?       
Literature cited
Baum, Muley, Smolinski, Heine, Hangauer and Klebe (2010) Non-additivity of functional group contributions by protein-ligand binding: A comprehensive study by crystallography and isothermal titration calorimetry. J Mol Biol 397:1042-1054  doi