However, there are other reasons that the NIH chemical probes folk might want to be wary of data analysis from this source and I'll say a bit more about these in the next post.
Monday, 13 April 2015
I will never call myself an expert for to do so would be to take the first step down a very slippery slope. It is important to remember that each and every expert has both an applicability domain and a shelf life. It’s not the experts themselves to which I object but what I’ll call ‘expert-driven decision-making’ and the idea that you can simply delegate your thinking to somebody else.
I’m going to take a look at an article that describes attempts to model an expert’s evaluation of chemical probes identified by NIH-funded high throughput screening. The study covers some old ground and one issue facing authors of articles like these is how to deal with the experts in what might be termed a ‘materials and methods context’. Experts who are also coauthors of studies like these become, to some extent, self-certified experts. One observation that can be made about this article is that its authors appear somewhat preoccupied with the funding levels for the NIH chemical probe initiative and I couldn't help wondering about how this might have shaped or influenced the analysis.
A number of approaches to modeling the expert’s assessment of the 322 probe compounds of which 79% were considered to be desirable. The approaches used by the authors ranged from simple molecular property filtering to more sophisticated machine learning models. I noted two errors (missing molar energy units; taking logarithm of quantity with units) in the formula for ligand efficiency and it’s a shame they didn’t see our article on ligand efficiency metrics which became available online about six weeks before they submitted their article (the three post series starting here may be helpful). The authors state, “PAINS is a set of filters determined by identifying compounds that were frequent hitters in numerous high throughput screens” which is pushing things a bit because the PAINS filters were actually derived from analysis of the output from six high throughput screening campaigns (this is discussed in detail in three-post series that starts here). Mean pKa values of 2.25 (undesirable compounds) and 3.75 (desirable compounds) were reported for basic compounds and it certainly wasn’t clear to me how compounds were deemed to be basic given that these values are well below neutral pH. In general, one needs to be very careful when averaging pKa values. While these observations might be seen as nit-picking, using terms like ‘expert’, ‘validation’ and ‘due diligence’ in title and abstract does set the bar high.
A number of machine learning models were described and compared in the article and it’s worth saying something about models like these. A machine learning model is usually the result of an optimization process. When we build a machine learning model, we search for a set of parameters that optimizes an objective such as fit or discrimination for the data with which we train the model. The parameters may be simple coefficients for parameters (like in a regression model) but they might also be threshold values for rules. The more parameters you use to build a model (machine learning or otherwise), the more highly optimized the resulting model will be and we use the term ‘degrees of freedom’ to say how many parameters we’ve used when training the model. You have to be very careful when comparing models that have different numbers of degrees of freedom associated with them and one criticism that I would make of machine learning models is that the number of degrees of freedom is rarely (if ever) given. Over-fitting is always a concern with models and it is customary to validate machine learning models using one or more of a variety of protocols. Once a machine learning model has been validated, number of degrees of freedom is typically considered to be a non-issue. Clustering in data can cause validation to make optimistic assessments of model quality and the predictive chemistry community does need to pay more attention to Design of Experiments. Here’s a slide that I sometimes use in molecular design talks.
Let’s get back to the machine learning models in the featured article. Comparisons were made between models (see Figure 4 and Table 5 in the article) but no mention is made of numbers of degrees of freedom for the models. I took a look in the supplementary information to see if I could get this information by looking at the models themselves and discovered that the models had not actually been reported. In fact, the expert’s assessment of the probes had not been reported either and I don't believe that this article scores highly for either reproducibility or openness. Had this come to me as a manuscript reviewer, the response would have been swift and decisive and you probably wouldn’t be reading this blog post. How much weight should those responsible for NIH chemical probes initiative give to the study? I’d say they can safely ignore it because the data set is proprietary and models trained on it are only described and not actually specified. Had the expert's opinion on the desirability (or otherwise) of the probes been disclosed then it would have been imprudent for the NIH folk to ignore what the expert had to say. At the same time, it's worth remembering that we seek different things from probes and from drugs and one expert's generic opinion of a probe needs to placed in the context of any specific design associated with the probe's selection.
However, there are other reasons that the NIH chemical probes folk might want to be wary of data analysis from this source and I'll say a bit more about these in the next post.
However, there are other reasons that the NIH chemical probes folk might want to be wary of data analysis from this source and I'll say a bit more about these in the next post.
Wednesday, 1 April 2015
So I have to admit that I got it wrong and it looks like Ligand Efficiency (LE) is actually thermodynamically valid after all. You’ll recall my earlier objections to LE on the grounds that the standard concentration of 1 M has been arbitrarily chosen and that our perception of compound quality depends on the units of concentration in which the relevant equilibrium constants are defined. A Russian friend recently alerted me to an article in the Siberian Journal of Thermodynamics in which she thought I might be interested and was kind enough to scan it for me because this journal is not available in the West due to the current frosty relations with Russia. This seminal study by RR Raskolnikov demonstrates unequivocally that the 1 M standard concentration is absolutely and uniquely correct thus countering all known (and even some unknown) objections to LE as a metric of compound quality. The math is quite formidable and the central proof is based on the convergence characteristics of the trace of the partial molar polarizability tensor. Enigmatically, the author acknowledges the assistance of an individual named only as Porfiry Petrovich in helping him to find the truth after a long search. Raskolnikov’s career path appears to have been rather unorthodox. Originally from St Petersburg, he moved east because of some trouble with a student loan and for many years he was dependent on what his wife was able to earn.
Sunday, 22 March 2015
<< previous |
So I thought that I’d conclude this mini-series ( 1 | 2 ) of PAINS posts with some lighter fare, the style of which is intended to be a bit closer to that of a PAINS-shaming post ( 1 | 2 ) than is normal for posts here. As observed previously, the PAINS-shaming posts are vapid and formulaic although I have to admit that it’s always a giggle when people spit feathers on topics outside their applicability domains. I must also concede that one of the PAINS-shaming posts was even cited in a recent article in ACS Medicinal Chemistry Letters although this citation might be regarded as more confirmation of the old adage that, ‘flattery will get you anywhere’ than indication of the scientific quality of the post. I shouldn’t knock that post too much because it’s what goaded me into taking a more forensic look at the original PAINS article. However, don’t worry if you get PAINS-shamed because, with acknowledgement to Denis Healey, being PAINS-shamed is “like being savaged by a dead sheep”.
I should say something about the graphic with which I’ve illustrated this blog post. It shows a diving Junkers Ju 87 Stuka and I’ll let aviation author William Green, writing in ‘Warplanes of the Third Reich’, tell you more about this iconic aircraft:
“The Ju 87 was an evil-looking machine, with something of the predatory bird in its ugly contours – its radiator bath and fixed, spatted undercarriage resembling gaping jaws and extended talons – and the psychological effect on the recipients of its attentions appeared almost as devastating as the bombs that it delivered with such accuracy. It was an extremely sturdy warplane, with light controls, pleasant flying characteristics and a relatively high standard of manoeuvrability. It offered crew members good visibility and it was able to hit a target in a diving attack with an accuracy of less than 30 yards. All these were highly desirable characteristics but they tended to blind Ob.d.L. to the Ju 87’s shortcomings. Its use presupposed control of the air, for it was one of the most vulnerable of combat aircraft and the natural prey of the fighter…”
I really should get back on-topic because I doubt that Rudel ever had to worry about singlet oxygen while hunting T-34s on TheEastern Front. I’ve promised to show you how to get away with polluting the literature so let’s suppose you’ve submitted a manuscript featuring PAINful structures and the reviewers have said, “Nein, es ist verboten”. What should you do next? The quick answer is, “It depends”. If the reviewers don't mention the orginal PAINS article and simply say that you’ve just not done enough experimental work to back up your claims then, basically, you’re screwed. This is probably a good time get your ego to take a cold shower and to find an unfussy open access journal that will dispense with the tiresome ritual of peer review and quite possibly include a package of downloads and citations for no additional APC.
Let’s look at another scenario, one in which the reviewers have stated that the manuscript is unacceptable simply because the compounds match substructures described in the original PAINS article. This is the easiest situation to deal with although if you’re using AlphaScreen to study a protein-protein interaction you should probably consider the open access suggestion outlined above. If not using AlphaScreen, you can launch your blitzkrieg although try not make a reviewer look like a complete idiot because the editor might replace him/her with another one who is a bit more alert. You need to point out to the editor that the applicability domain (using this term will give your response a degree of authority) for the original PAINS filters is AlphaScreen used to assay protein-protein interactions and therefore the original PAINS filters are completely irrelevant to your submission. You might also play the singlet oxygen card if you can find evidence (here’s a useful source for this type of information) for quenching/scavenging behavior by compounds that have aggrieved the reviewers on account of matching PAINS filters.
Now you might get a more diligent reviewer who looks beyond the published PAINS filters and digs up some real dirt on compounds that share a substructure with the compounds that you’ve used in your study and, when this happens, you need to put as much chemical space as you can between the two sets of compounds. Let’s use Voss et al (I think that this was what one of the PAINS-shaming posts was trying to refer to) to illustrate the approach. Voss et al describe some rhodanine-based TNF-alpha antagonists, the ‘activity’ of which turned out to be light-dependent and I would certainly regard this sort of light-dependency as very dirty indeed. However, there are only four rhodanines described in this article (shown below) and each as a heteroaromatic ring linked to the exocyclic double bond (extended pi-system is highly relevant to photochemistry) and each is substituted with ethyl on the ring nitrogen. Furthermore, that heteroaromatic ring is linked to either a aryl or heteroaromatic ring in each of the four compounds. So here’s how you deal with the reviewers. First point out that the bad behavior is only observed for four rhodanines assayed against a single target protein. If your rhodanines lack the exocyclic double bond, you can deal with the reviewers without breaking sweat because the substructural context of the rhodanine ring is so different and you might also mention that your rhodanines can’t function as Michael acceptors. You should also be able to wriggle off the hook if your rhodanines have the exocyclic double bond but only alkyl substituents on it. Sanitizing a phenyl substituent on the exocyclic double bond is a little more difficult and you should first stress that the bad behavior was only observed for rhodanines with, five-membered electron-rich heterocycles linked to that exocyclic double bond. You’ll also be in a stronger position if your phenyl ring lacks the additional aryl or heteroaryl substituent (or ring fusion) that is conserved in the four rhodanines described by Voss et al because this can be argued to be relevant to photochemistry.
Things will be more difficult if you’ve got a heteroaromatic ring linked to the exocyclic bond and this is when you’ll need to reach into the bottom draw for appropriate counter-measures with which to neutralize those uncouth reviewers. First take a close look at that heteroaromatic ring. If it is six-membered and/or relatively electron-poor, consider drawing the editor’s attention to the important differences between your heteroaromatic ring and those of in the offending rhodanines of Voss et al. The lack of aryl or heteroaryl substituents (or ring fusions) on your heteroaromatic ring will also strengthen your case so make sure editor knows. Finally, consider calculating molecular similarity between your rhodanines and those in Voss et al. You want this to be as low as possible so experiment with different combinations of fingerprints and metrics (e.g. Tanimoto coefficient) to find whatever gives the best results (i.e. the lowest similarity).
Wednesday, 18 March 2015
So apparently I’m a critic of the PAINS concept so maybe it’s a good idea to state my position. Firstly, I don’t know exactly what is meant by ‘PAINS concept’ so, to be quite honest, it is difficult to know whether or not I am a critic. Secondly, I am fully aware that many compounds are observed as assay hits for any of a number of wrong reasons and completely agree that it is important to understand the pathological behavior of compounds in assays so that resource does not get burned unnecessarily. At the same time we need to think more clearly about different types of behavior in assays. One behavior is that the compound does something unwholesome to a protein and, when this is the case, it is absolutely correct to say, ‘bad compound’ regardless of what it does (or doesn't) do to other proteins. Another behavior is that the compound interferes with the assay but leaves the target protein untouched and, in this case, we should probably say ‘bad assay’ because the assay failed to conclude that the protein has emerged unscathed from its encounter with the compound. It is usually a sign of trouble when structurally-related compounds show activity in a large number of assays but there are potentially lessons to be learned by those prepared to look beyond hit rates. If the assays that are hit are diverse in type then we should be especially worried about the compounds. If, however, the assays that are hit are of a single type then perhaps the specific assay type is of greater concern. Even when hit rates are low, appropriate analysis of the screening output may still reveal that something untoward is taking place. For example, a high proportion of hits in common may reflect that a mechanistic feature (e.g catalytic cysteine) is shared between two enzymes (e.g. PTP and cysteine protease)
While I am certainly not critical of attempts to gain a greater understanding of screening output, I have certainly criticized over-interpretation of data in print ( 1 | 2 ) and will continue to do so. In this spirit, I would challenge the assertion, made in the recent Nature PAINS article that “Most PAINS function as reactive chemicals rather than discriminating drugs” on the grounds that no evidence is presented to support it. As noted in a previous post, the term ‘PAINS’ was introduced to describe compounds that showed frequent-hitter behavior in a panel of six AlphaScreen assays and this number of assays would have been considered a small number even two decades ago when some of my Zeneca colleagues (and presumably our opposite numbers elsewhere in Pharma) started looking at frequent-hitters. After reading the original PAINS article, I was left wondering why only six of 40+ screens were used in the analysis and exactly how these six screens had been selected. The other point worth reiterating is that only including a single type of assay in analysis like this makes it impossible to explore the link between frequent-hitter behavior and assay type. Put another way, restricting analysis to a single assay type means that the results of the analysis constitute much weaker evidence that compounds interfere with other assay types or are doing something unpleasant to target proteins.
I must stress that I’m definitely not saying that the results presented in the original PAINS article are worthless. Knowledge of AlphaScreen frequent-hitters is certainly useful if you’re running this type of assay. I must also stress that I’m definitely not claiming that AlphaScreen frequent hitters are benign compounds. Many of the chemotypes flagged up as PAINS in that article look thoroughly nasty (although some, like catechols, look more ‘ADMET-nasty’ than ‘assay-nasty’). However, the issue when analyzing screening output is not simply to be of the opinion that something looks nasty but to establish its nastiness (or otherwise) definitively in an objective manner.
It’s now a good time to say something about AlphaScreen and there’s a helpful graphic in Figure 3 of the original PAINS article. Think of two beads held in proximity by the protein-protein interaction that you’re trying to disrupt. The donor bead functions as a singlet oxygen generator when you zap it with a laser. Some of this singlet oxygen makes its way to the acceptor bead where its arrival is announced with the emission of light. If you disrupt the protein-protein interaction then the beads are no longer in close proximity and the (unstable) singlet oxygen doesn’t have sufficient time to find an acceptor bead before it is quenched by solvent. I realize this is a rushed explanation but I hope that you’ll be able to see that disruption of the protein-protein interaction will lead to a loss of signal because most of the singlet oxygen gets quenched before it can find an acceptor bead.
I’ve used this term ‘quench’ and I should say a bit more about what it means. My understanding of the term is that it describes the process by which a compound in an excited state is returned to the ground state and it can be thought of as a physical rather than chemical process, even though intermolecular contact is presumably necessary. The possibility of assay interference by singlet oxygen quenchers is certainly discussed in the original PAINS article and it was noted that:
“In the latter capacity, we also included DABCO, a strong singlet oxygen quencher which is devoid of a chromophore, and diazobenzene itself”
An apparent IC50 of 85 micromolar was observed for DABCO in AlphaScreen and that got me wondering about what the pH of the assay buffer might have been. The singlet oxygen quenching abilities of DABCO have been observed in a number of non-aqueous solvents which suggests that the neutral form of DABCO is capable of quenching singlet oxygen. While I don’t happen to know if protonated DABCO is also an effective quencher of singlet oxygen, I would expect (based on a pKa of 8.8) the concentration of the neutral form in an 85 micromolar solution of DABCO buffered at neutral pH to be about 1 micromolar. Could this be telling us that quenching of singlet oxygen in AlphaScreen assays is possibly a bigger deal than we think?
Compounds can also react with singlet oxygen and, when they do so, the process is sometimes termed ‘scavenging’. If you just observe the singlet oxygen lifetimes, you can’t tell whether the singlet oxygen is returned harmlessly to its ground state or if a chemical reaction occurs. Now if you read enough PAINS articles or PAINS-shaming blog posts, you’ll know that there is a high likelihood that, at some point, The Great Unwashed will be castigated for failing to take adequate notice of certain articles deemed to be of great importance by The Establishment. In this spirit, I’d like to mention that compounds with sulfur doubly bonded to carbon have been reported ( 1 | 2 | 3 | 4 | 5 ) to quench or scavenge singlet oxygen and this may be relevant to the ‘activity’ of rhodanines in AlphaScreen assays.
The original PAINS article is a valuable compilation of chemotypes associated with frequent-hitter behavior in AlphaScreen assays although I have questioned whether or not this behavior represents strong evidence that compounds are doing unwholesome things to the target proteins. It might be prudent to check the singlet oxygen quencher/scavenger literature a bit more carefully before invoking a high hit rate in a small panel of AlphaScreen assays in support of assertions that literature has been polluted or that somebody’s work is crap. I’ll finish the post by asking whether tethering donor and acceptor beads covalently to each other might help identify compounds that interfere with AlphaScreen by taking out singlet oxygen. Stay tuned for the next blog post in which I’ll show you, with some help from Denis Healey and the Luftwaffe, how to pollute the literature (and get away with it).
Friday, 6 March 2015
Free energy simulation methods such as free energy perturbation (FEP) have been around for a while and, back in the late eighties when my Pharma career started, they were being touted for affinity prediction in drug discovery. The methods never really caught on in the pharma/biotech industry and there are a number of reasons why this may have been the case including the compute-intensive nature of the calculations and the level of expertise required to run them. This is not to say that nobody in pharma/biotech was using the methods. It’s just that the capability was not widely-perceived to give those who had it a clear advantage over their competitors. Also there are other ways to use protein structural information in lead optimization and I’ve already written about the importance of forming molecular interactions with optimal binding geometry but without incurring conformational/steric energy penalties. Nevertheless, being able to predict affinity accurately would be high on every drug discovery scientist’s wish list.
A recently published study appears to represent a significant step forward and I decided to take a closer after seeing it Pipelined and reviewed. The focus of the study is FEP and a number of innovations are described including an improved force field, enhanced sampling and automated work flow. The quantity calculated in FEP is ΔΔG° which is a measure of relative binding affinity and this is typically what you want to predict in lead optimization. We say ΔΔG° because it’s the difference between two ΔG° values which might, for example, be a compound with an unsubstituted phenyl ring and the corresponding compound with a chloro substituent at C3 of that aromatic ring. When we focus on ΔΔG we are effectively assuming that it is easier to predict differences in affinity than it is to predict affinity itself from molecular structure and this is a theme that I've touched on in a previous post. Readers familiar with matched molecular pair analysis (MMPA 1 | 2 | 3 | 4 | 5 ) will see a parallel with FEP which I failed draw when first writing about MMPA although the point has been articulated in subsequent publications (1 | 2). Of course FEP has been around a lot longer than MMPA so it’s actually much more appropriate to describe the latter as the data-analytic analog of the former.
As with MMPA, the rationale is that it is easier to predict differences in the values of a quantity than it is to predict values of the quantity directly from molecular structure. The authors state:
“In drug discovery lead optimization applications, the calculation of relative binding affinities (i.e., the relative difference in binding energy between two compounds) is generally the quantity of interest and is thought to afford significant reduction in computational effort as compared to absolute binding free energy calculations”
This study does appear to represent the state of the art although I would like to have seen the equivalent of Figure 3 (plot of FEP-predicted ΔG° versus experimental ΔG°) for the free energy differences which are the quantities that are actually calculated. I would argue that Figure 3 is somewhat misleading because some of the variation in FEP-predicted ΔG° is explained by variation in the reference ΔG° values. That said, the relevant information is summarized in Table S2 of the supporting information and the error distribution for the relative binding free energies (ΔΔG°) is shown in Figure S1.
One perception of FEP is that it becomes more difficult to get good results if the perturbation is large and the authors note:
“We find that our methodology is robust up to perturbations of approximately 10 heavy atoms”
Counting atoms is not the only way to gauge the magnitude of a perturbations. It’d also be interested to see how robustly the methodology handles perturbations that involve changes in ionization state and whether ΔΔG°values of greater magnitude are more difficult to predict than those of smaller magnitude. Prediction of affinity for compounds that bind covalently, but reversibly, to targets like cysteine proteases would probably also be feasible using these methods. Something I've wondered about for a few years is what would happen if the aromatic nitrogen that frequently accepts a hydrogen bond from the tyrosine kinase hinge was mutated into an aromatic carbon. If the resulting loss of affinity for this structural transformation was as small as some seem to believe it ought to be then it would certainly open up some 'patent space' in what is currently a bit of a log jam. You can also see how FEP might be integrated with MMPA in a lead optimization setting by using the former to predict the effects of structural modifications on affinity and the latter to assess the likely impact of of these modifications on ADME characteristics like solubility, permeability and metabolic stability.
So lots of possibilities and this is probably a good place to leave it for now.
Thursday, 19 February 2015
So it arrived as a call to beware of von Neumann’s elephants and the session has already been most ably covered by The CuriousWavefunction (although I can’t resist suggesting that Ash's elephant may look more like a cyclic voltammogram than the pachyderm of the title). The session to which I refer was Derek Lowe’s much anticipated presentation to a ravening horde of computational chemists that was organized by Boston Area Group Modeling and Informatics (BAGIM). Derek Pipelined beforehand and I provided some helpful tips (comment 1) on how to keep the pack at bay. I have to admit that I meant to say that that many continuum solvent models are charge-type symmetric (as opposed to asymmetric) and should point out that it is actually reality (e.g. DMSO) that is asymmetric with respect to charge type. At least no damage appears to have been done. As an aside, I was gratified by how enthusiastically the bait was taken (see comments 8 and 9) but that’s getting away from the aim of this post which is to explore the relationship between medicinal chemist and computational chemist.
Many in the drug discovery business seem to be of the opinion that the primary function of the pharmaceutical computational chemist is predict activity (enzyme inhibition; hERG blockade) and properties (e.g. solubility; plasma protein binding) of compounds before they are synthesized. I agree that the goal of both computational chemists and medicinal chemists is identification of efficacious and safe compounds and accurate prediction would be of great value in enabling the objective to be achieved efficiently. However, useful prediction of the (in vivo) effects of an arbitrary compound directly from molecular structure is not something that is currently feasible nor does it look like it will become feasible for a long time. Tension is likely to develop between the computational chemist and the medicinal chemist when either or both believe that the primary function of the former is simply to build or use predictive (e.g. QSAR) models for activity and ADME behavior.
One way of looking at Drug Discovery is as a process of accumulating knowledge and perhaps the success of the drug discovery project team should be measured by how efficiently they acquire that knowledge. A project team that quickly and unequivocally invalidates a target has made a valuable contribution because the project can be put out of its misery and the team members can move onto projects with greater potential. Getting smarter (and more honest) about shutting down projects is one area in which the pharma/biotech industry can improve. Although there are a number of ways (e.g. virtual/directed screening; analysis of screening output) that the computational chemist can contribute during hit identification, I’d like to focus on how computational and medicinal chemists can work together once starting points for synthesis have been identified (i.e. during hit-to-lead and lead optimization phases of project).
In drug discovery projects we typically we accumulate knowledge by posing hypotheses (3-chloro will increase potency or a 2-methoxy will lock the molecule into its bound conformation) and that’s why I use the term hypothesis-driven molecular design (HDMD) that I wrote about in an earlier post. When protein structural information is available, the hypothesis often takes the form that making a new interaction will lead to an increase in affinity and the key is finding ways of forming interactions with optimal binding geometry without incurring conformational/steric energy penalties. The computational chemist is better placed than the medicinal chemist to assess interactions by these criteria while the medicinal chemist is better placed to assess the synthetic implications of forming the new interactions. However, either or both may have generated the ideas that the computational chemist assessed and many medicinal chemists have strong grasp of molecular interactions and conformational preferences even if they are unfamiliar with molecular modelling software used to assess these. I always encourage medicinal chemists to learn to use the Cambridge Structural Database (CSD) and because it provides an easy way to become familiar with 3D structures and conformations of molecules as well as providing insight into the interactions that molecules make with one another. It also uses experimental data so you don’t need to worry about force fields and electrostatic models. Here’s a post in which I used the CSD to gain some chemical insight that will give you a better idea of what I’m getting at.
One question posed in the Curious Wavefunction post was whether medicinal chemists should make dead compounds to test a model. My answer to the question is that compounds should be prioritized for synthesis on the basis of the how informative they are likely to be or how active they are likely to be and, as pointed out by Curious Wavefunction, synthetic accessibility always needs to be accounted for. In hit-to-lead or early lead optimization, I’d certainly consider synthetic targets that were likely to be less active than the compounds that had already been synthesized but these would need have potential to provide information. You might ask how should we assess the potential of a compound to provide information and my response would be that it is not, in general, easy but this is what hypothesis-driven molecular design is all about. The further you get into lead optimization, the less likely it becomes that an inactive compound will be informative.
I realize that talking about hypothesis-driven molecular design and potential of synthetic targets to provide information may seem a bit abstract so I’ll finish this post with something a bit more practical. This goes back over a decade to a glycogen phosphorylase inhibitor project for diabetes (see articles 1 | 2 | 3 | 4 | 5 | 6 ). While lead discovery tends to be a relatively generic activity, the challenges in lead optimization tend to be more project-specific. We were optimizing some carboxylic acids (exemplified by the molecular structure in the figure below) and were interested in reducing the fraction bound to plasma protein which is often an issue for compounds that are predominantly anionic under physiological conditions.
I should point out that it wasn’t clear that reducing the bound fraction would have increased the free concentration (this point is discussed in more depth here) but hypothesis-driven design is more about asking good questions than making predictions. We wanted to reduce the unbound fraction (Fu) but we also wanted to keep the compounds anionic which suggested replacing the carboxylic acid with a bioisostere (see 1 | 2 | 3 | 4 | 5 ) such as tetrazole. If you happen to be especially interested in the basis for the bioisosteric relationship between these molecular recognition elements, have a look at Fig 7 in this article but I’ll be focusing on finding out what effect the bioisosteric replacement will have on Fu. Tetrazoles are less lipophilic (logP values are 0.3 to 0.4 units lower) than the corresponding carboxylic acids so we might expect that replacing a carboxylic acid with tetrazole will result in an increase in Fu (this thinking is overly simplistic although saying why would take the blog post off on a tangent but I'm happy to pick this up in comments). We did what has become known as matched molecular pair analysis (MMPA; see 1 | 2 | 3 | 4 | 5 | 6 ) and searched the in house plasma protein binding (PPB) database for pairs of compounds in which carboxylic acid was replaced by tetrazole while keeping the remainder of the molecular structure constant. We do MMPA because we believe that it is easier to predict differences in the values of properties or activity than it is to predict the values themselves directly from molecular structure. Computational chemists may recognize MMPA as a data analytic equivalent of free energy perturbation (FEP; see 1 | 2 ).
The original MMPA performed in 2003 suggested that tetrazoles would be more highly protein bound (i.e. lower Fu) than the corresponding carboxylic acids and on that basis we decided not to synthesize tetrazoles. Subsequent analysis of data for a larger number of matched molecular pairs that were available in 2008 arrived at the same conclusion with a higher degree of confidence (look at SE values) but it was the 2003 analysis that was used make the project decisions. The standard deviation (SD) values of 0.23 and 0.20 are also informative because these suggest that the PPB assay is very reproducible even though it was not run in replicate (have a look at this article to see this point discussed in more depth).
You might ask what we would have done if we hadn’t been able to find any matched molecular pairs with which to test our hypothesis. Had we decided to synthesize a tetrazole then it would have been sensible to select a reference carboxylic acid for which Fu was close to the center of the quantifiable range so as to maximize the probability of the matching tetrazole being ‘in-range’.
This is probably a good place to leave things. Even if you don't agree with what I've said here, I hope this blog post has at least got you thinking that there may be more to pharmaceutical computational chemistry than just making predictions.
Sunday, 25 January 2015
I was in Kuala Lumpur about this time last year and visited International Medical University where I delivered a harangue. It was a very enjoyable day and the lunch was excellent (as is inevitable in Malaysia where it seems impossible to find even mediocre food). We discussed molecular complexity at lunch and, since a picture says a thousand words, I put the place mat to good use.