Thursday, 13 September 2018

On the Nature of QSAR

With EuroQSAR2018 fast approaching, I'll share some thoughts from Brazil since I won't be there in person. I've not got any QSAR related graphics handy so I'll include a few random photos to break the text up a bit.

East of Marianne River on north coast of Trinidad

Although Corwin Hansch is generally regarded as the "Father of QSAR", it is helpful to look further back to the work of Louis Hammett in order to see the prehistory of the field. Hammett introduced the concept of the linear free energy relationship (LFER) which forms the basis of the formulation of QSAR by Hansch and Toshio Fujita. However, the LFER framework encodes two other concepts that are also relevant to drug design. First, the definition of a substituent constant relates a change in a property to a change in molecular structure and this underpins matched molecular pair analysis (MMPA). Second, establishing an LFER allows the sensitivity of physicochemical behavior to structural change to be quantified and this can be seen as a basis for the activity cliff concept.

Kasbah cats in Ouarzazate 

As David Winkler and the late Prof. Fujita noted in this 2016 article, QSAR has evolved into "two QSARs":

Two main branches of QSAR have evolved. The first of these remains true to the origins of QSAR, where the model is often relatively simple and linear and interpretable in terms of molecular interactions or biological mechanisms, and may be considered “pure” or classical QSAR. The second type focuses much more on modeling structure–activity relationships in large data sets with high chemical diversity using a variety of regression or classification methods, and its primary purpose is to make reliable predictions of properties of new molecules—often the interpretation of the model is obscure or impossible.

I'll label the two branches of QSAR as "classical" (C) and "machine learning" (ML). As QSAR evolved from its origins into ML-QSAR, the descriptors became less physical and more numerous. While I would not attempt to interpret ML-QSAR models, I'd still be wary of interpreting a C-QSAR model if there was a high degree of correlation between the descriptors. One significant difficulty for those who advocate ML-QSAR is that machine learning is frequently associated with (or even equated to) artificial intelligence (AI) which, in turn, oozes hype. Here are a couple of recent In The Pipeline posts (don't forget to look at the comments) on machine learning and AI.

One difference between C-QSAR models and ML-QSAR models is that the former are typically local (training set compounds are closely related structurally) while the the latter are typically non-local (although not as global as their creators might have you believe). My view is that most 'global' QSAR models are actually ensembles of local models although many QSAR modelers would have me dispatched to the auto-da- for this heresy. A C-QSAR model is usually defined for a particular structural series (or scaffold) and the parameters are often specific (e.g. p value for C3-substituent) to the structural series. Provided that relevant data are available for training, one might anticipate that, within its applicability domain, local model will outperform a global model since the local model is better able to capture the structural context of the scaffold.

I would guess that most chemists would predict the effect on logP of chloro-substituting a compound more confidently than they would predict logP for the compound itself. Put another way, it is typically easier to predict the effect of a relatively small structural change (a perturbation) on chemical behavior than it is to predict chemical behavior directly from molecular structure. This is the basis for using free energy calculations to predict relative affinity and it also provides a motivation for MMPA (which can be seen as the data-analytic equivalent of free energy perturbation). This suggests viewing activity and properties in terms of structural relationships between compounds. I would argue that C-QSAR models are better able than ML-QSAR models to exploit structural relationships between compounds.

Down the islands with Venezuela in the distance 

ML-QSAR models typically use many parameters to fit the data and this means that more data is needed to build them. One of the issues that I have with machine learning approaches to modeling is that it is not usually clear how many parameters have been used to build the models (and it's not always clear that the creators of the models know). You can think of number of parameters as the currency in which you pay for the quality of fit to the training data and you need to account for number of parameters when comparing performance of different models. This is an issue that I think ML-QSAR advocates need to address.

Overfitting of training data is an issue even for C-QSAR models that use small numbers of parameters. Generally, it is assumed that if a model satisfies validation criteria it has not been over-fitted. However, cross-validation can lead to an optimistic assessment of model quality if the distribution of compounds in the training space is very uneven. An analogous problem can arise even when using external test sets. Hawkins advocated creating test sets by removing all representatives of particular chemotypes from training sets and I was sufficiently uncouth to mention this to one of the plenaries at EuroQSAR 2016. Training set design and model validation do not appear to be solved problems in the context of ML-QSAR.

The Corniche in Beirut 

I get the impression that machine learning algorithms may be better suited for classification than QSAR and it is common to see potency (or affinity) values classified as 'active' or 'inactive' for modeling. This creates a number of difficulties and I'll also point you towards the correlation inflation article that explains why gratuitous categorization of continuous data is very, very naughty. First, transformation of continuous data to categorical data throws away huge amounts of information which would seem to be the data science equivalent of shooting yourself in the foot. Second, categorization distorts your perception of the data (e.g. a pIC50 value of 6.5 might be regarded as more similar to one of 9.0 than one of 5.5). Third, a constant uncertainty in potency translates to a variable uncertainty in the classification. Fourth, if you categorize continuous data then you need to demonstrate that conclusions of analysis do not depend on the categorization scheme.

In the machine learning area not all QSAR is actually QSAR. This article reports that "the performance of Naïve Bayes, Random Forests, Support Vector Machines, Logistic Regression, and Deep Neural Networks was assessed using QSAR and proteochemometric (PCM) methods". However, the QSAR methods used appear to be based on categorical rather than quantitative definitions of activity. Even when more than two activity categories (e.g. high, medium, low) are defined, analysis might not be accounting for the ordering of the categories and this issue was also discussed in the correlation inflation article. Some clarification from the machine learning community may be in order as to which of their offerings can be used for modelling quantitative activity data.

I'll conclude the post by taking a look at where QSAR fits into the framework of drug design. Applying QSAR methods requires data and one difficulty for the modeler is that the project may have delivered its endpoint (or been put out of its misery) by the time that there is sufficient data for developing useful models. Simple models can be useful even if they are not particularly predictive. For example, modelling the response of pIC50 to logP makes it easy to see the extent to which the activity of each compound beats (or is beaten by) the trend in the data. Provided that there is sufficient range in the data, a weak correlation between pIC50 and logP is actually very desirable and I'll leave it to the reader to ponder why this might be the case. My view is that ML-QSAR models are unlikely to have significant impact for predicting potency against therapeutic targets in drug discovery projects.  

So that's just about all I've got to say. Have an enjoyable conference and make sure keep the speakers honest with your questions. It'd be rude not to.

Early evening in Barra 

Saturday, 7 April 2018


I first became aware of Louis Hammett during the third term of my first year as an undergraduate at the University of Reading. Hammett was a pioneer in physical-organic chemistry and is widely regarded as one of the founders of that field. He would have been 124 today and was less than a year younger than Christopher Ingold, another pioneer in the field. Hammett passed away in 1987 at the age of 92 (here is an excellent obituary).

Today Hammett is remembered primarily for the parameters that describe electronic interactions between aromatic rings and their substituents. He also introduced linear free energy relationships which form the basis of classical QSAR. These days, QSAR has evolved away from its origins in physical-organic chemistry into what many call machine learning and parameters have become less physical (and considerably more numerous). Hammett's work provided an early lesson to wannabe molecular designers in how to think about molecules.

Jens Sadowski and I introduced matched molecular pair analysis (MMPA) in a chapter of a cheminformatics book that was conceived and edited by my dear friend (and favorite Transylvanian) Tudor Oprea. Here's a photo of Tudor and me at an OpenEye meeting (I think CUP II in 2001) during which our props (Tudor is wearing a PoD cape) were provided by the session chair (the formidable Janet Newman who intimidates proteins to the extent that they 'voluntarily' crystallize).

Now you might be wondering what MMPA has to do with Hammett. The short answer is that our book chapter included a table of what are effectively substituent constants for aqueous solubility and these have Hammett's fingerprints all over them. The longer answer is that Hammett introduced the idea of associating parameters with structural relationships (e.g. X is chloro analog of Y) between compounds. This is an important idea because much pharmaceutical design is focused on understanding and predicting the effects of structural modifications on the activity and properties of compounds. One rationale for this focus is the belief that it is easier to predict differences (e.g. relative affinity) in chemical behavior between structurally-related compounds than it is to predict chemical behavior directly from molecular structure.

At first, I didn't see the deeper connection between Hammett's work and pharmaceutical design. The main focus of our book chapter was preparing chemical structures in databases for virtual screening so the full extent of Hammett's influence on MMPA was not immediately recognized. As is often the case, we think we've discovered something really new only to find out later that somebody had been thinking along similar lines many years before. 

Happy 124th birthday, Louis Hammett.  

Sunday, 1 April 2018

The maximal quality of molecular interactions

There is a lot more to drug design than maximization of affinity and the key to successful design is actually that drugs form high quality interactions with their targets. Before the epiphany of ligand efficiency, measurement of interaction quality was a very inexact science. Ground-breaking research from the Budapest Enthalpomics Group (BEG) now puts the concept on a firm theoretical footing by unequivocally demonstrating that individual interactions can be localized on the affinity-quality axis in a unique manner that is completely independent of the standard state definition.

The essence of this novel approach is that, in addition to to its contributions to enthalpy and entropy of binding, each molecular interaction will now be awarded points for the artistic elements of the contact between ligand and target. This industry-leading application of Big Data uses the Blofeld-Auric Normalized Zeta Artificial Intelligence (BANZAI) algorithm to score aesthetic aspects of molecular interactions. This revolutionary machine learning application uses variable-depth, convolutional networks to model the covariance structure of the reduced efficiency tensor. Commenting on these seminal and disruptive findings the institute director, Prof. Kígyó Olaj, noted that "the algorithm is particularly accurate for scoring synchronization of vibrational modes and is even able to determine whether or not a hydrogen bond has made deliberate use of the bottom of the pool to assist another hydrogen bond during the binding routine".

Friday, 15 December 2017

Eu matei o Oscar Niemeyer

I never meant to kill Oscar Niemeyer. 

Vasily Grigoryevich Zaytsev was a pretty good shot. During the pivotal battle of Stalingrad he dispatched 225 enemy soldiers, including 11 snipers. Zaytsev died in Kiev on Oscar Niemeyer's 84th birthday and was reburied 14 years later on Mamayev Hill in Volgograd with full military honors. 

Oscar Ribeiro de Almeida Niemeyer Soares Filho was born in Rio de Janeiro on 15th December, 1907 and he is best known for design of some of the buildings in Brasilia. Here are examples of his work: 

I first learned of Curitiba in the 1980s from a Scientific American article on urban planning and it had been on my list of places to visit for almost 30 years when I arrived in Brazil in 2012. I also learned about Curitiba in Portuguese class and, while planning the trip, I was particularly excited to discover that Museu Oscar Niemeyer is in Curitiba.

I flew to Curitiba on Thursday November 30, 2012 and the museum visit was scheduled for the Saturday. I found a nice vegetarian restaurant near the museum and lunched there. Although not a vegetarian, I certainly enjoy vegetarian food and, in any case, there is only so much picanha that one can eat. As an aside, I can claim a degree of skill in finding vegetarian restaurants in unlikely places such as El Calafate (Peron must be turning in his grave) and Montevideo (didn't think that vegetables were allowed in Uruguay). The museum is sometimes called Museo do Olho (museum of the eye) on account of its most distinctive feature. Here are some photos:

I thoroughly enjoyed the afternoon in the museum and, as well as viewing the exhibits, I watched a video on the life of Niemeyer and also read some biographical material. However, neither source mentioned when he had died. On returning to São Carlos the next day, I googled him and was amazed to discover that he was still alive (and due to turn 105 in a couple of weeks). With all this talk about Zaytsev, you're probably thinking that I borrowed a sniper rifle and picked off Brazil's National Treasure as he shuffled around the garden on his zimmer frame. However, reality is even more bizarre.

At that time, I was taking Portuguese classes and the weekend in Curitiba was certainly going to give me something to talk about (typically the conversation would be about how deadly Friday's group meeting had been and why ponto de fusão is so important).  On being asked about my weekend, I responded, "Eu viajei para Curitiba e visitei o museu Oscar Niemeyer. Ele vive ainda!" (I travelled to Curitiba and visited the Oscar Niemeyer museum. He is still alive!)

Oscar Niemeyer died a few hours later.    

Thursday, 9 November 2017

Hydrogen bonding and electronegativity

So once again it's #RealTimeChem Week and to 'celebrate' we'll be taking a look at the relationship between hydrogen bond basicity and electronegativity in this blog post. The typical hydrogen bond is an interaction between an electronegative atom and a hydrogen atom that is covalently bonded to an another electronegative atom. We tend to think about hydrogen bonding as electrostatic in nature and we often use electrostatic models to describe the phenomenon. Let's take a look at hydrogen fluoride dimer which is probably the simplest hydrogen bonded system.
Fluorine is more electronegative than hydrogen which means that it tends to draw the electrons it shares with hydrogen towards itself. This gives fluorine a partial negative charge and hydrogen a partial positive charge. This simple electrostatic model suggests that a hydrogen bond will get stronger in response to increases in the electronegativity of either the acceptor atom or the atom to which the donor hydrogen is covalently bonded. 

Hydrogen bond strength can be quantified as the equilibrium constant for the association of a hydrogen bond donor with a hydrogen bond acceptor in a non-polar solvent. For example, pKBHX can be used as a measure of hydrogen bond basicity where KBHX is the equilibrium constant for association of the hydrogen bond acceptor compound (e.g. pyridine) with 4-fluorophenol in carbon tetrachloride.  Let's take a look at some pKBHX values for three structurally prototypical  compounds that present nitrogen, oxygen or fluorine to a hydrogen bond donor. 

The trend is the complete opposite of what you might have expected on the basis of the simple electrostatic model for hydrogen fluoride dimer. However, this is not as weird as you might think because electronegativity tells us about distribution of charge between atoms but at hydrogen bonding distances the donor can 'sense' the distribution of charge within the acceptor atom. Electronegativity quantifies the extent to which an atom can function as an 'electron sink' and this is also related to how effectively the atom can 'hide' the resulting excess charge from the environment around it. Put another way, fluorine will appear to be really weird if you think of it as a large, negative partial atomic charge

This is a good place to wrap up and, if you're interested in this sort of thing, why not take a look at this article on prediction of hydrogen bond basicity from molecular electrostatic potential. My most up to date hydrogen bond basicity data set can be found in the supplemental information (check the zip file) for this article and that's where I got the figures for the table.

Monday, 23 October 2017

War and PAINS

Non-interfering chemotypes are all alike;
 every interfering chemotype interferes in its own way

From afar, the Grande Armée appears invulnerable. The PAINS filters have been written into the Laws of MedChem and celebrated in Nature. It may be a couple of thousand kilometers to Moscow but what could possibly go wrong?

Viscount Wellington (he is yet become the 'Iron Duke') shadows the Grande Armée from the south. Dismissed as 'The Sepoy General' (he just writes blogs), Wellington knows that the best way to win battles is to persuade opponents to first underestimate him and then to attack him. He also knows that seemingly intractable problems often have simple solutions and, when asked years later by Queen Victoria as to how the sparrow problem of the new Crystal Palace might be resolved, his pithy response is, "Sparrowhawks, Ma'am".

Marshal Ney guards the southern flank of the Grande Armée and Wellington knows, from the Peninsular War, that Ney is a formidable opponent. Wellington is fully aware that, on the steppes, it will be all but impossible to exploit Ney's principal weakness (an unquestioning and unwavering belief in meaningfulness of Ligand Efficiency). Wellington knows that he will have to be very, very careful because Ney is a masterful exponent of the straw man defense.

The first contact with the Grande Armée occurs unexpectedly in the Belovezhskaya Pushcha. One of Ney's subordinates has set off in hot pursuit of a foraging party of thiohydantoins (which he has mistaken for rhodanines) and left the flank of Grande Armée exposed. Wellington orders an attack in an attempt to capitalize on the blundering subordinate's intemperance and it is only through the prompt action of Ney, who takes personal charge of the situation, that disaster is averted.

The first skirmish proves to be tactical draw although Ney's impetuous subordinate has been relieved of his command and put on clam-gathering duty. Wellington orders a diversionary attack designed to probe the Grande Armée defenses and then another intended to lure Ney into counter-attacking his carefully prepared defensive position. Ney initially appears to take the bait but soon disengages once he perceives the full extent of Wellington's defensive position. It will prove to be their final clash for the duration of this campaign.

The next contact with the Grande Armée takes place at Smolensk. A regiment of Swiss Guards, on loan from the Vatican, becomes detached from the main force and blunders into Wellington's outer defensive belt. The Swiss Guards' halberds prove to be of little use in this type of warfare and they are swiftly overwhelmed. As they are taken captive, many of the Swiss Guards are heard to mutter that the six assays of the PAINS panel are "different and independent, different and independent..." although none seems wholly convinced by their mantra.

Following the tactical victory at Smolensk, Wellington receives word by messenger pigeon that Marshal Kutuzov will attempt to stop the Grande Armée at Borodino (on the road to Moscow) and that Wellington should do what he can to harass the Grande Armée so as to buy more time for Kutuzov to complete his preparations. Wellington orders another diversionary attack which exposes the narrowness of the PAINS filter applicability domain before proceeding to Borodino where he arrays his troops to the south of the road to Moscow.

The armies of Wellington and Kutuzov are now disposed so as to counter a flanking maneuver by the Grande Armée but it is the army of Kutuzov that will bear the brunt of the attack while Wellington's force is held in reserve. Wellington marvels at Kutuzov's preparations and the efficient manner in which he has achieved optimal coverage of the terrain with the design of the training set. Not a descriptor is wasted and each differs in its own way since they are uncorrelated with each other. Nobody will be able to accuse Kutuzov of overfitting the data. Over a century later, Rokossovsky will tell Zhukov, "everything I know about QSAR, I learned from Kutuzov".

The Grande Armée advances confidently but Kutuzov is ready and up to the task at hand. The Grande Armée is first stopped in its tracks by a withering hail of grapeshot (more than half of the PAINS alerts were derived from one or two compounds only) and then driven back (...were originally derived from a proprietary library tested in just six assays measuring protein–protein interaction (PPI) inhibition using the AlphaScreen detection technology only). Running out of options, the Grande Armée commanders are forced to commit the elite Garde Impériale which temporarily blunts Kutuzov's advance. Wellington maneuvers his troops from their defensive squares into an attacking formation and awaits Kutuzov's order to commit the reserve.

Although the Grande Armée commanders consider it beneath them to do battle with the lowly Sepoy General, they have at least strengthened their southern flank in acknowledgement of his presence there. This in turn has weakened the northern flank which has been assumed to be safe from interference and it is at this point in the battle that the Grande Armée gets an unpleasant surprise.  There is the unmistakable sound of hoofbeats coming from the north, quiet at first but getting louder by the minute. Emerging from the smoke of battle, Marshal Blücher's Uhlans slam into the lightly-protected northern flank of the Grande Armée  (the same PAINS substructure was often found in consistently inactive and frequently active compounds). Following an attack plan on which Blücher has provided vital feedback, Wellington commits his troops although, in reality, there is little left for them to do aside from pursuing the retreating Garde Impériale.

There are lessons to be learned from the fate of the Grande Armée.  The PAINS filters were caught outside their narrow applicability domain on the vast Russian steppes and their fundamental weaknesses were brought into sharp focus by Blücher and Kutuzov who made effective use of large, non-proprietary data sets. Whether you're talking about T-34s or data, quantity has a quality all of its own. Science blogs are here to stay although, from time to time, every blogger should write a journal article pour encourager les autres.

Thursday, 12 October 2017

The resurrection of Maxwell's Demon

Sometimes when reading the residence time literature, I get the impression that the off-raters have re-animated Maxwell's Demon. It seems as if a nano-doorman stands guard at at the entrance of the binding site, only opening his nano-door to ligand molecules that want to get in. Microscopic Reversibility? Stop being so negative! With Big Data, Artificial Intelligence, Machine Learning (formerly known as QSAR) and Ligand Efficiency Metrics we can beat Microscopic Reversibility and consign The Second Law to the Dustbin Of History!

There were a number of things that triggered this blog post. First, I saw a recent article that got me thinking about philatelic drug discovery.  Second, some of the off-raters will be getting together in Berlin next week and I wanted to share some musings because I won't be there in person. Third, my former colleague Rutger Folmer has published a useful (dare I say, brave) critique of the residence time concept that is bang on target. 

I'm not actually going to say much about Rutger's article except to suggest that you read it. That's because I really want to examine the article on philatelic drug discovery in a more detail (it's actually about thermodynamic and kinetic profiling but I thought the reference to philately would better grab your attention). My standard opening move when playing chess with an off-rater is to assert that slow binding is equivalent to slow distribution. In what situations would you design a drug to distribute slowly?

Chemical kinetics is all about energy barriers and, the higher the barrier, the slower things will happen. Microscopic reversibility tells us that a barrier to association is a barrier to dissociation and that the ligand will return to solution along the same path that it took to its binding site. Microscopic reversibility tells you that if you got into the parking spot you can get out of it as well although that may not be the experience of every driver. The reason that microscopic reversibility doesn't always seem to apply to parking is that most humans, with the possible exception of tank drivers in the Italian army, are more comfortable in forward gear than in reverse. Molecules, in contrast, have no more concept of forward and reverse than they do of standard states, IUPAC or the opinions the 'experts' who might quantitatively estimate their drug-likeness while judging their beauty. Molecules don't actually do concepts. Put more uncouthly, molecules just don't give a toss.

I've created a graphic to illustrate to show how things might look in vivo when there is a barrier to association (and, therefore, to dissociation). We can think of the ligand molecule having to get over the barrier in order to get to its binding site and we call the top of the barrier the 'transition state'. This is a simplified version of reality (it is actually the system that passes from the unbound state through the transition state to the bound state and for some ligand-protein association there is no barrier) but it'll serve for what I'd like to say. The graphic consists of three panels and the first (A) of these illustrates the situation soon after dosing when the concentration of ligand (L) is relatively high and the target protein (P) has not had sufficient time to respond. If the barrier is sufficiently high, the system can't get to equilibrium before the ligand concentration starts to fall in what a pharmacokineticist might refer to as the elimination phase. Under this scenario the system will be at equilibrium briefly as the ligand concentration falls and I've shown this in panel B. After the equilibrium point is reached, the rate of dissociation exceeds the rate of association and this is shown in panel C. 

There's something else that I'd like you to take a look at in the graphic and that's the free energy (G) of the unbound state (P + L).  See how it goes down relative to the free energy of the bound state (P.L) as the concentration of ligand decreases. When thinking about energetics of these systems, it actually makes a lot of sense to use the unbound state as the reference but you do need to use a reference concentration (e.g. 1 M) to to do this.

When we do molecular design we often think in terms of manipulating energy differences. For example, we try to increase affinity by stabilizing the bound state relative to the unbound state. Once you start trying to manipulate off-rates, you soon realize that you can't change one thing at a time (unless you draft Maxwell's Demon into your project team).  I've created a second graphic which looks similar to the first graphic although there are important differences between the two graphics. In particular, I'm referencing energy to the unbound state (P + L) which means that the ligand concentration is constant in all three panels. Let's consider the central panel as the starting point for design. We can go left from that starting point and stabilize the bound state which is equivalent to optimizing affinity.  Stabilizing the bound state will also result in slower dissociation provided that the transition stare energy remains unchanged. This is a good thing but it's difficult to show that the benefits come from the slower dissociation and not from the increased affinity. If you raise the barrier (i.e. increase the energy of the transition state) to reduce the off-rate you'll find that you have slowed the on-rate to an equal extent.        

Before moving on, it may be useful to sum up where we've got to so far. First, ask yourself why you think off-rates will be relevant in situations where concentration changes on a longer time scale than binding. Second, you'll need to enlist the help of Maxwell's Demon if you want to reduce off-rate without affecting on-rate and/or affinity. Third, if you want to consider binding kinetics in design then it'd be best to use barrier height (referenced to unbound state) and affinity as your design parameters.

Now I'd take a look at the philatelic drug discovery article. This is a harsh term but it does capture a tendency in some drug discovery programs to measure things for the sake of it (or at least to keep the grinning Lean Six Sigma 'belts' grinning).  Some of this is a result of using techniques such as isothermal titration calorimetry (ITC) and surface plasmon resonance (SPR) that yield information in addition to affinity (that is of primary interest) at no extra cost. I really don't want to come across as a Luddite and I must stress that measurements of enthalpy, entropy, on-rate and off-rate are of considerable scientific interest and are also valuable for improving physical models. Furthermore, I am continually awed by the exquisite sensitivity of modern ITC and SPR instruments and would always want the option to be able to measure affinity using at least one of these techniques. However, problems start when the access to enthalpy, entropy, off-rates and on-rates becomes exploited for 'metrication' and drug discovery scientists seek 'enthalpy-driven' binding simply because the binding will be more 'enthalpy-driven'. It is easier to make the case for relevance of binding kinetics although, as Rutger points out, reducing the off-rate may very well make things worse if the on-rate is also reduced. It is much more difficult to assemble a coherent case for the relevance of thermodynamic signatures in drug discovery. Perhaps, some day, a seminal paper from the Budapest Enthalpomics Group (BEG) will reveal that isothermal systems like live humans can indeed sense the enthalpy and entropy changes associated with drugs binding to their targets although I will not be holding my breath.

Unsurprisingly, the thermodynamic and kinetic profiling (aka philatelic drug discovery) article advocates thermodyanamic profiling of bioactive compounds in lead optimization projects. I'm going to focus on the kinetic profiling and it is worrying that the authors don't seem to be aware that on-rates and off-rates have to be seen in a pharmacokinetic context in order to make the connection with drug discovery. The authors may find it instructive to think about how inhibitor concentration would have varied over the course of a typical experiment in their cell-based assays. They are also likely to find Rutger's article to be educational and I recommend that they familiarize themselves with its content.

The following statement suggests that it may be beneficial for the authors to also familiarize themselves with the rudiments of chemical kinetics:

"Association and dissociation rate constants (kon and koff) of compound binding to a biological target are not intrinsically related to one another, although they are connected by dissociation equilibrium constant KD (KD = koff/kon)."

The processes of association and dissociation are actually connected by virtue of taking place along the same path and by having to pass through the same transition states. The difference in barrier heights for association and dissociation is given by the binding free energy. 

Some analysis of relationships between potency in a cell-based assay and  KD, koff and kon were presented in Figure 6 of the article. I have a number of gripes with the analysis. First, it would be better to use logarithms of quantities like KD, IC50, koff and kon when performing analysis of this nature. In part, this because we typically look for linear free energy relationships in these situations. There is another strong rationale for using logarithms because analysis of correlations between continuous variables works best when the uncertainties in data values are as constant as possible. My second gripe is that the authors have chosen to bin their data for analysis and this is a great way to shoot yourself in the foot. When you bin continuous data you both reduce your data analysis options and leave people wondering whether the binning has been done to hide the weakness of the trends in the data.   I have droned at length about why it is naughty to bin continuous data so I'll leave it at that.

It's been a long post and it's time to wrap things up. If you've found the post to be 'cansativo' (sounds so much more soothing in Portguese) then spare a thought for the person who had to write it. To conclude, I'll leave you with a quote that I've taken from the abstract for Rutger's article:
"Moreover, fast association is typically more desirable than slow, and advantages of long residence time, notably a potential disconnect between pharmacodynamics (PD) and pharmacokinetics (PK), would be partially or completely offset by slow on-rate."