Sunday, 8 May 2016

A real world perspective on molecular design

I'll be taking a look at a Real-World Perspective on Molecular Design which has already been reviewed by Ash. I don't agree that this study can accurately be described as 'prospective' although, in fairness, it is actually very difficult to publish molecular design work in a genuinely prospective manner. Another point to keep in mind is that molecular modelers (like everybody else in drug discovery) are under pressure to demonstrate that they are making vital contributions. Let's take a look at what the authors have to say:

"The term “molecular design” is intimately linked to the widely accepted concept of the design cycle, which implies that drug discovery is a process of directed evolution (Figure 1). The cycle may be subdivided into the two experimental sections of synthesis and testing, and one conceptual phase. This conceptual phase begins with data analysis and ends with decisions on the next round of compounds to be synthesized. What happens between analysis and decision making is rather ill-defined. We will call this the design phase. In any actual project, the design phase is a multifaceted process, combining information on status and goals of the project, prior knowledge, personal experience, elements of creativity and critical filtering, and practical planning. The task of molecular design, as we understand it, is to turn this complex process into an explicit, rational and traceable one, to the extent possible. The two key criteria of utility for any molecular design approach are that they should lead to experimentally testable predictions and that whether or not these predictions turn out to be correct in the end, the experimental result adds to the understanding of the optimization space available, thus improving chances of correct prediction in an iterative manner. The primary deliverable of molecular design is an idea [4] and success is a meaningful contribution to improved compounds that interrogate a biological system."

This is a certainly a useful study although I will make some criticisms in the hope that doing so stimulates discussion. I found the quoted section to lack coherence and would argue that  the design cycle is actually more of a logistic construct than a conceptual one. That said, I have to admit that it's not easy to clearly articulate what is meant by the term 'molecular design'. One definition of molecular design is control of behavior of compounds and materials by manipulation of molecular properties. Using the term 'behavior' captures the idea that we design compounds to 'do' rather than merely to 'be'. I also find it useful to draw a distinction between hypothesis-driven molecular design (ask good questions) and prediction-driven molecular design (synthesize what the models, metrics or tea leaves tell you to). Asking good questions is not as easy as it sounds because it it is not generally possibly to perform controlled experiments in the context of molecular design as discussed in another post from Ash. Hypothesis-driven molecular design can also be thought of as a framework in which to efficiently obtain the information required to make decisions and, in this sense, there are analogies with statistical molecular designI believe that the molecular design that the authors describe in the quoted section is of the hypothesis-driven variety but hand-wringing about how "ill-defined" it is doesn't really help move things forward. The principal challenges for hypothesis-driven molecular design are to make it more objective, systematic and efficient. I'll refer you to a trio of blog posts ( 1 | 2 | 3) in which some of this is discussed in more detail.

I'll not say anything specific about the case studies presented in this study except to note that sharing specific examples of application of  molecular design as case studies does help to move the field forward even when the studies are incomplete. The examples do illustrate how the computational tools and structural databases can be used to provide a richer understanding of molecular properties such as conformational preferences and interaction potential. The CSD (Cambridge Structural Database) is a particularly powerful tool and, even in my Zeneca days, I used to push hard to get medicinal chemists using it. Something that we in the medicinal chemistry community might think about is how incomplete studies can be published so that specific learning points can be shared widely in a timely manner.  

But now I'd like to move on to the conclusions, starting with 1 (value of quantitative statements), The authors note:

"Frequently, a single new idea or a pointer in a new direction is sufficient guidance for a project team. Most project impact comes from qualitative work, from sharing an insight or a hypothesis rather than a calculated number or a priority order. The importance of this observation cannot be overrated in a field that has invested enormously in quantitative prediction methods. We believe that quantitative prediction alone is a misleading mission statement for molecular design. Computational tools, by their very nature, do of course produce numerical results, but these should never be used as such. Instead, any ranked list should be seen as raw input for further assessment within the context of the project. This principle can be applied very broadly and beyond the question of binding affinity prediction, for example, when choosing classification rather than regression models in property prediction."
This may be uncomfortable reading for QSAR advocates, metric touts and those who would have you believe that they are going to disrupt drug discovery by putting cheminformatics apps on your phone. It also is close to my view of the role of computational chemistry in molecular design (the observant reader will have noticed that I didn't equate the two activities) although, in the interests of balance, I'll refer you to a review article on predictive modelling. We also need to acknowledge that predictive capability will continue to improve (although pure prediction-driven pharmaceutical design is likely to be at least a couple of decades away) and readers might find this blog post to be relevant. 

Let's take a look at conclusion 5 (Staying close to experiment) and the authors note:

"One way of keeping things as simple as possible is to preferentially utilize experimental data that may support a project, wherever this is meaningful. This may be done in many different ways: by referring to measured parameters instead of calculated ones or by utilizing existing chemical building blocks instead of designing new ones or by making full use of known ligands and SAR or related protein structures. Rational drug design has a lot to do with clever recycling."

This makes a lot of sense although I don't recommend use of the tautological term 'rational drug design' (has anybody ever done irrational drug design?). What they're effectively saying here is that it is easier to predict the effect of structural changes on properties of compounds than it is to predict those properties directly from molecular structure. The implications of this for cheminformaticians (and others seeking to predict behaviour of compounds) is that they need to look at activity and chemical properties in terms of relationships between the molecular structures of compounds. I've explored this theme, both in an article and a blog post, although I should point out that there is a very long history of associating changes in the values of properties of compounds with modifications to molecular structures.

However, there is another side to "staying close to experiment" and that is recognizing what is and what isn't an experimental observable. The authors are clearly aware of this point when they state: 

"MD trajectories cannot be validated experimentally, so extra effort is required to link such simulation results back to truly testable hypotheses, for example, in the qualitative prediction of mechanisms or protein movements that may be exploited for the design of binders."

When interpreting structures of protein-ligand complexes, it is important to remember that the contribution of an intermolecular contact to affinity is not, in general, an experimental observable. As such, it would have been helpful if the authors had been a bit more explicit about exactly which experimental observable(s) form the basis of the "Scorpion network analysis of favorable interactions". The authors make a couple of references to ligand efficiency and I do need to point out that scaling free energy of binding has no thermodynamic basis because, in general, our perception of efficiency changes with the concentration used to define the standard state. On a lighter note there is a connection between ligand efficiency and homeopathy that anybody writing about molecular design might care to ponder and that's where I'll leave things.

No comments: