Sunday, 27 January 2019

Reviewing the reviewers

I recently published The Nature of Ligand Efficiency (NoLE) as a ChemRxiv preprint and this was featured (for all the right reasons) in a post at In The Pipeline. The material had been previously submitted to J Med Chem but it proved a bit too spicy for two of the three reviewers. I'll review the J Med Chem reviewers in this blog post and I hope that the feedback will be useful in the event of the journal being presented with similarly flavored material in the future. NoLE was my second publication from Berwick-on-Sea in the village of Blanchisseuse on the north coast of my native Trinidad and I'll include some photos from there to break up the text a bit.

Gate at Berwick-on-Sea in Blanchisseuse. The house was built (quite literally) by my late father (who would have been 89 today) and was named for my mother's home town of Berwick-upon-Tweed which has changed hands between England and Scotland on a number of occasions and may even still be at war with Imperial Russia.

The selection of reviewers for manuscripts that criticize previous studies presents a dilemma for journal editors. While it is prudent to consult those with a stake in what is being criticized, these may not the best people to ask about whether or not the criticism should be made. In particular, a reviewer using his/her position as a reviewer to suppress criticism of something in which he/she has a stake raises ethical questions. A stake in ligand efficiency (LE) could take any of a number of forms. First, one could have introduced a metric for LE. Second, one could have written articles endorsing ligand efficiency metrics or asserting their validity. Third, one could have enthusiastically promoted the LE metric at one's institution (e.g. by mandating that LE values be quoted when presenting project updates at the dog and pony shows that are an essential part of modern drug discovery). Fourth, one might be a devout member of the Fragment Cult (for whom the Doctrinal Correctness of LE is an Article of Faith).

There were three reviewers for my manuscript and I'll call them A, B and C since their numbers got scrambled between different rounds of review (also using the term 'Reviewer 3' might give some readers anxiety attacks). Reviewer A had nothing constructive to say and simply spat feathers. Reviewer B was very positive about the manuscript and made a number of  helpful suggestions. Reviewer C demanded that the manuscript be watered down to homeopathic levels (and that was never going to happen).

Here's my office at Berwick-on-Sea. That's a printout of NoLE on my desk (under the hanging beach towel).

The central theme of my manuscript is the argument that ligand efficiency is physically meaningless because perception of efficiency changes with the concentration unit in which affinity is expressed. This is actually a very serious criticism since since a change in perception resulting from a change in a unit would normally be regarded in physical science as an error in the "not even wrong" category.  It's not something that one can simply sweep under the carpet as a "limitation" of ligand efficiency. Despite their howls of protest, neither Reviewer A nor Reviewer C offered coherent counter-argument.

The tactic adopted by Reviewer C was to simply dismiss the physical arguments presented in the manuscript as "opinion" without presenting counter-argument. J Med Chem really does need to make it clear to reviewers that they need to do much better than this since it reflects badly on the journal.

Reviewer C. "'Physically meaningless' is at best an inflammatory opinion whereas the fact that other choices could have been made is often under-appreciated."
PWK. This criticism appears to be doctrinal rather than scientific and I note that Reviewer C has not offered counter-argument to the argument that LE is physically meaningless.

Here's a view of the Caribbean Sea. The 20 m drop from the gap in the vegetation is just as precipitous as you would expect although we've not (yet) lost any personnel or household pets over the edge.

Reviewer A struggled woefully with rudimentary physical chemistry throughout the review process and, given that I'd suggested a number of potential reviewers with the necessary expertise in molecular recognition and chemical thermodynamics, I was at a loss to understand why a reviewer who was so ill-equipped for the task at hand had been invited to review the manuscript.

Reviewer A. Reactions are considered to be spontaneous under standard conditions when the free energy is negative, but by changing the definition of C° in an arbitrary manner, any reaction can be said to be spontaneous or not. This is true in a trivial sense, but generations of researchers have found the concept of negative or positive free energies useful.
PWK. The flaw in this argument is that if you change the value of C° then you also change whether or not the reaction is spontaneous under the standard conditions. This is the basis of the law of mass action and it is also important to remember that KD values are not measured at single concentration. A chemical process (at constant temperature and pressure) by which the system changes from state A to state B will be spontaneous if DG[A®B]  is negative. Regardless of experiences of generations of researchers, medicinal chemists rarely (if ever) appear to use the sign of  D (e.g. for binding under assay conditions) when analyzing SAR or for making any other decisions.

This is the start to the path down to the lower deck

In one round of review, Reviewer C stated “I believe that it is incumbent on the author to argue that the choice of standard state used by medicinal chemists is not useful” and Reviewer A repeated the criticism in a subsequent round, noting that this was "the central problem with the manuscript". I thought this was a bit rich given that Reviewer A and Reviewer C had each accused me of using straw man tactics at different points in the review process. The more serious problem, however, is that we have two LE advocates each attempting to to transfer the burden of proof that (in science) one accepts as soon as one advocates that people take an action (e.g. use LE metrics). Reviewers A and C appeared to do this in order to evade their responsibility as reviewers to present counter-argument to the arguments in the manuscript. This would be like a thought leader (yes, there really are people who call themselves 'thought leaders') responding to criticism of a claim that AI was going to transform drug discovery by saying that it was incumbent on the critics to argue that AI was not useful. Imagine if they ran clinical trials like this?

At this point, Reviewer A did rather lose it and I was half expecting to have to fend off a counterattack by Steiner's division. Needless to say, the latest version of the manuscript now opens with "Ligand efficiency (LE) is, in essence, a good concept that is poorly served by a bad metric." and this can be considered the equivalent of a two-fingered gesture that is mistakenly attributed to the English and Welsh longbowmen at Agincourt.

Reviewer A. Dr. Kenny dodges this challenge by stating that the burden of proof should not be on him, but by arguing that LE is a “bad metric” despite its wide usage, he does in fact have to explain why free energy is also a “bad” concept. Not doing so makes the manuscript deeply misleading and therefore inappropriate for publication.
PWK. I only used the term “bad metric” in the conclusions where I wrote “Ligand efficiency is, in essence, a good concept served by a bad metric.” so it is incorrect to state that I have argued that LE is a “bad metric”. In any case, in the revised manuscript, I now question whether LE can accurately be described as a metric since neither its creators nor its advocates appear able (or willing) to say what it measures. Wide usage does not validate rules, guidelines or metrics and I note that, at one time, the prevailing view was that the sun orbited the earth. Once again, Reviewer A is making the serious error of assuming that everything that applies to free energy also applies to any function of free energy. The simple counter to Reviewer A’s challenge is that free energy is a state function and an integral part of the framework of thermodynamics. Although defined in terms of free energy, the LE metric is not is part of thermodynamics simply because it appears to require a privileged standard state.

I have occasionally stated that "useful is the last refuge of the scoundrel" and this tends to be misinterpreted as an assertion that utility of a model is unimportant. Nothing could actually be further from the truth and the statement is more a comment on the way that models can be 'validated' by simply labeling them as "useful". In some ways "useful" is analogous to the "God created it that way" statements that you will encounter if you are careless enough to become ensnared in arguments with Creationists. I should also point out that the manuscript did discuss the difficulties of demonstrating the utility of LE while neither A nor C presented any evidence (fervent belief does not usually constitute evidence in science) to support their assertion that the 1 M standard state is more useful than any other standard state.

Reviewer A appeared particularly aggrieved that one of The Great Unwashed should have the temerity to even question the value of LE and the toys were duly ejected from the pram. As my response below indicates, Reviewer A's comment is more what one might have expected from an inquisitor at a fifteenth century heresy trial than from an expert reviewer of a manuscript submitted to the premier medicinal chemistry journal. It is also worth pointing out that LE was touted as "useful" even as it was introduced in a 2004 letter to Drug Discovery Today and all three coauthors of that seminal contribution to the medicinal chemistry literature appeared to be blissfully unaware of the nontrivial dependency of their creation on the standard concentration. As such, I would argue that it would actually be a dereliction of duty not to question the utility of LE.

Reviewer A. Sixth, Dr. Kenny repeatedly questions the utility of LE; for example “The LE metric is claimed by advocates to be useful although it is rarely, if ever, shown to be predictive of pharmaceutically-relevant behavior” (p. 15) and “the LE metric is rarely, if ever, shown to be predictive of phenomena that are relevant to drug discovery” (p. 39).
PWK. This appears to be a doctrinal rather than scientific criticism.

Lower deck. I only swim from here if snorkeling because it's rocky.

Reviewer B was very positive about the "Molecular Size and Design Risk" section and made useful suggestions for its expansion. It's also worth mentioning that Derek quoted from this section in his post. However, Reviewer C suggested that the whole section be purged from the manuscript although it is possible that Reviewer C's underlying objective was to ensure that certain articles were not discussed. Reviewer C complained that my criticism of ref 48 was unfair although it may be that the reviewer considered ref 48 to be a liability (this post will give readers an idea why some LE advocates might consider ref 48 to be a liability). Another possibility is that the objection to criticism of ref 48 was actually a smokescreen and the real reason for suggesting that the section be purged was actually to avoid discussion of ref 45 (which might be considered to be an even greater liability by LE advocates).

Ref 58 and ref 59 are rare examples of articles that respond to criticism of LE and and a study such as NoLE really does need to discuss them (especially since both articles completely miss the point). The fundamental flaw that is common to both articles is that neither addresses the problems associated with the change in perception that results from using a different unit to express affinity. Reviewer C protested that it was gratuitous to single out ref 58 and even cited this 2014 post from Molecular Design in support of the charge that I was unfairly picking on ref 58. Reviewer C did seem rather rattled and also complained that I had quoted "non-scientific sections" of ref 59. I must confess to being unfamiliar with the concept that a scientific article can have non-scientific sections that can be declared off-limits for challenge. This was, perhaps, not Reviewer C's finest moment.

Reviewer A and Reviewer C both seemed rather keen that ref 94 not be discussed and they said that I should not be "attacking" fit quality (FQ) because it is rarely, if ever, used. I suspect the real reason was that both reviewers consider the metric (and ref 94) to be a significant liability from the LE perspective. I responded by noting that FQ had got its own box in the NRDD LE review and that ref 94 was cited in ref 58 (which asserts the validity of LE), suggesting that FQ may be of greater interest than Reviewer A and Reviewer C would have us believe. Another reason that Reviewer C might have preferred that the spotlight not be focused on FQ is that the discussion further exposes the illusion that fragments bind more efficiently than ligands of greater molecular size.

This is where I go swimming. It's a 5 minute walk from the house

So that concludes my review of the reviewers. I believe that the J Med Chem editors do need to think carefully about how (or even whether) they wish to have controversial topics addressed in their journal. Dr Eric Williams, the first Prime Minister of Trinidad and Tobago, suggested that his hearing impairment was an advantage in dealing with dissent because he could simply switch off his hearing aid. However, dealing with controversial topics in drug discovery might not be quite so simple. In particular, a journal needs to consider the potential vested interests of those from whom it seeks advice. For example, the Editors of a number of ACS journals may find it quite instructive to take a very close look at exactly how their journals came to endorse a frequent hitter model (trained on results from a panel of only six assays that all use the same readout) as a predictor of pan-assay interference...

I'll leave you with a selfie taken on the roof. A few minutes earlier I'd seen off a determined counter-attack by some jack spaniards (or should that be jacks spaniard?). Normally, I'd leave them alone but they were too close to where I needed to work. The technique is simple but its execution takes some nerve. First, arm yourself with a can of Baygon (don't forget to test it beforehand) and a broom. Second, with Baygon aimed, prod nest with broom. Third, spray a protective curtain of Baygon as the jack spaniards attack you (they are aggressive and they always attack). 

PWK one, jack spaniards nil 

No comments: