Molecular Design: PAINS

Showing posts with label PAINS. Show all posts

Wednesday, 1 April 2026

PAINS and Prejudice

<< previous || next >>

PAINS (pan assay interference compounds) filters have exerted a hold over the drug discovery community ever since the BH2010 study appeared over 15 years ago. Initially I didn’t take much notice of PAINS filters and, in any case, I’d already moved on from analysis of high-throughput screening (HTS) output by that point (I might add ‘thankfully’ because looking at too much HTS output is a sure-fire route to the funny farm). I started analysing HTS output from about 1993 at what was then Zeneca. I used the Daylight toolkit to create the Struct_Anal SMARTS-based chemical structure profiler in 1995 and, at that time, we were already using in house software named Flush (even at that stage it was clear that much of the HTS output being generated was going to disappear round the S-bend and our friends at what was then Rhône-Poulenc Rorer developed HARPick to ensure that nothing remained stuck to the porcelain).

Photo from 2011 at 'The Black Hole' (Los Alamos NM)

Something that had always worried me was that it was very easy to opine that a compound looked nasty but it was much more difficult to demonstrate objectively that the compound was indeed nasty. Late in 2014 a blog post, which fell well short of the standards that the drug discovery community has come to expect from Practical Fragments, prompted me to take a more forensic look at PAINS filters. What I found was that PAINS filters were based on the output from screening compounds in just six AlphaSceen assays (if a panel of six assays that all use the same read-out strikes you as suboptimal design of an experiment to detect pan-assay interference then you’re not alone). After blogging periodically about PAINS filters for a couple of years I wrote a Perspective on the topic (as noted in this blog post: from time to time, every blogger should write a journal article “pour encourager les autres”).

Nevertheless, doubts about the correctness of my position started to creep in when I was denounced for being insufficiently thoughtful in my published comments on PAINS by the authors, one of whom is a former colleague, of the seminal, insightful and Nobel-worthy ‘Seven Year Itch’ article (BN2017) which oozes wisdom and penetrating insight. Although stung by the criticism and wracked by self-doubt to the extent that I considered therapy, it was a recent study led by the world-renowned expert on tetrodotoxin pharmacology, Prof. Angelique Bouchard-Duvalier of the Port-au-Prince Institute of Biogerontology, working in collaboration with the Budapest Enthalpomics Group (BEG), that removed any lingering doubts about the sublime elegance and extreme predictivity of PAINS filters. The manuscript has not yet been made publicly available although I was able to access it with the help of my associate ‘Anastasia Nikolaeva’ (not sure exactly what she’s doing these days although I understand that she’s currently visiting Port-au-Prince for a medication review with Prof. Bouchard-Duvalier). There is no doubt that this genuinely disruptive study will comprehensively reshape the predictive biochemistry landscape, enabling drug discovery scientists to accurately, meaningfully and robustly predict assay interference using only chemical structures as input for the very first time.

Prof. Bouchard-Duvalier’s seminal study clearly demonstrates that singlet oxygen quenching is actually a conserved feature for all known and unknown mechanisms of interference with assay read-outs and that PAINS filters dramatically outperform all other methods for prediction of assay interference. The math is truly formidable (the rudimentary nature of my understanding of Haitian patois didn’t help either) and involves first projecting the atomic isothermal compressibility matrix into the quadrupole-normalized polarizability tensor before applying the Barron-Samedy transformation, followed by hepatic eigenvalue extraction using a the elegant algorithm devised by E. V. Tooms (a reclusive Baltimore resident and connoisseur of liver pâté whose illustrious thought leadership of the analytic topology field unravelled almost 32 years ago after he failed to comply with the safety instructions for an escalator). The incisive analysis of Prof. Bouchard-Duvalier shows without a shadow of doubt that singlet oxygen quenching as quantified by the AlphaScreen assay read-out is a fundamental principle in biomolecular assay science. Furthermore, ‘Anastasia Nikolaeva’ was also able to ‘liberate’ a prepared press release in which the grinning BEG director Prof. Kígyó Olaj explains:

Possibilities are limitless now that we can accurately and robustly predict the assay interference that compounds will exhibit directly from their chemical structures and we can safely consign experimental biochemical assays to the dustbin of history. Surely the Journal of Medicinal Chemistry Editors will now finally recognize the colossal impact that PAINS filters have made on real world drug discovery and development when they make their FIFA Prize nominations later this year.

Sunday, 31 December 2023

Chemical con artists foil drug discovery

One piece of general advice that I offer to fellow scientists is to not let the fact that an article has been published in Nature (or any other ‘elite’ journal for that matter) cause you to switch off your critical thinking skills while reading it and the BW2014 article (Chemistry: Chemical con artists foil drug discovery) that I’ll be reviewing in this post is an excellent case in point. My main criticism of BW2014 that is that the rhetoric is not supported by data and I’ve always seen the article as something of a propaganda piece.

One observation that I’ll make before starting my review of BW2014 is that what lawyers would call ‘standard of proof’ varies according to whether you’re saying something good about a compound or something bad. For example, I would expect a competent peer reviewer to insist on measured IC50 values if I had described compounds as inhibitors of an enzyme in a manuscript. However, it appears to be acceptable, even in top journals, to describe compounds as PAINS without having to provide any experimental evidence that they actually exhibit some type of nuisance behavior (let alone pan-assay interference). I see a tendency in the ‘compound quality’ field for opinions to be stated as facts and reading some of the relevant literature leaves me with the impression that some in the field have lost the ability to distinguish what they know from what they believe.

BW2014 has been heavily cited in the drug discovery literature (it was cited as the first reference in the ACS assay interference editorial which I reviewed in K2017) despite providing little in the way of practical advice for dealing with nuisance behavior. B2014 appears to exert a particularly strong influence on the Chemical Probes Community having been cited by the A2015, BW2017, AW2022 and A2022 articles as well as in the Toxicophores and PAINS Alerts section of the Chemical Probes Portal. Given the commitment of the Chemical Probes Community to open science, their enthusiasm for the PAINS substructure model introduced in BH2010 (New Substructure Filters for Removal of Pan Assay Interference Compounds (PAINS) from Screening Libraries and for Their Exclusion in Bioassays) is somewhat perplexing since neither the assay data nor the associated chemical structures were disclosed. My advice to the Chemical Probes Community is to let go of PAINS filters.

Before discussing BW2014, I’ll say a bit about high-throughput screening (HTS) which emerged three decades ago as a lead discovery paradigm. From the early days of HTS it was clear, at least to those who were analyzing the output from the screens, that not every hit smelt of roses. Here’s what I wrote in K2017:

Although poor physicochemical properties were partially blamed (3) for the unattractive nature and promiscuous behavior of many HTS hits, it was also recognized that some of the problems were likely to be due to the presence of particular substructures in the molecular structures of offending compounds. In particular, medicinal chemists working up HTS results became wary of compounds whose molecular structures suggested reactivity, instability, accessible redox chemistry or strong absorption in the visible spectrum as well as solutions that were brightly colored. While it has always been relatively easy to opine that a molecular structure ‘looks ugly’, it is much more difficult to demonstrate that a compound is actually behaving badly in an assay.

It has long been recognized that it is prudent to treat frequent-hitters (compounds that hit in multiple assays) with caution when analysing HTS output. In K2017 I discussed two general types of behavior that can cause compounds to hit in multiple assays: Type 1 (assay result gives an incorrect indication of the extent to which the compound affects target function) and Type 2 (compound acts on target by undesirable mechanism of action (MoA)). Type 1 behavior is typically the result of interference with the assay read-out and the hits in question can be accurately described as ‘false positives’ because the effects on the target are not real. Type 1 behaviour should be regarded as a problem with the assay (rather than with the compound) and, provided that the activity of a compound has been established using a read-out for which interference is not a problem, interference with other read-outs is irrelevant. In contrast, Type 2 behavior should be regarded as a problem with the compound (rather than with the assay) and an undesirable MoA should always be a show-stopper.

Interference with read-out and undesirable MoAs can both cause compounds to hit in multiple assays. However, these two types of bad behavior can still cause big problems whether or not the compounds are observed to be frequent-hitters. Interference with read-out and undesirable MoAs are very different problems in drug discovery and the failure to recognize this point is a serious deficiency that is shared by BW2014 and BH2010.

Although I’ve criticized the use of PAINS filters there is no suggestion that compounds matching PAINS substructures are necessarily benign (many of the PAINS substructures look distinctly unwholesome to me). I have no problem whatsoever with people expressing opinions as to the suitability of compounds for screening provided that the opinions are not presented as facts. In my view the chemical con-artistry of PAINS filters is not that benign compounds have been denounced but the implication that PAINS filters are based on relevant experimental data.

Given that the PAINS filters form the basis of a cheminformatic model that is touted for prediction of pan-assay interference, one could be forgiven for thinking that the model had been trained using experimental observations of pan-assay interference. This is not so, however, and the data that form the basis of the PAINS filter model actually consist of the output of six assays that each use the AlphaScreen read-out. As noted in K2017, a panel of six assays using the same read-out would appear to be a suboptimal design of an experiment to observe pan assay interference. Putting this in perspective, P2006 (An Empirical Process for the Design of High-Throughput Screening Deck Filters) which was based on analysis of the output from 362 assays had actually been published four years before BH2010.

After a bit of a preamble, I need to get back to reviewing BW2014 and my view is that readers of the article who didn’t know better could easily conclude that drug discovery scientists were completely unaware of the problems associated with misleading HTS assay results before the re-branding of frequent-hittters as PAINS in BH2010. Given that M2003 had been published over a decade previously. I was rather surprised that BW2014 had not cited a single article about how colloidal aggregation can foil drug discovery. Furthermore, it had been known (see FS2006) for years before the publication of BH2010 that the importance of colloidal aggregation could be assessed by running assays in the presence of detergent.

I'll be commenting directly on the text of BW2014 for the remainder of the post (my comments are italicized in red).

Most PAINS function as reactive chemicals rather than discriminating drugs. [It is unclear here whether “PAINS” refers to compounds that have been shown by experiment to exhibit pan-assay interference or simply compounds that share structural features with compounds (chemical structures not disclosed) claimed to be frequent-hitters in the BH2010 assay panel. In any case, sweeping generalizations like this do need to be backed with evidence. I do not consider it valid to present observations of frequent-hitter behavior as evidence that compounds are functioning as reactive chemicals in assays.] They give false readouts in a variety of ways. Some are fluorescent or strongly coloured. In certain assays, they give a positive signal even when no protein is present. [The BW2014 authors appear to be confusing physical phenomena such as fluorescence with chemical reactivity.]

Some of the compounds that should ring the most warning bells are toxoflavin and polyhydroxylated natural phytochemicals such as curcumin, EGCG (epigallocatechin gallate), genistein and resveratrol. These, their analogues and similar natural products persist in being followed up as drug leads and used as ‘positive’ controls even though their promiscuous actions are well-documented (8,9). [Toxoflavin is not mentioned in either Ref8 or Ref9 although T2004 would have been a relevant reference for this compound. Ref8 only discusses curcumin and I do not consider that the article documents the promiscuous actions of this compound. Proper documentation of the promiscuity of a compound would require details of the targets that were hit, the targets that were not hit and the concentration(s) at which the compound was assayed. The effects of curcumin, EGCG (epigallocatechin gallate), genistein and resveratrol on four membrane proteins were reported in Ref9 and these effects would raise doubts about activity for any of these compounds (or their close structural analogs) that had been observed in a cell-based assay. However, I don’t consider that it would be valid to use the results given in Ref9 to cast doubt on biological activity measured in an assay that was not cell-based.]

Rhodanines exemplify the extent of the problem. [Rhodanines are specifically discussed in K2017 in which I suggest that the most plausible explanation for the frequent-hitter behavior observed for rhodanines in the BH2010 panel of six AlphaScreen assays is that the singly-connected sulfur reacts with singlet oxygen (this reactivity has been reported for compounds with thiocarbonyl groups in their molecular structures).] A literature search reveals 2,132 rhodanines reported as having biological activity in 410 papers, from some 290 organizations of which only 24 are commercial companies. [Consider what the literature search would have revealed if the target substructure had been ‘benzene ring’ rather than ‘rhodanine’? As discussed in this post the B2023 study presented the diversity of targets hit by compounds incorporating a fused tetrahydroquinolines in their molecular structures as ‘evidence’ for pan-assay interference by compounds based on this scaffold.] The academic publications generally paint rhodanines as promising for therapeutic development. In a rare example of good practice, one of these publications (10) (by the drug company Bristol-Myers Squibb) warns researchers that these types of compound undergo light-induced reactions that irreversibly modify proteins. [The C2001 study (Photochemically enhanced binding of small molecules to the tumor necrosis factor receptor-1 inhibits the binding of TNF-α) is actually a more relevant reference since it focuses of the nature of the photochemically enhanced binding. The structure of the complex of TNFRc1 with one of the compounds studied (IV703; see graphic below) showed a covalent bond between one of carbon atoms of the pendant nitrophenyl and the backbone amide nitrogen of A62. The structure of the IV703–TNFRc1 complex shows that a covalent bond between pendant aromatic ring must also be considered as a distinct possiblity for the rhodanines reported in Ref10 and C2001.] It is hard to imagine how such a mechanism could be optimized to produce a drug or tool. Yet this paper is almost never cited by publications that assume that rhodanines are behaving in a drug-like manner. [It would be prudent to cite M2012 (Privileged Scaffolds or Promiscuous Binders: A Comparative Study on Rhodanines and Related Heterocycles in Medicinal Chemistry) if denouncing fellow drug discovery scientists for failure to cite Ref10.]

In a move partially implemented to help editors and manuscript reviewers to rid the literature of PAINS (among other things), the Journal of Medicinal Chemistry encourages the inclusion of computer-readable molecular structures in the supporting information of submitted manuscripts, easing the use of automated filters to identify compounds’ liabilities. [I would be extremely surprised if ridding the literature of PAINS was considered by the JMC Editors when they decided to implement a requirement that authors include computer-readable molecular structures in the supporting information of submitted manuscripts. In any case, claims such as this do need to be supported by evidence.] We encourage other journals to do the same. We also suggest that authors who have reported PAINS as potential tool compounds follow up their original reports with studies confirming the subversive action of these molecules. [I’ve always found this statement bizarre since the BW2014 authors appear to be suggesting that that authors who have reported PAINS as potential tool compounds should confirm something that they have not observed and which may not even have occurred. When using the term “PAINS” do the BW2014 authors mean compounds that have actually been shown to exhibit pan-assay interference or compounds that that share structural features with compounds that were claimed to exhibit frequent-hitter behavior in the BH2010 assay panel? Would interference in with the AlphaScreen read-out by a singlet oxygen quencher be regarded as a subversive action by a molecule in situations when a read-out other than AlphaScreen had been used?] Labelling these compounds clearly should decrease futile attempts to optimize them and discourage chemical vendors from selling them to biologists as valid tools. [The real problem here is compounds being sold as tools in the absence of the measured data that is needed to support the use of the compounds for this purpose. Matches with PAINS substructures would not rule out the use of a compound as a tool if the appropriate package of measured data is available. In contrast, a compound that does not match any PAINS substructures cannot be regarded as an acceptable tool if the appropriate package of measured data is not available. Put more bluntly, you’re hardly going to be able to generate the package of measured data if the compound is as bad as PAINS filter advocates say it is.]

Box: PAINS-proof drug discovery

Check the literature. [It’s always a good idea to check the literature but the failure of the BW2014 authors to cite a single colloidal aggregation article such as M2003 suggests that perhaps they should be following this advice rather than giving it. My view is that the literature on scavenging and quenching of singlet oxygen was treated in a cursory manner in BH2010 (see earlier comment in connection with rhodanines).] Search by both chemical similarity and substructure to see if a hit interacts with unrelated proteins or has been implicated in non-drug-like mechanisms. [Chemical similarity and substructure search will identify analogs of hits and it is actually the exact match structural search that you need do in order to see if a particular compound is a hit in assays against unrelated proteins.] Online services such as SciFinder, Reaxys, BadApple or PubChem can assist in the check for compounds (or classes of compound) that are notorious for interfering with assays. [I generally recommend ChEMBL as a source of bioactivity data.]

Assess assays. For each hit, conduct at least one assay that detects activity with a different readout. [This will only detect problems associated with interference with read-out. As discussed in S2009 it may be possible to assess and even correct for interference with read-out without having to run an assay with a different read-out.] Be wary of compounds that do not show activity in both assays. If possible, assess binding directly, with a technique such as surface plasmon resonance. [SPR can also provide information about MoA since association, dissociation and stoichiometry can all be observed directly using this detection technology.]

That concludes blogging for 2023 and many thanks to anybody who has read any of the posts this year. For too many people Planet Earth is not a very nice place to be right now and my new year wish is for a kinder, happier and more peaceful world in 2024.

Wednesday, 6 December 2023

Are fused tetrahydroquinolines interfering with your assay?

I’ll be taking a look at B2023 (Fused Tetrahydroquinolines Are Interfering with Your Assay) in this post. The article has already been discussed in posts at Practical Fragments and In The Pipeline. In anticipation of the stock straw man counterarguments to my criticisms of PAINS filters, I must stress that there is absolutely no suggestion that compounds matching PAINS filters are necessarily benign. The authors have shown that fusion of cyclopentene at C3-C4 of the tetrahydroquinoline (THQ) ring system is associated with a risk of chemical instability and I consider this to be extremely useful information for anybody thinking about using this scaffold. However, the authors do also appear to be making a number of claims that are not supported by evidence and, in my view, have not demonstrated that the chemical instability leads to pan-assay interference or even frequent-hitter behavior.

The term ‘PAINS’ crops up frequently in B2023 (the authors even refer to “the PAINS concept” although I think that’s pushing things a bit) and I’ll start by saying something about two general types of nuisance behavior of compounds in assays and these points are discussed in more detail in K2017 (Comment on The Ecstasy and Agony of Assay Interference Compounds). From the perspective of screening libraries of compounds for biological activity, the two types of nuisance behavior are very different problems that need to be considered very differently. One criticism that can be made of both BH2010 (original PAINS study) and BW2014 (Chemical con artists foil drug discovery) is that neither study considers the differing implications for drug discovery of these two types of nuisance behavior.

The first type of nuisance behavior in assays is interference with assay read-out and when ‘activity’ in an assay is due to assay interference hits can accurately be described as ‘false positives’ (this should be seen as a problem with the assay rather than the compound). Interference with assay read-outs is certainly irksome when you’re analysing output from screens because you don’t know if the ‘activity’ is real or not. However, if you’re able to demonstrate genuine activity for a compound using an assay with a read-out for which interference is not an issue then interference with other assay read-outs is irrelevant and would not rule out the compound as a viable starting point for further investigation. Interference with assay read-outs generally increases with the concentration of the compound in the assay (this is why biophysical methods are often favored for screening fragments) and I’ll direct readers to a helpful article by former colleagues. It’s also worth noting that interference with read-out can also lead to false negatives.

The second type of nuisance behavior is that the compound acts on a target by an undesirable mechanism of action (MoA) and it is not accurate to describe hits behaving in this manner as ‘false positives’ because the effect on the target is real (this should be seen as a problem with the compound rather than the assay). In contrast to interference with read-out, an undesirable MoA is a show-stopper. An undesirable MoA with which many drug discovery scientists will be familiar is colloidal aggregate formation (see M2003) and the problem can be assessed by running the assay in the absence and presence of detergent (see FS2006). In some cases patterns in screening output may point to an undesirable MoA. For example, cysteine reactivity might be indicated by compounds hitting in multiple assays for inhibition of enzymes that use feature cysteine in their catalytic mechanisms.

I’ll make some comments on PAINS filters before I discuss B2023 in detail and much of what I’ll be saying has already been said in K2017 and C2017 (Phantom PAINS: Problems with the Utility of Alerts for Pan-Assay INterference CompoundS) although you shouldn’t need to consult these articles in order to read the blog post unless you want to get some more detail. The PAINS filter model introduced in BH2010 consists of number of substructures which are claimed (I say “claimed” because the assay results and associated chemical structures are proprietary) to be associated with frequent hitter behavior in a panel of six assays that all use the AlphaScreen read-out (compounds that react with or quench singlet oxygen have the potential of interfere with this read-out). I argued in K2017 that six assays, all using the same read-out, do not constitute a credible basis for the design of an experiment to detect pan-assay interference. Put another way, the narrow scope of the data used to train the PAINS filter model restricts the applicability domain of this model to prediction of frequent-hitter behavior in these six assays. The BH2010 study does not appear present a single example of a compound that has been actually been demonstrated by experiment to exhibit pan-assay interference.

The B2023 study reports that tetrahydroquinolines (THQs) fused at C3-C4 with cyclopentene (1) are unstable. This is valuable information for anybody who may be have the misfortune to be working with this particular scaffold and the observed instability implies that drug discovery scientists should also be extremely wary of any biological activity reported for compounds that incorporate this scaffold. Furthermore, the authors show that the instability can be linked to the presence of the carbon-carbon double bond in the ‘third ring’ since 2, the dihydro analog of 1, appears to be stable. I would certainly mention the chemical instability reported in B2023 if reviewing a manuscript that reported biological activity for compounds based on this scaffold. However, I would not mention that BH2010 has stated that the scaffold matches the anil_alk_ene (SLN: C[1]:C:C:C[4]:C(:C:@1)NCC[9]C@4C=CC@9 ) PAINS substructure because the nuisance behavior consists of hitting frequently in a six-assay panel of questionable relevance and the PAINS filters were based on analysis of proprietary data.

Although I wouldn’t have predicted the chemical instability reported for 1 by B2023, this scaffold is certainly not a structural feature that I would have taken into lead optimization with any enthusiasm (a hydrogen that is simultaneously benzylic and allylic does rather look like a free lunch for the CYPs). I would still be concerned about instability even if methylene groups were added to or deleted from the aliphatic parts of 1. I suspect that the electron-releasing nitrogen of 1 contributes to chemical instability although I don’t think that changing nitrogen for another atom type would eliminate the risk of chemical instability. Put another way, the instability observed for 1 should raise questions about the stability of a number of structurally-related scaffolds. Chemical instability is (or at least should be) a show-stopper in the context of drug discovery even if doesn't lead to interference with assay read-out, an undesirable MoA or pan-assay interference.

I certainly consider the instability observed for 1 to be of interest and relevant to a number of structurally-related chemotypes. However, I have a number of concerns about B2023 and one specific criticism is that the authors use “tricyclic/fused THQ” as a synonym throughout the text as a synonym for “tricyclic/fused THQ with a carbon-carbon double bond in the ‘third’ ring”. At best this is confusing and it could lead to groundless criticism, either publicly or in peer review, of a study that reported assay results for compounds based on the scaffold in 2. A more general point is that the authors make a number of claims that, in my view, are not adequately supported by evidence. I’ll start with the significance section and my comments are italicized in red:

Tricyclic tetrahydroquinolines (THQs) are a family of lesser studied pan-assay interference compounds (PAINS) [The authors need to provide specific examples of tricyclic THQs that have been actually been shown to exhibit pan-assay interference to support this claim.] These compounds are found ubiquitously throughout commercial and academic small molecule screening libraries. [The authors do not appear to have presented evidence to support this claim and the presence of compounds in vendor catalogues does not prove that the compounds are actually being screened. In my view, the authors appear to be trying to ‘talk up’ the significance of their findings by making this statement.] Accordingly, they have been identified as hits in high-throughput screening campaigns for diverse protein targets. We demonstrate that fused THQs are reactive when stored in solution under standard laboratory conditions and caution investigators from investing additional resource into validating these nuisance compounds.

Continuing with the introduction

Fused tetrahydroquinolines (THQs) are frequent hitters in hit discovery campaigns. [In my view the authors have not presented sufficient evidence to support this statement and I don’t consider claims made in the BH2010 for frequent-hitter behavior by compounds matching the anil_alk_ene PAINS substructure to be admissible as evidence simply because they are based on proprietary data. In any case the numbers of compounds matching the anil_alk_ene PAINS substructure and reported in BH2010 to hit in zero (17) or one (11) assays in the PAINS assay panel suggest that 28 compounds (of a total of 51 substructural matches) cannot be regarded as frequent-hitters in this assay panel.] Pan-assay interference compounds (PAINS) have been controversial in the recent literature. While some literature supports these as nuisance compounds, other papers describe PAINS as potentially valuable leads. (1 | 2 | 3 | 4) [The C2017 study referenced as 2 is actually a critique of PAINS filters and I’m assuming that the authors aren’t suggesting that it “supports these [PAINS] as nuisance compounds”. However, I would consider it a gross misrepresentation of C2017 to imply that the study describes “PAINS as potentially valuable leads”.] There have been descriptions of many different classes of PAINS that vary in their frequency of occurrence as hits in the screening literature. [In my view, the number of articles on PAINS appears to greatly exceed the number of compounds that have actually been shown to exhibit pan-assay interference.]

The number of papers that selected this scaffold during hit discovery campaigns from multiple chemical libraries supports the idea that fused THQs are frequent hitters. [Let’s take a closer look at what the authors are suggesting by considering a selection of compounds, each of which has a benzene ring in its molecular structure. Now let’s suppose that each of a large number of targets is hit by at least one of the compounds in this selection (I could easily satisfy this requirement by selecting marketed drugs with benzene rings in their molecular structures). Applying the same logic as the authors, I could use these observations to support the idea that compounds incorporating benzene rings in their molecular structures are frequent-hitters. In my view the B2023 study doesn’t appear to have presented a single example of a fused THQ that has actually been shown experimentally to exhibit frequent-hitter behavior. As mentioned earlier in this post less than half of the compounds matching the anil_alk_ene PAINS substructure that were evaluated in the BH2010 assay panel can be regarded as frequent-hitters.] At first glance, these compounds appear to be valid, optimizable hits, with reasonable physicochemical properties. Although micromolar and reproducible activity has been reported for multiple THQ analogues on many protein targets, hit-to-lead optimization programs aimed at improving the initial hits (Supporting Information (SI), Table S1) have resulted in no improvement in potency or no discernible structure–activity relationships (SAR) [Achieving increased potency and establishing SARs are certainly important objectives in hit-to-lead studies. However, assertions that hit-to-lead optimizations “have resulted in no improvement in potency or no discernible structure–activity relationships” do need to be supported with appropriate discussion of specific hit-to-lead optimization studies.]

Examples of Fused THQs as “Hits” Are Pervasive

The diversity of protein targets captured below supports the premise that the fused THQ scaffold does not yield specific hits for these proteins but that the reported activity is a result of pan-assay interference. [I could use an argument analogous to the one that I’ve just used for frequent-hitters to ‘prove’ that compounds with benzene rings in their molecular structure do not yield specific hits and that any reported activity is due to pan-assay interference. The authors do not appear to have presented a single example of a fused THQ that has been shown by experiment to exhibit pan-assay interference.]

Concluding remarks

Our review and evidence-based experiments solidify the idea that tricyclic THQs are nuisance compounds that cause pan-assay interference in the majority of screens rather than privileged structures worthy of chemical optimization. [While I certainly agree that chemical instability would constitute a nuisance, I would consider it wildly extravagant to claim that tricyclic THQs can “cause pan assay interference” since nobody appears to have actually observed pan-assay interference for even a single tricyclic THQ.] Their widespread micromolar activities on a broad range of proteins with diverse assay readouts support our assertion that they are unlikely to be valid hits. [As stated previously, I do not consider that “widespread micromolar activities on a broad range of proteins” observed for compounds that share a particular structural feature implies that all compounds with the particular structural feature are unlikely to be valid hits.]

So that concludes my review of the B2023 study. I really liked the experimental work that revealed the instability of 1 and linked it to the presence of the double bond in the 'third' ring. Furthermore, these experimental results would (at least for me) raise questions about the chemical stability of some scaffolds that are structurally-related to 1. However, I found the analysis of the bioactivity data reported in the literature for fused THQs to be unconvincing to the extent that it significantly weakened the B2023 study.

Wednesday, 22 February 2023

Structural alerts and assessment of chemical probes

<< previous |

I’ll wrap up (at least for now) the series of posts on chemical probes by returning to the use of cheminformatic models for assessment of the suitability of compounds for use as chemical probes. My view is that there is currently no cheminformatic model, at least in the public domain, that is usefully predictive of the suitability (or unsuitability) of compounds for use as chemical probes and that assessments should therefore be based exclusively on experimental measurements of affinity, selectivity etc. Put another way, acceptable chemical probes will need to satisfy the same criteria regardless of the extent to which they offend the tastes of PAINS filter evangelists (and if PAINS really are as bad as the evangelists would have us believe then they’re hardly going to satisfy these acceptability criteria). My main criticism of PAINS filters (summarized in this comment on the ACS assay interference editorial) is that there is a significant disconnect between dogma and data.

I’ll start by saying something about cheminformatics since, taken together, the PAINS substructures can be considered as a cheminformatic predictive model. If you’re using a cheminformatic predictive model then you also need to be aware that it will have an applicability domain which is limited by the data used to train and validate the model. Consider, for example, that you have access to a QSAR model for hERG blockade that has been trained and validated using only data for compounds that are protonated at the assay pH. If you base decisions on predictions for compounds that are neutral under assay conditions then you’d be using the model outside its applicability domain (and therefore in a very weak position to blame the modelers if the shit hits the fan). While cheminformatic predictive models might (or might not) help you get to a desired endpoint more quickly you’ll still need experimental measurements in order to know that you have indeed got the desired end point.

But let’s get back to PAINS filters which were introduced in this 2010 study. PAINS is an acronym for pan-assay interference compounds and you could be forgiven for thinking that PAINS filters were derived by examining chemical structures of compounds that had been shown to exhibit pan-assay interference. However, the original PAINS study doesn’t appear to present even a single example of a compound that is shown experimentally to exhibit pan-assay interference and the medicinal chemistry literature isn’t exactly bursting at the seams with examples of such compounds.

The data set on which the PAINS filters were trained consisted of the hits (assay results in which the response was greater than a threshold when the compound was tested at a single concentration) from six high-throughput screens, each of which used AlphaScreen read-out. Although PAINS filters are touted as predictors of pan-assay interference it would be more accurate to describe them as predictors of frequent-hitter behavior in this particular assay panel (as noted in a previous post promiscuity generally increases as the activity threshold is made more permissive). From a cheminformatic perspective the choice of this assay panel appears to represent a suboptimal design of an experiment to detect and characterize pan-assay interference (especially given that data from “more than 40 primary screening campaigns against enzymes, ion channels, protein-protein interactions, and whole cells” were available for analysis). Those who advocate the use of PAINS filters for the assessment of the suitability of compounds for use as chemical probes (and the Editors-in-Chief of more than one ACS journal) may wish to think carefully about why they are ignoring a similar study based on a larger, more diverse (in terms of targets and read-outs) data set that had been published four years before the PAINS study.

Although a number of ways in which potential nuisance compounds can reveal their dark sides are discussed in the original PAINS study the nuisance behavior is not actually linked to the frequent-hitter behavior reported for compounds in the assay panel. Also, it can be safely assumed that none of the six protein-protein interaction targets of the PAINS assay panel feature a catalytic cysteine and my view is that any frequent-hitter behavior that is observed in the assay panel for ‘cysteine killers’ is more likely to be due to reaction with (or quenching of) singlet oxygen. It’s also worth pointing out that when compounds are described as exhibiting pan-assay interference (or as frequent hitters) that the relevant nuisance behavior has often been predicted (or assumed) as opposed to being demonstrated with measured data. I would argue that even a ‘maximal PAINS response’ (the compounds is actually observed as a hit in each of the six assays of the PAINS assay panel) would not rule out the use of a compound as a chemical probe.

I have argued on cheminformatic grounds that it’s not appropriate to use PAINS filters for assessment of potential probes but there’s another reason that those seeking to set standards for chemical probes shouldn’t really be endorsing the use of PAINS filters for this purpose. “A conversation on using chemical probes to study protein function in cells and organisms” that was recently published in Nature Communications stresses the importance of Open Science. However, the PAINS structural alerts were trained on proprietary data and using PAINS filters to assess potential chemical probes will ultimately raise questions about the level of commitment to Open Science. I made a very similar point in my comment on the ACS assay interference editorial (Journal of Medicinal Chemistry considers the publication of analyses of proprietary data to be generally unacceptable).

Let’s take a look at “The promise and peril of chemical probes” that was published in Nature Chemical Biology in 2015. The authors state:

“We learned that many of the chemical probes in use today had initially been characterized inadequately and have since been proven to be nonselective or associated with poor characteristics such as the presence of reactive functionality that can interfere with common assay features [3] (Table 2). The continued use of these probes poses a major problem: tens of thousands of publications each year use them to generate research of suspect conclusions, at great cost to the taxpayer and other funders, to scientific careers and to the reliability of the scientific literature.”

Let’s take a look at Table 2 (Examples of widely used low-quality probes) from "The promise and peril of chemical probes". You’ll see “PAINS” in the problems column of Table 2 for two of the six low-quality probes in and this rings a number of alarm bells for me. Specifically, it is asserted that flavones are “often promiscuous and can be pan-assay interfering (PAINS) compounds” and Epigallocatechin-3-gallate is a “promiscuous PAINS compound” which raises a number of questions. Were the (unspecified) flavones and Epigallocatechin-3-gallate actually observed to be promiscuous and if so what activity threshold was used for quantifying promiscuity? Were any of the (unspecified) flavones or Epigallocatechin-3-gallate actually observed to exhibit pan-assay interference? Were affinity and selectivity measurements actually available for the (unspecified) flavones or Epigallocatechin-3-gallate?

I’ll conclude the post by saying something about cheminformatic predictive models. First, to use a cheminformatic predictive model outside its applicability domain is a serious error (and will cast doubts on the expertise of anybody doing so). Second, predictions might (or might not) help you get to a desired end point but you’ll still need measured data to establish that you’ve got to the desired end point or that a compound is unfit for a particular purpose.

Monday, 2 January 2023

Assessment of chemical probes

| next >>

I’ll be taking a look at some of the criteria, specifically structural alerts, by which chemical probes are assessed and here’s the link to the Chemical Probes Portal. Before getting into the post there are a couple of points that I need to stress. First, structural alerts derived from analysis of screening hits (defined as responses that exceed a threshold when assayed at a particular concentration) are not necessarily useful for assessing higher affinity compounds for which concentration responses have been determined. Second, chemical probes will have to satisfy the same set of acceptability criteria whether or not they trigger structural alerts.

I’ll start by commenting on “A conversation on using chemical probes to study protein function in cells and organisms” that was recently published in Nature Communications since it was this article that triggered the blog post. I consider most of the views expressed in the in the article to be sound although I disagree with much of what is stated in the following paragraph:

“The first essential thing that needs to be done is to eliminate the really bad nuisance compounds, which can have problematic behavior—like being non-specifically very reactive with proteins; forming colloidal aggregates that non-specifically adsorb and inactivate proteins; exerting toxicity toward cells, for example through a membrane damaging effect called phospholipidosis; or exhibiting spectral or fluorescence properties that interfere with the biological assay read-out. These undesirable compounds are often referred to as Pan Assay Interference or PAINS compounds, as highlighted by Jonathan Baell [4]. There are software filters or algorithms available that should be used routinely to identify any risk of such chemical promiscuity and simple lab assays should be run to check for the various problematic properties we mentioned. Such compounds should never be considered further or used as chemical probes. They should be excluded from compound libraries. Yet many are sold by commercial vendors as chemical probes and widely used.”

In 2017 a number of ACS journals simultaneously published “The Ecstasy and Agony of Assay Interference Compounds” editorial and I believe that a number of points raised in a comment on this editorial are still relevant to dealing with nuisance compounds. In the comment, I classified bad behavior of screening ‘actives’ as Type 1 (compound hits in the assay but does not affect target function) and Type 2 (compound affects target function through an undesirable mechanism of action). These are two very different problems and each requires very different solutions. Type 1 behavior, which can also be described as interference with read-out, is primarily a problem from the perspective of analysis of high-throughput screening (HTS) output because you don’t know whether observed ‘activity’ is real or not. From the perspective of probe promiscuity, Type 1 behavior is much less of a problem than Type 2 behavior because the ‘activity’ is not real. If you’re trying to decide whether a potential chemical probe is acceptable then genuine activity at 50 nM against another protein is going to hurt a whole lot more than responses of >50% in several assays at a test concentration of 10 μM.

It is asserted in the conversation that there are “software filters or algorithms available that should be used routinely to identify any risk of such chemical promiscuity”. When recommending their use of predictive models for assessment of potential probes, it’s important to be aware of their inherent limitations. Specifically, models derived from analysis of data have applicability domains that are imposed by the data used to build the models. For example, PAINS filters were derived from analysis of the output of six screens that all use the same read-out (AlphaScreen) and this limits the applicability domain of the PAINS filter model to prediction of frequent-hitter behavior in AlphaScreen assays. It is asserted in the conversation that commercial vendors are selling compounds as chemical probes that are unfit for purpose and I strongly recommend that anybody making such assertions should carefully examine the supporting evidence. I would argue that sharing structural features with compounds (for which structures that have not been disclosed) that have been observed to exhibit frequent-hitter behavior when screened at a single concentration (e.g., 10 μM) would not credibly support an assertion that a compound is unsuitable for use as a chemical probe. A specific criticism I would make of the way that structural alerts (especially those derived using proprietary data) are used is that it is sometimes suggested, for example in the ACS assay interference editorial, that HTS hits that don’t trigger structural alerts can be checked less thoroughly than hits that do trigger structural alerts.

The Information Centre of the Chemical Probes Portal includes a “Toxicophores and PAINS Alerts” section in which it is correctly stated that “the presence of the toxicophore or PAINS substructures within the chemical structure of a compound does not necessarily mean that it will be non-specifically active or toxic, or give rise to assay interference”. The “Toxicophores and PAINS Alerts” section might work better as a “Structural Alerts” section and the toxicophores citation appears to be incorrect (reference 10 actually cites an article on toxicity risks associated with excessive lipophilicity). If doing this, I would recommend saying something about the applicability domains of any structural alerts that are highlighted and considering the inclusion of Aggregator Advisor (link to article) and BadApple (here's link to article)

Alternatively, it might be an idea to create separate “Nuisance Compounds” and “Toxicophores” sections because these are very different problems. I would generally recommend the use of the term “nuisance compounds” since PAINS and colloidal aggregators are sometimes treated as separate categories of bad actor, as is the case in the ACS assay interference editorial, and the criteria for labelling compounds as PAINS are ambiguous. It would certainly be useful to include some reviews on assay interference, such as this one, in a “Nuisance Compounds” section. I quite like this article by former colleagues which shows how interference with read-out can be assessed and even corrected for. As for a “Structural Alerts” section, the applicability domains of any predictive models should be indicated so people don’t end up using models that have been trained using hits from screening at 10 μM to assess probes with 20 nM affinity.

This is a good point at which to wrap up and it’s worth stressing that the essence of the criticism of PAINS filters is simply that the rhetoric is not supported by the data. Those like me who are critical of the way that PAINS filters are used are certainly not suggesting that screening hits all smell of roses (back in 1995 I used the Daylight toolkits to build the SMARTS-matching software that was used in the Zeneca ‘de-crapper’ and colleagues also created the Flush software) nor are we denying that assay interference is a serious problem. Although I believe that it is certainly helpful to have scientists who have worked with HTS data share their experiences and opinions with respect to hit quality, I would argue that there are dangers in giving such opinions too much weight (this article may be of interest) especially when data that might be used to justify the opinions are proprietary. Specifically, I would strongly advise against making statements that a compound is unfit for use as a chemical probe unless the assertion is supported by measured data in the public domain for the compound in question.

I’ll leave it there for now. In the next post on chemical probes, I’ll be taking a look at permeability.

Sunday, 14 October 2018

A PAINful itch

I've been meaning to take a look at the Seven Year Itch (SYI) article on PAINS for some time. SYI looks back over the the preceding 7 years of PAINS while presenting a view of future directions. One general comment that I would make of SYI is that it appears to try to counter criticisms of PAINS filters without explicitly acknowledging these criticisms.

This will a long post and strong coffee may be required. Before starting, it must be stressed that I neither deny that assay interference is a significant problem nor do I assert that compounds identified by PAINS filters are benign. The essence of my criticism of much of the PAINS analysis is that the rhetoric is simply not supported by the data. It has always been easy to opine that chemical structures look unwholesome but it has always been rather more difficult to demonstrate that compounds are behaving pathologically in assays. One observation that I would make about modern drug discovery is that fact and opinion often become entangled to the extent that those who express (and seek to influence) opinions are no longer capable of distinguishing what they know from what they believe.

I've included some photos to break up the text a bit and these are from a 2016 visit to the north of Vietnam. I'll start with this one taken from the western shore of Hoan Kiem Lake the night after the supermoon.

Hanoi moon

I found SYI to be something of a propaganda piece with all the coherence of a six-hour Fidel Castro harangue. As is typical for articles in the PAINS literature, SYI is heavy in speculation and opinion but is considerably lighter in facts and measured data. It wastes little time in letting readers know how many times the original PAINS article was cited. One criticism that I have made about the original PAINS article (that also applies to SYI and the articles in between) is that the article neither defines the term PAINS (other than to expand the acronym) nor does it provide objective criteria by which a compound can be shown experimentally to be (or not to be) a PAINS (or is that a PAIN). An 'unofficial' definition for the term PAINS has actually been published and I think that it's pretty good:

"PAINS, or pan-assay interference compounds, are compounds that have been observed to show activity in multiple types of assays by interfering with the assay readout rather than through specific compound/target interactions."

While PAINS purists might denounce the creators of the unofficial PAINS definition for heresy and unspecified doctrinal errors, I would argue that the unofficial definition is more useful than the official definition (PAINS are pan-assay interference compounds). I would also point out that some of those who introduced the unofficial definition actually use experiments to study assay interference when much of the official PAINSology (or should that be PAINSomics) consists of speculation about the causes of frequent-hitter behavior. One question that I shall put to you, the reader, is how often, when reading an article on PAINS, do you see real examples of experimental studies that have clearly demonstrated that specific compounds exhibit pan-assay interference?

Restored bunker and barbed wire at Strongpoint Béatrice which was the first to fall to the Viet Minh.

Although the reception of PAINS filters has generally been positive, JCIM has published two articles (the first by an Associate Editor of that journal and the second by me) that examine the PAINS filters critically from a cheminformatic perspective. The basis of the criticism is that the PAINS filters are predictors of frequent hitter behavior for assays using an AlphaScreen readout and they have been developed using proprietary data. It's a quite a leap from frequent-hitter behavior when tested at single concentrations in a panel of six AlphaScreen assays to pan-assay interference. In the language of cheminfomatics, we can state that the PAINS filters have been extrapolated out of a narrow applicability domain and they have been reported (ref and ref) to be less predictive of frequent-hitter behavior in these situations. One point that I specifically made was that a panel of six assays all using the same readout is a suboptimal design of an experiment to detect and quantify pan-assay interference.

In my article, bad behavior in assays was classified as Type 1 ( assay result gives an incorrect indication of the extent to which the compound affects the function of the target) or Type 2 (compounds affect target function by an undesirable mechanism of action). I used these rather bland labels because I didn't want to become ensnared in a Dien Bien Phu of nomenclature and it must be stressed that there is absolutely no suggestion that other people use these labels. My own preference would actually be to only use the term interference for Type 1 bad behavior and it's worth remembering that Type 1 bad behavior can also lead to false negatives.

The distinction between Type 1 and and Type 2 behaviors is an important and useful one to make from the perspective of drug discovery scientists who are making decisions as to which screening hits to take forward. Type 1 behavior is undesirable because it means that you can't believe the screening result for hits but, provided that you can find an assay (e.g. label-free measurement of affinity) that is not interfered with, Type 1 behavior is a manageable, although irksome, problem. Running a second assay that uses an orthogonal readout may shed light on whether Type 1 behavior is an issue although, in some cases, it may be possible to assess, and even correct for, interference without running the orthogonal second assay. Type 2 behavior is a much more serious problem and a compound that exhibits Type 2 behavior needs to be put out of its misery as swiftly and mercifully as possible. The challenge presented by Type 2 behavior is that you need to establish the mechanism of action simply to determine whether or not it is desirable. Running a second assay with an orthogonal readout is unlikely to provide useful information since the effect on target function is real.

Barbed wire at Strongpoint Béatrice. I'm guessing that it was not far from here that, on the night of 13th/14th March, 1954, Captain Riès would have made the final transmission: "It's all over - the Viets are here. Fire on my position. Out."

Most (all?) of the PAINSology before SYI failed to make any distinction between Type 1 and Type 2 bad behavior. SYI states "There does not seem to be an industry-accepted nomenclature or ontology of anomalous binding behavior" and makes some suggestions as to how this state of affairs might be rectified. SYI recommends that "Actives" be first classified as "target modulators" or "readout modulators". The "target modulators" are all considered to be "true positives" and these are further classified as "true hits" or "false hits". All the "readout modulators" are labelled as "false positives". Unsurprisingly, the authors recommend that all the "false hits" and "false positives" be labelled as pan-assay interference compounds regardless of whether the compounds in question actually exhibit pan-assay interference. In general, I would advise against drawing a distinction between the terms "hit" and "positive" in the context of screening but, if you chose to do so, then you do really do need to define the terms much more precisely than the authors have done.

I think the term "readout modulator" is reasonable and is equivalent to my definition of Type 1 behavior (assay result gives an incorrect indication of the extent to which the compound affects the function of the target). However, I strongly disagree with the classification of compounds showing "non-specific interaction with target leading to active readout" as readout modulators since I'd regard any interaction with the target that affects its function to be modulation. My understanding is that the effects of colloidal aggregators on protein function are real (although not exploitable) and that it is often possible to observe reproducible concentration responses. My advice to the authors is that, if you're going to appropriate colloidal aggregators as PAINS, then you might at least put them in the right category.

While the term "target modulator" is also reasonable, it might not be a such great idea to use it in connection with assay interference since it's also quite a good description of a drug. Consider the possibility of homeopaths and anti-vaxxers denouncing the pharmaceutical industry for poisoning people with target modulators. However, I disagree with the use of the term "false hit" since the modulation of the target is real even when the mechanism of action is not exploitable. There is also a danger of confusing the "false hits" with the "false positives" and SYI is not exactly clear about the distinction between a "hit" and a "positive". In screening both terms tend to be used to specify results for which the readout exceeds a threshold value.

The defensive positions on one of the hills of Strongpoint Béatrice have not been restored. Although the trenches have filled in with time, they are not always as shallow as they appear to be in this photo (as I discovered when I stepped off the path).

It's now time to examine what SYI has to to say and singlet oxygen is as good a place as any to start from. One criticism of PAINS filters that I have made, both in my article and the Molecular Design blog, is that some of the frequent-hitter behavior in the PAINS assay panel may be due to quenching or scavenging of singlet oxygen which is an essential component of the the AlphaScreen readout. SYI states:

"However, while many PAINS classes contain some member compounds that registered as hits in all the assays analyzed and that therefore could be AlphaScreen-specific signal interference compounds, most compounds in such classes signal in only a portion of assays. For these, chemical reactivity that is only induced in some assays is a plausible mechanism for platform-independent assay interference."

The authors seem to be interpreting the observation that a compound only hits in a portion of assays as evidence for platform-independent assay interference. This is actually a very naive argument for a number of reasons. First, compounds do not all appear to have been assayed at the same concentration in the original PAINS assay panel and there may be other sources of variation that were not disclosed. Second, different readout thresholds may have been used for the assays in the panel and noise in the readout introduces a probabilistic element to whether or not the signal for a compound exceeds the threshold. Last, but definitely not least, the molecular structure of a compound does influence the efficiency with which it quenches or scavenges singlet oxygen. A recent study observed that PAINS "alerts appear to encode primarily AlphaScreen promiscuous molecules".

If you read enough PAINS literature, you'll invariably come across sweeping generalizations made about PAINS. For example, it has been claimed that "Most PAINS function as reactive chemicals rather than discriminating drugs." SYI follows this pattern and asserts:

"Another comment we frequently encounter and very relevant to this journal is that PAINS may not be appropriate for drug development but may still comprise useful tool compounds. This is not so, as tool compounds need to be much more pharmacologically precise in order that the biological responses they invoke can be unambiguously interpreted."

While it is encouraging that the authors have finally realized the significance of the distinction between readout modulators and target modulators, they don't seem to be fully aware of the implications of making this distinction. Specifically, one can no longer make the sweeping generalizations about PAINS that are common in PAINS literature. Consider a hypothetical compound that is an efficient quencher of singlet oxygen and that has shown up as a hit in all six AlphaScreen assays of the original PAINS assay panel. While many would consider this compound to be a PAINS (or PAIN), I would strongly challenge a claim that observation of frequent-hitter behavior in this assay panel would be sufficient to rule out the use of the compound as a tool.

SYI notes that PAINS are recognized by other independently developed promiscuity filters.

"The corroboration of PAINS classes by such independent efforts provides strong support for the structural filters and subsequent recognition and awareness of poorly performing compound classes in the literature. It is instructive therefore to introduce two more recent and fully statistically validated frequent-hitter analytical methods that are assay platform-independent. The first was reported in 2014 by AstraZeneca(16) and the second in 2016 by academic researchers and called Badapple.(27)"

I don't think it is particularly surprising (or significant) that some of the PAINS classes are recognized as frequent-hitters by other models for frequent-hitter behavior. What is not clear is how many of the PAINS classes are recognized by the other frequent-hitter models or how 'strong' the recognition is. I would challenge the description of the AstraZeneca frequent-hitter model as "fully statistically validated" since validation was performed using proprietary data. I made a similar criticism of the original PAINS study and would suggest that the authors take a look at what this JCIM editorial has to say about the use of proprietary data in modeling studies.

The French named this place Eliane and it was quieter when I visited than it would have been on 6th May, 1954 when the Viet Minh detonated a large mine beneath the French positions. It has been said that the alphabetically-ordered (Anne-Marie to Isabelle) strongpoints at Dien Bien Phu were named for the mistresses of the commander, Colonel (later General) Christian de Castries although this is unlikely.

SYI summarizes as follows:

"In summary, we have previously discussed a variety of issues key to interpretation of PAINS filter outputs, ranging from HTS library design and screening concentration, relevance of PAINS-bearing FDA-approved drugs, issues in SMARTS to SLN conversion, the reality of nonfrequent hitter PAINS, as well as PAINS and non-PAINS that are respectively not recognized or recognized in the PAINS filters as originally published. However, nowhere has a discussion around these key principles been summarized in one article, and that is the point of the current article. Had this been the case, we believe some recent contributions to the literature would have been more thoughtfully directed. (21,32)"

I must confess that reference to the reality of nonfrequent hitter pan assay interference compounds would normally prompt me to advise authors to stay off the peyote until the manuscript has been safely submitted. However, the bigger problem embedded in the somewhat Rumsfeldesque first sentence is that you need objective and unambiguous criteria by which compounds can be determined to be PAINS or non-PAINS before you can talk about "key principles". You also need to acknowledge that interference with readout and and undesirable mechanisms of action are entirely different problems requiring entirely different solutions.

I noted that recent contributions to the literature from me and from a JCIM Associate Editor (who might know a bit more about cheminformatics than the authors) were criticized for being insufficiently thoughtful. To be criticized in this manner is, as the late, great Denis Healey might have observed, "like being savaged by a dead sheep". Despite what the authors believe, I can confirm that my contribution to the literature would have been very similar even if SYI had been published beforehand. Nevertheless, I would suggest to the authors that dismissing the feedback from a JCIM Associate Editor as if he were a disobedient schoolboy might not have been such a smart move. For example, it could get the JMC editors wondering a bit more about exactly what they'd got themselves into when they decided to endorse a frequent-hitter model as a predictor of pan-assay interference. The endorsement of a predictive model by a premier scientific journal represents a huge benefit to the creators of the model but the flip side is that it also represents a huge risk to the journal.

So that's all that I want to say about PAINS and it's a good point to wrap things up so that I can return to Vietnam for the remainder of the post.

I'm pretty sure that neither General Giap nor General de Castries visited the summit of Fansipan which at 3143 meters is the highest point in Vietnam (I wouldn't have either had a cable car had not been installed a few months before I visited). It's a great place to enjoy the sunset.

Back in Hanoi, I attempted to pay my respects to Uncle Ho, as I've done on two previous visits to this city, but timing was not great (they were doing the annual formaldehyde change). Uncle Ho is in much better shape than Chairman Mao who is actually seven years 'younger' and this is a consequence of having been embalmed by the Russians (the acknowledged experts in this field). Chairman Mao had the misfortune to expire when Sino-Soviet relations were particularly frosty and his pickling was left to some of his less expert fellow citizens. It is also said that the Russian embalming team arrived in Hanoi before Uncle Ho had actually expired...

Catching up with Uncle Ho