Saturday, 14 February 2009

Molecular complexity and extent of substitution

<< previous || next >>

Having introduced extent of substitution as a measure of molecular complexity in an earlier post, I was particularly interested by Dan's posts on AT7519 and AT9283. In each case, the screening hit used as a starting point for further elaboration lacked acyclic substituents.

You might wonder how you could impose this substructural requirement when selecting compounds for screening. This is actually very easy using SMARTS notation (Daylight SMARTS tutorial | OpenEye SMARTS pattern matching | SMARTS in wikipedia). The requirement that terminal non-hydrogen atoms be absent can be specified as:

[A;D1] 0

D1 indicates a non-hydrogen atom (A) that is connected to only one other non-hydrogen atom and 0 requires that these cannot be present in acceptable molecules. A requirement like this can be combined with a requirement for 10 to 20 non-hydrogen atoms:

* 10-20

I will discuss the use of SMARTS for compound selection in more detail in connection with design of screening libraries so think of this as a taster. I've also tried to keep things simple by assuming that hydrogen atoms are implicit which means that they are treated as a property of the atoms to which they are bonded rather than as atoms in their own right.

1 comment:

Dan Erlanson said...

This is an interesting observation, and it got me wondering as to its generality. The one other “fragment to clinic” featured on Practical Fragments (besides AT7519 and AT9283) started with a fragment that had two sites of substitution off an indole core, but this trifecta is a small data set, so I dug up my 2006 Current Opinion in Biotechnology review and was somewhat shocked at the results: of the 40+ fragments listed, only a single one lacked acyclic substituents! A more recent review from Astex in J. Med. Chem. (Congreve et al., 2008, 3661) gave similar results: of the roughly 30 fragments highlighted, only one of them lacked any acyclic substituents.

I’m not sure what this means, but I suspect it has to do with the reluctance of many chemists to pursue fragments that are viewed as too simple, as discussed in earlier posts. Since both reviews rely on published accounts of fragments that were, in most cases, advanced to more potent compounds, fragments with equally attractive ligand efficiency but lower complexity that were not pursued due to a dislike of unsubstituted starting points would have been unfairly excluded.

What we don’t know is how often unsubstituted fragments show up as hits relative to their abundance in fragment libraries. Unfortunately this information is not publicly available, but I bet many companies have enough proprietary data to answer this question. Any volunteers?