Tuesday, 3 November 2009

Screening libraries: Sampling Chemical Space

<< previous || next >>

I am currently in Rapa Nui (aka Easter Island) and it seemed fitting to continue the series on compound library design from here since the first two posts have been from less commonly visited places like Asuncion and Tierra del Fuego. In the previous post I discussed 2D molecular similarity and showed how this can be used to define diversity and coverage, two important compound library characteristics. In general, compounds in a library need to be mutually diverse in order to provide good coverage although high diversity does not guarantee optimal coverage.

In this post, I’ll take you through an approach to library design called ‘Core and Layer’ (CaL). Although we used this to select compounds for generic fragment libraries and more specialised NMR screening libraries, the method is quite general and I have used it to design a compound library for black box cell screening and to select compounds to complement high throughput screens. The software tools (Flush and BigPicker) used to apply CaL were created at Zeneca by Dave Cosgrove and are described in our article in some detail. Although you might think that the tools were developed in order to apply the CaL method, things actually happened the other way round and it was the availability of the software that led to CaL being adopted as an approach to library design.

Figure 1 shows a schematic view of CaL. The core consists of the compounds currently in the library at any point of the design process and a layer is a set of compounds that have been selected to be diverse with respect to the core. Once a layer has been selected, it is added to the core and the combined set of compounds becomes the new core. The process of selecting layers goes on until you’re either happy with the library or you run out of patience.



You’re probably thinking that this is a very tedious and time-consuming way to build a compound library and might ask whether it would be better to select a maximally diverse set of compounds in a single step. However, there are advantages in building up a library in this manner. In library design, all compounds are not equal and CaL allows you to bias compound selection in a highly-controlled manner. I’ll discuss fragment selection criteria in some detail in future posts in this series so please just assume for now that there are some fragments that you would prefer to have in your library than others. The initial core consists of a sampling of your favourite fragments and as you add layers the compounds in them become progressively less attractive. Another feature of CaL is that it provides a solution to the problem of selected compounds proving to be unavailable as can be the case when trying to source relatively large samples from commercial suppliers.

I think this is a good place to stop as it’s dinner time in Rapa Nui. CaL is an approach to biased sampling of chemical space but it doesn’t tell us about which regions of chemical space should be sampled preferentially. In the next posts of this series I’ll take a look at what makes one fragment better than other. On the travel front, I fly into Auckand in a couple of week’s time for a month and a half in New Zealand and expect to be around Melbourne for the first four months of the New Year. Feel free to get in touch if you’ve got fragment stuff that you’d like to discuss.

Literature cited

Blomberg et al, Design of compound libraries for fragment screening. JCAMD, 2009, 23, 513-525 DOI