Molecular Design: June 2009

<< previous || next >>

I’m looking forward to posting on design of compound libraries for fragment screening. However, before launching that series I wanted to comment on an article entitled ‘The influence of lead discovery strategies on the properties of drug candidates’ that has already been reviewed by Dan.

My first comment is that the analysis uses a database of 335 hit-lead pairs from HTS. I think it’s quite difficult to define meaningful hit-lead pairs for HTS. I first got into analysis of HTS output a decade and a half ago and from the start we looked for groups of structurally similar compounds in the actives. The larger the group of similar compounds, the greater the interest since observation of these active clusters increases confidence that activity is real. I’m really not sure that hit-lead pairs can be defined meaningfully for HTS-derived leads, especially when journal articles are the primary information source.

That said, the main reason for this post is to take a closer look at ligand efficiency defined in terms of lipophilicity. The most obvious way to define a ligand efficiency metric in terms of lipophilicity is simple to subtract logP from pIC50. There is the issue of whether one should use logP or logD as the measure if lipophilicity but, particularly when using calculated partition coefficients, I think it’s best to use logP. Some of this was discussed in the AstraZeneca fragment based lead generation review from a couple of years ago although I’m sure that the idea of subtracting logP from pIC50 was not exactly new then. The difference (pIC50 – ClogP) has since become known as ligand lipophilicity efficiency (LLE).

Unlike molecular size measures of ligand efficiency, (pKd – logP) has a firm, although somewhat obscure, thermodynamic basis (at least for neutral molecules). The product (Kd x P) is an equilibrium constant in its own right. Just as Kd is a measure of the relative stabilities of bound and aqueous ligand (Kd x P) is a measure of the relative stabilities of bound ligand (in an aqueous medium) and ligand at its standard state in octanol. The product (Kd x P) and its negative logarithm (pKd – logP) both quantify the extent to which the ligand would ‘prefer’ to be bound to protein or solvated in octanol. It’s worth noting at this point that octanol is quite polar and alkane/water partition coefficients would probably represent a better measure of lipophilicity if they were more accessible. I’ll probably discuss this in more detail at some point in the future but for now you might want to take a look at a recent article on prediction of alkane/water partition coefficients because it reviews a lot of the earlier work.

The authors of the featured article assert that LLE does not include ligand efficiency. I don’t completely agree with that statement because one could say that LLE is a measure of how efficiently a ligand exploits its lipophilicity to bind to the target protein. However it is clear that no explicit measure of molecular size is used in the definition of LLE. The authors propose dividing logP by ligand efficiency (LE) and call this function LELP and suggest that this should be between -10 and 10 (no units specified) for acceptable leads.

I must confess that I just don’t get it. Using this metric, a compound with logP = 1 and LE = 0.1 (in whatever units they’re using) is equivalent to one of logP = 3 and LE = 0.3. Also a compound with logP = 0 becomes an acceptable lead even with a millimolar Kd. I’m not convinced that LELP gets the balance right between penalising size and lipophilicity.

I’m not a great fan of ligand efficiency metrics although they have a place for comparing hits from the same assay and I’ve certainly used them for that purpose. My favored approach to combining size and lipophilicity into a single efficiency measure would be to use one of the following two functions depending on which measure of affinity/potency is being used:

where HA is the number of non-hydrogen (heavy) atoms and measured or calculated values of logP can be used depending on which are available. It’s worth pointing out that the lipophilicity that gets measured is actually logD (typically at pH = 7.4) rather than logP. For neutral compounds the two are the same (unless you get self-association in one of the phases as might happen with a lactam in hydrocarbon) but for compounds that are ionised you’ll either need to know the pKa or determine logD as a function of pH in order to get logP.

Something this article flagged up for me is the difficulty in finding concise but meaningful names for ligand efficiency metrics. I think of efficiency as something associated with a process or action rather than an object and therefore prefer to talk about ‘binding efficiency’. The other benefit of doing this is that it reminds us that the efficiency is defined for the combination of ligand and assay system (protein, co-factors, substrates, buffer components etc) and not the ligand in isolation.

To get us thinking about how we might improve matters, I’ll make some suggestions. I think we should be using pIC50 or pKd to quantify binding since units of energy never seem to get quoted when binding free energies are used to define what I’ll now refer to as binding efficiency. Also we tend to think more in terms of pIC50 than energy when we talk about potency and binding. Here are three equations that we can use to define binding efficiency with the subscripts indicating the property or properties used to scale potency. I suggest calling each quantity binding efficiency by the appropriate property. For example equation 3 defines ‘binding efficiency by size and lipophilicity’.

Common measures of size include number of heavy atoms, molecular weight (molar mass), surface area and volume and equations 1 and 3 can be used with either provided that the appropriate units (Heavy atoms, g/mol, Da, square Angstrom, cubic Angstrom) are quoted. If the unit of size is included, it is possible to tell which measure of size has been used to scale potency. As an aside, note how potency, which has units of concentration, is converted into a dimensionless number by dividing by M (mol/litre). I’m a bit less happy with using subscript-L to denote lipophilicity because it doesn’t allow us to distinguish binding efficiencies derived using logP and logD. Nevertheless, this represents a start although it’s likely that other folk will have ideas that we can use to refine the definitions.

Literature cited

Kesuru & Makara, The influence of lead discovery strategies on the properties of drug candidates. Nature Rev. Drug Discov. 2009, 8, 203-212 DOI

Albert et al, An Integrated Approach to Fragment Based Lead Generation: Philosophy, Strategy and Case Studies from AstraZeneca's Drug Discovery Programs. Curr. Top. Med. Chem. 2007, 7, 1600-1629 Link

Leeson & Springthorpe, The influence of druglike concepts on decision-making in medicinal chemistry. Nature Rev. Drug Discov. 2007, 8, 203-212 DOI

Toulmin et al, Toward prediction of alkane/water partition coefficients J. Med. Chem. 2008, 51, 3720-3730 DOI

Molecular Design

Sunday, 7 June 2009

Scaling potency by lipophilicity and molecular size