It has been a while since I did a proper blog post. Some of you may have encountered a Perspective article
entitled, ‘Ligand efficiency metrics considered harmful’ and I’ll post on it
because the journal has made the article open access until Sept 14. The Perspective has already been reviewed by
Practical Fragments and highlighted in a F1000 review. Some of the points discussed in the article
were actually raised last year in Efficient Voodoo Thermodynamics, Wrong Kind of Free Energy and ClogPalk : a method for predicting alkane/water partition coefficients. There has been recent
debate about the validity of ligand efficiency (LE) which is summarized in a blog post (make sure to look at the comments as well). However, I believe that both sides missed the
essential point which is that the choice (conventionally 1 M) of concentration
that is used to define the standard state is entirely arbitrary.
In this blog post, I’ll focus on what I call ‘scaled’ ligand efficiency metrics
(LEMs). Scaling means that a measure of activity or affinity is divided by a
risk factor such as molecular weight (MW) or heavy (i.e. non-hydrogen) atoms
(HA). For example, LE can be calculated
by dividing the standard free energy of binding (ΔG°) by HA:
LE = (−1/HA)´RTloge(Kd/C°)
Now you’ll notice that I’ve written ΔG° in terms of the
dissociation constant (Kd) and the standard concentration (C°) and I articulated why
this is important early last year. The
logarithm function is only defined for numbers (i.e. dimensionless quantities)
and, inconveniently, Kd has units of concentration. This means that LE is a function of both Kd
and C° and I’m going to first
redefine LE a bit to make the problem a bit easier to see. I'll use IC50 instead of Kd to define a new LEM which
I’m not going to name (in the LEM literature names and definitions keep changing
so much better to simply state the relevant formula for whatever is actually used):
(−1/NHA)´log10(IC50/Cref)
I have a number of reasons for defining the metric in this
manner and the most important of these is that the new metric is metric is
dimensionless. Note how I use the number
of heavy atoms (NHA) rather than HA to define the metric. Just in case you’re wondering what the
difference is, HA for ethanol is 3 heavy atoms while NHA is 3
(lacking units of heavy atoms). [In the original post I'd counted 2 for this molecule but if you check the comments you'll see that this howler was picked up by an alert reader and I has now been corrected.]
Apologies for being pedantic (some might even call me a units Nazi) but
if people had paid more attention to units, we’d never have got into this sorry
mess in the first place. The other point
to note is that I’ve not converted IC50 to units of energy, mainly
for the reason that it is incorrect to do so because an IC50 is not
a thermodynamic quantity. However, there are other reasons for not introducing
units of energy. Often units of energy go AWOL when values of LE are presented
and there is no way of knowing whether the original units were kcal/mol or
kJ/mol. Even when energy units are stated explicitly, this might be masking a
situation in which affinity and potency measurements have been combined in a
single analysis. Of course, one can be cynical and
suggest that the main reason for introducing energy units is to make
biochemical measurements appear to be more physical.
So let’s get back to that new metric and you’ll have noticed
a quantity in the defining equation that I’ve called Cref (reference
concentration). This is similar to the
standard concentration in that the choice of its value is completely arbitrary
but it is also different because it has no thermodynamic significance. You need
to use it when defining the new LEM because, at the risk of appearing
repetitive, you can’t calculate a logarithm for a quantity that has units. Another way of thinking about Cref
is as an arbitrary unit of concentration that we’re using to analyze some
potency measurements. Something that is
really, really important and fundamental in science is that your perception of
a system should not change when you change the units of the quantities that
describe the system. If it does then, in
the words of Pauli, you are “not even wrong” and you should avoid claiming
penetrating insight so as to prevent embarrassment later. So let’s take a look at
how changing Cref affects our perception of ligand efficiency. The table below is essentially the same as
what was given in the Perspective article (the only difference is that I’m
using IC50 in the blog post rather than Kd). I also should point out that Huan-Xiang Zhou
and Mike Gilson made a similar criticism of LE back in 2009 (although we cited their
article in the context of standard states, we failed to notice the critique at
the end of their article and there really is no excuse for having missed
it). When a reference concentration of 1
M is used, the three compounds are all equally ligand efficient according to
the new LEM. If Cref = 0.1 M,
the compounds appear to become more ligand efficient as molecular size
increases but the opposite behavior is observed for Cref = 10 M.
Here’s a figure that shows the problem from a different
angle. See how the three parallel lines respond the scaling transformation (Y=> Y/X). The line that passes through
the origin transforms to a line of zero slope (Y/X is independent of X) while
the other two lines transform to curves (Y/X is dependent on X). This graphic has important implications for
fit quality (FQ) because it shows that some of the size dependency of LE is a
consequence of the (arbitrary) choice of a value (usually 1 M) for Cref
or C°.
A common response that I’ve encountered when raising these
points is that LE is still useful. My counter-response
is that Religion is also useful (e.g. as a means for pastors to fleece their flocks
efficiently) but that, by itself, neither makes it correct nor ensures that that
it is being used correctly (if that is even possible for Religion). One occasionally expressed opinion is that,
provided you use a single value for Cref or C°, the resulting LE values will be
consistent. However, consistency is no
guarantee of correctness and we need to remember that we create metrics for the
purpose of measuring things with them. When you advocate use of a metric in drug discovery,
the burden of proof is on you to demonstrate that the metric has a
sound scientific basis and actually measures what it supposed to measure.
The Perspective is critical of the LEMs used in drug
discovery but it does suggest alternatives that do not suffer from the same
deficiencies. This is a good point to
admit that it took me a while to figure out what was wrong with LE and I’ll
point you towards a blog post from over five years ago that will give you an idea about how my
position on LEMs has evolved. It is
often stated that LEMs normalize activity with respect to risk factor although
it is rarely, if ever, stated explicitly what is meant by the term ‘normalize’. One way of thinking about normalization is
as a way to account for the contribution of a risk factor, such as molecular
size or lipophilicity, to activity. You
can do this by modelling the activity as a function of risk factor and using the
residual as a measure of activity that has been corrected for the contribution
made by the risk factor in question. You
can also think of an LEM as a measure of the extent to which the activity of a
compound beats a trend (e.g. linear response of activity to HA). If you’re
going to do this then why not use the trend actually observed in the data rather
than some arbitrarily assumed trend?
Although I still prefer to use residuals to quantify extent
to which activity beats a trend, the process of modelling activity measurements
hints at a way that LE might be rehabilitated to some extent. When you fit a line to activity (or affinity)
measurements, you are effectively determining the value of Cref (or C°) that will make the line
of fit pass through the origin and which you can then use to redefine activity
(or affinity) for the purpose of calculating LE. I would argue that the
residual is a better measure of the extent to which the activity of a compound
beats the trend in the data because the residual has sign and the uncertainty
in it does not depend explicitly on the value of a scaling variable such as HA.
However, LE defined in this manner can at least be claimed to take account of
the observed trend in the activity data. Something that you might want to think about in this context is whether or not you'd expect the same parameters (slope and intercept) if you were to fit activity measured against different targets to MW or HA. My reason for bring this up is that it has implications for the validity of mixing results from different assays in LE-based analyses so see what you think.
I’ll wrap up by directing you to a presentation that I've been
doing lately and it includes material from the earlier correlation inflation study. A point that I make when presenting
this material is that, if we do bad science and bad data analysis, people can
be forgiven for thinking that the difficulties in drug discovery may actually be
of our own making.