In the previous post, I questioned the validity of
scaled ligand efficiency metrics (LEMs) such as LE. However, LEMs can also be defined by
subtracting the value of the risk factor from activity and this has been termed offsetting. For example, you can
subtract a measure of lipophilicity from pIC50 to give functions
such as:
pIC50 – logP
pIC50 – ClogP
pIC50 – logD(pH)
As you will have gathered from the previous post, I am
not a big fan of naming LEMs. The reason for this is that you often (usually?)
can’t tell from the definition exactly what has been calculated and I think it
would be a lot better if people were forced (journal editors, you can achieve something
of lasting value here) to be explicit about the mathematical function(s) with
which they normalize activity of compounds. In
some ways, the problems are actually worse when activity is normalized by
lipophilicity because a number of different measures of lipophilicity can be used
and because differences between lipophilicity measures are not always well understood
(even by LEM ‘experts’). Does LLE mean ‘ligand-lipophilicity efficiency’ or 'lipophilic ligand efficiency’?
When informed that a compound has an LLE of 4, how can I tell whether it
has been calculated using logD (please specify pH), logP or predicted logP (please specify
prediction method since there several to choose from)?
Have a look at this figure and observe how the three
parallel lines respond the offsetting transformation (Y => Y-X). The line with unit slope transforms to a line
of zero slope (Y-X is independent of X) while the other two lines transform to
lines of non-unit slope (Y/X is dependent on X). This figure is analogous to the one in the
previous post that showed how three parallel lines transformed under the
scaling transformation.
It’s going to be helpful to generalize Lipophilic
Efficiency (LipE) so let’s do that first:
LipEgen
= pIC50 - (l ´ ClogP)
Generalizing
LipE in this manner shows us that LipE (l = 1) is actually quite arbitrary (in much the same way that a standard or
reference state is arbitrary and one might ask whether l = 0.5 might not be a better LEM. Note that a similar criticism was made of Solubility Forecast Index in the Correlation Inflation Perspective. One
approach to validating an LEM would be to show that it actually predicted
relevant behavior of compounds. In the case of LEMs based on lipophilicity, it would be necessary to
show that the best predictions were observed for l = 1. Although
one can think of an LEMs as simple quantitative structure activity relationships
(QSARs), LEMs are rarely, if ever, validated in a way
that QSAR practitioners would regard as valid.
Can anybody find a sentence in the pharmaceutical literature containing the words ‘ligand’, ‘efficiency’ and ‘validated’? Answers on a postcard...
Offset LEMs do
differ from scaled LEMs and one might invoke a thermodynamic argument to
justify the use of LipE as an LEM. In a nutshell
it can be argued that LipE is a measure of the ease of moving a compound from a
non-polar environment to its binding site in the protein. There are two flaws in this argument which were
discussed in our LEM critique which will be open access for another 10 days or so. Firstly, when
ligand binds in an ionized form, lipophilicity measures do not quantify the
ease of moving the bound form from octanol to water because ionized forms of
compounds do not usually partition into octanol to a significant extent.
Secondly, octanol/water is just one of a number of partitioning systems and one
needs to demonstrate that lipophilicity derived from it is optimal for
definition of an LEM. The figure below shows
how logP can differ when measured in alternative partitioning systems and you
should be aware of an occasionally expressed misconception that the relevant logP
values simply differ by a constant amount.
One solution to
the problem is to model pIC50 as a function of your favored measure of lipophilicity and use the
residuals to quantify the extent to which activity beats the trend in the
data. This is what exactly what I suggested in the previous post as an
alternative to scaling activity by risk factors such as molecular weight or
heavy atoms and the approach can be seen as bringing these risk factors and
lipophilicity into a common data-analytical framework. Even if you don’t like the idea of using the
residuals, it is still useful to model the measured activity because a slope of
unity helps to validate LipE (assuming that you’re using ClogP to model activity).
Even if the slope of the line of fit
differs from unity, you can set l to its value to create a lipophilic efficiency metric
that has been tuned to the data set that you wish to analyze.
This is a good point
at which to wrap up. As noted (and reiterated)
in the LEM critique, when you use LEMs, you're making assumptions about
trends in data and your perception of the system is distorted when these
assumptions break down. Modelling the
data by fitting activity to risk factor allows you use the trends actually
observed in the data to normalize activity.
That’s just about all I want to say for now and please don’t get me started
on LELP…
No comments:
Post a Comment