It appears that whoever first described Economics as the ‘Dismal Science’ had never encountered a ligand efficiency metric. I’ll be taking a look at the FK2025 study (Covalent ligand efficiency) in this post and the study has already been reviewed by Dan. Something that I’ve observed repeatedly over the years is that authors of ligand efficiency studies exhibit a lack of understanding of units and dimensions associated with physicochemical quantities that would shame a first year undergraduate studying introductory physical chemistry (this is somewhat ironic given that creators of ligand efficiency metrics frequently tout their creations in physicochemical terms). I consider covalent ligand efficiency (CLE) as defined in the FK2025 study to have no value whatsoever for design of drugs that bind irreversibly to their targets through covalent bond formation given that the metric is time-dependent and based on an invalid measure of bioactivity. The formidable Lady Bracknell is clearly unimpressed and I should mention that the photo is from the wikipedia page for English actress Rose Leclercq (1843-1899). Given the serious deficiencies in the FK2025 study this is going to be a long and tedious post (even more so than usual 😁😁😁) so please ensure that you have strong coffee close to hand. As is usually the case in posts here I've used the same reference numbers as were used in FK2025 and quoted text is indented with my comments in red italics. I've organized some of the mathematical material into three tables and references to tables in the post are to these (and not to any of the tables in the FK2025 study).
Before starting the review of FK2025 it’s worth examining irreversible covalent inhibition from a molecular design perspective and I’ll direct readers to the informative S2016 and McW2021 reviews, and the recent L2025 study which presents COOKIE-Pro for covalent inhibitor binding kinetics profiling on the proteome scale. Covalent bond formation between RNA and ligands can also be exploited (S2025 | K2025 | L2015) and I generally use 'target' rather than 'protein' in blog posts and journal articles. An irreversible covalent inhibitor acts by first binding non-covalently to its target in the first step with the covalent bond forming in the second step between an electrophilic ligand atom (the term warhead is commonly used) and a nucleophilic target atom such as the sulfur atom of a cysteine. A commonly used measure of activity for irreversible covalent inhibitors is the kinact/KI ratio which can be thought of as the product of affinity (1/KI) and reactivity (kinact). In design of irreversible covalent inhibitors we try to place the electrophilic atom of the warhead within reacting distance of the nucleophilic atom of the target (this is relatively easy if you have a reliable structure of a complex of the target with a relevant ligand that lacks the electrophilic warhead). The non-covalent complex between target and ligand is stabilised by the non-covalent contacts between the target and ligand (the term ‘molecular interactions’ is also used although I prefer to think in terms of ‘non-covalent contacts’ since the latter can be observed experimentally). However, non-covalent contacts also determine reactivity of the non-covalently bound complex by stabilising the transition state (I consider more correct to think in terms of reactivity of the complex than in terms of reactivity of either the electrophilic warhead or the target nucleophile). In the design context, this means attempting to tune non-covalent contacts to stabilise the transition state to a greater extent than the non-covalent complex.
The LE and CLE metrics share a very serious deficiency in that your perception of efficiency can be altered if you change the value of an arbitrary term in the formula for the metric and I'll start the review of FK2025 by critically examining LE. The meaningless of LE stems from a fundamental misunderstanding of how logarithms work and I'll by point you toward M2011 (Can one take the logarithm or the sine of a dimensioned quantity or a unit? Dimensional analysis involving transcendental functions) that was published in the Journal of Chemical Education. In drug discovery we frequently need to calculate logarithms for quantities and you need to be aware you can’t calculate the logarithm for a dimensioned quantity. Let’s take pIC50 as an example and this quantity is commonly defined as the negative logarithm of the IC50 in mole per litre (M). However, what you actually do when you calculate pIC50 is that you take the negative logarithm of the numerical value of the IC50 when expressed in mole per litre (this is a bit of a mouthful and it can be written more compactly as equation 1 below). While not denying that it is useful to have a convention such as this for expressing potency values logarithmically it should be remembered that the choice of mol per litre (M) is entirely arbitrary and it would be equally correct to use other valid concentration units such as μM or nM. One consequence of choosing mole per litre (M) for expressing IC50 values is that pIC50 values (or at least measured pIC50 values) will generally be positive because of the extreme difficulty of measuring meaningful IC50 values that are greater than 1 M.
Let’s take a look at the binding free energy ΔG° and you’ll notice that I’ve written it with a degree symbol which indicates that this quantity corresponds to a standard state defined by a concentration value C° (the standard concentration). Equation 2 shows how the binding free energy is defined as a difference in chemical potential between ‘reactants’ (target + ligand) and ‘product’ (target-ligand complex) with each species at the standard concentration (the degree symbol indicates that that both the binding free energy and chemical potential depend on with the value of C° and I’ve also shown this explicitly in the equation although this is not actually necessary). Equation 3 shows the dependence of chemical potential on the concentration C of the species and the standard concentration C°. Taken together, Equation 2 and Equation 3 should clarify the origins of the dependence of binding free energy on the standard concentration comes from (there are two ‘reactants’ but only one ‘product’). We can’t actually measure binding free energy directly but we can calculate it from the dissociation constant KD using Equation 4 (which can be derived from Equation 2 and Equation 3). It’s important to be aware that if you use Equation 4 to convert ΔG° values between different values of the standard concentration C° you’ll be making the assumption that solutions are dilute (ΔH is independent of concentration) and this is shown in Equation 5.
Let’s now a take a look at ligand efficiency (LE) and you can see from the photo above that some heretics regard the metric as physically nonsensical (if you're interested in how I came to be chatting with fellow blogger Ash then take a look at this post). The LE metric which is regarded as an article of faith in the fragment-based design community was introduced in the (p5) study with the symbol Δg (see Equation 6 below) and the authors of that study did not actually state that it had to be calculated using a C° value of 1 M (I consider it unlikely that any of the authors were even aware of the dependence of ΔG° on C°). In The Nature of Ligand Efficiency (p9) I defined the quantity ηbind (see Equation 7) by dividing Δg (LE) by RT (when LE values are quoted the molar energy units are usually discarded and T often does not correspond to the temperature at which the assay was run) and by the factor (2.303) used to convert between natural logarithms and base 10 logarithms. The quantity ηbind is directly proportional to Δg (LE) and using it makes it much easier to see how using a different standard concentration can alter your perception of efficiency. Take a look at Table 1 in (p9) and you’ll see that the three compounds (a fragment, a lead and a clinical candidate) bind with equal efficiency when C° is 1 M. Change C° to 0.1 M and the clinical candidate is binds more efficiently than the fragment but when C° to 10 M the fragment becomes more ligand-efficient than the clinical candidate. As noted in (p9) “In thermodynamic analysis, a change in perception resulting from a change in a standard state definition would generally be regarded as a serious error rather than a penetrating insight.”
Here's what the authors of FK2025 say about LE:
LE depends on the choice of the standard concentration (normally 1 M) (p8) (p9) and its maximal available value is size dependent.(p10) (p11) [It's true that LE depends on C° but it’s also true that ΔG° depends on C° and the difference in the two dependencies is that is that LE “depends upon the choice of standard concentration in a nontrivial fashion” (p8). The issue is not the so much that LE depends on C° but that using a different unit to express KD changes how we perceive efficiency. The ΔΔG values that determine perception of affinity don’t change when you use a different value of C° (equivalent to using a different unit to express KD). However, if you use a different value of C° for calculating LE you can see from Table 1 in (p9) that even the ordering of LE values between two ligands can change. I consider the molecular size dependencies of LE observed by the authors of (p10) and (p11) to be artefactual and I’ll point you toward Fig. 1 in (p9) which shows that using a different value of C° can change how we perceive the molecular size dependency of LE.] Nevertheless, LE is an established tool to normalize potency and facilitate the comparison of ligands with a range of potencies and sizes. [It is not uncommon for adherents of religions to consider their beliefs to be established facts.] The usefulness of LE and other efficiency metrics in drug discovery has been extensively analyzed and reviewed elsewhere. (p6) (p12) (p13) (p14) (p15) (p16) (p17) [My view is that nobody has actually demonstrated the usefulness of LE and I’m unconvinced that it would even be possible to do so meaningfully in an objective manner (consider the feasibility of comparing success rates between a group of individuals using LE in discovery projects and a control group of individuals not using LE in discovery projects). Usefulness means that using something provides demonstrable benefits and ‘widely-used’ is not equivalent to ‘useful’ (I’m guessing that more people use homeopathic ‘medicines’ than use ligand efficiency metrics). One piece of advice that I’ll offer to anybody advocating the use of LE in drug design is to ensure that you fully understand the implications of changes in perception resulting from using different units to express quantities not least because you might find yourself lecturing to people who do understand.]
After a lengthy preamble it’s now time to review the FK2025 study. One of the challenges in design of drugs that engage their targets irreversibly is that it’s not possible to meaningfully quantify activity with a single parameter. This is particularly relevant to definition of efficiency metrics which are typically derived by either scaling or offsetting a measured activity value by a risk factor such as molecular size or lipophilicity. While you can certainly measure an IC50 value for an irreversible covalent inhibitor the value that you measure will be time-dependent and it’s not generally meaningful to compare two IC50 values that have been measured using different incubation times. While the kinact/KI ratio is time-independent using it as a measure of activity necessarily entails a degree of information loss.
The authors of FK2025 state:
Our starting point is the LE introduced for noncovalent ligands as a useful metric for lead selection. (p5) [LE was claimed to be useful when it was introduced although no evidence was presented in support of the claim.]
Let’s take a look at Table 2 which shows two equations from FK2025. The first equation, which appears in the text of the article, illustrates two common errors in the efficiency metric field (taking logarithms of dimensioned quantities and discarding units). Authors making either of the errors should ring alarm bells for the reader especially if the authors interpret values of the efficiency metrics.
The authors of FK2025 assert that “LE can be decomposed into contributions from the noncovalent recognition and the covalent reaction (Box 2, Equation III)” and this is reproduced in Table 2 as Equation 2. The first term is a commonly-used mathematical formula for LE when inhibition is reversible and it is important to be aware that KI has been divided by an arbitrary concentration value (1 M) in order that it can be expressed as a logarithm (see Equation 1 in Table 1). The argument of the logarithm in the second term is dimensionless although its magnitude does vary with t. Each term in Equation III (Box 2) has a nontrivial dependence on the value of an arbitrary quantity (the 1 M concentration in the first term and t, in the second term). This means that your perception of efficiency when calculated according to Equation III (Box 2) will be altered if you use either a different concentration unit or a different value of t. You can see this effect in Figure 2 (effect of varying of t) and the appearance of Figure 1 will be altered if you use a value of t other than 1 h or a concentration unit other than M for the calculation of LE.
It's now time to examine CLE (defined as Equation II in Box 3 of the FK2025 study) and I’ll direct you to Table 3 below in which I’ve made some comments. Using CLE requires that the IC50 values for all the inhibitors of interest all correspond to the same time point (t) and it is not clear whether the authors are suggesting that that the IC50 values should all be measured using the same incubation time or need to be calculated from measured KI and kinact values using Equation III in Box 2. A quantity t is also explicitly present in the argument of the logarithm in Equation II in Box 3 and this is necessary for the argument of the logarithm to be dimensionless (see M2011). The argument of the logarithm in Equation II (Box 3) is clearly time-dependent and this means that your perception of efficiency will be altered if you use a different value of t when calculating CLE (just as your perception of efficiency will be altered if you use a different concentration unit to express IC50 when you calculate LE for reversible inhibitors). It also means the molecular size dependency of CLE will vary with time just as the molecular size dependency of LE varies with the concentration unit used to express affinity as can be seen in Fig. 1 of (p9).
However, there is another difficulty which is that the argument of the logarithm in Equation II (Box 3) is not a valid measure of activity (the same criticism can also be made of the xLE metric introduced in the Z2025 study that Dan has already reviewed). This problem is a bit more subtle and it’s important to remember that knowing the IC50 value for a reversible inhibitor enables you to generate a concentration response for inhibition. When you express an IC50 value as a logarithm you need to scale it by a concentration value to ensure that the argument of the logarithm function is dimensionless (see M2011) but it’s important to remember that the concentration unit is still there even though it’s not shown (see Equation 1 in Table 1).
This is a good point at which to wrap up and I’ve argued that CLE has two deficiencies. First, perception of efficiency and its dependency on molecular size both vary with an arbitrary quantity (t) in the argument of the logarithm (this is analogous to the problems caused by the arbitrary nature of the concentration unit used for scaling affinity/potency in the definition of LE for reversible binders). Second, the argument of the logarithm is not a valid measure of activity because it cannot be used to generate a concentration response. Furthermore, I would question the value of aggregating results from multiple assays for analysis even for a valid metric without these deficiencies and I offered the following advice in (p9):
Drug designers should not automatically assume that conclusions drawn from analysis of large, structurally-diverse data sets are necessarily relevant to the specific drug design projects on which they are working.
I’ve criticized the FK2025 study at length and saying how I might use data like this in drug design projects is a good way to conclude the post. A general criticism that I have made of drug design efficiency metrics is that they are based on assumptions of relationships between activity and risk factors such as molecular size. I argued in (p9) that one should use the trend that is actually observed in the data to normalize activity with respect to risk factors and I’ll point you to the relevant section (Alternatives to ligand efficiency for normalization of affinity) in that article. I would start by attempting to model the relationship between kinact and reactivity with glutathione. The objective of this exercise is to identify inhibitors that best exploit their intrinsic reactivity when forming covalent bonds with the target residue (you can quantify this by how far the point for an inhibitor lies above the trend line and the most interesting compounds have the largest positive residuals). I would also examine the relationship between kinact and KI with a view to identifying the inhibitors for which non-covalent interactions with the target most effectively stabilise the transition state relative to the non-covalent complex. I should stress that there is no suggestion that these analyses would necessarily yield useful insight.




