So you’ve got a little data analysis problem. You have some compounds with a range of IC

_{50}values and you’d like to explore the extent that molecular size contributes to potency/affinity. Here’s one suggestion for starters. Plot pIC_{50}against your favourite measure of molecular size which could be number of number of non-hydrogen atoms, molecular weight and look at the residuals which tell you how much each compound beats the trend (or is beaten by it). Let’s start by defining pIC_{50}which you can calculate from:
pIC

_{50}= -log(IC_{50}/M) 1
You might be asking why I’m not suggesting that you use the
standard Gibbs free energy of binding which is defined by:

ΔG° = RTln(K

_{d}/C°) 2
The main reason for not doing this is that we can’t. You’ll notice that ΔG° is calculated from K

_{d}and not IC_{50}and the two are not the same thing even though they both have units of concentration. Those of you who have worked on kinase projects may have even used IC_{50}values measured at different ATP concentrations to get a better idea how much kick your inhibitors will have at physiological ATP concentration. Put another way, you can measure the concentration of sugar in your coffee and plug this into equation 2 but that does not make what you calculate a standard Gibbs free energy of binding. If you’ve measured IC_{50}then you really should use pIC_{50}in this analysis. Converting pIC_{50}to ΔG° is technically incorrect and arguably pretentious since the converted pIC_{50}can give the impression that it is somehow more thermodynamic than that from which it was calculated. Converting pIC_{50}to ΔG° also introduces additional units (of energy/mole) and there is always a degree of irony when these units, which may have been introduced to just make biological data look more physical, get lost when the results are presented.
So let’s get back to the data analysis problem. Suppose that we’ve plotted pIC

_{50}against number of heavy (i.e. non-hydrogen) atoms and the next step is to fit the data. Best way to start is to fit a straight line although you could also fit a curve if the data justifies this. Let’s assume that we’re fitting the straight line:
pIC

_{50 }= A + B×N_{HA}3
I realise that using an intercept term (A) will cause a few
eyebrows to become raised. Surely the line of fit
should go through the origin? There is
a problem with this line of thinking and it’s helpful now to talk instead in terms of affinity and ΔG° to
develop the point a bit more. You might
be thinking that in the limit of zero molecular size a compound should have
zero free energy of binding. However, a
zero free energy of binding corresponds to K

_{d}being equal to the standard concentration and you’ll remember that the choice of standard state is arbitrary. If you must derive insights from thermodynamic measurements then the very least that you can do is to ensure that any insights you derive are invariant with respect to the value of the standard concentration.
When you use ligand efficiency (-ΔG°/N

_{HA}) you’re effectively assuming that the value of ΔG° should be directly proportional to the number of heavy atoms in the ligand molecule. One consequence of defining ligand efficiency in this manner is that relative values of ligand efficiency for compounds with different numbers of heavy atoms will change if you change (as thermodynamics tells us that we are allowed to do) the standard concentration used to define ΔG°. I've droned on enough though and it's time to check out. I will however, leave you with the question of whether it makes sense to try to correct ligand efficiency for the effects of molecular size.