Molecular Design: AI-based drug design?

|| >> Next

I’ll start this post by stressing that I’m certainly not anti-AI. I actually believe that drug design tools that are being described as AI-based are potentially very useful in drug discovery. For example, I’d expect natural language processing capability to enable drug discovery scientists to access relevant information without actually having to create database queries. I actually have a long-standing interest in automated molecular structure editing (see KS2005) and see the ability to build chemical structures in an automated manner using Generative AI as a potentially useful addition to the drug designer’s arsenal. Physical chemistry is very important in drug design and there are likely benefits to be had from building physicochemical awareness into the AI tools (one approach would be to use atom-based measures of interaction potential and I’ll direct you to some relevant articles: A1989 | K1994 | LB2000 | H2004 | L2009 | K2009 | L2011 | K2016 | K2022)

All that said, the AI field does appear to be associated with a degree of hype and number of senior people in the drug discovery field seem to have voluntarily switched off their critical thinking skills (it might be a trifle harsh to invoke terms like “herding instinct” although doing so will give you a better idea of what I’m getting at). Trying to deal with the diverse hype of AI-based drug design in a single blog post is likely to send any blogger on a one-way trip to the funny farm so I’ll narrow the focus a bit. Specifically, I’ll be trying to understand the meaning of the term “AI-designed drug”.

The prompt for this post came from the publication of “Inside the nascent industry of AI-designed drugs” DOI in Nature Medicine and I don’t get the impression that the author of the article is too clued up on drug design:

Despite this challenge, the use of artificial intelligence (AI) and machine learning to understand drug targets better and synthesize chemical compounds to interact with them has not been easy to sell.

Apparently, AI is going to produce the drugs as well as design them:

“We expect this year to see some major advances in the number of molecules and approved drugs produced by generative AI methods that are moving forward”, Hopkins says.

I’d have enjoyed being a fly on the wall at this meeting although perhaps they should have been asking “why” rather than “how”:

“They said to me: Alex, these molecules look weird. Tell us how you did it”, Zhavaoronkov [sic] says. "We did something in chemistry that humans could not do.”

So what I think it means to claim that a drug has been “AI-designed” is that the chemical structure of the drug has been initially generated by a computer rather than a human (I’ll be very happy to be corrected on this point). Using computers to generate chemical structures is not exactly new and people were enumerating combinatorial libraries from synthetic building blocks over two decades ago (that’s not to deny that there has been considerable progress in the field of generating chemical structures). Merely conceiving a structure does not, however, constitute design and I’d question how accurate it would be to use the term “AI-designed” if structures generated by AI had been subsequently been evaluated using non-AI methods such as free energy perturbation.

One piece of advice that I routinely offer to anybody seeking to transform or revolutionize drug discovery is to make sure that you understand what a drug needs to do. First, the drug needs to interact to a significant extent with one or more therapeutic targets (while not interacting with anti-targets such as hERG and CYPs) and this is why molecular interactions (see B2010 | P2015 ) are of great interest in medicinal chemistry. Second, the drug needs to get to its target(s) at a sufficiently high concentration (the term exposure is commonly used in drug discovery) in order to have therapeutically useful effects on the target(s). This means that achieving controllability of exposure should be seen as a key objective of drug design. One of the challenges facing drug designers is that it’s not generally possible to measure intracellular concentration for drugs in vivo and I recommend that AI/ML leaders and visionaries take a look at the SR2019 study.

Given that this post is focused on how AI generates chemical structures, I thought it might be an idea to look at how human chemists currently decide which compounds are to be synthesized. Drug design is incremental which reflects the (current) impossibility of accurately predicting the effects that a drug will have on a human body directly from its molecular structure. Once a target has been selected, compounds are screened for having a desired effect on the target and the compounds identified in the screening phase are usually referred to as hits.

The screening phase is followed by the hit-to-lead phase and it can be helpful to draw an analogy between drug discovery and what is called football outside the USA. It’s not generally possible to design a drug from screening output alone and to attempt to do so would be the equivalent of taking a shot at goal from the centre spot. Just as the midfielders try move the ball closer to the opposition goal, the hit-to-lead team use the screening hits as starting points for design of higher affinity compounds. The main objective in the hit-to-lead phase to generate information that can be used for design and mapping structure-activity relationships for the more interesting hits is a common activity in hit-to-lead work.

The most attractive lead series are optimized in the lead optimization phase. In addition to designing compounds with increased affinity, the lead optimization team will generally need to address specific issues such as inadequate oral absorption, metabolic liability and off-target activity. Each compound synthesized during the course of a lead optimization campaign is almost invariably a structural analog of a compound that had already been synthesized. Lead optimization tends to be less ‘generic’ than lead identification because the optimization path is shaped by these specific issues which implies that ML modelling is likely to be less applicable to lead optimization than to lead identification.

This post is all about how medicinal chemists decide which compounds get synthesized and these decisions are not made in a vacuum. The decisions made by lead optimization chemists are constrained by the leads identified by the hit-to-lead team just as the decisions made by lead identification chemists are constrained by the screening output. While AI methods can easily generate chemical structures, it's currently far from clear that AI methods can eliminate the need for humans to make decisions as to which compounds actually get synthesized.

This is a good point at which to wrap up. One error commonly made by people with an AI/ML focus is to consider drug design purely as an exercise in prediction while, in reality, drug design should be seen more in a Design of Experiments framework.

Molecular Design

Tuesday, 18 July 2023

AI-based drug design?

1 comment: