AI is learning the strategies chemists use to build new molecules

373 points by _Dark_Wing a day ago on reddit | 8 comments

Something happens when an experienced chemist looks at a route someone else designed. Within seconds, they’ll often say it won’t work – not because they ran a calculation, but because the order of decisions just looks wrong.

Getting software to replicate that judgment has been one of chemistry’s harder problems. Algorithms can generate routes by the hundreds.

Evaluating them the way an expert would is something else – and a new system from EPFL shows that gap can close.

AI and molecule synthesis

Most synthesis starts at the end. Chemists begin with the target molecule and work backward – a method called retrosynthesis. Break the molecule into smaller pieces. Ask what simpler ingredients could rebuild it from scratch.

Each disconnection sets up more decisions. Form this ring early or later? Protect that fragile group, or risk it? Use a quick reaction with a low yield, or a slower one that scales better?

Existing software can list options at every step. What it has not been able to do is reliably tell which combination of moves a thoughtful chemist would actually pursue.

That gap is what a team led by Philippe Schwaller, a chemist at the École Polytechnique Fédérale de Lausanne (EPFL), set out to close.

A different role

Schwaller’s group built a framework called Synthegy. It uses large language models, but not in the way most people picture. The models do not invent the chemistry themselves.

Standard retrosynthesis programs still generate the candidate routes, drawing on enormous reaction databases and decades of research into how chemicals combine. Synthegy steps in afterward.

A language model reads each candidate route, written out in plain text, and judges how well it lines up with whatever the chemist typed at the start. The chemist might say: avoid protecting groups. Or form the cyclohexane ring early.

The model scores each route against that prompt, ranks the options, and explains its reasoning.

Plain English routes

Changing the interface sounds like a small thing. It isn’t. Older tools relied on rigid filters – forbidden reactions, hard-coded preferences, numerical thresholds – and a chemist who wanted the program to think differently had to re-teach it through new code.

With Synthegy, the same chemist types a sentence. The route-generating software returns its long list, and what comes back is a short ranking of paths that match the instruction, each with a written explanation.

Iteration shrinks from hours to minutes. Grading 60 candidate routes takes about 12 minutes and costs roughly $2 to $3 in computing fees, the team reports.

“With Synthegy, we’re giving chemists the power to just talk, allowing them to iterate much faster and navigate more complex synthetic ideas,” said Andres M. Bran, the paper’s first author.

Tracking electron moves

The same method applies at a smaller scale. Reaction mechanisms show the step-by-step details of how a reaction unfolds – electrons moving between atoms, bonds breaking and reforming.

Mechanisms are how chemists explain why a reaction works, not just what comes out of it. Synthegy can step through those electron moves and flag which sequences make chemical sense.

Chemists can feed in extra context, too. The temperature. A hunch about a specific pathway. The model uses that context to narrow the search.

Tested by chemists

Not content with internal benchmarks, the team ran a double-blind study. Thirty-six chemists were shown pairs of synthesis routes for the same target molecule and asked which one better matched a written prompt.

Their picks were compared with Synthegy’s picks across 368 valid evaluations. The system agreed with the chemists 71.2 percent of the time. Not perfect. But high enough, the authors argue, to suggest the model is tracking real strategic reasoning, not surface features of the text.

Until this study, no one had shown that a language model could grade hundreds of multi-step synthesis routes against a chemist’s stated strategy and agree with human experts more often than not.

Where it stumbles

Synthegy has clear limits. Smaller language models performed close to random when judging routes, so the system depends on the largest, most expensive models to do useful work.

Sometimes the models misread which direction a reaction runs. That produces wrong feasibility calls. Routes longer than 20 steps get hard for the system to follow coherently.

A narrower test

The 71.2% agreement figure comes from a specific slice of the problem. Route pairs in the human evaluation were constrained to between 6 and 15 steps.

All were judged against one type of strategic prompt. Whether the match rate holds for shorter routes or a broader range of instructions remains untested.

What this opens

What is now known is that a system scoring synthesis routes against plain-language instructions can match expert chemists’ choices nearly three-quarters of the time – suggesting the strategic reasoning that takes years to develop can be captured, at least in part, by the right prompt.

For drug discovery labs, that lowers the cost of exploring more aggressive strategies on a tight timeline. For graduate students, it puts a senior chemist’s instinct one prompt away.

The technique could even plug into automated synthesis robots, letting machines take on routes already screened for strategic sense before any glassware is touched.

A tool that grasps strategy can also grasp critique. Synthegy already flags unnecessary protecting steps and judges feasibility – the kind of feedback that once required a senior chemist looking over your shoulder.

The study is published in Matter.

—–

Like what you read? Subscribe to our newsletter for engaging articles, exclusive content, and the latest updates.

Check us out on EarthSnap, a free app brought to you by Eric Ralls and Earth.com.

—–