Down in the Synthetic Details

Here’s a concentrated shot of process chemistry for you, with a bit of a modern kick to it (if you’re not up for doing those chemistry shots, just skip down to the last paragraph!) It’s about synthetic routes to a Bruton’s tyrosine kinase drug candidate, branebrutinib (at right), a covalent inhibitor that’s now in clinical trials. At first glance, you have a substituted indole, connected to an aminopiperidine, which is acylated with the reactive alkyne-amide group. There are a lot of ways to approach this one, and the initial synthesis from the drug discovery chemistry group is pretty much what I (and many other chemists) would have tried: a Fischer indole synthesis to give you (after some manipulation) a 4-bromo-5-fluoro-7-cyano indole, which is palladium-coupled with a Boc-protected aminopiperidine, which is then deprotected and coupled with the reactive butynoic acid. Along the way you hydrolyze the 7-nitrile to the amide. It’s a nine-step route with an overall 7% yield, seven steps from the hydrazine intermediate in the Fischer indole part of the synthesis (which intermediate was available by the ton from a previous in-house kinase inhibitor program).

Although the synthesis got the job done, it was nine steps in a straight line, and the timelines for the next delivery of the drug to the clinic were rather tight compared to how long the synthesis would take. The team looked at shortening the route by doing the Pd coupling on the indole where the amide was already present, instead of having to turn it into a nitrile and back into an amide at the end. The discovery chemists had tried this, obviously, but the reaction failed under a number of conditions, so they moved on. But the general rule is that just about any metal-catalyzed coupling can be made to work if you’re willing to throw enough effort at it, and “throw enough effort”, these days, really should mean some sort of high-throughput experimentation (HTE) rather than poking at the reaction one guess at a time. Several rounds of ten-micromolar reaction arrays were set up, and effective conditions were found that changed the solvent, the palladium catalyst, the base, and the additives, taking the amide-indole coupling step from producing nothing but dark gunk to giving a reproducible 60% yield on multigram scale. The new route was now seven steps with an overall 30% yield, which I’m sure made everyone much happier.

But the process group wasn’t done. Pushing the piperidine-coupling part of the synthesis back before the indole was formed could have several advantages, not least replacing a palladium coupling with what could be just a simple SnAr displacement. (Purification of the metal-catalyzed couplings on scale is really one of their weak points – there can be a lot of side products produced in small amounts, and the palladium itself has to be thoroughly removed from anything that’s going to be a drug). Another route of HTE was applied to the reaction of the Boc-aminopiperidine intermediate onto difluoro anthralinic acid, the 4-fluoro to be displaced by the piperidine. It would be even neater (in theory) if this could have been done with the whole alkyne-amide piperidine, but that functional group was just too reactive to take through the synthesis in place, so installing it was fixed as the final step no matter what. An array of ten different bases in six different solvents showed that the base hardly made any difference to the reaction (it comes into the picture after the rate-limiting addition step, so most anything will do). But the solvent was a big variable: diglyme and anisole were mostly awful, and DMSO was excellent. So DMSO and potassium bicarbonate won that round.

Since this was done on an anthranilic acid derivative, the product now has a plain aniline amine where the Fischer indole synthesis calls for a hydrazine. There are ways to make indoles straight from anilines, but the lack of a classic name reaction for that transformation is a tip-off that there are no obvious try-this-first conditions for it. The team came up with whiteboard routes that could go via a reductive route (using perhaps the buttery-smelling compound diacetyl as the two-carbon source), oxidative (with a wide range of coupling reagent possibilities), or neutral (perhaps using some sort of 2-halo butanone). They opted for the neutral route – the reductive one had too few options, and the oxidative one had the potential to lead to trouble when it came to choice of oxidants on scale. Time for more HTE! 288 reactions were set up in three 96-well plates with six different two-carbon electrophile synthons, four different solvents, and twelve Lewis acid choices, and what really stood from that was the combination of acyloin (2-hydroxy-2-butanone) and tin triflate. Another 96-well plate was set up with an additional array of oxophilic Lewis acids in eight different solvents, and what that showed was that zirconium chloride was a good choice, and acetic acid, interestingly, was a good solvent. It wasn’t just the protic acid doing the trick, apparently – triflic acid was a loser in acetic acid and in every other solvent.

Well, I like to paraphrase Tolstoy when it comes to acid catalysts, and say that protic acids are (mostly) all alike, while every Lewis acid is a Lewis acid in its own way. I (and many other chemists) have done just this sort of experimentation, setting up a whole row of small reactions and grabbing all the weirdo metal salts off the shelf to see what happens. Doing it in 96-well plates with mechanical dispensing help and automated LC/MS to check the reactions is definitely the way to go, though! But where my protic/Lewis scheme breaks down is solubility: protic acids are mostly all alike as long as everything stays in solution, and a further HTE screen of those showed that the problem with triflic acid, HCl, and others was that they were forming insoluble complexes. Diphenylphosphoric acid (not anyone’s first thing to reach for, I can assure you) turned out to be strong enough to push the reaction, while keeping everything in solution. The yields were higher than the Lewis acid ones, and there was no need to clean out residual zirconium, cerium, scandium or what have you.

A final round of HTE was conducted to optimize the key butynoic amide coupling in the last step. The tricky part was that things could start adding in a 1,4 fashion to the reactive alkyne –  which is what you want the Cys residue on the kinase target to do eventually, but you don’t want to see it before that! An array of 24 reagents with two different bases in four different solvents, and they sure saw a lot of undesired addition. To make matters worse, the reagents that performed the best (things like PyBroP and TPTU) were not suitable for large-scale synthesis due to cost, sourcing, and waste stream problems. A second HTE round narrowed down on the four remaining reagents that managed to give the desired product without heaps of 1,4 side products, investigating those with sixteen bases in six solvents. This 384-reaction screen identified diphenylphosphinic chloride with N-methylmorpholine in DMF as the best of the bunch.

So at right you see the final synthesis, after all this investigation. Four steps, about 38% yield on a kilo scale, easier purifications, and doable in a far shorter time than the original route as well. And all it took was hundreds and hundreds of reactions, set up with conditions chosen by the expertise of a large team of experienced synthetic chemists! So let’s think about what this tells us, because this is not at all a weird or strange process chemistry example – it’s just one that was done very efficiently through aggressive use of high-throughput experimentation.

First off, I know that there will be some readers who will hark back to my most recent “synthesis machine” post and ask what in the world I can be thinking about, because that’s not how this molecule was produced, for sure. But the automated synthesis that I see coming is more like the discovery chemistry route: it’s something that will make product for you, so you can then run experiments to see what that product does. There is no guarantee that a synthesis machine will provide you with the best route to a given compound – in fact, it’s almost guaranteed not to – but what it will provide is product, with minimal effort on your part, for you to use to answer higher-level questions. A hypothetical automated synthesis machine could definitely have been used to crank out analogs around these structures for enzyme assays, for example.

But once you narrow down to the desired drug candidate, you have to do work like you see in this paper. There’s more food for thought about automation and software: it would be interesting to see what the various retrosynthesis programs proposed for this molecule. You can bet that they would not have given you the route at right (although it would not be surprising if they gave you one like the original route). These programs, as they stand now, are also there to show you a route that has a reasonable chance of working, not to provide you with the best synthesis possible. A strength of the software, though, is that as new reactions and routes are published, they get incorporated. This very paper under discussion, for example, will be fed into the hoppers of the various retrosynthesis programs now that it’s out in the literature, and this particular indole synthesis (for example) will now be an option that it can suggest. I personally would set the filters on such software to strongly favor routes that appear in papers like this one and other from the OPR&D journal, because (as you can see) these tend to have a lot of care and attention given to them.

What would it take to have software that showed you the best synthesis for a given compound, though? Now that’s a dream that even I think is out of reach for us, at least for as far out into the future as I can imagine. And this paper illustrates why! Look at all the tiny variations that end up making a difference, and sometimes a big difference. If we could model or compute our way to the answers in such situations, believe me, we would do that rather than set up endless arrays of reactions just to see what happens. Everyone who’s done research-level synthetic organic chemistry has experienced this: you flip one chiral center in your molecule, or made a chain one carbon longer, you change the solvent from one ether to another, raise or lower the temperature a bit, switch a sodium salt for a potassium one, change a ligand on your palladium catalyst, whatever, and all sorts of craziness breaks loose. And it’s often not easy to see why things changed so much. Ex post facto you can sometimes come up with hypotheses, and use those to fix things up if you’re right. But there are plenty of throw-your-hands-up moments that just never get explained at all. 

Organic chemistry wobbles and teeters across an energy landscape that (from the viewpoint of any given reaction) is full of huge hills, deep valleys, and twisty little pathways that are followed by walking along the edges of steep, crumbling ridges. But from a distance, all that topography is compressed into a pretty narrow thermodynamic range. The differences between a reaction working and not working, between it giving you mostly Product A, mostly Product B, mostly returned starting material, or mostly scorched pan drippings are energetically very small. All sorts of little changes can send things off in different directions, and these can be largely inside the error bars of our attempts to model them. Organic chemistry is indeed a mature science, but don’t confuse that with thinking that it’s a solved problem. If you just want the molecules, damn the cost, to answer other questions we can generally provide them. But if you want them made elegantly, you’ll need to take a seat – and you better have packed some lunch with you.