Crystal Fragments

I started doing a good deal of fragment-based drug discovery ten or fifteen years ago, and I still have a lot of respect for the technique. For those outside the business, the idea behind FBDD is to not start off with a big screen of drug-sized molecules that you might have in your general screen collection or various focused libraries, but rather to do a smaller screen (generally in the low-thousands-of-compounds range at most) that only uses smaller structures with a limited number of features in each one. If you select these carefully, you can end up covering a rather large amount of chemical space a bit at a time. These fragment-sized compounds are of course going to bind more weakly than a strong hit that has more structural features, but the concept of “ligand efficiency” is what you use to judge these.


To illustrate: a compound that weighs (say) 425 daltons and binds at (say) 30 micromolar affinity is probably not very interesting at all. Your screening collection has a huge number of compounds in that molecular weight range (which is pretty typical for many drugs), and you probably have hundreds of them that have double-digit micromolar binding numbers. But what do you do with them? That binding affinity probably needs to be increased about a thousand-fold or more to be a reasonable drug candidate, and if get there by adding a bunch of new features to a compound that starts off weighing 425, you’re likely to have a bulkier and much less desirable compound by the time you finish, affinity or not. But if you can start out with something that weighs around 150 daltons but still manages 30 micromolar binding, well, that’s actually a pretty solid number for something that small. And if you build out from its smaller structure seeking more binding interactions, you have a better chance of picking up that increased affinity while still staying within a reasonable size and complexity. The plan is to build out while keeping a close eye on what you’re getting for your added groups, always aiming for the most bang from your hard-earned molecular weight bucks.


That’s a big change from the old days, defined here as “back when I started working in this industry”. Back then, high-micromolar compounds were generally considered carpet sweepings, lint from your pocket that was almost certainly not worth working on, no matter what its molecular weight. The concept of “Hey, that’s actually really good binding for a compound that small; we should build something starting from that” hadn’t caught on.


It certainly has now, though! FBDD is pervasive throughout the industry, and there’s been a lot of technical development along the way. One of the key areas is how to screen weak binders like this. There are all sorts of techniques that can tell you about a single-digit nanomolar binder – in fact, if your assay can’t pick up one of those when it shows up, you’re in big trouble. But a fifty or hundred micromolar compound, well. . .getting reliable signals from those may be a challenge. For some years now the preferred technique is to start off with one of the well-established biophysical assay techniques (NMR, SPR, thermal shift, what have you) and run the hits you get from that through the others to see what sort of overlap you get. Ideally, you’ll run your whole fragment collection against your drug target through more than one of those assays. In the end, there will always be compounds that look like good hits in one of them that don’t show up in the others, and the rule of thumb is that you will be much happier working on compounds that hit by two or more orthogonal techniques.


Now, I left one classic method off that list: X-ray crystallography. In the early days of fragment screening, X-ray was often used as a primary assay. You’d soak your fragment collection (compound by compound or as mixtures) into crystals of your protein target and look for electron density. That’s something of an art form, both in the soaking techniques and in the interpretation of the data, but you can often get surprisingly clear looks at your fragments nestled into various crannies of your protein target. These experiments led to a lot of realizations that hadn’t been illustrated so vividly until then. There were small fragments that would bind to several spots on a given protein simultaneously, for one thing, or (similarly) proteins that would pick up different fragments binding to different places (the canonical active binding site, for sure, but others as well). And there were binding sites that would pick up two or three separate molecules of a given fragment simultaneously, given enough room. When you started looking at closely related structures to your fragment hits, you would often find that some of them were indeed binding to the same pocket as the original versions, but had unexpectedly flipped around as if they had decided to back into the parking spot instead. You could never assume anything; it was much better to go out and get the data (it always is, frankly).


In general, it has always been far easier to run a fragment optimization program if you have repeated looks at the structure in this way – it tells you where to build off of your current molecule to pick up new interactions, and alerts you to weirdo unexpected changes in the binding conformations as you do so. As the years went on, X-ray generally wasn’t the primary assay, because the other methods tended to be less labor-intensive with higher throughput. But you definitely tried to get X-ray structures of your best hits and the compounds that you made from them.


This new paper, though, is from a German company called “Crystals First” and their co-workers, and they mean what they say, using a hybrid of experimental fragment work and computational techniques. They started off with X-ray data from fragment screens against protein kinase A, and picked the most solid data sets. Looking over the structure of the active site, these all bound in the same spot, with slightly different conformations of four amino acids side chains to accommodate the different ligands. Four chemically distinct fragments were chosen along with a model of the binding site (and see below – these were chosen just for chemical diversity, and not on the basis of affinity at all). This set was used for a computational docking screen of about 200,000 fragment-sized compounds from the Enamine REAL library, each of which had one of the four “seed” fragments as a substructure. That gave about 15,000 new related fragments in about 75,000 poses. These were then enumerated through the sorts of one-step reactions that the Enamine intermediates are set up for, giving about 500,000 molecules, which were then docked to produce about two million poses.


Scoring and filtering (such as getting rid of molecules that clashed into the protein surface in the models) took the count back down to the 500,000 range, and these were reduced to about three thousand clusters. At this point, the paper says that these were “visually inspected” to take the list down to 106 compounds that were selected for synthesis (that step does not sound like a lot of fun, to be honest). These had rather low similarity (Tanimoto) to known PKA inhibitors. 93 of the compounds had a successful synthesis, and 75 of them were soluble enough to be assayed. 40% of these were active in a PKA phosphorylation assay, which is a nice hit rate, and the best of them was about 700 nM. 13 of the most active were taken into either crystal soaking or cocrystallization experiments for X-ray. Six of these gave good structures, which generally matched up well with the predicted binding modes.


Even by this point, the compounds were still just on the high end of fragment space in molecular weight (since they were basically the product of two small Enamine compounds reacting), and could potentially be taken through yet another round of similar screening. It’s interesting to note that the four fragments that started off this whole process, when checked experimentally, showed only millimolar activity in the PKA assay and no real effect in thermal shift assays at all. In other words, these were compounds that under conventional conditions would never have been selected as fragment starting points in the first place. This computationally intensive technique, at least in this case, really does seem to pulling useful starting points out of chemical matter that no one would have looked at twice, and it makes you wonder about the standard fragment screening cascades and what’s being missed.


The authors believe that the combination of starting with experimentally verified fragments (the initial X-ray structures) and the computational enumeration of large (yet focused and synthesizable) libraries is potentially a higher-success strategy than traditional fragment screening, and it appears to be faster as well (this whole campaign is said to have taken nine weeks). I don’t see the speed as a clinching argument myself, but the production of active chemical matter from fragments that (while real) were basically too weak to measure for affinity is very interesting. 


And no one would have done anything like this in those old days I was talking about, that’s for sure. For one, there are the computational needs, which of course would have been completely out of the question. But as mentioned, no one really did fragment-based screening back then, either, and they absolutely wouldn’t have tried it with compounds that were such weak binders to start with, X-ray or no X-ray. And there’s another factor: the existence of the Enamine REAL library, which provides opportunities both for computation and for high-success synthesis. Such libraries are the apotheosis of the “analog by catalog” approach, and that wasn’t such a big thing back then, either. The Maybridge catalog was probably the closest thing, and they were the only company that was really pointing itself in that direction. It’s a different world!