Virtual Screening Versus the Numbers

Virtual screening is what a lot of people think of when they think of using computational modeling to do drug discovery. That’s the idea, existing in increasingly complex form over the last few decades, of coming up with lists of potential compounds (which may or may not exist physically) and calculating how they might interact with a target’s binding site. The ultimate goal – which is not yet available in any stores – is to make classic screening assays more or less obsolete, because you’d be able to model the exact same interactions and get the same answers. That would also allow you (again, in theory) to screen far larger compound libraries than actually exist in vials, including lots of compounds that seem like they’d be make-able but which no one has actually gotten around to making yet.

There are several factors that keep us from doing all our screening this way (he said calmly, eyebrows twitching). One is that the business end of any such effort, the ability to “dock and score” the interactions between a small molecule and a potential binding site, is not quite where it needs to be. In some cases you get pretty reliable answers, and in some cases you don’t, and it’s generally not obvious which of those domains you might be working in today. Our abilities to calculate the energies involved are a lot better than they used to be, but there’s quite a variety of processes at work when a small molecule contacts a protein surface (hydrogen bonds, pi interaction, van der Waals interactions, salt bridges, and many more). And remember, we’re not just looking at enthalpic changes, but entropic ones as well. The flexibility and amount of “disorder” in both the small molecule and the protein before and after a binding event has to be taken into account. On top of all this are the water molecules that are solvating the small molecule beforehand and the waters that are sitting in defined places in the binding site itself, some of which may well be kicked out into the solution when binding occurs. Those all have enthalpic and entropic contributions to make to the overall calculation and these cannot be ignored. In fact, none of these things can be ignored. The free-energy differences between an interesting lead candidate’s binding and something of no use at all really isn’t that large, and can be within the error bars of your calculation, unfortunately.

A second big factor is, well, bigness. As mentioned, you’d like to be able to run these virtual screens on a huge number of molecules (since you don’t have to buy, make, store, and dispense all the damn things when it’s all happening computationally). Big Pharma libraries of individual real compounds in vials can go up into the low-millions range. DNA-encoded library technology can quickly send you into the hundreds of millions or low billions – it’s a different world and a different sort of screening, but you really are covering a lot of ground. Until you look at the possible space you could be covering, that is: there have been many estimates of the number of compounds in “druglike space” over the years, but you could toss all the DNA-encoded libraries in the world into even the more conservative ones (by which I mean merely on the order of ten to the twentieth compounds) and never see them again. In case you’re wondering, the less conservative estimates go up into the ten-to-the-sixtieth range, which is totally beyond the ability of the human brain to picture.

This problem is made worse (oh joy) by the fact that all these druglike compounds have three-dimensional shapes, many with a complement of rotatable bonds that give them various amounts of  conformational flexibility. That means that for every compound you want to screen, you might actually want to screen a few dozen or a few hundred potential conformers out of the godzillion or so possible ones, and you will be narrowing down to that set by performing yet more computations and energy estimates on each individual compound as you move its bonds around. Those have error bars in them too, naturally. So you have processor time piled into processor time piled onto processor time, and you can very quickly find yourself in a situation where there hasn’t been enough of it since the formation of the solar system (or maybe since the Big Bang) to run your desired virtual screen.

Unless you find a way to narrow the problem, that is. So that’s another art involved in this field: wrestling the problem down to a size that can be dealt with, but doing so in a way that doesn’t throw away too many of those great hits that you know must be lurking out there somewhere. Even though you don’t know where they are or what they might look like. This new paper is probably the largest-scale effort to date in this regard, and it’s a state-of-the-art look at ways to chop an inhuman, incomprehensible problem down to a merely Godawful one. It’s based around the Enamine REAL library and its extensions, which is a set of compounds made from a (rather large) collection of simply functionalized building blocks. If you mix-and-match all the combinations, you get billions of potential compounds with a high chance that they’ll actually be synthesize-able in a relatively short time via two- and three-step routes. But how to dig through that many possible compounds to pick the ones that you really want?

The authors proposed a “virtual synthon hierarchichal enumeration screening” (V-SYNTHES) approach. They do the docking calculations on a set of fragment compounds that are a good representation of all the scaffolds available for library synthesis and the corresponding reagents to elaborate them. The best combinations are then turned into a virtual library of their own and these compounds are docked again computationally. The paper says that they end up focusing on less that 0.1% of the synthons available, bringing the calculations back into the feasible range. The hope is to enrich the chemical space as early as possible with things that are more likely to lead to real hits (with a filter in place to keep any single classes from dominating the resulting set). There are other refinements – for example, the fragment-like compounds have”dummy” capping R groups on some positions, and the work flow tries to prioritize hits that have these pointed into a region of the protein target with space to grow. As the paper notes, this is in some ways a computational recapitulation of fragment-based drug discovery, where you start with a small piece with greater-than-expected affinity to the target and grow out from that.

This technique was tried out in a cannabinoid receptor docking effort (CB2), giong after 11 billion potential REAL compounds from the Enamine collection. Their algorithmic approach reduced things down to docking of 1.5 million compounds, which is good, because an 11 billion compound library is just not within range of anyone’s abilities if you try to tackle it directly. Overall, it was about a 5,000-fold improvement in computational screening cost. They benchmarked this against a random set of another 1.5 million compounds from the 11 billion, and found they had indeed produced an enhanced hit set – the best compounds were much more energetically favorable than the ones from the random collection, and there were many more hits overall. Comparing the two 1.5-million-compound sets along the way showed about a fifty-fold increase in potential hits even at the most permissive energy cutoffs, and higher enrichments with more stringent ones. The theoretical enrichment factor for the two-component reactions would be about 500 and about 20,000 for the much larger three-component set, and what they saw in practice was about 250 and 460 – getting within range of perfect enhancement for the simpler reaction set, anyway.

They took the best 5,000 compounds and ran the usual sorts of filters past them – undesirable functional groups, too much similarity to known CB2 ligands, etc. They then docked the remaining compounds at a higher computational level and took members of each structural cluster out for actual synthesis (80 compounds). Enamine was able to deliver on 60 of those within about five weeks, and 21 of these showed antagonist activity in a real receptor assay (>40% inhibition at 10 micromolar screening). In a radioligand binding assay, fine of them had Ki values in the single-digit micromolar range or better for CB1 and 16 for CB2, and these also looked good in a counterscreen against 300 other human receptors. As a further benchmark, the authors also compared these results to a standard virtual screen run at about the largest you’d want to do such a thing, on a 115-million diverse compound set from the 11 billion Enamine structures, and using the same receptor model and docking algorithm used in the V-SYNTHES protocol. This rather computationally intensive effort pulled out 97 hits, with nine of these showing single-digit micromolar or better Ki values at CB1 and five at CB2.

Interestingly, the V-SYNTHES hits were rather different scaffolds than you see reported in the literature for these receptors. And because of the nature of the Enamine collection, they are readily expanded in an “SAR by catalog” effort. The team took three of the best hits and requested 121 similar compounds to be synthesized, with 104 delivered in the five-week time frame. This pulled out dozens of micromolar-and-below compounds, with several down in the low nanomolar range with 50x selectivity against CB2, which takes you down to potential lead candidate territory pretty quickly. At the very end of the paper they mention a similar effort against a completely different target, the kinase ROCK1, which produced similar enhancement factors and led to six micromolar-or-better compounds, one of which had an assay Kd of about 8 nanomolar. 

From a data-science perspective, the size of the virtual libraries grows in polynomial fashion (as the square of the number of synthons for two-component reaction products, for example). Indeed, since this work was done, the full Enamine REAL library has gone from 11 billion to 21 billion potential compounds. But the computational demands of this protocol grow only linearly with the number of synthons, making its potential advantages more and more appealing (in the relative sense) as the libraries themselves get larger. I realize that to real comp-sci people saying “only linearly” might make them shiver a bit, but we’ll take what we can get out here. The hope is that as the actual docking algorithms improve (as long as we’re not paying too high a computational price along the way) that this and the other virtual screening techniques will show even greater hit-rate enhancements.

And as usual, I’m a long-term optimist on this sort of thing. We’re already able to do much more interesting and impressive work in this regard than was possible ten years ago, much less what could be done back when I entered the drug industry workforce (which techniques look by now roughly like chipping flint to make hand axes). It’s not going to happen next year, but eventually it seems likely that real, physical compound screens (with plates and liquid handlers and fluorescent readers and all) will disappear in favor of computational approaches. Various hypemeisters have claimed that we had reached that state at various points stretching back to early 1980s or so, but just because people have allowed themselves to get a bit too enthusiastic from time to time doesn’t mean that the idea is laughable or will never be realized. Decades of hard work have, in fact, been realizing it piece by piece, and that will continue. Getting it done that way is not as sexy as proclaiming that the New Era Has Begun!, but it’s how things generally work out here in reality, where most of us spend the bulk of our time.