Virtual Screening for Coronavirus Protease Inhibitors: A Waste of Good Electrons?

Here’s a provocative title, in its way: “When Virtual Screening Yields Inactive Drugs”. You can take that several ways, naturally. I mean, all screening programs come up with mostly inactive compounds – you actually get suspicious when the hit rate is too high, and “too high” might easily be something like 0.1%, depending on what target you’re screening against and what library of compounds you’re throwing at it. But in this case the authors are talking about the flood of screens (especially virtual screens) against coronavirus targets that started up in 2020.

Man oh man, were there ever a lot of those papers. You have to think that everyone who threw something like this together at the time got it published somewhere, right? And who would turn down a well-intentioned manuscript to identify compounds to fight a burgeoning pandemic? But even at the time, it was clear that not all of these papers were of equal quality, and that what’s more, they couldn’t all be right about what they’d identified. The first published X-ray structure (6LU7) of the coronavirus main protease hit in April 2020, and the virtual screening groups wasted no time –  in fact, that very structure paper had some screening ideas, none of them in the end particularly fruitful, and that protease structure was later the basis for some truly humungous virtual screens by other groups. I think it’s safe to say that the coronavirus MPro has been virtually screened about as thoroughly as the current state of the art will allow. The authors of this new paper estimate at least 700 docking/screening efforts have been reported in one fashion or another.

But what have all these efforts led to? The first of those two giant screens just linked to was the subject of further work by another team, narrowing down its hit set by drug-likeness criteria, chemical diversity and the like. But the top 200 or so compounds arrived at by those cutoffs were of no interest as actual coronavirus protease inhibitors when they were screened. Meanwhile, the two best compounds from that second big-screen paper had about 30 to 50 micromolar potency, which isn’t anything to start a conga line with, either, not at full drug-sized molecular weights, anyway. All three of those screening papers took rather different computational approaches, though, and this new one proposes to go back over their hit sets with what is in theory a more discriminating tool, the Molecular Mechanics/Generalized Born Surface Area (MMGBSA) calculation. Like all such techniques, it’s not going to spit out the “real” binding energies (or at least that’s not the way to bet), but it should be better at rank-ordering a large set of compounds, if you have the horsepower available to do the calculations. It’s supposed to calculate low-energy poses for the molecules themselves, with attention to solvation energies, and also to allow for relaxation and wiggle room in the protein binding site in turn.

So the authors re-scored the virtual screens above using this new criterion, and actually tested the best of these new hit sets against the coronavirus protease itself (since these were screens of what should be known commercially available compounds). Interestingly, many of the 200 or so compounds re-scored previously from the first big-screen paper showed up again by this method, although there were certainly some new ones in there. But actual testing of the top of the lists from both re-scoring efforts yielding nothing of interest whatsoever – the best compound, on dose-reponse testing, came in with an IC50 of about 800 micromolar, which for a compound of its molecular weight is simply useless. In the end, none of these large state-of-the-art virtual screens have produced anything that is in any way a candidate for a coronavirus protease inhibitor. Indeed, they have a great deal of difficulty producing anything that even measurably binds to the target at all.

This might be worth keeping in mind the next time someone tries to hype you up about virtual screening. The authors of this new paper say that they are hoping to see fewer papers that involve “overproduction of VS works that simulate a single crystal and a few randomly selected molecules“, and note that “The increasing number of ‘false theoretical friends’ in the literature will not help to rationalize the search of an efficient antiviral.” They’re being too diplomatic, but I don’t have to be. This stuff not only does no good, it does actual harm. It gives outside observers ideas about the efficacy of virtual screening that are simply not confirmed by real-world experiments, and it clutters up the literature with proposals that lead nowhere. If you’re going to virtually screen big databases of available compounds, don’t stop there: go screen some of your damn hits in a lab and see if they’re any good.

The time for publishing papers that are nothing more than “Here are a bunch of compounds proposed by our software!” should have ended, and the time when you could publish such lists in the expectation that there were actionable hits in there has not yet arrived. Virtual screening is still at that awkward age when claims need to be backed up with hard data. At the risk of sounding like a curmudgeonly pain in the rear, I’d be fine if journals started rejecting every one of these papers that don’t provide any.