Faked Crystallography

I’ll admit that I didn’t see this one coming: Retraction Watch is reporting that the Cambridge Crystallographic Data Center (CCDC), the world’s main repository of small-molecule crystal data, is on the way to pulling nearly a thousand deposited crystal structures because they appear to have been faked. A preprint from earlier this year from David Bimler flagged what seems to be a paper-mill operation flooding out bogus papers on metal-organic frameworks: hundreds and hundreds of weirdly worded manuscripts on nonexistent MOFs and their imaginary applications, full of apparently randomly selected “references” to the rest of the literature. And these things depositited crystal data with the CCDC, which is the step that I really didn’t expect.

After all, anyone who studies the scientific literature has (especially in recent years) seen these auto-generated papers full of crap. But faked crystal structure files? That’s nasty. The record of these papers shows a sudden jump in 2020 and 2021, leading Bimler to wonder:

The dates paint a picture of accelerating publication, as if a small-scale cottage industry had been scaled up to a production line with a larger staff. One can imagine crystallographers initially ghostwriting manuscripts as a favour for friends, moonlighting from their day job, and becoming progressively more professional, though this must remain speculation.

These things have appeared in a whole list of journals, with a good number showing up in Inorganic and Nano-Metal Chemistry, Journal of Structural Chemistry, Journal of Molecular Structure, Main Group Chemistry, and Zeitschrift für Kristallographie – New Crystal Structures. These are not the most widely read journals in all of science, but they have been considered real places where real papers appear with real data. A few papers have appeared in even more general chemistry titles such as the Bulletin of the Chemical Society of Japan. There are also a few dozen papers that were shoveled into the Journal of the Iranian Chemical Society and the Arabian Journal of Chemistry, and a few that went into really obscure titles such as the Brazilian Journal of Medical and Biological Research. Apparently one of the highest-count titles for placement of these junkheaps is the Latin American Journal of Pharmacologyfrom Buenos Aires, and I can confidently state that I have never heard of that title in my life. What they’re doing publishing scores of papers on metal-organic frameworks is beyond me, and it seems to be beyond Bimler’s ken as well. He notes that the journal does not offer library subscriptions, and papers have to be bought for $50 each, so there are probably a number of them that haven’t even been counted up yet.

Retraction Watch reports that the CCDC doesn’t pull things from its database until the background papers themselves have been retracted. For over 900 papers, that’s going to take quite a while, at traditional journal-correction rates. So they’ve issued “Expressions of Concern” on their own part for these papers, which is not something you see a database like this doing. But it’s a lot better than waiting for the journals to get their acts together, especially since the likes of the Latin American Journal of Pharmacology may in fact have no such coherence to collect in the first place.

And I had not realized it, but there was a fake crystal data scandal back in 2010. That one perhaps gives some insights into how you fake such things: apparently the authors in those cases would take an actual set of crystallorgraphic data (somebody else’s naturally!) and just step in and substitute some new metal atoms and the like. Voila, a new “compound” and a new CIF to deposit. In that round of fakery, it turned out that one structure had been used to generate eighteen supposed new ones, which is impressive in its way. Similar techniques have probably been used this time around, too; no doubt one could write a script to just auto-fake this stuff. To quote Samuel Johnson, “A man might write such stuff for ever, if he would but abandon his mind to it“.

One final note: the “authors” of all these current 992 papers are from Chinese medical institutions, most of them appearing only once. If anyone got a raise or promotion based on their publication record off this stuff, what a waste of money that was. . .