Login to Your Account

Lists Get Checked Twice This Season; Experiments Do Not


By Anette Breindl
Science Editor

It's the time of year to make lists and check them twice. Science published its Breakthrough of the Year issue describing its picks for the most important scientific advances of the year on Friday.

A day earlier, Nature put out a news piece on "Nature's 10," 10 individuals who were instrumental in some way to scientific enterprise from discoveries through their publications to the policies based on them.

As journals were making their picks for the biggest scientific stories of the year, though, one emerging story about the year in science was not about the content of those discoveries – but about the dawning realization of just how little of that content is reproducible.

One of Nature's 10 is Elizabeth Iorns, who founded the Reproducibility Initiative to provide both ways and incentives to validate translational research. (See BioWorld Today, Aug. 22, 2012.)

The need for such an initiative has become painfully obvious in recent years. The magnitude of the problem is sobering. Last March, MD Anderson's Lee Ellis (who sits on the advisory board of the Reproducibility Initiative) and Glenn Begley, who is a senior vice president at TetraLogic Pharmaceuticals Inc., published a paper detailing the attempts of Begley's group, during his time as head of oncology and hematology research of Amgen Inc., to reproduce landmark cancer studies. They were unable to do so in more than 90 percent of cases. (See BioWorld Insight, April 2, 2012, and BioWorld Today, June 8, 2011.)

Other studies have differed in the exact numbers. But whatever the exact proportion of translational studies that can't be reproduced, it appears to be well above 50 percent.

For the most part, the problem is not outright fraud. Fraud certainly exists, and journals and referees have been working on better systems for detecting it ever since the decade's most high-profile case of fraud – Woo-Suk Hwang, who fabricated data to support his claim that he and his team had created patient-specific stem cell lines with high efficiency, and embryonic stem cell lines through cloning. (See BioWorld Today, Jan. 11, 2006.)

But much more often, data are not faked outright. They are, instead, collected and analyzed under extreme systemic pressure to achieve perfect results. As a result, those data are cherry-picked and massaged – often unconsciously – until they are, literally, too good to be true.

Those findings, in turn, can ruin drug discovery programs and academic careers alike, as researchers struggle to build their own work on the scientific equivalent of quicksand that no one has even tried to replicate.

Part of the problem is intrinsic to what science is about. Scientists, by and large, are attracted to science for the thrill of discovery – not for the mundane work of doing someone else's quality control. Nor does replication bring the glory that an original finding does. Instead, it can be hard to publish at all.

Altogether, that state of affairs makes it perfectly reasonable for an individual principal investigator to devote scarce funding to original discoveries rather than replication. But the net result is a sort of tragedy of the commons – since it is in no one's personal interest to replicate, the scientific literature ends up being made up to a significant extent of bogus findings.

The Reproducibility Initiative addresses the problem by providing both a mechanism and an incentive to replicate findings. Scientists can arrange for third parties to replicate experiments. Those validation studies are routinely published by PLoS ONE, which gives the folks doing the replication a published paper, and the original researchers additional attention for their work, as well as data that are of higher value for licensing to industry.

Begley also suggested ways to make replication a more attractive proposition. But a better way might ultimately to be to nip the need for such replication in the bud by performing higher-quality research in the first place.

Begley pointed to blinding during data analysis as the single most important way to improve the quality of translational research. If blinding became routine, he told BioWorld Today, "I think we would get rid of a lot of this."

Other fixes – all of which, he said, "seem self-evident but are seldom practiced" – include showing all data, not just best examples that are suddenly "representative" in the figures; repeating experiments; running both positive and negative controls; validating reagents; and using appropriate statistics. High-profile studies, he said, typically fail on several of those quality indicators.

The sort of research that makes it to annual best-of lists is likely to be more robust than average, because there are many labs working on it, meaning that reproducibility issues are more likely to show up before publication, not after. One item on this year's list, the ENCODE project's publication of regulatory DNA elements, was done by 442 researchers and published in more than 30 papers simultaneously.

But a high profile of the research itself is no guarantee of its accuracy.

Before his flameout, Hwang – who must have known that others would attempt to reproduce his findings – and his claims of human cloning were one of the staple items of 2004's 10-best lists.