With science funders like NIH and NSF, we need to dramatically expand their capacity to assess reproducibility, as well as to investigate the more extreme case of research fraud. Otherwise, we could be wasting easily several billion dollars a year on fraud and irreproducible research.
Reproducibility
As for reproducibility, numerous studies have identified a reproducibility problem across virtually every major area of biomedical research.
For one example, the Reproducibility Project in Cancer Biology concluded in the past year, and found fairly dismal results. Not only were the effects around 85% smaller than in the original studies, it was an enormous challenge even to attempt a replication experiment in the first place.
Literally zero of the original experiments “were described in sufficient detail” to even start up a replication experiment, and 41% of the original authors were “minimally helpful” or “not at all helpful” in giving advice on what their original experiment had done.
As another example, consider ALS research. The serial entrepreneur Jamie Heywood had a brother who came down with ALS – also known as Lou Gehrig’s disease – at a fairly young age. Heywood was determined to save his brother, and founded the ALS Therapy Development Institute (which raised many millions of dollars for ALS research).
But it was all for nought—his brother died, and Heywood told me that “we spent $40 million dollars on ALS experiments, and not a single one would replicate.”
Heywood came across as starkly bitter as he said this to me.
It’s no wonder. Imagine raising millions of dollars for medical research to help a family member on his deathbed, only to find out that none of the studies worked.
As one of Heywood’s deputies wrote, “we have tested more than 100 potential drugs in an established mouse model of this disease (mostly unpublished work). Many of these drugs had been reported to slow down disease in that same mouse model; none was found to be beneficial in our experiments . . . . Eight of these compounds ultimately failed in clinical trials, which together involved thousands of people.”
Indeed, some drugs that had highly positive effects in published mouse experiments turned out to have negative effects in the replications:
In other words, reproducibility isn’t just a technical matter of being finicky about crossing i’s and dotting t’s.
It can have real human consequences, including lost opportunities to address deadly diseases, and putting thousands of people through clinical trials that might have been pointless from the outset.
****
To be sure, NIH and NSF have already addressed reproducibility in various ways. NSF has occasionally announced small lines of funding for replication studies in neuroimaging, in computing and communications research, and in social science more broadly. The NIH has taken many actions to improve reproducibility, including funding a wide range of individual projects aimed at improving reproducibility, and a number of funding announcements (such as reproducing clinical trials on alcoholism).
All of this work is great! I don’t mean to discount it at all.
But we could build upon it by being more systematic in requiring that a certain percentage of each year’s budget be dedicated to such efforts, rather than leaving it to ad hoc decisions from year to year.
That is, the NIH and NSF should be required to spend 0.1% of their funding each year on replication studies. By comparison, the Center for Medicare and Medicaid Services spends around 0.17% of its yearly budget on the Center for Program Integrity.
One tenth of one percent isn’t too much when it comes to quality control.
This funding should be administered by the NIH or NSF Director, rather than imposed as an equal requirement across the Institutes and Centers (or the Directorates, in the case of NSF), and the precise allocation should be determined by a council of representatives from across the organization.
For example, it might be more important and costly to replicate a series of cancer experiments sponsored by the National Cancer Institute than to replicate studies from the National Institute of Dental and Craniofacial Disorders. Likewise, NSF would probably get much more out of replicating studies from education, social science, or perhaps earth science, rather than mathematics or computer science.
As well, whenever Congress considers a line of funding for a specific line of research (such as the recent burst of hundreds of millions of dollars committed to Alzheimer’s), it should require the agency to conduct a comprehensive replication project (which would have been a very wise decision with regard to Alzheimer’s…).
Fraud Investigations
Imagine a world in which police and prosecutors basically didn’t exist, and most crime was caught only if there was a private vigilante around. Crime would be higher. But that’s the world that researchers live in: no official agency *proactively* investigates important published papers to check for signs of fraud.
In the vast majority of cases, the only reason we know about fraud is because of dedicated amateurs who spent thankless hours risking their own professional reputation and careers to investigate someone else’s research. Indeed, most of the people who are the most prominent fraud-detectors have come from outside academia, despite the fact that they have PhDs and at one point intended to have an academic career. (See, e.g., here and here.)
In other cases, PhD students (who are even more vulnerable to retaliation) have taken on a particular investigation in their spare time, often in the face of advice not to waste their time (see here, here, and here).
Offices of Research Integrity do exist within government and universities, but they don’t proactively look for fraud. Anyone who spends time on such matters often gets nowhere, and even makes enemies in a given field (along with missing opportunities to work on their own original research projects). The incentives are entirely against spending one’s time and money on checking research quality.
Just recently, we all learned – some 15-16 years after the fact – that a line of Alzheimer’s research seems to have been fraudulent. The original research credits “grants from the NIH” to three of the authors. As Nobel Laureate Thomas Südhof of Stanford told Science, “The immediate, obvious damage is wasted NIH funding and wasted thinking in the field because people are using these results as a starting point for their own experiments.”
How did the fraud investigation start? Anonymous commenters on PubPeer, an online forum often used to raise anonymous critiques of published scientific articles. Yet the fraud was apparently quite obvious, once anyone bothered to look.
A similar case occurred with Piero Anversa of Harvard Medical School, who fabricated data on stem cells in heart disease. A Reuters story estimated that NIH had spent $249 million on such research even after being aware of Anversa’s fraud.
Recounting all the cases of fraud (e.g., Anil Potti) in other government-sponsored grants would be a lengthy endeavor.
But the scenario is usually the same:
Fraud persists for years until a dedicated observer takes a closer look at the data, and immediately realizes that there are obvious discrepancies that are too good to be true. If anyone had taken a closer look at some earlier time, the fraud would have been discovered then.
This scenario is far from optimal. When we spend many tens of billions of tax dollars on research every year, it makes no sense to leave the actual research unpoliced except for the occasional private vigilante.
NIH and NSF should sponsor serious efforts to look at sponsored studies for signs of fraud, and to engage in more proactive investigations rather than waiting for anonymous Internet commenters to identify problems years or decades after the fact.
As well, we should set up one or more organizations that would work full-time replicating studies, systematically auditing research fields and/or government agencies for signs of research fraud and other issues, etc.
These proactive efforts would preserve the value of the taxpayer investment in research, both by unearthing more cases of fraud and by deterring many other fraudsters.
Sample legislative text
Replication Experiments:
Section 402 of the Public Health Service Act (42 U.S.C. § 282) is amended by inserting the following new subsection (o):
“(o) Replication Initiatives–
In General: The Director of NIH shall spend 1/10th of one percent of the NIH budget on replicating important experiments and studies from across NIH.
Selection and Advisory Council: The Director of NIH shall assemble a Replication Advisory Council composed of representatives from each NIH Institute or Center that spends at least $500 million a year on extramural grants. That Council shall meet at least twice a year in order to decide which areas of NIH-sponsored research deserve replication so as to advance the field. The Council shall make recommendations to the Director as to the amount to be spent on any particular replication project, with the goal of a well-balanced portfolio. The Director shall follow the Council’s recommendations to the extent feasible, and shall briefly explain any departures in an annual report submitted to Congress.
Publication—The overall results and data from any replication project shall be made publicly available as soon as possible.
Section 7009 of the National Science Act of 1950 (42 U.S.C. § 1862o-1) shall be amended by relabeling the existing text as subsection (a), and adding a new subsection (b) as follows:
“(b) Replication Initiatives–
In General: The Director shall spend 1/10th of one percent of the Foundation budget on replicating important experiments and studies previously funded by the Foundation.
Selection and Advisory Council: The Director shall assemble a Replication Advisory Council composed of representatives from each Directorate. That Council shall meet at least twice a year in order to decide which areas of Foundation-sponsored research deserve replication so as to advance the field. The Council shall make recommendations to the Director as to the amount to be spent on any particular replication project, with the goal of a well-balanced portfolio. The Director shall follow the Council’s recommendations to the extent feasible, and shall briefly explain any departures in an annual report submitted to Congress.
Publication—The overall results and data from any replication project shall be made publicly available as soon as possible.
Fraud Investigations:
Section 493 of the Public Health Service Act (42 U.S.C. § 289b) is amended by adding a new subsection (f):
“(f) Investigations of Potential Research Fraud—
Establishment of New Fraud-Detection Organization: The Office of Research Integrity shall, as soon as practicable, seek to establish and fund one or more extramural fraud-detection organizations.
Leadership: This organization shall be led by and comprised largely of scientists and researchers with expertise sufficient to detect academic fraud in data, images, etc.
Activities: The fraud-detection organization shall regularly seek out opportunities to detect fraud in NIH-funded publications and research, including proactive investigations and responding to reports (for which anonymous submissions shall be accepted).
Section 7009 of the National Science Act of 1950 (42 U.S.C. § 1862o-1) shall be amended by adding a new subsection (c) as follows:
“(c) Investigations of Potential Research Fraud—
Establishment of New Fraud-Detection Organization: The Foundation shall, as soon as practicable, seek to establish and fund one or more extramural fraud-detection organizations.
Leadership: This organization shall be led by and comprised largely of scientists and researchers with expertise sufficient to detect academic fraud in data, images, etc.
Activities: The fraud-detection organization shall regularly seek out opportunities to detect fraud in Foundation-funded publications and research, including proactive investigations and responding to reports (for which anonymous submissions shall be accepted).
Agree completely - funders have been slow off the mark to realise that the current system incentivises showy science with quick results. The ALS story is heart-breaking.
I have 2 additional thoughts:
1. I'd like to see funders support collaborative projects - even adversarial collaborations - where different groups tackle the same question. Could use same or different methods. So replication/generalisability built in from the get-go, rather than as afterthought.
2. Would be great if a University set up a MSc course that trained people in how to detect fraud/questionable research. Alas, idea came too late for me to push for it here in Oxford. Might be feasible to bolt it on to a course in a topic like Evidence-based medicine.
I do grant writing for nonprofits, public agencies, and some research-based businesses; the barriers and challenges the NIH and NSF put up never cease to amaze me: https://seliger.com/2018/04/11/no-rush-nsf-accelerating-discovery-educating-future-stem-workforce-program/.
And then there is the NSF's general application guide, versus the specific RFP in question, each of which recursively and maddeningly refer back to each other.