> Pharmalot: What do you make, though, of the editors at The New England Journal of Medicine? In a recent editorial, they expressed reservations and set off a ruckus by saying that some researchers worry about ‘research parasites.’
>Krumholz: There are lot of people who feel the research they do is the best research. It comes across as self-serving and absurd. Science is about replicating and testing your work. We need scientists to test and investigate and provide new insights. … Remember, people may make errors and the only way to catch mistakes and make corrections is to see the data.To call people parasites who are taking advantage of existing data and using that to generate new knowledge is not parasitism. That’s synergy and building on work of others. … Anyone who holds that view is not seeing the big picture about how different people can work in different ways to help society advance.
There's been a lot of anger at NEJM for this editorial, largely because it used the brazenly derogatory term "research parasites," more than any other particular complaint. The same discussion happened last decade in the basic research community, as gene expression microarray studies started publishing lots of data, and no single initial publication could fully mine it. Now that the clinical science community is encountering more of this, and information dense technologies are being used to gather data during clinical trials, the same discussion is getting rehashed.
Meanwhile, those whose primary research thrust is analysis of data rather than generation often have more offers to collaborate than they could ever meet. They may not get the Science or Nature papers though, as those are typically reserved for data producers.
> They may not get the Science or Nature papers though, as those are typically reserved for data producers.
[I'm an undergrad research student at my university, doing research into Asteroseismology.] While this might be the case for medical data, I can give a counterexample of astrophysics data. As a community, we only have so many telescopes. Most of the work I'm doing is involved with the NASA Kepler space telescope, and several of my colleagues have written articles in Nature and Science about new techniques and results from data that is publicly available from that telescope. Several papers I've coauthored have mainly been new data processing techniques to learn more about stars using existing data.
So, from my perspective, the whole "research parasite" FUD was particularly ridiculous. It's incredibly arrogant to claim that you've learnt all you can learn from a single collection of data. Calling people that salvage even more information from other data sets "parasites" is incredibly arrogant.
The derogatory term is far from the only problem with the NEJM editorial. The problem is that the article was acting as if publications and socially approved results are primarily a reward for putting in effort, as opposed to part of a search for truth.
The article criticized "rude" behavior like using someone's own data to refute their conclusion. That's the biggest bit of cluelessness in the editorial.
Sadly, most researchers are primarily judged on publication record, with the not-too-subtle subtext that "low" rates of publication are due to lack of either effort or talent. In such rating exercises, truth is never a factor.
Creating data can be expensive. Analyzing data is cheap. People who come up with a new theory get a lot more fame than people who publish raw data. The incentives are then to suck your data absolutely dry of theoretical insight before you publish it.
The solution would be to realign the incentives by giving more credit to the creators of the data.
Medical research is currently a clusterfuck. It is an absolute non sense to create throw away data for the purpose of verifying or falsifying one theory. Fucking morons. Sometimes I think I'm the only sane person on this rock, but I used to think that maybe I was just arrogant. Nope. Now I am older and wiser now and I realise its just true.
I'm always glad to see submissions to Hacker News from the new Stat news service. This submission has a lead paragraph before the interview transcript that sets a lot of context: "In an extraordinary move, the International Committee of Medical Journal Editors last week issued a proposal to require researchers to share their clinical trial data as a condition for publication. And the researchers would also have to submit plans for how their data can be shared. The journal editors, who represent such periodicals as The New England Journal of Medicine and the Annals of Internal Medicine, believe data sharing 'will help to fulfill our moral obligation to study participants, and we believe it will benefit patients, investigators, sponsors, and society.'"
The key words there are "condition for publication." If researchers can no longer publish in the major journals without an ironclad guarantee that they will share their data, the incentive to share data well improves a lot. Interestingly, the Retraction Watch group-edited blog about scientific research has a guest post today "Sharing data is a good thing. But we need to consider the costs"[1] by Liz Wager referring to the same proposal by medical journals, but suggesting that some other rules would be even more helpful for improving research.
"I want to re-emphasize that I am not against data sharing. If I had a magic wand (aka unlimited funding for research and its dissemination) I would undoubtedly wave it over all research and create a system in which raw data were permanently linked to all types of report and all the report formats were linked (so that, for example, somebody reading a press release could easily check the journal article, and, if they wished, also the protocol, full study findings and raw data). But if the fairy gave me two wishes instead of a wand, I would wish for prospective trial registration and access to full trial reports for all trials before wishing for raw data."
If you got curious about this, check Ben Goldacre's website (http://www.badscience.net/) and the OpenTrials project (http://opentrials.net/). He wrote a couple very interesting books about this issue.
I'm helping http://www.myire.com which is about to publicly launch this quarter and is working on this exact problem. It's everything from ideation to publication allowing for collaboration. Truly reproducible research.
If you want to chat about it email me hn at strapr dot com and let's talk.
This is pretty awesome. I think what's as important as actually providing the data + results is details of the steps taken to produce the analysis. Clinical trials data analysis is often performed by underqualified research assistants and then sent to a biostatistician for a rubber stamp. Having a mechanism to review analysis would not only save time in getting potential treatments to patients, but also reduce the effort required to analyze new datasets.
"In an extraordinary move, the International Committee of Medical Journal Editors last week issued a proposal to require researchers to share their clinical trial data as a condition for publication."
If this happens, it would turn a very important philosophical and practical corner in the sciences. It might even prevent things like this:
Noteworthy about the above result is that many studies that were to have been included in the study, couldn't be, because the original data couldn't be located.
>Krumholz: There are lot of people who feel the research they do is the best research. It comes across as self-serving and absurd. Science is about replicating and testing your work. We need scientists to test and investigate and provide new insights. … Remember, people may make errors and the only way to catch mistakes and make corrections is to see the data.To call people parasites who are taking advantage of existing data and using that to generate new knowledge is not parasitism. That’s synergy and building on work of others. … Anyone who holds that view is not seeing the big picture about how different people can work in different ways to help society advance.
There's been a lot of anger at NEJM for this editorial, largely because it used the brazenly derogatory term "research parasites," more than any other particular complaint. The same discussion happened last decade in the basic research community, as gene expression microarray studies started publishing lots of data, and no single initial publication could fully mine it. Now that the clinical science community is encountering more of this, and information dense technologies are being used to gather data during clinical trials, the same discussion is getting rehashed.
Meanwhile, those whose primary research thrust is analysis of data rather than generation often have more offers to collaborate than they could ever meet. They may not get the Science or Nature papers though, as those are typically reserved for data producers.