Our VP is worried about data archiving…

Jack Payne is the Vice-President of Food and Agricultural Sciences at UF and he is a great guy – we’re luck to have him. He’s fought for us at the state and national level and is very progressive as university administrators go. But this morning he sent this email to his faculty


Please see memo below from Cathy Wotecki, USDA Chief Scientiest and Under Secretary for REE, which is the office through which our funds flow from USDA.  I have concerns about the memo.  On the surface it all sounds wonderful – sharing data and feeding the world, but the federally mandated open access rule, to me, has serious complications.  Basically, the law claims that if your research was funded by federal dollars, you must make your results available independent of publishing in a peer-reviewed journal, etc.  So, how do we protect “truth?”  What stops bad data from being released to the public?  There are many more implications along this line of reasoning.  Also, what about any intellectual property that may be involved in the research?  More importantly, the argument being used is that the public pays for the research through their taxes and is therefore entitled to the results.  But I ask what public?  Once the data becomes public, it will be the citizens of the world who will have access to the results, raising national security issues.  Anyway, just some of my thoughts.  This will become a huge issue in the coming months as university scientists become more aware of the implications.  I do know that the National Academies are meeting next week to discuss this new federal requirement.  Soon, AAU, APLU and others should be weighing in on the matter as well as the various professional organizations.


Kudos for seeking faculty input and putting this issue on the radar of researchers who may not be thinking about it.  But I’m afraid my response may not have been as convincing as it could have been because I didn’t address any of his concerns regarding biosecurity or intellectual property. Any suggestions? Please post below in the comments.


Jack,  The benefits of making data open far outweigh any perceived costs. I lay out some of these benefits out in a recent editorial for a journal of which I am an Editor (http://tinyurl.com/3wqehzw), see also this story in Science: http://tinyurl.com/ykodmu9). The benefits include:

1)     Making data (and code used for analyses) publicly available allows others to check your results. Transparency improves science because it catches mistakes, the consequences of which can be tremendous. See this example from economics: http://tinyurl.com/c3jzz27

2)     Data can be used in meta-analyses or to address questions not originally envisioned by the people who collected them. I provide an example related to climate change in my editorial.

3)     Papers that make data available freely are cited more than papers that don’t. This means they have a greater impact (Piowar et al. 2007. Sharing detailed research data is associated with increased citation rate. PLoS One 2: e308). Link: http://tinyurl.com/ydc3356

4)     Many data archiving sites, such as the NSF-supported Data Dryad, allow for an embargo period (typically 1 year post-publication) before data are released to the public. See http://datadryad.org/ . These sites also have datasets in citable formats with DOIs, so researchers will continue to get credit for data collection via citation.

5)     Finally, the loss of datasets is a tremendous waster of intellectual effort and public funding.  We have an responsibility to spend the public’s funds wisely, and if one’s data die in a filing cabinet or on a hard drive we have essentially wasted taxpayer money.

This train has left the station, we’re just playing catch-up at this point. Indeed, many prestigious journals now require data used in a paper be archived and freely available for the data to be published. There are some legitimate concerns, but have been suitably dealt with in many other places. IFAS needs to encourage its researchers to archive their data in publicly available archives because it’s the ethical thing to do, will make our science more robust, and increase its impact tremendously in both the short and long-term.


PS I practice what I preach. The data for five recent papers my lab has published are freely available for reuse by others at http://tinyurl.com/dxnj2ca.




So what do you think?

