I worked with the University of Minnesota libraries on a pilot data curation program. I have discussed this before. The article about the project: Preserving Data for Future Research is online now, and is in their Continuum Magazine. There is a rare photo of me smiling.
“This is a service that the Libraries can provide and nobody else on campus is currently providing,” said Lisa Johnston, a University of Minnesota librarian, who also is Co-Director of the University Digital Conservancy. Johnston is working on a plan to meet the federal mandate.
“This is just a new type of resource that we will be providing,” she said. “It’s a natural extension of library services.”
Johnston led a pilot data curation project last year that involved faculty members, researchers, and students representing five different data sets. The project leveraged the Libraries existing infrastructure, the University Digital Conservancy, the institutional repository for the University of Minnesota (conservancy.umn.edu).
“Feedback from the faculty in the pilot was very positive and anticipated that this service might satisfy the upcoming requirements from federal funding agencies,” Johnston said. Now she’s working toward building a repository for the campus, which may be open for business later this fall.
“University libraries are the natural repository for research conducted at a particular university,” said David Levinson, professor in the Department of Civil Engineering. Levinson – who conducts research in the area of infrastructure, particularly transportation infrastructure – currently maintains some of his research data on his office desktop computer.
“I won’t be here in 20 years; I’ll be retired. What will happen to the data sets when I retire?” he asks “What if someone forgets to migrate it?”
Levinson was involved in the pilot study. He called it a “step in the right direction, but it’s a baby step,” citing potential lack of resources and compliance as two challenges to a fully functioning data curation repository.
“You could probably have one librarian for every department at the University … who could have a full-time job collating and collecting the data for that department each year,” he said, noting that a funding model has not yet been established. He adds “[The funding] should come from the grants.”
So, why is it important for publicly funded research data to be preserved?
“First of all, the data is oftentimes unique, you could never recreate it,” Johnston said. “It’s also very expensive. And what do you get out of it? One, two, five papers? You could instead make that underlying research data available so that other researchers can take a look at the data, re-analyze it and come up with new results – perhaps competing results, perhaps validating results.”
Levinson agreed, saying that Libraries already have the infrastructure, the resources and the tools to not only preserve the data but to make it “findable” by the public.
“There’s 7 billion people in the world – most of whom don’t want to use my data – but a couple of whom might. And they might not know that the data exist” if it’s just sitting on my computer, he said. “Putting it out into a standardized, findable public forum makes it easier for them to: A) Know that the data exists; and B) Actually get at the data.”