IASSIST 2008 Conference Session: Moving Research Data Into and Out of Institutional Repositories

Robin Rice of EDINA and Edinburgh University Data Library gave an excellent overview of open access to data. She quoted Peter Buneman of the Digital Curation Centre who said, “The best way to preserve your data is to publish it!”

She also gave an overview of open data licences, including the Science Commons Open Data Protocol and the Public Domain Dedication & License (PDDL), which avoids the attribution stacking problems that may occur with Creative Commons licences.

Robin raised the issues:

  • What are the incentives for researchers to manage and share data?
  • How to meet funders’ requirements: researchers need to define how they are going to share their data or why they cannot
  • Capacity of higher education institutions to provide services for data management?

Finally, she gave an overview of the DISC-UK DataShare project (DISC-UK is a collaboration between Southampton University, Edinburgh University, Oxford University and London School of Economics) and what it is doing in tracking the tools and guidelines available relating to open data.

Katherine McNeill from MIT Libraries, DSpace and the Harvard-MIT Data Center gave a talk concerning interoperability between MIT’s institutional repository (IR) and data repositories.

MIT has multiple locations for depositing data – DSpace, Harvard-MIT Data Center (HMDC) and ICPSR.

This presents challenges for searching, unifying collections and archiving. It is also difficult to advise faculty where to deposit their data as each location has its own advantages and disadvantages.

Therefore, there is a need for interoperability.

Opportunity: PLEDGE Project – designed to foster the use of data grid technologies, replicating content across multiple systems for the purpose of preservation.

Developed an ‘agent’ so that DSpace software and HMDC Dataverse software are interoperable. Developer of agent: Mark Diggory.

Goal: to archive, preserve and provide access in DSpace to MIT-authored studies in HMDC.


(1) Workflow for selecting and processing studies

  • Currently this is a manual process and an informal service

(2) Updating of studies

  • Currently, if content updated in HMDC, there is no ‘flag’ to notify repository managers that it needs to be updated for DSpace

(3) License agreements and terms of use

  • DSpace has licensing screens which give DSpace permission to disseminate etc.
  • With the new system content is loaded in the ‘back end’ so the researcher is not actually exposed to the licensing screens
  • Repository manager has to get permission via email
  • How to deal with this in the future?
  • Further, end-users are usually notified of what they can do with data but this way it is more hidden – implications?

(4) Keeping the agent up to date