Tag Archives: IASSIST 2008

IASSIST 2008 Conference Session: Licensing, Privacy and Protection

Thomas Lindsay of the University of Minnesota gave a presentation entitled, The Digital Locked File Cabinet: A Problem Metaphor. His talk addressed: (a) interpreting Institutional Review Board (IRB) regulations and guidelines for human subject research – i.e. in relation to privacy and (b) data security where private information has been collected.

US Code: Protection of Human Subjects, created 1974 and revised 2005 – primarily biomedical and behavioural research, applied directly to all federally funded or conducted research – human subject research defined in 45 CFR 46.102(f)

Research obligations under law include:
1.minimise subjects risk through sound methodology
2.risks appropriate to benefits

Human subjects research in the digital age – online surveys, no face to face communication, written consent is often impossible (at least where a signature is required) and traditional data security measures do not apply.

Regulations that do not strictly translate to the digital age must be interpreted; panels make individualised case-by-case decisions (but often these decisions will conflict with other similar cases); do not always understand nuances of internet data security issues.

CLA Survey Services at the University of Minnesota has developed software and standards to deal with these problems. These include online consent forms, data security software etc.

CLA-OIT Survey Services (Case Study) –
Key Issues:

(1) Informed Consent

  • Title 45 part 46.117 – legislation requires informed consent – researchers must collect signatures from participants to indicate consent
  • Signature collection is not practical in most online research settings – vast majority of population is not set up for federally-recognised “digital signatures”
  • Compromise with IRB – consent form is first page of survey, consent form is identical to paper requirements, in place of signature a question is asked whether the participant consents, the participant can only continue with the survey if they say yes

(2) Required Answers

  • Can researchers require a participant to answer a question? This can conflict with the principle that a participant can withdraw consent at any time
  • Face to face is different to online – in face to face the researcher can intimidate – online a participant can never be truly forced to answer a question, because they can just shut the browser and walk away from the computer

(3) Data Security

  • Multiple layers of protection including encryption, dedicated database, passwords etc.

[I asked two questions:

(1)Have you developed a system whereby the participant can give varying levels of consent for use of their data when they begin the survey?
(2)What is the protocol if the participant does “close the browser and walk away”? Are they informed at the beginning of the survey that this will constitute a withdrawal of consent? Are they told what will be done with their data entered thus far if they decide to stop completing the survey?

Response: they have not done either of these things, but will consider both carefully. There may be some difficulties in convincing the researcher who is conducting the survey to agree to allow participants to give varying levels of consent, because the researcher usually wants to obtain consent to do everything they can with the data.]

IASSIST 2008 Conference Session: Data Discovery and Dissemination: Linking Librarians, Vendors and Archives

Terrence Bennett, The College of New Jersey
Austin McLean, ProQuest
Myron Gutmann, ICPSR (Inter-University Consortium for Political and Social Research)

This session centred on data in dissertations.

ProQuest UMI has a commitment to enter and maintain the dissertation or thesis in the scholarly records and make copies available according to the author’s choice for access and dissemination.

[My question: what mechanisms have you implemented to ensure that you are providing the level of access and reuse rights chosen by the author – do you use licensing mechanisms such as Creative Commons licences?

Their response: are looking into working with CC and Science Commons – we think its a great idea.]

Students engaging in the research process are seeking ways to enhance knowledge; they explore the possibility of using existing data and/or whether new data must be collected to best answer their research question.

In December, ProQuest began to facilitate the access of supplemental files – the form of these supplemental files is up to the institution and what they will accept (ProQuest will accept anything submitted by researcher).

ProQuest’s policy concerns for dissertations include:

  • Will the “published version” differ from the “preprint” version?
  • What permissions will need to accompany the dissertations?
  • Are institutions doing the robust check of permissions or are they expecting ProQuest to do this? ProQuest makes clear what they look at – not the content of the submission but only the forms completed by the student, and the title, the abstract and the table of contents of the dissertation.
  • What are the preservation expectations of the universities that submit dissertations to ProQuest?
  • What file types to university publishing partners expect ProQuest to commit to migrate?

Bibliographic citations enhance the value of data collections. So why not add a direct link to the data in the ProQuest online catalog?

Challenges: need for ProQuest to modify web page & get author permissions – would authors agree to provide a direct link to their data?


  • How do we protect student’s IP to ensure that they can publish and build their careers?
  • Who owns the data? – what if it is not the student, and it is a faculty member or another party?

IASSIST 2008 Conference Session: Moving Research Data Into and Out of Institutional Repositories

Robin Rice of EDINA and Edinburgh University Data Library gave an excellent overview of open access to data. She quoted Peter Buneman of the Digital Curation Centre who said, “The best way to preserve your data is to publish it!”

She also gave an overview of open data licences, including the Science Commons Open Data Protocol and the Public Domain Dedication & License (PDDL), which avoids the attribution stacking problems that may occur with Creative Commons licences.

Robin raised the issues:

  • What are the incentives for researchers to manage and share data?
  • How to meet funders’ requirements: researchers need to define how they are going to share their data or why they cannot
  • Capacity of higher education institutions to provide services for data management?

Finally, she gave an overview of the DISC-UK DataShare project (DISC-UK is a collaboration between Southampton University, Edinburgh University, Oxford University and London School of Economics) and what it is doing in tracking the tools and guidelines available relating to open data.

Katherine McNeill from MIT Libraries, DSpace and the Harvard-MIT Data Center gave a talk concerning interoperability between MIT’s institutional repository (IR) and data repositories.

MIT has multiple locations for depositing data – DSpace, Harvard-MIT Data Center (HMDC) and ICPSR.

This presents challenges for searching, unifying collections and archiving. It is also difficult to advise faculty where to deposit their data as each location has its own advantages and disadvantages.

Therefore, there is a need for interoperability.

Opportunity: PLEDGE Project – designed to foster the use of data grid technologies, replicating content across multiple systems for the purpose of preservation.

Developed an ‘agent’ so that DSpace software and HMDC Dataverse software are interoperable. Developer of agent: Mark Diggory.

Goal: to archive, preserve and provide access in DSpace to MIT-authored studies in HMDC.


(1) Workflow for selecting and processing studies

  • Currently this is a manual process and an informal service

(2) Updating of studies

  • Currently, if content updated in HMDC, there is no ‘flag’ to notify repository managers that it needs to be updated for DSpace

(3) License agreements and terms of use

  • DSpace has licensing screens which give DSpace permission to disseminate etc.
  • With the new system content is loaded in the ‘back end’ so the researcher is not actually exposed to the licensing screens
  • Repository manager has to get permission via email
  • How to deal with this in the future?
  • Further, end-users are usually notified of what they can do with data but this way it is more hidden – implications?

(4) Keeping the agent up to date

IASSIST 2008 Conference – Technology of Data: Collection, Communication, Access and Preservation

From 28th to 30th May I attended the IASSIST 2008 conference, Technology of Data: Collection, Communication, Access and Preservation at Stanford University.

I was there representing the Legal Framework for e-Research Project, which is hosted at QUT. Under this project, we have been examining and developing legal and management frameworks for data access, sharing and reuse.

I appeared to be the only lawyer or legal academic at the IASSIST conference, which was somewhat surprising considering the number of times that presenters raised legal questions or concerns in their sessions. The primary concern seemed to be how to determine ownership of data given the vast number of researchers, database managers and other interested parties that may assert ownership interests in the data. Copyright was a concern (does it attach? how do we deal with it so as to provide wide access to the data?), as was privacy. Finally, even where ownership rights could be determined, the big question was: how do we get our researchers to share? It was generally agreed that researchers are notoriously protective (overprotective?) of their data.

My thoughts on these matters were enthusiastically received. In brief, I advocated the use of Data Management Plans from the conception of a research project, which set out:

  • the different parties with an interest in the data collected by the research project;
  • who owns the data and/or who may control the data;
  • who is responsible for managing the data;
  • any legal controls applying to the data, including contractual conditions (arising in a funding agreement, employment agreement or any other agreement), copyright, confidentiality or privacy restrictions;
  • how data collected by the research project will be integrated with existing data from other sources in a way that complies with all responsibilities imposed by law;
  • how data will be disseminated;
  • how data will be attributed;
  • what uses other researchers may make with the data; and
  • data preservation and sustainability.

Whether privacy will be an issue will depend on the type of data collected and whether it can identify an individual. Whether copyright law will apply will depend again on the type of data collected and the jurisdiction in which the data is collected. In Australia, databases may attract copyright protection, but this is unlikely to be the case in the United States. Where data or a data compilation does attract copyright protection, licensing mechanisms can be employed to ensure wide distribution and reuse of the data. One option is applying a Creative Commons licence to the copyrightable elements of the data or database. Alternatively, Science Commons has developed an Open Data Protocol.

I have written more about the legal frameworks surrounding data here (with Professor Anne Fitzgerald).

Another pervasive concern of conference participants (most of whom were – I gathered – data librarians and database/repository managers) was obtaining accurate and reliable metadata from researchers who usually feel that they have a million better things to do than enter metadata into a computer system. This has been a problem faced by librarians dealing with theses and dissertation repositories and journal article repositories for years. Different institutions have different ways of dealing with the problem of reluctant academics. Some institutions have taken it upon themselves to enter the metadata on behalf of the academic. However, I still feel that the best approach is through consistent advocacy, education and demonstrations to show academics the enormous benefits to them of having their work easily searchable, findable and citable online.

The following posts comprise my notes from some of the conference sessions that I found most interesting. Apologies if the notes are a little rough. If you attended the conference and have any corrections, please let me know.

Congratulations to IASSIST for organising a fabulous 2008 conference and to Stanford for hosting it.