On Thursday 2 September, I attended the Australian National Data Service (ANDS) Workshop at the eResearch Australasia Conference 2008. This was a full day workshop, but the ANDS team did a great job of keeping the workshop interesting and highly interactive, and the day went very quickly.
In the morning, there were a few brief presentations – notably from Andrew Treloar of Monash University and the ANDS Establishment Project and Tracey Hinds from CSIRO. I particularly enjoyed Tracey’s presentation, which at a conference that seemed dominated by IT issues, focused on the social issues and the governance issues involved in data management and sharing research data. My notes from Tracey’s talk are below.
The rest of the day was spent in small round-table discussions. The most lively discussion surrounded questions about what institutions and research bodies need to help them in managing and sharing their data, and how ANDS could help. The group found that there was a need for:
- an openly accessible registry of ontologies for metadata of datasets, so that institutions can start using common and enduring metadata to describe their data;
- training for researchers, repository managers, research management staff, librarians, archivists and IT staff about data management (including the legal issues surrounding data management), database/repository infrastructure (how to make the database easy to use and sustainable), open access (why should you share your data?) and metadata. It was agreed that the training materials might have a generic introduction component that could be used by all groups, but then there should be different kinds of training materials that provide relevant detail to different groups (e.g. research management staff will have different concerns to IT staff; science researchers may have different concerns humanities researchers);
- developing conventions for the citation of data, so that researchers can get credit for sharing their data; and
- proper and comprehensive data management plans (DMP).
There was a consensus that data management plans were particularly important and that it would be useful to develop template DMPs which included specific sections that could be added or deleted as appropriate (for example, a section about compliance with privacy laws might be relevant to medical research but not to astronomy research). It was also thought that ANDS could select a few research projects from different disciplines and assist these projects in formulating a DMP. The resulting DMPs could then be made available online for other projects to use and adapt.
In relation to ANDS selecting particular projects to assist, in a broader way, with their data management and release (“engagement targets”) in the hope that these projects might then appear as “exemplar projects” for other groups, it was considered that appropriate selection criteria might be:
- broadness of audience and impact;
- potential for reuse of data and the ongoing reusability/sustainability of the data;
- the project’s willingness to assist others to develop their data management skills;
- wide inter-disciplinary appeal;
- willingness to transfer data around; and
- projects which will have good exemplary value to attract other communities.
I believe that ANDS will make the notes taken from the workshop available online.
Here are my notes from Tracey’s talk:
Tracey Hind – CSIRO
- ownership of data should stay with researcher
- but still need to manage CSIRO’s data at a higher level – maybe provide an “enabling” service for this rather than dictate a “one size fits all” approach
- As of now, CSIRO still does not formally recognise the idea of data management
- Real challenges are not technology – it is the human factors – issues of acceptance, understanding, people being prepared to share their data, IP etc
- High demand for storage, but storage is not management
- Scientists are not working as well across disciplines as the Flagship vision as hoped, much of this is because “you don’t know what you don’t know” – and it’s hard getting insight into other research disciplines
- Making data easily discoverable is the key to achieving multi-disciplinary outcomes
- Lesson is that data is a complex issue – especially when researchers don’t understand the potential benefits – you need exemplar projects to demonstrate the benefits of data management to get buy in.
- CSIRO’s data management vision (eSIM) – CSIRO scientists will be able to…gather, analyse and share scientific information securely and efficiently, leading to greater scientific outcomes for Australia
- Four layers – people, processes, technology and governance
- People challenges = incentives for deposit into a repository;
- Processes challenges = making sure that the work flows created actually support the technology and make things easy
- Governance = making sure all of this is properly funded and that data management is a part of the decision making (i.e. make sure researchers have a DMP before they are awarded funding)
- CSIRO’s exemplar projects = Auscope project; Atlas of Living Australia; Corporate Communications