Open Access Day

Today, 15 October 2008, is Open Access Day!

Today I attended an OA Day event in the QUT Library, which was sponsored by SPARC, PLoS and Students of Free Culture.

First, we watched the “Voices of Open Access” video, which is available on the Open Access Day website, and the QUT Library Secretariat “Shout Out” for OA video.

We then had some presentations and discussions, moderated by Elizabeth Stark.

Peter Jerram, CEO of PLoS, gave a short introduction. He stated that there is now:

  • Over 3600 journals in Directory of Open Access Journals (DOAJ)
  • Now more than 12,000 OA repositories in more than 70 countries
  • More than 50 mandates for OA in 28 countries

He also gave his thanks to:

  • Authors who choose to publish in OA
  • Peter Suber
  • Melissa Hagemann, OSI
  • DOAJ
  • Publishers and editors of OA journals
  • Research funders such a Wellcome Trust that provide funds for OA journals
  • SPARC and Students of Free Culture
  • Advocates of OA

Dr Phil Bourne, Editor in Chief of PLoS Computational Biology, who was presenting from University of California San Diego, gave the keynote presentation. The webcast can be accessed at http://openaccessday.org/program/

Presentation: The Promise of Open Access

SciVee
mash up of academic content
e.g. Pubcast – video integrated with the full text of the paper – but this requires openness in relation to the paper i.e. unrestricted access, Creative Commons licence
e.g. Professional Profile includes all sorts of content: publications, pubcasts and videos etc – profiles are a first step to virtual research environments

BioLit: Tools for new modes of scientific dissemination
http://biolit.ucsd.edu
Mash up between database and journal article
Integrate biological literature and biological database and includes:

  • A database of journal text
  • Authorising tools to facilitate database storage of journal text
  • Tools to make static figures and table interactive

Semantic enrichment of text
Semantic enrichment at the point of authoring – like the spell checker in Word – scans for specific information/word (e.g. name of a gene) and goes out an retrieves information, info appears in column to side of paper, author can choose whether to link to that information or not.

Questions:

Q: How does peer review fit into the new multi-media environment?
A: It is a misconception that peer review does not fit into the OA environment
For Pubcast, the paper associated with the video has already been peer reviewed.

Q: is there a plug-in for the semantic enrichment tool for open office or other platforms that are not Word?
A: Not yet, but probably coming. Will be open source and people can do what they like with it. No restraints imposed by Microsoft

ARROW Repository Day

On 14 October 2008, I attended the ARROW Repository Day held in Customs House in Brisbane. I presented on the legal issues surrounding management of data for inclusion in a repository. You can access my slides here.

Chris Rusbridge of the Digital Curation Centre in the UK also presented. Some brief notes from his talk are below. Chris was live blogging the day, so if you are interested I suggest you read his notes at the Digital Curation Blog.

Chris Rusbridge (Digital Curation Centre) – Moving the repository upstream

The resistant scholar

  • Uncertainty, risk – about copyright; about Ingelfinger Rule
  • Change
  • Too busy
  • Doesn’t fit into the way they do things now
  • Not well motivated by advantages to others
  • Little in it for them!

Research workflow

  • many different tasks in parallel
  • all different stages
  • teaching (several), research (several), writing up research, writing grant proposals, reviewing papers, administrative tasks etc

On negative clicks

Asked – how many extra clicks are you willing to make to ensure preservation of your record?

Answer – zero

Negative click repository?

Can the repository help rather than hinder?
Towards a Research Repository System? [diagram]

Maybe we could…

  • help with publisher liaison
  • support multiple authoring across several institutions
  • more permissive identity management
  • support multiple versions
  • fine grained access control
  • checkpointing
  • support supplementary data
  • provide basic data management capability
  • provide simple, cross-platform, persistent storage
  • provide some longevity
  • provide additional benefits

More on the Brisbane Declaration

This is what Professor Arthur Sale of the University of Tasmania, one of the chief architects of the Brisbane Declaration, has written about it:

…May I tease out a few strands of the Brisbane Declaration for
readers of the list, as a person who was at the OAR Conference in
Brisbane.

1. The Declaration was adopted on the voices at the Conference,
revised in line with comments, and then participants were asked to put
their names to it post-conference. It represents an overwhelming
consensus of the active members of the repository community in
Australia.

2. The Conference wanted a succinct statement that could be used to
explain to senior university administrators, ministers, and the public
as to what Australia should do about making its research accessible.
It is not a policy, as it does not mention any of the exceptions and
legalisms that are inevitably needed in a formal policy.

3. The Conference wanted to support the two Australian Ministers with
responsibility for Innovation, Science and Health in their moves to
make open access mandatory for all Australian-funded research.

4. Note in passing that the Declaration is not restricted to
peer-reviewed articles, but looks forward to sharing of research data
and knowledge (in the humanities and arts).

5. At the same time, it was widely recognized that publishers’ pdfs
(“Versions of Record”) were not the preferred version of an article to
hold in a repository, primarily because a pdf is a print-based concept
which loses a lot of convenience and information for harvesting, but
also in recognition of the formatting work of journal editors (which
should never change the essence of an article). The Declaration
explicitly make it clear that it is the final draft (“Accepted
Manuscript”) which is preferred. The “Version of Record” remains the
citable object.

6. The Declaration also endorses author self-archiving of the final
draft at the time of acceptance, implying the ID/OA policy (Immediate
Deposit, OA when possible).

While the Brisbane Declaration is aimed squarely at Australian
research, I believe that it offers a model for other countries. It
does not talk in pieties, but in terms of action. It is capable of
implementation in one year throughout Australia. Point 1 is written so
as to include citizens from anywhere in the world, in the hope of
reciprocity. The only important thing missing is a timescale, and
that’s because we believe Australia stands at a cusp..

What are the chances of a matching declaration in other countries?

Arthur Sale
University of Tasmania

This is what Peter Suber had to say on his blog:

This is not the first call for OA to publicly-funded research. But I particularly like the way it links that call to (1) OA repositories at universities, (2) national research monitoring programs, like the HERDC, and (3) the value of early deposits. Kudos to all involved.

Just announced: Brisbane Declaration [on open access in Australia]

Following the conference on Open Access and Research held in September in Australia, and hosted by Queensland University of Technology, the following statement was developed and has the endorsement of over sixty participants.

Brisbane Declaration

Preamble
The participants recognise Open Access as a strategic enabling activity, on which research and inquiry will rely at international, national, university, group and individual levels.

Strategies
Therefore the participants resolve the following as a summary of the basic strategies that Australia must adopt:

  1. Every citizen should have free open access to publicly funded research, data and knowledge.
  2. Every Australian university should have access to a digital repository to store its research outputs for this purpose.
  3. As a minimum, this repository should contain all materials reported in the Higher Education Research Data Collection (HERDC).
  4. The deposit of materials should take place as soon as possible, and in the case of published research articles should be of the author’s final draft at the time of acceptance so as to maximize open access to the material.

Brisbane, September, 2008

ANDS Workshop at eResearch Australasia Conference

On Thursday 2 September, I attended the Australian National Data Service (ANDS) Workshop at the eResearch Australasia Conference 2008. This was a full day workshop, but the ANDS team did a great job of keeping the workshop interesting and highly interactive, and the day went very quickly.

In the morning, there were a few brief presentations – notably from Andrew Treloar of Monash University and the ANDS Establishment Project and Tracey Hinds from CSIRO. I particularly enjoyed Tracey’s presentation, which at a conference that seemed dominated by IT issues, focused on the social issues and the governance issues involved in data management and sharing research data. My notes from Tracey’s talk are below.

The rest of the day was spent in small round-table discussions. The most lively discussion surrounded questions about what institutions and research bodies need to help them in managing and sharing their data, and how ANDS could help. The group found that there was a need for:

  • an openly accessible registry of ontologies for metadata of datasets, so that institutions can start using common and enduring metadata to describe their data;
  • training for researchers, repository managers, research management staff, librarians, archivists and IT staff about data management (including the legal issues surrounding data management), database/repository infrastructure (how to make the database easy to use and sustainable), open access (why should you share your data?) and metadata. It was agreed that the training materials might have a generic introduction component that could be used by all groups, but then there should be different kinds of training materials that provide relevant detail to different groups (e.g. research management staff will have different concerns to IT staff; science researchers may have different concerns humanities researchers);
  • developing conventions for the citation of data, so that researchers can get credit for sharing their data; and
  • proper and comprehensive data management plans (DMP).

There was a consensus that data management plans were particularly important and that it would be useful to develop template DMPs which included specific sections that could be added or deleted as appropriate (for example, a section about compliance with privacy laws might be relevant to medical research but not to astronomy research). It was also thought that ANDS could select a few research projects from different disciplines and assist these projects in formulating a DMP. The resulting DMPs could then be made available online for other projects to use and adapt.

In relation to ANDS selecting particular projects to assist, in a broader way, with their data management and release (“engagement targets”) in the hope that these projects might then appear as “exemplar projects” for other groups, it was considered that appropriate selection criteria might be:

  • broadness of audience and impact;
  • potential for reuse of data and the ongoing reusability/sustainability of the data;
  • the project’s willingness to assist others to develop their data management skills;
  • wide inter-disciplinary appeal;
  • willingness to transfer data around; and
  • projects which will have good exemplary value to attract other communities.

I believe that ANDS will make the notes taken from the workshop available online.

Here are my notes from Tracey’s talk:

Tracey Hind – CSIRO

  • ownership of data should stay with researcher
  • but still need to manage CSIRO’s data at a higher level – maybe provide an “enabling” service for this rather than dictate a “one size fits all” approach
  • As of now, CSIRO still does not formally recognise the idea of data management
  • Real challenges are not technology – it is the human factors – issues of acceptance, understanding, people being prepared to share their data, IP etc
  • High demand for storage, but storage is not management
  • Scientists are not working as well across disciplines as the Flagship vision as hoped, much of this is because “you don’t know what you don’t know” – and it’s hard getting insight into other research disciplines
  • Making data easily discoverable is the key to achieving multi-disciplinary outcomes
  • Lesson is that data is a complex issue – especially when researchers don’t understand the potential benefits – you need exemplar projects to demonstrate the benefits of data management to get buy in.
  • CSIRO’s data management vision (eSIM) – CSIRO scientists will be able to…gather, analyse and share scientific information securely and efficiently, leading to greater scientific outcomes for Australia
  • Four layers – people, processes, technology and governance
  • People challenges = incentives for deposit into a repository;
  • Processes challenges = making sure that the work flows created actually support the technology and make things easy
  • Governance = making sure all of this is properly funded and that data management is a part of the decision making (i.e. make sure researchers have a DMP before they are awarded funding)
  • CSIRO’s exemplar projects = Auscope project; Atlas of Living Australia; Corporate Communications

eResearch Australasia Conference 2008 – Tuesday morning (30 September)

John Wilbanks – Uncommon Knowledge and e-Research

Once again, John Wilbanks gave an informative and dynamic presentation. It was geared towards the audience in attendance here at the eResearch Australasia Conference (who are somewhat more IT and science focused than the audience at the OAR conference last week) and so described in detail many aspects of the NeuroCommons Project. If you are interested, I suggest that you see the Neurocommons website. I don’t think any summary that I could provide here would do the project justice. But here are some notes from the beginning of John’s presentation:

Why “eResearch”?

1. eResearch is a requirement imposed on us by the flood of data

  • the web doesn’t give us the same results for science as it does for culture
  • so what can we do?
  • We can…collaborate
  • Eg – Watson and Crick – their success was composed, by building on a series of blocks of knowledge that were available to them from a range of sources
  • But humans can’t build models to scale anymore
  • We need to utilize digital resources

One way to think about eResearch is that it is about:

  • Finding the right collaborator;
  • making big discoveries;
  • getting credit for one’s work

2. We need to convert what we know into digital formats that support model buildings

  • “the web” – no organising topics – hyperlinking allows us to organise things in a dynamic way
  • all the data and all the ides: building blocks
  • open access attempts to solve the legal problems – giving credit where credit is dues; allows humans to read the papers; allows publicly funded research to be accessed by the public
  • but it doesn’t solve the technical problem of paper-based formats that cannot be read by machines
  • we need to develop machine-searchable formats

Kerstin Lehnert, Columbia University – New Science Communities for Cyberinfrastructure: The Example of Geochemistry

Kerstin described eResearch as a vision to provide a genuine infrastructure of highly reliable, widely accessible ICT capabilities to assist researchers in their work – ultimately about people

She discussed the cultural issues involved in sharing data. She identified data citation (what I would call “attribution”) as a big problem. How can all scientists and contributors be cited? Many want to be attributed personally (not just by a project), but there are so many contributors and this quickly becomes a big and messy problem. This observation reflects the problem that we at the OAK Law and Legal Framework to eResearch Projects identified in assessing whether Creative Commons licences could be applied to data compilations. Attribution is an important condition of the CC licence. Researchers and research projects need to decide and identify (before applying a CC licence) how the data compilation is to be attributed, otherwise users could run into all sorts of problems and confusion.

Jane Hunter (UQ) – National Committee for Data in Science (NCDS)

A committee of the Australian Academy of Science – established in February 2008; member of CODATA

Mission – to promote enduring access to Australia’s scientific data assets in order to drive national research and innovation
And to provide a National Data Science voice
Encourage and facilitation cross-fertilisations, between specific science disciplines and other data generation/management disciplines

Future activities include engaging with Chairs of other national committees, including looking at what role they can play within ANDS (Australian National Data Service) to support their goals.

Review: Anatomy Titus Fall of Rome

On Thursday 25 September, I saw The Bell Shakespeare Company’s production, “Anatomy Titus Fall of Rome” at the Cremorne Theatre. The play was directed by Michael Gow and starred John Bell as Titus Andronicus.

I was very impressed with this production. It was contemporary (all actors performed in regular clothes and sometimes wore rather absurd masks) and powerful. I wasn’t quite sure how they were going to depict what is probably Shakespeare’s bloodiest tragedy, and in the end they did it with a lot of blood – a bucket of “blood” centre-stage, to be exact, which the actors flung all over the stage during the course of the production.

The actors did a wonderful job and carried the audience through the entire 2.5 hours without pause and without a hitch. The intermingling of comedy throughout the tragedy certainly helped.

The parts I liked best were where modern objects and references were weaved amongst the Shakespearian ones – books (I think all were actually copies of Shakespeare’s works) were used as weapons and the actor’s monologues frequently featured random modern words thrown in as if to keep the audience on their toes.

However my favourite part was after the play itself, when the actors took some time to talk directly with the audience. This was a wonderful thing for them to do and it resulted in some very interesting discussion. Importantly, we discussed why a play that featured a prominent black character and the violent raping and torturing of a young woman was performed entirely by a white male cast. Several female members of the audience expressed the feeling that they would not have been able to watch the rape scene had it been performed with a female actor, and were consequently glad that a man had played the part. I actually thought the absence of both a dark-skinned actor and a female actor only served to vividly (and almost shockingly) reveal to the audience the racist and sexist undertones in Titus Andronicus, and indeed, in much of the world still today. I was impressed with the way the cast discussed these issues with the audience– they proved to be intelligent and sensitive to the issues. (However, it did not change the fact that the actors could only ever act out their interpretation, as a white male, of what it was like to be a woman or a black man.)

I would highly recommended seeing this production before it closes on 4 October.

eResearch Australasia Conference 2008 – Cloud Computing

Monday – Plenary: Cloud Infrastructure Services Panel Session

Chair: Nick Tate, UQ
Tony Hey – Microsoft Research
Peter Elford – Cisco
Kevin Mayo – Sun Microsystems
Anne Fitzgerald – QUT

Tony – A Digital Data Deluge in Research

– outsourcing of IT infrastructure
– minimize costs
– small businesses have access to large scale resources
– eg – Virtual Research Environment run by British Library: content management; knowledge management; social networking; online collaboration tools
[similar presentation to at OAR conference]

Peter –

– is cloud computing really a new idea?
– don’t think so – still just software as a service
– so what is the “cloud”?
– do researchers struggle to get access to machines? – probably no
– but do they have problems managing them well – probably yes
– balance between technology, people and processes
– it is a natural evolution and another opportunity
– but not a disruptive technology

Kevin –

From point of view of building these systems:
– need a successful business model
– need to consider privacy and security in a global world
– need to understand technical considerations
– there are a number of services out there at the moment because they have managed to deal with the business model problems….
– …but they may not have effectively dealt with the other issues
– e.g. how you get your data to and from the service
– in the future – we might see: automating the collection and analysis of census data; climate data etc – with barely any interference by people

Anne –
– when we think of cloud computing, many legal issues come to mind: privacy, data security etc
– so far, adapting the law to the digital environment has developed in a very ad hoc manner
– so maybe we would be better to approach it from principles, I prose the following principles:

1. establishing trust in the online environment
– cloud computing = applications that can be accessed anywhere by anyone
– so issues of data security, privacy, reliability of the data and the service
– not much on this (beyond some privacy restrictions) in Australia at the moment

2. equivalence of traditional and online transactions
– need a set of rules to apply to online activities that are equivalent to traditional activities
– at the moment, attempt to transpose current laws in online environment = copyright, electronic transactions act
– but when we look at cloud computing we see this principle is not being applied in a consistent way
– need for clarification of concepts of ownership of data stored on someone else’s equipment
– vast difference between copyright licence given to Google for Google Docs – vs rights that would be given to someone in the real world who is storing and managing someone else’s documents (i.e. they would be given virtually no rights) – why the immense difference just because the storage and management occurs online?

3. Participation of Government in regulating online activities
– would enactment of legislation help or hinder here?

4. We need openness in this environment
– open standards and maybe also open source
– affordability of cloud computing can help to overcome the digital divide
– expectation of users is that they can access the service where and when they like

Development of laws and policies in this environment has occurred primarily at an international level (e.g. OECD – Seoul Declaration), but there is still no international body charged with regulating online commerce

Questions:

Q: Ashley Buckle – Monash: not convinced that this is a solution for him running a small research lab – this is the problem: convincing people that this is for them, especially when they don’t want to be guinea pigs for new projects that may not work

A: Tony – you can only be convinced by something that works for you. There will be a variety of academic cloud services. But the real test is that it is easy to use, can be acquired easily and cheaply, and it should work for you and if it doesn’t work then you shouldn’t use it.

Q: If Microsoft and Google etc operate cloud computing services outside of the USA, does the Patriot Act still apply to them?

A: Not an expert on Patriot Act, but – we need to establish a uniformity or conformity throughout the world, after discussion among countries, and not just have one country’s law dominate, otherwise this could actual be a barrier to trade etc.

eResearch Australasia Conference 2008

I am currently in Melbourne for the week, attending the eResearch Australiasia Conference 2008, hosted by the Australian Government Department of Innovation, Industry, Science and Research (DIISR) at the Sebel and Citigate Hotels, Albert Park. The conference runs from Monday 29 September – Wednesday 1 October, then there are two days of workshops on Thursday 2 and Friday 3 October. I will be here until Friday. I will try to blog my notes as I go (subject to internet availability) and I will post my overall comments at the end.