When it comes to making your dataset open-access, it’s all pretty simple isn’t it? Collect and collate the data you need to present, then make it available in an institutional or subject repository under an open licence and that’s it. Well, not quite!
You need to be aware of how copyright and other types of legal protection work in respect of datasets, and of any steps you will need to take to ensure that your datasets can be made freely available to others whilst respecting any rights that apply to individual components or types of information contained in your dataset.
WHY DOES IT MATTER?
Protection of different types of data takes many forms. Researchers need to be aware of the legal landscape within which they produce and use research data and how this affects the way in which datasets can be made available for re-use by other researchers. The University recommend using the Open Data Commons Open Database Licence as a means of making your data open. This licence allows users to use, copy and distribute the database, to create other works based on the contents of the database and to modify, adapt or build on the database. (N.B. the terms datasets and databases in this context are used interchangeably). These are activities which are often regulated by other legal forms of protection such as copyright, database rights and General Data Protection Register, so before applying an open licence to a dataset, a researcher needs to know whether any of the elements in their dataset attract protection, and if so, how it could affect the ability to make the dataset open access and what if anything needs to be done to ensure the rights and protections applying to the contents of the dataset are not infringed.
HOW CAN I TELL WHAT IS PROTECTED?
Research data comes in many forms; depending on the type of project, the dataset containing the evidence underpinning the research could comprise amongst others, observational data such as temperature measurements, body-weight recordings, computer code or software, survey data, collections of digital images, collections of newspaper articles, collections of private correspondence, transcripts or recordings of interviews or physical artefacts such as artworks or musical compositions. Individual data elements within a dataset may enjoy protection, and collections of data can also enjoy Sui Generis database protection. Personal data are protected under the GDPR (General Data Protection Register).
Copyright law grants the rightsowner a number of exclusive rights including the right to copy, distribute and adapt the work and to sell or licence the copyright for use by others. Facts in themselves are not protected by copyright, rather, it is the original expression which is protected by copyright, and the work in question must demonstrate “the author’s own intellectual creation”.
A wide range of outputs enjoy copyright protection including:
- Original literary, dramatic, musical or artistic works;
- Computer programs and software code;
- Databases (in addition to the separate Database Right). This only applies where the selection or arrangement is original, and the protection only applies to the structure of the database and not the contents.
- Sound recordings, films or broadcasts; and
- Typographical arrangements of published editions.
Datasets comprising third-party copyright material, for example collections of newspaper articles, posts from social media sites cannot be made open access without first obtaining the permission of the rightsholders. Similarly, where a researcher wants to include recorded interviews in a dataset, that is intended to be made available to others on an open-access basis, they would need to obtain the permission of the participants in order to do this. Data obtained from archived datasets hosted in subject or other repositories is often made available for personal use, but if the datasets are intended to be further disseminated, then the permission of the rightsholder(s) of the dataset will need to be obtained.
In the European Union, the SGDR (Sui Generis Database Right) protects original and non-original databases. Database rights can only apply where there has been substantial investment in the collection, verification and presentation of data obtained from independent sources. Efforts expended in creating the data populating the database does not automatically confer a database right. Database rights are protected for 15 years from the date of creation or publication. Once a database has been made available to the public, the Database Right allows authorised users to extract and re-use a substantial portion of the content for specific non-commercial purposes under “fair dealing”. For some complex databases, the structure itself can be categorised as a literary work (even if its contents are of a visual nature) and attract 70 years’ copyright similar to other literary material.
General Data Protection Register (GDPR)
Datasets pertaining to research in any discipline which uses personal data such as medical science, social sciences and the humanities are required to comply with the provisions of the GDPR. Datasets containing personal data cannot be made open access, even those where the data has been pseudo-anonymised. Datasets containing fully anonymised data may be made available.
WHAT DO I NEED TO DO?
Before you start your research, you should be aware of any requirements to make your dataset open. This may be as a result of a mandate from your funding body, or in response to the University’s Research Data Management Policy. The best time to consider whether you need to obtain permission to make data available is when creating your data management plan. The plan should outline what data will be created and how, and should include details on how the data will be shared, paying attention to any rights, protections and restrictions that may need to be taken into consideration.
Mary Mowat, Copyright Officer, University of Aberdeen