About data citation
Providing a citation for a scientific dataset is beneficial to both the original data collector and those that wish to reuse the data in the future. Data citation is relatively new (compared to the citation of scientific publications) and best practice is evolving but there are a number of groups that have created excellent guidance to follow.
The following web links provide this guidance and have been picked to suit different levels of interest and expertise:
- Basic: A briefing paper created by the UK Digital Curation Centre suitable for anyone wanting a general overview of data citation
- Intermediate: A working guide on how to cite datasets created by the UK Digital Curation Centre suitable for both researchers and data managers
- Advanced: A data citation working group under the Research Data Alliance that is suitable for those who want to be involved in the development of data citation best practice
In most cases the process of creating a dataset citation results in the publication of the dataset even if this is only through a local repository. This doesn’t in itself prove that the data are either good quality or, perhaps more importantly, useful to the scientific community. Scientific publications address this through the process of peer review and it is becoming more common for datasets to be reviewed in a similar way. Data journals now exist for many scientific disciplines while many traditional publishers are introducing submission policies specifically for data. Anyone wishing to follow current developments in data publication should follow the activities of the Research Data Alliance Publishing Data Interest Group.
The UKRI Open Access policy states that all publications acknowledging funding from UKRI or any of its councils, must have a data access statement that informs readers where the underlying research materials associated with a paper are available, and how the research materials can be accessed. This encompasses data.
The statement should include:
• where the data can be accessed (i.e. which data repository);
• a unique persistent identifier, such as a Digital Object Identifier (DOI) or accession number, or a link to a permanent record for the data;
• details of any restrictions on accessing the data and a justifiable explanation (e.g. for legal, ethical or commercial reasons).
A dataset citation will cover these requirements.