Selecting Data for Preservation


For data archiving to be a meaningful and cost effective activity, it is important to be selective when deciding which data to keep. Data selection is a subjective process and will be determined by the nature of the project and types of data produced. Consider the following when making decisions on which data to keep:

  • Any relevant institutional or legal requirements (Does your funding body have a data policy that specifies a retention period for the project’s data? Is the data affected by legislation such as Data Protection, Freedom of Information or copyright?)
  • The scientific or historical value of data (Is the data vital to your project? Has it been used again in subsequent projects or research? Can the data be replicated or re-measured without considerable cost or new external funding?)
  • Uniqueness of the data (Does it duplicate existing work or is it unique? Do other copies exist elsewhere, and if so will they be preserved?)
  • Potential for reuse of the data (Are there any intellectual property rights (IPR) issues relating to sharing or reuse of the data? Are human subjects involved and was consent given for archiving or reusing the dataset? Is the dataset in a format that allows others to reuse it without cost or other restrictions?)
  • Cost effectiveness of the data storage (How much will long-term storage of the data cost? Have you secured funds to cover the storage costs?)
  • Documentation of the data (Is there a data dictionary explaining things such as field names and the context of the data? Is there sufficient documentation to allow the data to be found wherever it is stored?)

(Based on information from the University of Cambridge)

These criteria will be used when assessing data to be submitted to Royal Holloway's Figshare.