Gather your data files, documentation, and any information necessary to reuse your dataset. You may choose to provide processed data, ‘raw’ unprocessed data or both, as well as the scripts, code or software needed to reanalyze your data.
You may choose to upload a version of your analysis scripts alongside your data, but we recommend that you deposit source code or software in purpose-built repositories such as GitHub, GitLab or Bitbucket. You can link directly to these other outputs from your FRDR metadata and reference them in your README.
When you deposit your data in FRDR, your file structure (how you have arranged your data into directories or folders) is retained. Consider arranging these files by type, date, or analysis to make them easier to understand. For example:
Example a)
├── Code
│ ├── process_raw_data.r
│ ├── analysis_1.r
│ └── analysis_2.r
├── Data
│ ├── Raw_data
│ │ ├── file_a.raw
│ │ └── file_b.raw
│ └── Processed_data
│ ├── file_a.csv
│ └── file_b.csv
├── Outputs
│ ├── Figures
│ └── Models
└── README.txt
Example b)
├── Documentation
│ ├── site_information.csv
│ ├── site_1.shp
│ └── site_2.shp
├── Data
│ ├── year_01
│ │ ├── site_1.csv
│ │ └── site_2.csv
│ └── year_02
│ ├── site_1.csv
│ └── site_2.csv
└── README.txt
Name your files in a logical and descriptive way, so that you and other researchers can understand them at a glance. Keep file names brief, and consider including information about the project, content, date or version number as part of the filename. Use alphanumeric characters, and avoid spaces or special characters (%^& * ’). Your naming convention should be described in your README.
Example: StanleyPark_Temperatures_20200801.csv
Example: AnalysisPoem_IV05_v03.txt
For further advice, see UBC’s File Naming Conventions.
Open, non-proprietary file formats are preferred for long-term preservation, but sometimes sharing proprietary file formats is necessary for the reuse of data. Consider the needs of the future researcher when deciding what types of files to deposit:
What file formats are widely used in your field? Is it likely that other researchers will have access to the software necessary to open your files?
If you transform your files into open formats for deposit, will any information be lost?
FRDR is able to accept and ensure bit-level preservation for a variety of file formats, and will work with you to retain your data in the most appropriate format. However, we highly recommend the following preservation-friendly file formats:
For more information on preservation formats, see guidelines from UK Data Service, Cornell, University of Edinburgh.
Data will only be useful (and beneficial) in the long-term if they are thoroughly described. To ensure your data are interpreted correctly, it is important to include a codebook and/or a README file with your data, and to document your data collection methods. For this reason, a FRDR curator will ask that a README file be provided with your submission. You may use the FRDR README template.
Tips for writing READMEs:
Further guidance is available in UBC’s ‘Quick Guide: Creating a README for your dataset’ and Cornell University’s ‘Guide to writing "readme" style metadata’.
Please look your dataset over before you submit it for review. Some things you may wish to consider:
FRDR curators will work with you to review your data at the time of submission to help ensure the quality of the metadata in the repository and to improve the findability and accessibility of your data. Curators are typically librarians employed by research institutions around the country. They may also be data managers embedded within research groups who have agreed to work with CARL Portage, and have been granted special permissions in the FRDR system.
Have you obtained data or code from a third party who may hold copyright or intellectual property rights that would prevent you from re-distributing them?
If you used secondary data in your research, you will need to confirm you have permission to re-publish these data in FRDR before your submission can be approved by a curator. Uncertain if you need permission? Data that were made freely available for research purposes are not necessarily ‘free.’ Ask yourself:
Were you required to log-in to a website to download the data?
Did you agree to any specific terms of use, sign a data use agreement, or reach an understanding with the data provider that would prevent you from publishing these data in FRDR?
If the data are readily available from another source and you have not manipulated or edited the datasets for your research, please consider linking to the original source rather than re-publishing. You may do so using the ‘related identifier’ field when you deposit your data. Please also include full citations for any data or software you reused for your study in your README file.
Please be aware that we are unable to provide restricted access to data at this time. Although we can set an embargo to protect your data from download in the short-term, all data deposited into FRDR under current terms will eventually be made publicly available. Please confirm that you can share your data, and that appropriate steps have been taken to process and anonymize that data where necessary. You may need to consult your participant consent forms or other documentation to confirm that publishing data in FRDR will not violate the terms under which you collected your data.
Some common types of restricted data are:
If your research involves human participants or contains human biological material, please confirm that you have consent to share your data, and prepare your data in compliance with any applicable legal or ethical guidelines. Learn more about potential restrictions and advice for processing human participant data for sharing in this helpful guide: Can I Share My Data? If you need to anonymize or de-identify your data for deposit, please see the following De-identification Guidance.
Indigenous community leaders are in the best position to assess the benefits and risks of sharing Indigenous knowledge, as well as data collected from Indigenous people, Indigenous lands, water, and ice. These data can only be shared in FRDR if community leaders have agreed that sharing is appropriate. Please consult with your Research Ethics Board or for more information, see:
The First Nations Principles of Ownership, Control, Access and Possession (OCAP™)
First Nations Information Governance Centre (FNIGC). A First Nations Data Governance Strategy. March 30, 2020.
Inuit Tapiriiti Kanatami (ITK) and Nunavut Research Institute Negotiating Research Relationships with Inuit Communities: A Guide for Researchers
National Aboriginal Health Organization’s Principles of Ethical Métis Research
Global Indigenous Data CARE Principles for Indigenous Data Governance
You may need to remove or coarsen location information if your data were collected from field sites in protected areas, sensitive archaeological sites, or private property where consent to reveal location was not obtained or could devalue property or cause stigmatization. You may also need to remove or coarsen occurrence data of vulnerable species. Please consult the IUCN Red List of Threatened Species for species status and known risks and threats to the species. The Global Biodiversity Information Facility’s Guide to Best Practices for Generalising Sensitive Species-Occurrence Data includes a matrix for assessing risk of harm and guidance for generalizing spatial information.