data.path Ryoji.Ikeda - 4 by r2hox via Flickr
The Research Data Management Team is ready to consult with you on your questions about research data, including compliance with funding agency mandates. Contact us!
Email: repub@psu.edu
This page is a brief overview of the relevant questions that you should consider throughout each stage of the research process before you write your DMP. You can go through each of the questions to think through the data management actions that will be necessary for your work and thus should be included in your DMP. Completing the Questions to Consider will make the process of composing the DMP - and including all needed details - much easier than going straight to the DMPTool or a template.
This page will not cover best practices for each of these stages. To further explore each of these topics in detail, see this guide from the University of Arizona or this guide from Penn State Harrisburg.
First it is important to consider the types of data you will be collecting, the format and approximate volume of the data, and the methods used to collect the data.
Metadata & Organization
One of the main purposes of data management and data sharing is to make the data generated from funded research available to other parties for replication and reuse, among other uses. In order for research data to be as useful and relevant as it can be, data should follow the FAIR Principles as closely as possible. FAIR stands for: Findable, Accessible, Interoperable, and Reusable. A brief outline is included below.
To that end, it's important to determine how you will describe your data and how it will be organized throughout the project. This includes naming conventions and hierarchy of files to be created or collected, README files, data dictionaries, and metadata. See best practices for these here.
Active Storage, Security, & Backups
This step covers how the data will be stored during the active phase of the project. In the case of sensitive or protected data, you should be vigilant about security of the data during active storage, including physical security, network security, and security of computer systems and files to protect your data from unauthorized access, changes to data, and disclosure or destruction of data. Be sure to check relevant Penn State policies for guidance on security of certain types of protected data.
Once you have identified any limitations to where and how you store your research data, you can look at the storage options available. This section is specifically dealing with active storage and not repositories or preservation of data; that is covered later. Penn State has created a Data Storage Finder which allows researchers to select relevant criteria for a potential storage solutions and then explore the options at Penn State. See also the Storage Options for Research Data on the Additional Resources page.
You should always have a plan for where and when your data will be backed up. Best practice for backups is to always have three copies of your data:
Note that once you have selected your active storage solutions, there is often boilerplate language to include in your DMP about that storage option.
Data Sharing & Reuse
A large part of the reason that DMPs are required for research is to ensure that researchers make a plan on how they will share their data after the project. As this is such an integral part to research, there will be a separate guide that covers data sharing best practices (stay tuned for updates). In addition, many funders, institutions, and publishers require researchers to share their data, so be sure to check the requirements for your project.
Preservation
Finally, you will want to consider what happens to your data several years after the close of the project. Many repositories do have options for preservation, but that is one aspect you will want to investigate further. Check with your institution and funder for preservation requirements and best practices for disposal of data.
Ethical, Confidentiality, and Privacy Concerns
These considerations must be determined as they inform storage and documentation decisions. Several aspects could trigger a confidentiality or ethical concern, most notably the collection of human-related data. This could include genomic data, personally identifiable information (PII) related to a subject, or health data (under the Health Insurance Portability and Accountability Act (HIPPA)). If your research data encompasses any of the above ethical and confidentiality concerns, you should note these in the DMP. If your project will be generating human subjects data, you should also state an intention to comply with Penn State's IRB requirements set by research administration guidelines and policies. Consult Penn State's Human Subjects Research (IRB) site. Review RA 22: HIPAA and Research at Penn State University, or, if applicable, RA 23: HIPAA and the Milton S. Hershey Medical Center and Penn State College of Medicine.