Skip to Main Content

Data Management Plans

This guide covers what Data Management Plans are, how to make an implementable one, and how to use the DMPTool

What This Guide Covers

This guide provides guidance and information for researchers on Data Management Plans (DMPs) including:

  1. What a DMP is
  2. Why a DMP is useful and whether one may be required 
  3. How to create an implementable DMP

If you are looking to create a DMP as a requirement for a sponsored project, see this page

If you are already familiar with DMPs and the requirements for them, see Quick Links at the bottom of this page for frequently used resources and tools. 

Data Management Terms and Definitions

Throughout this guidance, several terms are used which are defined here:

Data management describes the processes of collecting, organizing, describing, sharing, and preserving data. Data management is vital to any research project to prevent data issues - such as unorganized data or loss of data - from derailing your research project as well as to support making your data findable, accessible, interoperable, and reusable (FAIR).

Research (or scientific) data are the recorded factual material commonly accepted in the scientific community as of sufficient quality to validate and replicate research findings, regardless of whether the data are used to support scholarly publications [adapted from NIH policy (effective January 2023)]. Research data could be observational, experimental, simulated, or derived. Some examples include tables of numbers, transcripts of interviews, survey results, images, video or audio recordings, genomic data, or code, among others. 

A Data Management Plan (DMP) is an outline of what you will do with your data during and after a research project as well as how a researcher will collect, organize, document, describe, share, and preserve your data to make it FAIR. DMPs are often required by funders, but are beneficial to researchers regardless of whether they are required or not. Most funder-required DMPs are submitted as part of the funding proposal as a one to two page narrative document. However, if a DMP is not required, it can be in any format that is helpful to your research team (i.e. an excel sheet, bullet points, etc). A DMP should be a living document which is created before research begins and updated as research progresses. 

Data Sharing refers to the practice of making data available to other research stakeholders, including other investigators, research subjects, and the broader public. Various funding agencies, publishers, and other research institutions mandate open data sharing to promote transparency, research reproducibility, and to increase the impact of research data [NNLM]. It is important to check the requirements of each of these entities before starting your research. Your plan on how and where you will share your data should be included in the DMP.  

FAIR Data Principles are guidelines for making data findable, accessible, interoperable, and reusable (FAIR). Learn more by visiting the Go Fair website

Components of a DMP

A DMP is a written document or standard operating procedure outlining what you will do with acquired or generated research data over the course of the project and afterwards, including how you plan to collect, organize, document, describe, share, and preserve your data to make it FAIR. A DMP is a living document that should be created as early as possible, optimally during the planning phase of a project, and updated throughout the project. Things change often, and that's okay - just be sure to update your plan accordingly so it is relevant throughout the life of the project. DMPs are often required for grant-funded research proposals to help ensure that data are properly managed, documented, stored, analyzed, preserved and subsequently shared with other researchers while accounting for legal, privacy, intellectual property and other considerations. 

Planning ahead can help you to identify any hurdles to making your data FAIR and to ensure you have the proper resources to fulfill that goal. There are five major questions that a DMP should answer: (adapted from University of Arizona)

  1. What type of data will be produced?
  2. How will it be organized and what standards will be used for documentation and metadata describing the data?
  3. What steps will be taken to protect privacy, security, confidentiality, intellectual property or other rights?
  4. If others are allowed to reuse the data, how, where and when will the data be accessed and shared?
  5. Where will the data be archived and preserved and for how long?

Formalizing the wholistic vision of what data will be produce and how it can be shared with the wider scientific community can help you find gaps and make it easier to communicate the plan with all members of the research team, even if you have considered these aspects of your research already. 

Generally, DMPs will address the same topics, but the implementation aspects of the DMP will look different for different researchers, areas of study, and scope of project. The Data Curation Centre (DCC) has developed a checklist for Data Management Plans which outlines further the topics should be addressed in a DMP alongside questions to consider and guidance. These sections include: 

  • Administrative data: project name, funder, DOI, PI, relevant dates
  • Data collection: what data and how is it collected
  • Documentation and metadata: metadata schema, form of documentation
  • Ethics and legal compliance: HIPAA, copyright, intellectual property
  • Storage and back-up: where will data be stored, who has access, what security precautions will be taken
  • Selection and preservation: what data is preserved and how/where will it be preserved
  • Data sharing: what data will be shared and how will it be shared, restrictions to sharing
  • Responsibilities and resources: who is responsible for each step above, will execution of DMP require additional resources/budget

Do I Need a DMP?

Whether a DMP is required will depend on institutional and funder policies, but it's always best practice to create a data management plan for each research project.  A funder-required DMP must be submitted in the grant application package, and there are specific guidelines on how it should be formatted and what it must include. If a DMP is not funder-required, it can be more informal: a simple text document or spreadsheet containing the relevant details - any format that will assist you and the rest of the research team. 

Creating a DMP is considered best practice for everyone

Proper data management provides a lot of benefits to a research team:

  • Saves Time: Properly managing data is in your best interest; being able locate past or current data saves time, frustration, and money for the whole research team.
  • Increases Citations: When possible to openly, share data, well-managed data can itself be cited and may also lead to more citations for the original paper [1].
  • Enhances Reproducibility: Data management enhances reproducibility by making the methodology more transparent.
  • Preserves Data: While data management encourages researchers to consider backup and security measures, it also ensures that data is preserved, not just stored. Preservation focuses on the long-term ability to access and use data, and considers interoperability and open file formats.

A DMP, whatever form it takes, can reduce redundant work, help new members of the research team as they join the project, and keep everyone on the same page about how data will be collected, stored, described, and shared. 

And it may be required by your funder. 

In February 2013 the Office of Science and Technology Policy (OSTP) issued a call to federal agencies with budgets in excess of $100 million to provide plans for public access to research results from projects funded by them. By fall 2015, most, if not all, federal agencies falling under this requirement have issued their public access plans. Check your funder's website or use the SPARC Research Funder Data Sharing Policies Tool to  navigate existing public access policies and find out whether a DMP is required for your project. 

If you are still having trouble, contact the Data Learning Center Data Management support team at repub@psu.edu. 

 

[1] Heather A. Piwowar and Todd J. Vision, “Data Reuse and the Open Data Citation Advantage,” PeerJ 1 (October 2013): e175, https://dx.doi.org/10.7717/peerj.175.

Quick Links

If you're already familiar with DMPs and the resources available, here are some quick links that might be useful: