Whether your research project requires you to gather and create your own data, or to locate pre-existing datasets, it is important to know your data discovery and data storage options. Knowing how and where to locate and store data is imperative for a smooth research process and this section provides assistance in doing so. What data are you looking for? What data do you want to store? Often, researchers are looking for (or creating) three types of data:
If you need assistance, the library can help! Your Harrisburg librarians can provide frontline assistance and will be able to get you started on your data journey. If necessary, your librarian will connect you with the Research Informatics and Publishing department where you can submit your data related questions via their contact form.
A data repository is a location that holds data, makes data available to users, and organizes the data in a logical manner (National Library of Medicine, 2020). Additionally, data repositories are great places to find and/or store research data. Data repositories fall into three categories: General purpose; Discipline/domain specific; and Institutional repositories. A more detailed description is below.
General purpose data repositories are domain agnostic. Generally, these repositories accept all files and all formats. While they are good to search, they are not always the best fit for finding and storing data. Examples of this type of repository are: Dryad, figshare, and Zenodo.
Discipline / domain specific data repositories are primarily designed for specific research domains. These repositories offer more search options than general purpose ones. Often, users can browse by subject, country, and/or content-type. Additionally, domain specific repositories offer several filtering options for users. If you are interested in locating a domain specific repository for data, a great place to begin is the Registry of Research Data Repositories, also known as re3data.
Institutional repositories are data repositories that are created and hosted by institutions. These are available to researchers affiliated with the institution and the availability of data is determined by the researcher and/or research team. Researchers can make their data open and publicly available or only available to those within the institution. ScholarSphere is Penn State's institutional repository and anyone with a Penn State Access ID can deposit materials relating to the University’s teaching, learning, and research mission to ScholarSphere. All types of scholarly materials, including publications, instructional materials, creative works, and research data are accepted. In addition to ScholarSphere, DataCommons is a disciplinary data repository at Penn State to which researchers may submit data for dissemination and compliance purposes.
Like other aspects of a research project, investigating and selecting an appropriate data repository can be daunting. If you or your research team have a voice in the selection process, considering these features may help:
At Penn State, researchers have several data storage options that offer different features. Here you will find a description of four storage options available to Penn State users: ScholarSphere, Penn State's G Suite, PSU OneDrive, and Penn State's Institute for Computational and Data Sciences (ICDS). Ultimately, your data storage selection should serve your needs and you are encouraged to reach out to the responsible parties for more information.
Storage Option | Key Features |
---|---|
ScholarSphere |
|
Penn State's G Suite |
|
PSU OneDrive |
|
PSU ICDS |
|
Looking for more assistance comparing your data storage options? Penn State created a website to do just that. Check out the university's Data Storage Finder to find the data storage option that best suits your research needs.