Skip to Main Content

Collaborative Archive & Data Research Environment (CADRE)

Need Further Support?

If you have any questions, please contact the Penn State CADRE representative, Dan Coughlin (

You can also contact the CADRE team directly.

Furthermore, you can stay updated on CADRE's latest news and releases by subscribing to its newsletter

What is CADRE?

CADRE is a cloud-based platform that provides access to a standardized version of the Web of Science and Microsoft Academic Graph datasets. CADRE includes GUI data querying, analysis, storage and visualization capabilities. 

As Penn State University is an official partner of the CADRE project, researchers have unfettered access to all of CADRE's standardized datasets and data-management and -analysis tools. 

CADRE aids researchers interested in working with big bibliometric datasets. You do not need to have any coding experience to perform data querying or analysis on CADRE. The platform is currently in its beta phase, but is being used by researchers. 

Learn more about how you can use CADRE below. You can also find more information by visiting the CADRE homepage or start working now by logging into the CADRE Gateway

Dataset Access and Information

You will receive access to standardized, high-quality versions of the following datasets:

  • Web of Science: a leading commercial dataset that includes 73 million papers and 1.7 billion citations. Web of Science indexes selected journals that cover "core" and "emerging" sciences. 
  • Microsoft Academic Graph: an open bibliometric dataset that holds 208 million documents and 1.4 billion citations. Microsoft Academic Graph includes a broad spectrum of internet documents for all sciences. 
  • U.S. Patents and Trademark Office: an open government dataset that includes 9 million patent application documents. 

All of CADRE's datasets are updated by the CADRE team as updates become available to ensure researchers are working with the latest data release. 

Data Querying, Analysis, and Visualization

CADRE's Gateway contains the tools you need to query, analyze, and publish your research. Everything created in CADRE can be reproduced by other researchers. 

  • Query Builder: Use the user-friendly GUI Query Builder to easily query big bibliometric datasets. 
  • Jupyter Notebook: Proficient coders can take advantage of the Jupyter Notebook features to build custom data analysis and visualization tools. 
  • Marketplace: After users create data analysis tools in Jupyter Notebook, they can publish them to the Marketplace for others users to apply to their own research. The Marketplace also allows you to publish and reproduce queries, derived data, and workflows. 

Data Storage and Archiving

Users can store their query outputs, data analysis tools, and research results in the CADRE cloud. 

Users will soon have the ability to attach DOIs to their reproducible packages as well. CADRE provides three tiers of DOI allocation:

  1. Packages with no DOI or metadata (discoverable only by users with a CADRE account)
  2. Packages with temporary DOIs and metadata (discoverable only by users with a CADRE account)
  3. Permanently archived packages with DOIs and metadata (discoverable by anyone)

Getting Started

To start working with the CADRE platform:

  1. Visit the CADRE Gateway.
  2. Click the Log In to CADRE button.
  3. You will then be prompted to a CILogon portal. Select your current institution and use your university-affiliated email address to log on. 
  4. You can now begin working on the platform immediately!

You can also find extensive walkthroughs of CADRE's features and demos for running and analyzing your first queries in the collection of informational videos on CADRE's Resources page

How researchers are using CADRE

  • Mapping Collaborations and Partnerships in SDG Research (MCAP): Researchers used CADRE's datasets to study research output and patterns of global collaboration that support the United Nations' Sustainable Development Goals (SDGs). 
  • The global network of air links and scientific collaboration - a quasi-experimental analysis: The research team is determining how the introduction and availability of long-distance flights impacted international scientific collaboration by measuring collaboration through co-authorship and co-affiliation on CADRE's datasets. 
  • Study of Pandemic Publishing: How Scholarly Literature is Affected by COVID-19 Pandemic: Researchers are studying the quality of COVID-19 related scholarly works by using CADRE's datasets to identify signs of incoherency, irreproducibility, and haste.