Constellate's data builder encourages users to want materials from certain journals, but the search facets allow you to widen or narrow your parameters to develop a more specific data set. You can preview the changes you make on the results interface before finalizing the output.
You can build your dataset through the search interface, building on particular words or concepts of interest. Some options include:
You may want to explore these settings before you settle on a final output.
Once you've created a dataset or encountered one you want to use from their pre-selected material, you can download some features. For example, if you wanted to have access to information related to African-American history covering all the materials available from African American Review, Black American Literature Forum and Negro American Literature Forum between 1967-2020, you can download the following data. (The relevant file type is included in parentheses.)
You can also request all metadata, as well as sheets of unigrams, bigrams, and/or trigrams (all as .csv files). Finally, it is possible to request a Constellate Document Format json file which will cover all metadata, unigrams, bigrams, trigrams, and full-text. Different people have different needs; this provides a range of ways to get what you need. Read more about what these download options will offer on the Constellate Help Pages:
Constellate offers a Jupyter Notebooks environment for analysis and manipulation of your data sets using the Python scripting language. You can access them in an annotated, tutorial learning version and in a more minimalist research version. Constellate calls these the "Tutorial" option, which is designed to help beginners, and a more straightforward "Analysis" option. They offer several pre-established Jupyter Notebooks, including scripts for metadata and pre-processing, working with simple word frequencies and more complex word frequencies (TF-IDF), and topic modeling. You can also import your own notebook if you have your own scripts you'd like to run from any extant Github repository.
You can also run your own analyses with the data downloadable outright from Constellate.
The Constellate team has developed some robust help documents to support you getting started with their platform.
And, they provide how-to guides which outlines the material covered here in a step-by-step format:
If you need more help