Library Guides: Research Using Web Archives: Web Archives Explained

Web Archives Explained

According to the International Internet Preservation Consortium (IIPC), "Web archiving is the process of collecting portions of the World Wide Web, preserving the collections in an archival format, and then serving the archives for access and use."

Penn State University Libraries uses the Archive-It vended service from the Internet Archive to crawl and capture content on the Web. Using the Archive-It service, Penn State selects URLs for capture, organizes them into collections, performs periodic snapshots of those websites on specific dates, and then makes the website available for "playback" that is meant to match the original browsing experience. Because the Web changes frequently and the technologies used to deliver web content are complex and varied, it cannot be guaranteed that all files from a website will be captured or reconstructed with perfect fidelity. Please be aware, as you navigate the content of an archived website, that some content may not have been captured accurately or not included in the scope of the crawling activity, resulting in broken links. If you encounter missing content that you feel should be available, please let us know. More information about Archive-It can be found online at https://archive-it.org/. The full scope of University Libraries’ web archives can be found online at https://archive-it.org/home/psu.

Research Using Web Archives

Penn State University Archivist

Scholarly Product and Applications in Special Collections specialist

Web Archives Explained