The Archives Unleashed Toolkit is an open-source toolkit for analyzing web archives.
updated at June 21, 2024, 9:23 p.m.
Web application for distributed compute analysis of Archive-It web archive collections.
updated at June 17, 2024, 9:16 p.m.
An Apache Spark framework for easy data processing, extraction as well as derivation for web archives and archival collections, developed at Internet Archive.
updated at June 12, 2024, 4:19 p.m.
Internet Archive's Sparkling Data Processing Library
updated at June 5, 2024, 8:32 p.m.
An open-source toolkit for analyzing line-oriented JSON Twitter archives with Apache Spark.
updated at June 12, 2023, 7:59 a.m.