Snapshots a web page to get it as a static, self-contained HTML document.
created at July 13, 2017, 11:31 p.m.
Zotero extension that combats link rot by archiving webpages and journal articles.
created at Aug. 29, 2019, 5:51 p.m.
Streaming WARC/ARC library for fast web archive IO
created at March 6, 2017, 6:17 p.m.
WarcDB: Web crawl data as SQLite databases.
created at May 29, 2022, 11:09 a.m.
A Tool To Push Web Resources Into Web Archives
created at Feb. 9, 2017, 12:29 p.m.
Automatically archive links to videos, images, and social media content from Google Sheets (and more).
created at Jan. 15, 2021, 10:30 a.m.
A curated list of awesome tools for website diffing and change monitoring.
created at May 24, 2017, 5:33 a.m.
Run a high-fidelity browser-based crawler in a single Docker container
created at Nov. 2, 2020, 4:37 a.m.
brozzler - distributed browser-based web crawler
created at July 13, 2015, 11:48 p.m.
The archivist's web crawler: WARC output, dashboard for all crawls, dynamic ignore patterns
created at Feb. 5, 2015, 5:01 a.m.
Core Python Web Archiving Toolkit for replay and recording of web archives
created at Dec. 9, 2013, 3:30 a.m.
A Python and Command-Line Interface to Archive.org
created at Aug. 15, 2012, 7:18 p.m.