linkstat by httpreserve

CLI implementation of httpreserve that can test links and retrieve internet archive replacements

created at March 19, 2019, 9:23 p.m.

Go

3 +0

8 +1

0 +0

GitHub
DownloadNet by dosyago

💾 DownloadNet - All content you browse online available offline. Search through the full-text of all pages in your browser history. ⭐️ Star to support our work!

created at Dec. 20, 2019, 9:47 a.m.

JavaScript

42 +0

3,674 +3

136 +0

GitHub
WarcDB by Florents-Tselai

WarcDB: Web crawl data as SQLite databases.

created at May 29, 2022, 11:09 a.m.

Python

10 +0

386 +1

11 +0

GitHub
internetarchive by jjjake

A Python and Command-Line Interface to Archive.org

created at Aug. 15, 2012, 7:18 p.m.

Python

53 +0

1,546 +1

215 +0

GitHub
browsertrix by webrecorder

Browsertrix is the hosted, high-fidelity, browser-based crawling service from Webrecorder designed to make web archiving easier and more accessible for all!

created at June 28, 2021, 10:46 p.m.

TypeScript

11 +1

139 +4

29 +1

GitHub
jwat-tools by netarchivesuite

JWAT Tools

created at Aug. 30, 2018, 5:54 p.m.

Java

7 +0

5 +0

2 +0

GitHub
warc-safe by natliblux

A tool for detecting viruses and NSFW material in WARC files

created at May 3, 2024, 6:24 a.m.

Python

4 +0

7 +0

0 +0

GitHub
jwat by netarchivesuite

Java Web Archive Toolkit

created at Aug. 30, 2018, 5:28 p.m.

Java

8 +0

3 +0

2 +0

GitHub