MinerU

A high-quality tool for convert PDF to Markdown and JSON.一站式开源高质量数据提取工具,将PDF转换成Markdown和JSON格式。

created at Feb. 29, 2024, 8:52 a.m.

Python

104

19,331

1,379

GitHub
labelU

Data annotation toolbox supports image, audio and video data.

created at Oct. 19, 2022, 9:03 a.m.

Python

12

656

58

GitHub
PDF-Extract-Kit

A Comprehensive Toolkit for High-Quality PDF Content Extraction

created at June 27, 2024, 1:05 p.m.

Python

37

5,482

364

GitHub