Paper Parse
Parse academic PDF papers into markdown with figure extraction.
32 downloads
Free
Reviewed
Paper Parse
Parse academic PDF papers into structured markdown with figure extraction using PyMuPDF.
Usage
uv run {baseDir}/scripts/parse_paper.py --pdf /path/to/paper.pdf [--output-dir ./output]
Output
The tool generates:
{paper_name}_content.md- Full paper content in markdown{paper_name}_parsed.json- Structured metadata including:- Paper title
- Number of pages
- Extracted figures with captions and paths
cover_title_authors.png- First-page snapshot focused on title + authors regionfigures/- Directory containing high-resolution figure screenshots
Example
uv run scripts/parse_paper.py --pdf ~/papers/my-paper.pdf --output-dir ./parsed
Output structure:
./parsed/
├── my-paper_content.md
├── my-paper_parsed.json
└── figures/
├── figure_1.png
├── figure_2.png
└── ...
Dependencies
- PyMuPDF (fitz) - PDF parsing and rendering
- pymupdf4llm - Markdown conversion
These are automatically managed by uv via the inline script metadata.
Download
ZIP package — ready to use
Skill Info
- Creator
- Chen-Li-17
- Downloads
- 32
- Published
- Mar 15, 2026
- Updated
- Mar 16, 2026