Docling Python Library
The core Docling Python library and `docling` CLI. Parses PDFs, DOCX, PPTX, XLSX, HTML, images (PNG/TIFF/JPEG), audio (WAV/MP3), WebVTT, LaTeX, and plain text into a unified `DoclingDocument` representation that can be exported to Markdown, HTML, lossless JSON, DocTags, and WebVTT. Implements advanced PDF understanding — page layout, reading order, table structure (TableFormer), code and formula recognition, picture classification — plus OCR (EasyOCR, Tesseract, RapidOCR, Mac OCR) and the GraniteDocling visual language model pipeline. Runs locally for air-gapped and sensitive-data use.
Docling Python Library is one of 16 APIs that Docling publishes on the APIs.io network, described by a machine-readable OpenAPI specification.
This API exposes 1 machine-runnable capability that can be deployed as REST, MCP, or Agent Skill surfaces via Naftiko.
Tagged areas include Documents, Parsing, Python, SDK, and PDF. The published artifact set on APIs.io includes API documentation, a getting-started guide, SDKs, an OpenAPI specification, and 1 Naftiko capability spec.