Docling Serve REST API

Docling Serve exposes the Docling pipeline as an HTTP service. Synchronous endpoints `POST /v1/convert/source` and `POST /v1/convert/file` accept URL- or upload-sourced documents and return converted JSON, Markdown, HTML, or a zipped bundle. Asynchronous variants (`/v1/convert/source/async`, `/v1/convert/file/async`) return a task handle that can be polled at `/v1/status/poll/{task_id}`, streamed via WebSocket `/v1/status/ws/{task_id}`, and retrieved at `/v1/result/{task_id}`. Container images ship CPU, CUDA 12.8/13.0, and AMD ROCm 6.3 variants; Kubernetes deployment is supported via the Docling Operator.

Docling Serve REST API is one of 16 APIs that Docling publishes on the APIs.io network, described by a machine-readable OpenAPI specification.

This API exposes 2 machine-runnable capabilities that can be deployed as REST, MCP, or Agent Skill surfaces via Naftiko and 2 JSON Schema definitions.

Tagged areas include Documents, Parsing, REST, PDF, and OCR. The published artifact set on APIs.io includes API documentation, an OpenAPI specification, a JSON-LD context, 2 Naftiko capability specs, and 2 JSON Schemas.

OpenAPI Specification

docling-serve-openapi.yml Raw ↑
openapi: 3.1.0
info:
  title: Docling Serve REST API
  description: |
    HTTP service exposing the Docling document-parsing pipeline. Submit documents as URLs
    or uploads and receive a `DoclingDocument` together with optional Markdown, HTML, and
    text renditions. Synchronous endpoints return the converted document inline; the
    asynchronous endpoints return a task handle that can be polled, streamed over
    WebSocket, and fetched on completion.
  version: '1.0'
  license:
    name: MIT
    url: https://github.com/docling-project/docling-serve/blob/main/LICENSE
  contact:
    name: Docling Project
    url: https://github.com/docling-project/docling-serve
servers:
- url: http://localhost:5001
  description: Local Docling Serve container.
tags:
- name: Convert
  description: Synchronous conversion endpoints.
- name: Async
  description: Asynchronous conversion submission.
- name: Tasks
  description: Task status, results, and streaming.
- name: System
  description: Health and metadata.
paths:
  /v1/convert/source:
    post:
      tags:
      - Convert
      summary: Convert Documents From Source URLs
      description: |
        Synchronously convert one or more documents pulled from HTTP source URLs or
        provided inline as base64. Returns the converted document(s) directly in the
        response body.
      operationId: convertSource
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ConvertSourceRequest'
      responses:
        '200':
          description: Conversion completed.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ConvertResponse'
            application/zip:
              schema:
                type: string
                format: binary
        '400':
          $ref: '#/components/responses/Error'
        '500':
          $ref: '#/components/responses/Error'
  /v1/convert/file:
    post:
      tags:
      - Convert
      summary: Convert Documents From Uploaded Files
      description: |
        Synchronously convert one or more documents uploaded as multipart form data.
        Conversion options are supplied as additional form fields.
      operationId: convertFile
      requestBody:
        required: true
        content:
          multipart/form-data:
            schema:
              $ref: '#/components/schemas/ConvertFileForm'
      responses:
        '200':
          description: Conversion completed.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ConvertResponse'
            application/zip:
              schema:
                type: string
                format: binary
        '400':
          $ref: '#/components/responses/Error'
        '500':
          $ref: '#/components/responses/Error'
  /v1/convert/source/async:
    post:
      tags:
      - Async
      summary: Submit Source Conversion Asynchronously
      description: |
        Submit a source-based conversion job to the async queue. Returns a `TaskDetail`
        with the queue position and a `task_id` for subsequent polling.
      operationId: convertSourceAsync
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ConvertSourceRequest'
      responses:
        '200':
          description: Job accepted.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/TaskDetail'
        '400':
          $ref: '#/components/responses/Error'
  /v1/convert/file/async:
    post:
      tags:
      - Async
      summary: Submit File Conversion Asynchronously
      description: |
        Submit an upload-based conversion job to the async queue. Returns a `TaskDetail`
        for polling.
      operationId: convertFileAsync
      requestBody:
        required: true
        content:
          multipart/form-data:
            schema:
              $ref: '#/components/schemas/ConvertFileForm'
      responses:
        '200':
          description: Job accepted.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/TaskDetail'
        '400':
          $ref: '#/components/responses/Error'
  /v1/status/poll/{task_id}:
    get:
      tags:
      - Tasks
      summary: Poll Asynchronous Task Status
      description: Return the current `TaskDetail` for the task identified by `task_id`.
      operationId: pollTaskStatus
      parameters:
      - name: task_id
        in: path
        required: true
        schema:
          type: string
      responses:
        '200':
          description: Current task status.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/TaskDetail'
        '404':
          $ref: '#/components/responses/Error'
  /v1/result/{task_id}:
    get:
      tags:
      - Tasks
      summary: Get Asynchronous Task Result
      description: Return the conversion result for a completed asynchronous task.
      operationId: getTaskResult
      parameters:
      - name: task_id
        in: path
        required: true
        schema:
          type: string
      responses:
        '200':
          description: Conversion result.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ConvertResponse'
            application/zip:
              schema:
                type: string
                format: binary
        '404':
          $ref: '#/components/responses/Error'
        '409':
          description: Task is not yet complete.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
  /health:
    get:
      tags:
      - System
      summary: Service Health Check
      description: Liveness/readiness probe for Docling Serve.
      operationId: getHealth
      responses:
        '200':
          description: Service is healthy.
          content:
            application/json:
              schema:
                type: object
                properties:
                  status:
                    type: string
                    example: ok
  /openapi.json:
    get:
      tags:
      - System
      summary: Get OpenAPI Specification
      description: Returns the live OpenAPI specification for the running Docling Serve instance.
      operationId: getOpenApiSpec
      responses:
        '200':
          description: OpenAPI document.
          content:
            application/json:
              schema:
                type: object
components:
  responses:
    Error:
      description: Error response.
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/ErrorResponse'
  schemas:
    ConvertSourceRequest:
      type: object
      description: Request body for source-based document conversion.
      properties:
        http_sources:
          type: array
          description: HTTP/HTTPS URLs to fetch and convert.
          items:
            $ref: '#/components/schemas/HttpSource'
        file_sources:
          type: array
          description: Inline base64-encoded documents.
          items:
            $ref: '#/components/schemas/FileSource'
        options:
          $ref: '#/components/schemas/ConvertDocumentsOptions'
        target:
          $ref: '#/components/schemas/Target'
    HttpSource:
      type: object
      required:
      - url
      properties:
        url:
          type: string
          format: uri
        headers:
          type: object
          additionalProperties:
            type: string
    FileSource:
      type: object
      required:
      - base64_string
      - filename
      properties:
        base64_string:
          type: string
          description: Base64-encoded file content.
        filename:
          type: string
          description: Original filename, used for format detection.
    ConvertFileForm:
      type: object
      properties:
        files:
          type: array
          items:
            type: string
            format: binary
        from_formats:
          type: array
          items:
            type: string
        to_formats:
          type: array
          items:
            type: string
        image_export_mode:
          type: string
          enum: [embedded, placeholder, referenced]
        do_ocr:
          type: boolean
        ocr_engine:
          type: string
        force_ocr:
          type: boolean
        ocr_lang:
          type: array
          items:
            type: string
        pdf_backend:
          type: string
        table_mode:
          type: string
          enum: [fast, accurate]
        do_table_structure:
          type: boolean
        include_images:
          type: boolean
        images_scale:
          type: number
        return_as_file:
          type: boolean
    ConvertDocumentsOptions:
      type: object
      description: Conversion behavior knobs shared by sync and async endpoints.
      properties:
        from_formats:
          type: array
          description: Input formats to accept.
          items:
            type: string
            enum: [pdf, docx, pptx, xlsx, html, md, asciidoc, image, audio, csv, xml_uspto, xml_jats]
        to_formats:
          type: array
          description: Output formats to produce.
          items:
            type: string
            enum: [md, html, json, text, doctags]
        image_export_mode:
          type: string
          enum: [embedded, placeholder, referenced]
        do_ocr:
          type: boolean
        force_ocr:
          type: boolean
        ocr_engine:
          type: string
          enum: [easyocr, tesseract, tesseract_cli, rapidocr, mac_ocr, ocrmac]
        ocr_lang:
          type: array
          items:
            type: string
        pdf_backend:
          type: string
          enum: [dlparse_v1, dlparse_v2, pypdfium2]
        table_mode:
          type: string
          enum: [fast, accurate]
        do_table_structure:
          type: boolean
        do_code_enrichment:
          type: boolean
        do_formula_enrichment:
          type: boolean
        do_picture_classification:
          type: boolean
        do_picture_description:
          type: boolean
        picture_description_area_threshold:
          type: number
        include_images:
          type: boolean
        images_scale:
          type: number
        pipeline:
          type: string
          enum: [standard, vlm]
        vlm_model:
          type: string
        return_as_file:
          type: boolean
        abort_on_error:
          type: boolean
    Target:
      type: object
      description: Optional delivery target for the converted output.
      properties:
        kind:
          type: string
          enum: [inbody, zip, s3, http]
        zip_file_name:
          type: string
    ConvertResponse:
      type: object
      properties:
        document:
          $ref: '#/components/schemas/DoclingDocumentRendering'
        status:
          type: string
          enum: [success, partial_success, failure]
        errors:
          type: array
          items:
            type: object
        processing_time:
          type: number
        timings:
          type: object
          additionalProperties:
            type: number
    DoclingDocumentRendering:
      type: object
      description: Container of one or more renderings of the converted document.
      properties:
        filename:
          type: string
        md_content:
          type: string
        html_content:
          type: string
        json_content:
          $ref: '#/components/schemas/DoclingDocument'
        text_content:
          type: string
        doctags_content:
          type: string
    DoclingDocument:
      type: object
      description: Canonical Docling document representation. See docling-core for the full schema.
      properties:
        schema_name:
          type: string
        version:
          type: string
        name:
          type: string
        origin:
          type: object
        furniture:
          type: array
          items:
            type: object
        body:
          type: object
        groups:
          type: array
          items:
            type: object
        texts:
          type: array
          items:
            type: object
        tables:
          type: array
          items:
            type: object
        pictures:
          type: array
          items:
            type: object
        key_value_items:
          type: array
          items:
            type: object
        pages:
          type: object
    TaskDetail:
      type: object
      description: Async task descriptor returned by submit/poll/status endpoints.
      required:
      - task_id
      - task_status
      properties:
        task_id:
          type: string
        task_status:
          type: string
          enum: [pending, started, success, failure, revoked]
        task_position:
          type: integer
          description: Position in the queue (0 = currently running).
        task_meta:
          type: object
          additionalProperties: true
        created_at:
          type: string
          format: date-time
        started_at:
          type: string
          format: date-time
        finished_at:
          type: string
          format: date-time
    ErrorResponse:
      type: object
      properties:
        detail:
          type: string
        code:
          type: string