Mistral AI Chat Completions API

The Mistral AI Chat Completions API enables developers to interact with Mistral's language models in a conversational manner. It supports multi-turn conversations, function calling, and JSON mode for structured outputs. The API provides access to a range of models including Mistral Large, Mistral Small, and Codestral, each optimized for different use cases from general reasoning to code generation.

OpenAPI Specification

mistral-ai-chat-completions-openapi.yml Raw ↑
openapi: 3.1.0
info:
  title: Mistral AI Chat Completions API
  description: >-
    The Mistral AI Chat Completions API enables developers to interact with
    Mistral's language models in a conversational manner. It supports
    multi-turn conversations, function calling, JSON mode for structured
    outputs, and vision capabilities. The API provides access to a range of
    models including Mistral Large, Mistral Small, and Codestral, each
    optimized for different use cases from general reasoning to code
    generation.
  version: '1.0.0'
  contact:
    name: Mistral AI Support
    url: https://docs.mistral.ai
  termsOfService: https://mistral.ai/terms
externalDocs:
  description: Mistral AI Chat Completions Documentation
  url: https://docs.mistral.ai/api/endpoint/chat
servers:
  - url: https://api.mistral.ai/v1
    description: Mistral AI Production Server
tags:
  - name: Chat Completions
    description: >-
      Endpoints for generating chat completions using Mistral language models
      in a conversational format.
security:
  - bearerAuth: []
paths:
  /chat/completions:
    post:
      operationId: createChatCompletion
      summary: Create chat completion
      description: >-
        Creates a model response for the given chat conversation. The endpoint
        accepts a list of messages comprising a conversation and returns a
        model-generated message as a response. Supports multi-turn
        conversations, function calling, JSON mode, and streaming.
      tags:
        - Chat Completions
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ChatCompletionRequest'
      responses:
        '200':
          description: Successful chat completion response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ChatCompletionResponse'
        '400':
          description: Bad request due to invalid parameters
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error'
        '401':
          description: Unauthorized due to missing or invalid API key
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error'
        '429':
          description: Rate limit exceeded
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error'
components:
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      bearerFormat: API Key
  schemas:
    ChatCompletionRequest:
      type: object
      required:
        - model
        - messages
      properties:
        model:
          type: string
          description: >-
            ID of the model to use. You can use the Models API to see all
            available models.
          example: mistral-large-latest
        messages:
          type: array
          description: >-
            A list of messages comprising the conversation so far. Each message
            has a role and content.
          items:
            $ref: '#/components/schemas/ChatMessage'
        temperature:
          type: number
          description: >-
            Sampling temperature between 0.0 and 1.5. Higher values like 0.7
            produce more random output, while lower values like 0.2 produce
            more focused and deterministic output.
          minimum: 0.0
          maximum: 1.5
          default: 0.7
        top_p:
          type: number
          description: >-
            Nucleus sampling parameter. The model considers tokens with top_p
            probability mass. A value of 0.1 means only the top 10% of tokens
            are considered.
          minimum: 0.0
          maximum: 1.0
          default: 1.0
        max_tokens:
          type: integer
          description: >-
            The maximum number of tokens to generate in the chat completion.
            The total token count of the prompt plus max_tokens cannot exceed
            the model's context length.
          minimum: 1
        stream:
          type: boolean
          description: >-
            Whether to stream back partial progress as server-sent events. If
            true, tokens are sent as data-only events as they become available,
            terminated by a data: [DONE] message.
          default: false
        stop:
          oneOf:
            - type: string
            - type: array
              items:
                type: string
          description: >-
            Stop generation if this token is detected, or if one of these
            tokens is detected when providing an array.
        random_seed:
          type: integer
          description: >-
            The seed to use for random sampling. If set, different calls will
            generate deterministic results.
        response_format:
          type: object
          description: >-
            An object specifying the format that the model must output. Setting
            to json_object enables JSON mode.
          properties:
            type:
              type: string
              enum:
                - text
                - json_object
              description: >-
                The format type. Use json_object to enable JSON mode.
        tools:
          type: array
          description: >-
            A list of tools the model may call. Currently only functions are
            supported as a tool.
          items:
            $ref: '#/components/schemas/Tool'
        tool_choice:
          type: string
          description: >-
            Controls which tool is called by the model. Can be auto, none, any,
            or required.
          enum:
            - auto
            - none
            - any
            - required
        presence_penalty:
          type: number
          description: >-
            Penalizes repetition of words or phrases. A higher value encourages
            the model to use a wider variety of words and phrases.
          minimum: -2.0
          maximum: 2.0
          default: 0.0
        frequency_penalty:
          type: number
          description: >-
            Penalizes repetition based on frequency in the generated text. A
            higher value discourages repeating frequently used words.
          minimum: -2.0
          maximum: 2.0
          default: 0.0
        safe_prompt:
          type: boolean
          description: >-
            Whether to inject a safety prompt before all conversations.
          default: false
    ChatMessage:
      type: object
      required:
        - role
        - content
      properties:
        role:
          type: string
          description: >-
            The role of the message author.
          enum:
            - system
            - user
            - assistant
            - tool
        content:
          oneOf:
            - type: string
            - type: array
              items:
                type: object
          description: >-
            The content of the message. Can be a string or an array of content
            parts for multimodal inputs.
        tool_calls:
          type: array
          description: >-
            Tool calls generated by the model.
          items:
            $ref: '#/components/schemas/ToolCall'
        tool_call_id:
          type: string
          description: >-
            The ID of the tool call this message is responding to, required
            for tool role messages.
    Tool:
      type: object
      required:
        - type
        - function
      properties:
        type:
          type: string
          enum:
            - function
          description: >-
            The type of the tool. Currently only function is supported.
        function:
          $ref: '#/components/schemas/FunctionDefinition'
    FunctionDefinition:
      type: object
      required:
        - name
      properties:
        name:
          type: string
          description: >-
            The name of the function to be called.
        description:
          type: string
          description: >-
            A description of what the function does.
        parameters:
          type: object
          description: >-
            The parameters the function accepts, described as a JSON Schema
            object.
    ToolCall:
      type: object
      properties:
        id:
          type: string
          description: >-
            The ID of the tool call.
        type:
          type: string
          enum:
            - function
          description: >-
            The type of the tool call.
        function:
          type: object
          properties:
            name:
              type: string
              description: >-
                The name of the function to call.
            arguments:
              type: string
              description: >-
                The arguments to call the function with, as a JSON string.
    ChatCompletionResponse:
      type: object
      properties:
        id:
          type: string
          description: >-
            A unique identifier for the chat completion.
        object:
          type: string
          description: >-
            The object type, always chat.completion.
          enum:
            - chat.completion
        created:
          type: integer
          description: >-
            The Unix timestamp in seconds of when the completion was created.
        model:
          type: string
          description: >-
            The model used for the chat completion.
        choices:
          type: array
          description: >-
            A list of chat completion choices.
          items:
            $ref: '#/components/schemas/ChatCompletionChoice'
        usage:
          $ref: '#/components/schemas/Usage'
    ChatCompletionChoice:
      type: object
      properties:
        index:
          type: integer
          description: >-
            The index of the choice in the list of choices.
        message:
          $ref: '#/components/schemas/ChatMessage'
        finish_reason:
          type: string
          description: >-
            The reason the model stopped generating tokens.
          enum:
            - stop
            - length
            - tool_calls
            - model_length
    Usage:
      type: object
      properties:
        prompt_tokens:
          type: integer
          description: >-
            Number of tokens in the prompt.
        completion_tokens:
          type: integer
          description: >-
            Number of tokens in the generated completion.
        total_tokens:
          type: integer
          description: >-
            Total number of tokens used in the request.
    Error:
      type: object
      properties:
        message:
          type: string
          description: >-
            A human-readable error message.
        type:
          type: string
          description: >-
            The type of error.
        code:
          type: integer
          description: >-
            The HTTP status code.