Cohere Chat API

The Cohere Chat API enables developers to integrate large language model text generation capabilities into their applications through a conversational interface. It supports multi-turn conversations, tool use with JSON schema definitions, retrieval-augmented generation, and streaming responses. The API is available via the v2 endpoint and works with Cohere's Command family of models.

OpenAPI Specification

cohere-chat-api-openapi.yml Raw ↑
openapi: 3.1.0
info:
  title: Cohere Chat API
  description: >-
    The Cohere Chat API enables developers to integrate large language model
    text generation capabilities into their applications through a
    conversational interface. It supports multi-turn conversations, tool use
    with JSON schema definitions, retrieval-augmented generation, and
    streaming responses. The API is available via the v2 endpoint and works
    with Cohere's Command family of models.
  version: '2.0'
  contact:
    name: Cohere Support
    url: https://support.cohere.com
  termsOfService: https://cohere.com/terms-of-use
externalDocs:
  description: Cohere Chat API Documentation
  url: https://docs.cohere.com/reference/chat
servers:
  - url: https://api.cohere.com
    description: Cohere Production Server
tags:
  - name: Chat
    description: >-
      Endpoints for generating text responses through conversational
      interactions with Cohere language models.
security:
  - bearerAuth: []
paths:
  /v2/chat:
    post:
      operationId: chat
      summary: Chat with a Cohere model
      description: >-
        Generates a text response to a user message. Accepts a list of chat
        messages in chronological order representing a conversation between
        the user and the model. Messages can include User, Assistant, Tool,
        and System roles. Supports tool use, retrieval-augmented generation,
        and structured JSON output.
      tags:
        - Chat
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ChatRequest'
      responses:
        '200':
          description: Successful chat completion response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ChatResponse'
        '400':
          description: Bad request due to invalid parameters
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error'
        '401':
          description: Unauthorized due to missing or invalid API key
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error'
        '429':
          description: Rate limit exceeded
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error'
  /v2/chat/stream:
    post:
      operationId: chatStream
      summary: Chat with streaming response
      description: >-
        Generates a text response to a user message and streams it using
        server-sent events (SSE). Partial results are delivered as they are
        generated, enabling real-time display in user interfaces. Emits
        stream-start, content-delta, citation-start, citation-end, and
        stream-end events.
      tags:
        - Chat
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/ChatRequest'
      responses:
        '200':
          description: Streaming chat completion response delivered as SSE events
          content:
            text/event-stream:
              schema:
                $ref: '#/components/schemas/StreamEvent'
        '400':
          description: Bad request due to invalid parameters
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error'
        '401':
          description: Unauthorized due to missing or invalid API key
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error'
        '429':
          description: Rate limit exceeded
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error'
components:
  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      description: >-
        Bearer authentication using a Cohere API key. Pass the API key
        in the Authorization header as Bearer <token>.
  schemas:
    ChatRequest:
      type: object
      required:
        - model
        - messages
      properties:
        model:
          type: string
          description: >-
            The name of a compatible Cohere model to use for generation.
          example: command-r-plus
        messages:
          type: array
          description: >-
            A list of chat messages in chronological order representing a
            conversation between the user and the model. Messages can be
            from User, Assistant, Tool, and System roles.
          items:
            $ref: '#/components/schemas/Message'
        tools:
          type: array
          description: >-
            A list of tools (functions) available to the model. The model
            may choose to call these tools during generation. Each tool is
            defined with a name, description, and JSON schema for parameters.
          items:
            $ref: '#/components/schemas/Tool'
        max_tokens:
          type: integer
          description: >-
            The maximum number of output tokens the model will generate in
            the response. If not set, defaults to the model's maximum output
            token limit.
          minimum: 1
        stop_sequences:
          type: array
          description: >-
            A list of up to 5 strings that the model will use to stop
            generation. If the model generates a string matching any entry,
            it will stop generating tokens.
          items:
            type: string
          maxItems: 5
        temperature:
          type: number
          description: >-
            A non-negative float that tunes the degree of randomness in
            generation. Lower temperatures mean less random generations and
            higher temperatures mean more random generations. Defaults to 0.3.
          minimum: 0
          maximum: 2
          default: 0.3
        response_format:
          type: object
          description: >-
            Controls the format of the model's output. Set type to
            json_object to force JSON output. Optionally provide a JSON
            Schema to ensure a specific structure.
          properties:
            type:
              type: string
              enum:
                - text
                - json_object
              description: >-
                The format type for the response output.
            json_schema:
              type: object
              description: >-
                An optional JSON Schema that the output must conform to
                when type is json_object.
        safety_mode:
          type: string
          enum:
            - CONTEXTUAL
            - STRICT
            - NONE
          description: >-
            Used to select the safety instruction inserted into the prompt.
            Defaults to CONTEXTUAL.
          default: CONTEXTUAL
    Message:
      type: object
      required:
        - role
      properties:
        role:
          type: string
          enum:
            - user
            - assistant
            - system
            - tool
          description: >-
            The role of the message author in the conversation.
        content:
          type: string
          description: >-
            The text content of the message.
        tool_call_id:
          type: string
          description: >-
            The ID of the tool call this message is responding to. Required
            when role is tool.
        tool_calls:
          type: array
          description: >-
            Tool calls generated by the model. Present when role is assistant
            and the model decided to call tools.
          items:
            $ref: '#/components/schemas/ToolCall'
    Tool:
      type: object
      required:
        - type
        - function
      properties:
        type:
          type: string
          enum:
            - function
          description: >-
            The type of tool. Currently only function is supported.
        function:
          type: object
          required:
            - name
            - description
          properties:
            name:
              type: string
              description: >-
                The name of the function to be called.
            description:
              type: string
              description: >-
                A description of what the function does.
            parameters:
              type: object
              description: >-
                The parameters the function accepts, described as a JSON Schema
                object.
    ToolCall:
      type: object
      properties:
        id:
          type: string
          description: >-
            The unique identifier for this tool call.
        type:
          type: string
          enum:
            - function
          description: >-
            The type of tool call.
        function:
          type: object
          properties:
            name:
              type: string
              description: >-
                The name of the function to call.
            arguments:
              type: string
              description: >-
                The arguments to pass to the function, as a JSON string.
    ChatResponse:
      type: object
      properties:
        id:
          type: string
          description: >-
            Unique identifier for the chat completion.
        finish_reason:
          type: string
          enum:
            - complete
            - max_tokens
            - stop_sequence
            - tool_call
            - error
            - timeout
          description: >-
            The reason the chat request finished. Values include complete,
            max_tokens, stop_sequence, tool_call, error, and timeout.
        message:
          $ref: '#/components/schemas/Message'
        usage:
          $ref: '#/components/schemas/Usage'
    Usage:
      type: object
      properties:
        billed_units:
          type: object
          properties:
            input_tokens:
              type: integer
              description: >-
                The number of billed input tokens.
            output_tokens:
              type: integer
              description: >-
                The number of billed output tokens.
        tokens:
          type: object
          properties:
            input_tokens:
              type: integer
              description: >-
                The total number of input tokens processed.
            output_tokens:
              type: integer
              description: >-
                The total number of output tokens generated.
    StreamEvent:
      type: object
      description: >-
        A server-sent event emitted during streaming chat generation. Event
        types include stream-start, content-delta, citation-start,
        citation-end, tool-call-start, tool-call-delta, tool-call-end,
        and stream-end.
      properties:
        event_type:
          type: string
          enum:
            - stream-start
            - content-delta
            - citation-start
            - citation-end
            - tool-call-start
            - tool-call-delta
            - tool-call-end
            - stream-end
          description: >-
            The type of streaming event.
        delta:
          type: object
          description: >-
            The incremental content payload for delta events.
          properties:
            message:
              type: object
              properties:
                content:
                  type: object
                  properties:
                    text:
                      type: string
                      description: >-
                        The incremental text content.
    Error:
      type: object
      properties:
        message:
          type: string
          description: >-
            A human-readable error message describing what went wrong.