Microsoft Azure
Microsoft Azure Computer Vision API

Microsoft Azure Computer Vision API is a cloud-based service that allows users to easily analyze and extract information from images. This powerful tool uses advanced machine learning algorithms to identify objects, people, text, and even emotions within images. With the Computer Vision API, users can quickly understand the content and context of their imagery, automate image processing tasks, and transform raw visual data into actionable insights.
Documentation GitHub OpenAPI
Documentation

📖
Documentation
https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/
Specifications

⚙
OpenAPI
https://raw.githubusercontent.com/api-evangelist/microsoft-azure/refs/heads/main/openapi/computer-vision-api-openapi-original.yml
OpenAPI Specification

swagger: '2.0'
info:
  version: '1.0'
  title: Microsoft Azure Computer Vision API
  description: >-
    The Computer Vision API provides state-of-the-art algorithms to process
    images and return information. For example, it can be used to determine if
    an image contains mature content, or it can be used to find all the faces in
    an image.  It also has other features like estimating dominant and accent
    colors, categorizing the content of images, and describing an image with
    complete English sentences.  Additionally, it can also intelligently
    generate images thumbnails for displaying large images effectively.
securityDefinitions:
  apim_key:
    type: apiKey
    name: Ocp-Apim-Subscription-Key
    in: header
security:
  - apim_key: []
x-ms-parameterized-host:
  hostTemplate: '{AzureRegion}.api.cognitive.microsoft.com'
  parameters:
    - $ref: ../../../Common/ExtendedRegions.json#/parameters/AzureRegion
basePath: /vision/v1.0
schemes:
  - https
paths:
  /models:
    get:
      description: >-
        This operation returns the list of domain-specific models that are
        supported by the Computer Vision API.  Currently, the API only supports
        one domain-specific model: a celebrity recognizer. A successful response
        will be returned in JSON.  If the request failed, the response will
        contain an error code and a message to help understand what went wrong.
      operationId: microsoftAzureListmodels
      produces:
        - application/json
      responses:
        '200':
          description: List of available domain models.
          schema:
            $ref: '#/definitions/ListModelsResult'
        default:
          description: Error response.
          schema:
            $ref: '#/definitions/ComputerVisionError'
      x-ms-examples:
        Successful List Domains request:
          $ref: ./examples/SuccessfulListDomainModels.json
      summary: Microsoft Azure Get Models
      tags:
        - Models
  /analyze:
    post:
      description: >-
        This operation extracts a rich set of visual features based on the image
        content. Two input methods are supported -- (1) Uploading an image or
        (2) specifying an image URL.  Within your request, there is an optional
        parameter to allow you to choose which features to return.  By default,
        image categories are returned in the response.
      operationId: microsoftAzureAnalyzeimage
      consumes:
        - application/json
      produces:
        - application/json
      parameters:
        - $ref: '#/parameters/VisualFeatures'
        - name: details
          in: query
          description: >-
            A string indicating which domain-specific details to return.
            Multiple values should be comma-separated. Valid visual feature
            types include:Celebrities - identifies celebrities if detected in
            the image.
          type: array
          required: false
          collectionFormat: csv
          items:
            type: string
            x-nullable: false
            x-ms-enum:
              name: Details
              modelAsString: false
            enum:
              - Celebrities
              - Landmarks
        - $ref: '#/parameters/ServiceLanguage'
        - $ref: ../../../Common/Parameters.json#/parameters/ImageUrl
      responses:
        '200':
          description: >-
            The response include the extracted features in JSON format.Here is
            the definitions for enumeration typesClipartTypeNon-clipart = 0,  ambiguous = 1, normal-clipart = 2, good-clipart =
            3.LineDrawingTypeNon-LineDrawing = 0,LineDrawing = 1.
          schema:
            $ref: '#/definitions/ImageAnalysis'
        default:
          description: Error response.
          schema:
            $ref: '#/definitions/ComputerVisionError'
      x-ms-examples:
        Successful Analyze with Url request:
          $ref: ./examples/SuccessfulAnalyzeWithUrl.json
      summary: Microsoft Azure Post Analyze
      tags:
        - Analyze
  /generateThumbnail:
    post:
      description: >-
        This operation generates a thumbnail image with the user-specified width
        and height. By default, the service analyzes the image, identifies the
        region of interest (ROI), and generates smart cropping coordinates based
        on the ROI. Smart cropping helps when you specify an aspect ratio that
        differs from that of the input image. A successful response contains the
        thumbnail image binary. If the request failed, the response contains an
        error code and a message to help determine what went wrong.
      operationId: microsoftAzureGeneratethumbnail
      consumes:
        - application/json
      produces:
        - application/octet-stream
      parameters:
        - name: width
          type: integer
          in: query
          required: true
          minimum: 1
          maximum: 1023
          description: >-
            Width of the thumbnail. It must be between 1 and 1024. Recommended
            minimum of 50.
        - name: height
          type: integer
          in: query
          required: true
          minimum: 1
          maximum: 1023
          description: >-
            Height of the thumbnail. It must be between 1 and 1024. Recommended
            minimum of 50.
        - $ref: ../../../Common/Parameters.json#/parameters/ImageUrl
        - name: smartCropping
          type: boolean
          in: query
          required: false
          default: false
          description: Boolean flag for enabling smart cropping.
      responses:
        '200':
          description: The generated thumbnail in binary format.
          schema:
            type: file
        default:
          description: Error response.
          schema:
            $ref: '#/definitions/ComputerVisionError'
      x-ms-examples:
        Successful Generate Thumbnail request:
          $ref: ./examples/SuccessfulGenerateThumbnailWithUrl.json
      summary: Microsoft Azure Post Generatethumbnail
      tags:
        - generateThumbnail
  /ocr:
    post:
      description: >-
        Optical Character Recognition (OCR) detects printed text in an image and
        extracts the recognized characters into a machine-usable character
        stream.   Upon success, the OCR results will be returned. Upon failure,
        the error code together with an error message will be returned. The
        error code can be one of InvalidImageUrl, InvalidImageFormat,
        InvalidImageSize, NotSupportedImage,  NotSupportedLanguage, or
        InternalServerError.
      operationId: microsoftAzureRecognizeprintedtext
      consumes:
        - application/json
      produces:
        - application/json
      parameters:
        - $ref: '#/parameters/DetectOrientation'
        - $ref: ../../../Common/Parameters.json#/parameters/ImageUrl
        - $ref: '#/parameters/OcrLanguage'
      responses:
        '200':
          description: >-
            The OCR results in the hierarchy of region/line/word. The results
            include text, bounding box for regions, lines and words.textAngleThe
            angle, in degrees, of the detected text with respect to the closest
            horizontal or vertical direction. After rotating the input image
            clockwise by this angle, the recognized text lines become horizontal
            or vertical. In combination with the orientation property it can be
            used to overlay recognition results correctly on the original image,
            by rotating either the original image or recognition results by a
            suitable angle around the center of the original image. If the angle
            cannot be confidently detected, this property is not present. If the
            image contains text at different angles, only part of the text will
            be recognized correctly.
          schema:
            $ref: '#/definitions/OcrResult'
        default:
          description: Error response.
          schema:
            $ref: '#/definitions/ComputerVisionError'
      x-ms-examples:
        Successful Ocr request:
          $ref: ./examples/SuccessfulOcrWithUrl.json
      summary: Microsoft Azure Post Ocr
      tags:
        - Ocr
  /describe:
    post:
      description: >-
        This operation generates a description of an image in human readable
        language with complete sentences.  The description is based on a
        collection of content tags, which are also returned by the operation.
        More than one description can be generated for each image.  Descriptions
        are ordered by their confidence score. All descriptions are in English.
        Two input methods are supported -- (1) Uploading an image or (2)
        specifying an image URL.A successful response will be returned in JSON.  If the request failed, the response will contain an error code and a
        message to help understand what went wrong.
      operationId: microsoftAzureDescribeimage
      consumes:
        - application/json
      produces:
        - application/json
      parameters:
        - name: maxCandidates
          in: query
          description: >-
            Maximum number of candidate descriptions to be returned.  The
            default is 1.
          type: string
          required: false
          default: '1'
        - $ref: '#/parameters/ServiceLanguage'
        - $ref: ../../../Common/Parameters.json#/parameters/ImageUrl
      responses:
        '200':
          description: Image description object.
          schema:
            $ref: '#/definitions/ImageDescription'
        default:
          description: Error response.
          schema:
            $ref: '#/definitions/ComputerVisionError'
      x-ms-examples:
        Successful Describe request:
          $ref: ./examples/SuccessfulDescribeWithUrl.json
      summary: Microsoft Azure Post Describe
      tags:
        - Describe
  /tag:
    post:
      description: >-
        This operation generates a list of words, or tags, that are relevant to
        the content of the supplied image. The Computer Vision API can return
        tags based on objects, living beings, scenery or actions found in
        images. Unlike categories, tags are not organized according to a
        hierarchical classification system, but correspond to image content.
        Tags may contain hints to avoid ambiguity or provide context, for
        example the tag 'cello' may be accompanied by the hint 'musical
        instrument'. All tags are in English.
      operationId: microsoftAzureTagimage
      consumes:
        - application/json
      produces:
        - application/json
      parameters:
        - $ref: '#/parameters/ServiceLanguage'
        - $ref: ../../../Common/Parameters.json#/parameters/ImageUrl
      responses:
        '200':
          description: Image tags object.
          schema:
            $ref: '#/definitions/TagResult'
        default:
          description: Error response.
          schema:
            $ref: '#/definitions/ComputerVisionError'
      x-ms-examples:
        Successful Tag request:
          $ref: ./examples/SuccessfulTagWithUrl.json
      summary: Microsoft Azure Post Tag
      tags:
        - Tag
  /models/{model}/analyze:
    post:
      description: >-
        This operation recognizes content within an image by applying a
        domain-specific model.  The list of domain-specific models that are
        supported by the Computer Vision API can be retrieved using the /models
        GET request.  Currently, the API only provides a single domain-specific
        model: celebrities. Two input methods are supported -- (1) Uploading an
        image or (2) specifying an image URL. A successful response will be
        returned in JSON.  If the request failed, the response will contain an
        error code and a message to help understand what went wrong.
      operationId: microsoftAzureAnalyzeimagebydomain
      consumes:
        - application/json
      produces:
        - application/json
      parameters:
        - name: model
          in: path
          description: The domain-specific content to recognize.
          required: true
          type: string
        - $ref: '#/parameters/ServiceLanguage'
        - $ref: ../../../Common/Parameters.json#/parameters/ImageUrl
      responses:
        '200':
          description: Analysis result based on the domain model
          schema:
            $ref: '#/definitions/DomainModelResults'
        default:
          description: Error response.
          schema:
            $ref: '#/definitions/ComputerVisionError'
      x-ms-examples:
        Successful Domain Model analysis request:
          $ref: ./examples/SuccessfulDomainModelWithUrl.json
      summary: Microsoft Azure Post Models Model Analyze
      tags:
        - Models
  /recognizeText:
    post:
      description: >-
        Recognize Text operation. When you use the Recognize Text interface, the
        response contains a field called 'Operation-Location'. The
        'Operation-Location' field contains the URL that you must use for your
        Get Handwritten Text Operation Result operation.
      operationId: microsoftAzureRecognizetext
      parameters:
        - $ref: ../../../Common/Parameters.json#/parameters/ImageUrl
        - $ref: '#/parameters/HandwritingBoolean'
      consumes:
        - application/json
      produces:
        - application/json
      responses:
        '202':
          description: >-
            The service has accepted the request and will start processing
            later. It will return Accepted immediately and include an
            Operation-Location header. Client side should further query the
            operation status using the URL specified in this header. The
            operation ID will expire in 48 hours.
          headers:
            Operation-Location:
              description: >-
                URL to query for status of the operation. The operation ID will
                expire in 48 hours. 
              type: string
        default:
          description: Error response.
          schema:
            $ref: '#/definitions/ComputerVisionError'
      x-ms-examples:
        Successful Domain Model analysis request:
          $ref: ./examples/SuccessfulRecognizeTextWithUrl.json
      summary: Microsoft Azure Post Recognizetext
      tags:
        - recognizeText
  /textOperations/{operationId}:
    get:
      description: >-
        This interface is used for getting text operation result. The URL to
        this interface should be retrieved from 'Operation-Location' field
        returned from Recognize Text interface.
      operationId: microsoftAzureGettextoperationresult
      parameters:
        - name: operationId
          in: path
          description: >-
            Id of the text operation returned in the response of the 'Recognize
            Handwritten Text'
          required: true
          type: string
      produces:
        - application/json
      responses:
        '200':
          description: Returns the operation status.
          schema:
            $ref: '#/definitions/TextOperationResult'
        default:
          description: Error response.
          schema:
            $ref: '#/definitions/ComputerVisionError'
      x-ms-examples:
        Successful Domain Model analysis request:
          $ref: ./examples/SuccessfulGetTextOperationResult.json
      summary: Microsoft Azure Get Textoperations Operationid
      tags:
        - textOperations
x-ms-paths:
  /analyze?overload=stream:
    post:
      description: >-
        This operation extracts a rich set of visual features based on the image
        content.
      operationId: AnalyzeImageInStream
      consumes:
        - application/octet-stream
        - multipart/form-data
      produces:
        - application/json
      parameters:
        - $ref: '#/parameters/VisualFeatures'
        - name: details
          in: query
          description: >-
            A string indicating which domain-specific details to return.
            Multiple values should be comma-separated. Valid visual feature
            types include:Celebrities - identifies celebrities if detected in
            the image.
          type: string
          required: false
          enum:
            - Celebrities
            - Landmarks
        - $ref: '#/parameters/ServiceLanguage'
        - $ref: ../../../Common/Parameters.json#/parameters/ImageStream
      responses:
        '200':
          description: >-
            The response include the extracted features in JSON format. Here is
            the definitions for enumeration types clipart = 0, ambiguous = 1,
            normal-clipart = 2, good-clipart = 3. Non-LineDrawing =
            0,LineDrawing = 1.
          schema:
            $ref: '#/definitions/ImageAnalysis'
        default:
          description: Error response.
          schema:
            $ref: '#/definitions/ComputerVisionError'
      x-ms-examples:
        Successful Analyze with Url request:
          $ref: ./examples/SuccessfulAnalyzeWithStream.json
  /generateThumbnail?overload=stream:
    post:
      description: >-
        This operation generates a thumbnail image with the user-specified width
        and height. By default, the service analyzes the image, identifies the
        region of interest (ROI), and generates smart cropping coordinates based
        on the ROI. Smart cropping helps when you specify an aspect ratio that
        differs from that of the input image. A successful response contains the
        thumbnail image binary. If the request failed, the response contains an
        error code and a message to help determine what went wrong.
      operationId: GenerateThumbnailInStream
      consumes:
        - application/octet-stream
        - multipart/form-data
      produces:
        - application/octet-stream
      parameters:
        - name: width
          type: integer
          in: query
          required: true
          minimum: 1
          maximum: 1023
          description: >-
            Width of the thumbnail. It must be between 1 and 1024. Recommended
            minimum of 50.
        - name: height
          type: integer
          in: query
          required: true
          minimum: 1
          maximum: 1023
          description: >-
            Height of the thumbnail. It must be between 1 and 1024. Recommended
            minimum of 50.
        - $ref: ../../../Common/Parameters.json#/parameters/ImageStream
        - name: smartCropping
          type: boolean
          in: query
          required: false
          default: false
          description: Boolean flag for enabling smart cropping.
      responses:
        '200':
          description: The generated thumbnail in binary format.
          schema:
            type: file
        default:
          description: Error response.
          schema:
            $ref: '#/definitions/ComputerVisionError'
      x-ms-examples:
        Successful Generate Thumbnail request:
          $ref: ./examples/SuccessfulGenerateThumbnailWithStream.json
  /ocr?overload=stream:
    post:
      description: >-
        Optical Character Recognition (OCR) detects printed text in an image and
        extracts the recognized characters into a machine-usable character
        stream.   Upon success, the OCR results will be returned. Upon failure,
        the error code together with an error message will be returned. The
        error code can be one of InvalidImageUrl, InvalidImageFormat,
        InvalidImageSize, NotSupportedImage,  NotSupportedLanguage, or
        InternalServerError.
      operationId: RecognizePrintedTextInStream
      consumes:
        - application/octet-stream
        - multipart/form-data
      produces:
        - application/json
      parameters:
        - $ref: '#/parameters/OcrLanguage'
        - $ref: '#/parameters/DetectOrientation'
        - $ref: ../../../Common/Parameters.json#/parameters/ImageStream
      responses:
        '200':
          description: >-
            The OCR results in the hierarchy of region/line/word. The results
            include text, bounding box for regions, lines and words. The angle,
            in degrees, of the detected text with respect to the closest
            horizontal or vertical direction. After rotating the input image
            clockwise by this angle, the recognized text lines become horizontal
            or vertical. In combination with the orientation property it can be
            used to overlay recognition results correctly on the original image,
            by rotating either the original image or recognition results by a
            suitable angle around the center of the original image. If the angle
            cannot be confidently detected, this property is not present. If the
            image contains text at different angles, only part of the text will
            be recognized correctly.
          schema:
            $ref: '#/definitions/OcrResult'
        default:
          description: Error response.
          schema:
            $ref: '#/definitions/ComputerVisionError'
      x-ms-examples:
        Successful Ocr request:
          $ref: ./examples/SuccessfulOcrWithStream.json
  /describe?overload=stream:
    post:
      description: >-
        This operation generates a description of an image in human readable
        language with complete sentences.  The description is based on a
        collection of content tags, which are also returned by the operation.
        More than one description can be generated for each image.  Descriptions
        are ordered by their confidence score. All descriptions are in English.
        Two input methods are supported -- (1) Uploading an image or (2)
        specifying an image URL.A successful response will be returned in JSON.  If the request failed, the response will contain an error code and a
        message to help understand what went wrong.
      operationId: DescribeImageInStream
      consumes:
        - application/octet-stream
        - multipart/form-data
      produces:
        - application/json
      parameters:
        - name: maxCandidates
          in: query
          description: >-
            Maximum number of candidate descriptions to be returned.  The
            default is 1.
          type: string
          required: false
          default: '1'
        - $ref: '#/parameters/ServiceLanguage'
        - $ref: ../../../Common/Parameters.json#/parameters/ImageStream
      responses:
        '200':
          description: Image description object.
          schema:
            $ref: '#/definitions/ImageDescription'
        default:
          description: Error response.
          schema:
            $ref: '#/definitions/ComputerVisionError'
      x-ms-examples:
        Successful Describe request:
          $ref: ./examples/SuccessfulDescribeWithStream.json
  /tag?overload=stream:
    post:
      description: >-
        This operation generates a list of words, or tags, that are relevant to
        the content of the supplied image. The Computer Vision API can return
        tags based on objects, living beings, scenery or actions found in
        images. Unlike categories, tags are not organized according to a
        hierarchical classification system, but correspond to image content.
        Tags may contain hints to avoid ambiguity or provide context, for
        example the tag 'cello' may be accompanied by the hint 'musical
        instrument'. All tags are in English.
      operationId: TagImageInStream
      consumes:
        - application/octet-stream
        - multipart/form-data
      produces:
        - application/json
      parameters:
        - $ref: '#/parameters/ServiceLanguage'
        - $ref: ../../../Common/Parameters.json#/parameters/ImageStream
      responses:
        '200':
          description: Image tags object.
          schema:
            $ref: '#/definitions/TagResult'
        default:
          description: Error response.
          schema:
            $ref: '#/definitions/ComputerVisionError'
      x-ms-examples:
        Successful Tag request:
          $ref: ./examples/SuccessfulTagWithStream.json
  /models/{model}/analyze?overload=stream:
    post:
      description: >-
        This operation recognizes content within an image by applying a
        domain-specific model.  The list of domain-specific models that are
        supported by the Computer Vision API can be retrieved using the /models
        GET request.  Currently, the API only provides a single domain-specific
        model: celebrities. Two input methods are supported -- (1) Uploading an
        image or (2) specifying an image URL. A successful response will be
        returned in JSON.  If the request failed, the response will contain an
        error code and a message to help understand what went wrong.
      operationId: AnalyzeImageByDomainInStream
      consumes:
        - application/octet-stream
        - multipart/form-data
      produces:
        - application/json
      parameters:
        - name: model
          in: path
          description: The domain-specific content to recognize.
          required: true
          type: string
        - $ref: '#/parameters/ServiceLanguage'
        - $ref: ../../../Common/Parameters.json#/parameters/ImageStream
      responses:
        '200':
          description: Analysis result based on the domain model
          schema:
            $ref: '#/definitions/DomainModelResults'
        default:
          description: Error response.
          schema:
            $ref: '#/definitions/ComputerVisionError'
      x-ms-examples:
        Successful Domain Model analysis request:
          $ref: ./examples/SuccessfulDomainModelWithStream.json
  /recognizeText?overload=stream:
    post:
      description: >-
        Recognize Text operation. When you use the Recognize Text interface, the
        response contains a field called 'Operation-Location'. The
        'Operation-Location' field contains the URL that you must use for your
        Get Handwritten Text Operation Result operation.
      operationId: RecognizeTextInStream
      parameters:
        - $ref: '#/parameters/HandwritingBoolean'
        - $ref: ../../../Common/Parameters.json#/parameters/ImageStream
      consumes:
        - application/octet-stream
      produces:
        - application/json
      responses:
        '202':
          description: >-
            The service has accepted the request and will start processing
            later.
          headers:
            Operation-Location:
              description: >-
                URL to query for status of the operation. The operation ID will
                expire in 48 hours. 
              type: string
        default:
          description: Error response.
          schema:
            $ref: '#/definitions/ComputerVisionError'
      x-ms-examples:
        Successful Domain Model analysis request:
          $ref: ./examples/SuccessfulRecognizeTextWithStream.json
definitions:
  TextOperationResult:
    type: object
    properties:
      status:
        type: string
        description: Status of the text operation.
        enum:
          - Not Started
          - Running
          - Failed
          - Succeeded
        x-ms-enum:
          name: TextOperationStatusCodes
          modelAsString: false
        x-nullable: false
      recognitionResult:
        $ref: '#/definitions/RecognitionResult'
  RecognitionResult:
    type: object
    properties:
      lines:
        type: array
        items:
          $ref: '#/definitions/Line'
  Line:
    type: object
    properties:
      boundingBox:
        $ref: '#/definitions/BoundingBox'
      text:
        type: string
      words:
        type: array
        items:
          $ref: '#/definitions/Word'
  Word:
    type: object
    properties:
      boundingBox:
        $ref: '#/definitions/BoundingBox'
      text:
        type: string
  BoundingBox:
    type: array
    items:
      type: integer
      x-nullable: false
  ImageAnalysis:
    type: object
    description: Result of AnalyzeImage operation.
    properties:
      categories:
        type: array
        description: An array indicating identified categories.
        items:
          $ref: '#/definitions/Category'
      adult:
        $ref: '#/definitions/AdultInfo'
      color:
        $ref: '#/definitions/ColorInfo'
      imageType:
        $ref: '#/definitions/ImageType'
      tags:
        type: array
        description: A list of tags with confidence level.
        items:
          $ref: '#/definitions/ImageTag'
      description:
        $ref: '#/definitions/ImageDescriptionDetails'
      faces:
        type: array
        description: An array of possible faces within the image.
        items:
          $ref: '#/definitions/FaceDescription'
      requestId:
        type: string
        description: Id of the request for tracking purposes.
      metadata:
        $ref: '#/definitions/ImageMetadata'
  OcrResult:
    type: object
    properties:
      language:
        type: string
        description: The BCP-47 language code of the text in the image.
      textAngle:
        type: number
        format: double
        description: >-
          The angle, in degrees, of the detected text with respect to the
          closest horizontal or vertical direction. After rotating the input
          image clockwise by this angle, the recognized text lines become
          horizontal or vertical. In combination with the orientation property
          it can be used to overlay recognition results correctly on the
          original image, by rotating either the original image or recognition
          results by a suitable angle around the center of the original image.
          If the angle cannot be confidently detected, this property is not
          present. If the image contains text at different angles, only part of
          the text will be recognized correctly.
      orientation:
        type: string
        description: >-
          Orientation of the text recognized in the image. The value
          (up,down,left, or right) refers to the direction that the top of the
          recognized text is facing, after the image has been rotated around its
          center according to the detected text angle (see textAngle property).
      regions:
        type: array
        description: >-
          An array of objects, where each object represents a region of
          recognized text.
        items:
          $ref: '#/definitions/OcrRegion'
  OcrRegion:
    type: object
    description: >-
      A region consists of multiple lines (e.g. a column of text in a
      multi-column document).
    properties:
      boundingBox:
        type: string
        description: >-
          Bounding box of a recognized region. The four integers represent the
          x-coordinate of the left edge, the y-coordinate of the top edge,
          width, and height of the bounding box, in the coordinate system of the
          input image, after it has been rotated around its center according to
          the detected text angle (see textAngle property), with the origin at
          the top-left corner, and the y-axis pointing down.
      lines:
        type: array
        items:
          $ref: '#/definitions/OcrLine'
  OcrLine:
    type: object
    description: An object describing a single recognized line of text.
    properties:
      boundingBox:
        type: string
        description: >-
          Bounding box of a recognized line. The four integers represent the
          x-coordinate of the left edge, the y-coordinate of the top edge,
          width, and height of the bounding box, in the coordinate system of the
          input image, after it has been rotated around its center according to
          the detected text angle (see textAngle property), with the origin at
          the top-left corner, and the y-axis pointing down.
      words:
        type: array
        description: An array of objects, where each object represents a recognized word.
        items:
          $ref: '#/definitions/OcrWord'
  OcrWord:
    type: object
    description: Information on a recognized word.
    properties:
      boundingBox:
        type: string
        description: >-
          Bounding box of a recognized word. The four integers represent the
          x-coordinate of the left edge, the y-coordinate of the top edge,
          width, and height of the bounding box, in the coordinate system of the
          input image, af

# --- truncated at 32 KB (44 KB total) ---
# Full source: https://raw.githubusercontent.com/api-evangelist/microsoft-azure/refs/heads/main/openapi/computer-vision-api-openapi-original.yml