Prime Intellect Evaluations API

Create, manage, and submit samples to evaluations against the Prime Intellect Environments Hub (2,500+ RL environments). Supports both client-driven evaluations (push samples, finalize, retrieve) and fully hosted evaluations executed by Prime Intellect with logs, cancellation, and selectable inference models. Backs `prime eval` CLI commands.

Prime Intellect Evaluations API is one of 6 APIs that Prime Intellect publishes on the APIs.io network, described by a machine-readable OpenAPI specification.

This API exposes 2 machine-runnable capabilities that can be deployed as REST, MCP, or Agent Skill surfaces via Naftiko and 1 JSON Schema definition.

Tagged areas include Evaluations, Benchmarks, Reinforcement Learning, and Environments. The published artifact set on APIs.io includes API documentation, an OpenAPI specification, 2 Naftiko capability specs, and 1 JSON Schema.

OpenAPI Specification

prime-intellect-evaluations-api-openapi.yml Raw ↑
openapi: 3.1.0
info:
  title: Prime Intellect Evaluations API
  version: 0.1.0
  description: Run and manage model evaluations against the Environments Hub. Includes both client-driven evaluations (push
    samples, finalize, retrieve) and hosted evaluations executed by Prime Intellect with logs, cancellation, and selectable
    inference models.
  contact:
    name: Prime Intellect
    url: https://www.primeintellect.ai
servers:
- url: https://api.primeintellect.ai
security:
- HTTPBearer: []
tags:
- name: evals
- name: hosted-evaluations
- name: Feedback
paths:
  /api/v1/evaluations/:
    post:
      tags:
      - evals
      summary: Create Evaluation
      description: 'Create a new evaluation


        This endpoint supports:

        - Environment evaluations: Provide environments

        - Prime RL evaluations: Provide run_id

        - Suite evaluations: Provide suite_id


        Ownership:

        - If team_id is provided in request, the evaluation will be owned by the team

        - Otherwise, the evaluation will be owned by the authenticated user'
      operationId: create_evaluation_api_v1_evaluations__post
      security:
      - HTTPBearer: []
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/CreateEvaluationRequest'
      responses:
        '201':
          description: Successful Response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/CreateEvaluationResponse'
        '401':
          description: Authorization failed
        '422':
          description: Invalid request data
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
    get:
      tags:
      - evals
      summary: List Evaluations
      description: 'Get a list of evaluations owned by the authenticated user or their teams


        By default, returns all evaluations the user has access to (personal + all teams).'
      operationId: list_evaluations_api_v1_evaluations__get
      security:
      - HTTPBearer: []
      parameters:
      - name: team_id
        in: query
        required: false
        schema:
          anyOf:
          - type: string
          - type: 'null'
          description: Filter by specific team ID
          title: Team Id
        description: Filter by specific team ID
      - name: environment_id
        in: query
        required: false
        schema:
          anyOf:
          - type: string
          - type: 'null'
          description: Filter by environment ID
          title: Environment Id
        description: Filter by environment ID
      - name: environment_name
        in: query
        required: false
        schema:
          anyOf:
          - type: string
          - type: 'null'
          description: Filter by environment name
          title: Environment Name
        description: Filter by environment name
      - name: suite_id
        in: query
        required: false
        schema:
          anyOf:
          - type: string
          - type: 'null'
          description: Filter by suite ID
          title: Suite Id
        description: Filter by suite ID
      - name: skip
        in: query
        required: false
        schema:
          type: integer
          minimum: 0
          default: 0
          title: Skip
      - name: limit
        in: query
        required: false
        schema:
          type: integer
          maximum: 100
          minimum: 1
          default: 50
          title: Limit
      responses:
        '200':
          description: Successful Response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ListEvaluationsResponse'
        '401':
          description: Authorization failed
        '422':
          description: Invalid request data
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
  /api/v1/evaluations/bulk-delete:
    post:
      tags:
      - evals
      summary: Bulk Delete Evaluations
      operationId: bulk_delete_evaluations_api_v1_evaluations_bulk_delete_post
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/BulkDeleteEvaluationsRequest'
        required: true
      responses:
        '200':
          description: Successful Response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/BulkDeleteEvaluationsResponse'
        '401':
          description: Authorization failed
        '422':
          description: Invalid request data
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
      security:
      - HTTPBearer: []
  /api/v1/evaluations/{evaluation_id}:
    put:
      tags:
      - evals
      summary: Update Evaluation
      description: Update an existing evaluation
      operationId: update_evaluation_api_v1_evaluations__evaluation_id__put
      security:
      - HTTPBearer: []
      parameters:
      - name: evaluation_id
        in: path
        required: true
        schema:
          type: string
          title: Evaluation Id
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/UpdateEvaluationRequest'
      responses:
        '200':
          description: Successful Response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/UpdateEvaluationResponse'
        '401':
          description: Authorization failed
        '422':
          description: Invalid request data
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
    get:
      tags:
      - evals
      summary: Get Evaluation
      description: Get detailed information about a specific evaluation.
      operationId: get_evaluation_api_v1_evaluations__evaluation_id__get
      security:
      - HTTPBearer: []
      parameters:
      - name: evaluation_id
        in: path
        required: true
        schema:
          type: string
          title: Evaluation Id
      responses:
        '200':
          description: Successful Response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/GetEvaluationResponse'
        '401':
          description: Authorization failed
        '422':
          description: Invalid request data
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
    delete:
      tags:
      - evals
      summary: Delete Evaluation
      operationId: delete_evaluation_api_v1_evaluations__evaluation_id__delete
      security:
      - HTTPBearer: []
      parameters:
      - name: evaluation_id
        in: path
        required: true
        schema:
          type: string
          title: Evaluation Id
      responses:
        '200':
          description: Successful Response
          content:
            application/json:
              schema: {}
        '401':
          description: Authorization failed
        '422':
          description: Invalid request data
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
  /api/v1/evaluations/{evaluation_id}/finalize:
    post:
      tags:
      - evals
      summary: Finalize Evaluation
      description: Mark an evaluation as complete and compute final statistics.
      operationId: finalize_evaluation_api_v1_evaluations__evaluation_id__finalize_post
      security:
      - HTTPBearer: []
      parameters:
      - name: evaluation_id
        in: path
        required: true
        schema:
          type: string
          title: Evaluation Id
      requestBody:
        content:
          application/json:
            schema:
              anyOf:
              - $ref: '#/components/schemas/FinalizeEvaluationRequest'
              - type: 'null'
              title: Request
      responses:
        '200':
          description: Successful Response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/FinalizeEvaluationResponse'
        '401':
          description: Authorization failed
        '422':
          description: Invalid request data
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
  /api/v1/evaluations/{evaluation_id}/samples:
    post:
      tags:
      - evals
      summary: Push Samples
      description: 'Push evaluation samples


        This endpoint can be called multiple times to stream samples as they''re generated.'
      operationId: push_samples_api_v1_evaluations__evaluation_id__samples_post
      security:
      - HTTPBearer: []
      parameters:
      - name: evaluation_id
        in: path
        required: true
        schema:
          type: string
          title: Evaluation Id
      requestBody:
        required: true
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/PushSamplesRequest'
      responses:
        '200':
          description: Successful Response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/PushSamplesResponse'
        '401':
          description: Authorization failed
        '422':
          description: Invalid request data
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
    get:
      tags:
      - evals
      summary: Get Samples
      description: Get samples for a specific evaluation
      operationId: get_samples_api_v1_evaluations__evaluation_id__samples_get
      security:
      - HTTPBearer: []
      parameters:
      - name: evaluation_id
        in: path
        required: true
        schema:
          type: string
          title: Evaluation Id
      - name: page
        in: query
        required: false
        schema:
          type: integer
          minimum: 1
          default: 1
          title: Page
      - name: limit
        in: query
        required: false
        schema:
          type: integer
          maximum: 1000
          minimum: 1
          default: 100
          title: Limit
      responses:
        '200':
          description: Successful Response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/GetSamplesResponse'
        '401':
          description: Authorization failed
        '422':
          description: Invalid request data
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
  /api/v1/feedback:
    post:
      tags:
      - Feedback
      summary: Submit Feedback
      operationId: submit_feedback_api_v1_feedback_post
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/FeedbackRequest'
        required: true
      responses:
        '201':
          description: Successful Response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/FeedbackResponse'
        '401':
          description: Authorization failed
        '422':
          description: Invalid request data
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
      security:
      - HTTPBearer: []
  /api/v1/hosted-evaluations:
    post:
      tags:
      - hosted-evaluations
      summary: Create Hosted Evaluation
      description: Create and start a hosted evaluation.
      operationId: create_hosted_evaluation_api_v1_hosted_evaluations_post
      requestBody:
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/CreateHostedEvaluationRequest'
        required: true
      responses:
        '201':
          description: Successful Response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/CreateHostedEvaluationResponse'
        '401':
          description: Authorization failed
        '422':
          description: Invalid request data
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
      security:
      - HTTPBearer: []
  /api/v1/hosted-evaluations/models:
    get:
      tags:
      - hosted-evaluations
      summary: Get Inference Models
      description: Get available models from Prime Inference API for hosted evaluations
      operationId: get_inference_models_api_v1_hosted_evaluations_models_get
      responses:
        '200':
          description: Successful Response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ListInferenceModelsResponse'
        '401':
          description: Authorization failed
        '422':
          description: Invalid request data
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
      security:
      - HTTPBearer: []
  /api/v1/hosted-evaluations/{evaluation_id}/cancel:
    patch:
      tags:
      - hosted-evaluations
      summary: Cancel Hosted Evaluation Route
      description: Cancel a running hosted evaluation.
      operationId: cancel_hosted_evaluation_route_api_v1_hosted_evaluations__evaluation_id__cancel_patch
      security:
      - HTTPBearer: []
      parameters:
      - name: evaluation_id
        in: path
        required: true
        schema:
          type: string
          title: Evaluation Id
      responses:
        '200':
          description: Successful Response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/CancelHostedEvaluationResponse'
        '401':
          description: Authorization failed
        '422':
          description: Invalid request data
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
  /api/v1/hosted-evaluations/{evaluation_id}/logs:
    get:
      tags:
      - hosted-evaluations
      summary: Get Hosted Evaluation Logs Route
      description: Get real-time logs from the sandbox running a hosted evaluation.
      operationId: get_hosted_evaluation_logs_route_api_v1_hosted_evaluations__evaluation_id__logs_get
      security:
      - HTTPBearer: []
      parameters:
      - name: evaluation_id
        in: path
        required: true
        schema:
          type: string
          title: Evaluation Id
      responses:
        '200':
          description: Successful Response
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/GetHostedEvaluationLogsResponse'
        '401':
          description: Authorization failed
        '422':
          description: Invalid request data
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
components:
  securitySchemes:
    HTTPBearer:
      type: http
      scheme: bearer
  schemas:
    FinalizeEvaluationResponse:
      properties:
        evaluation_id:
          type: string
          title: Evaluation Id
        status:
          $ref: '#/components/schemas/EvaluationStatus'
        message:
          anyOf:
          - type: string
          - type: 'null'
          title: Message
        total_samples:
          anyOf:
          - type: integer
          - type: 'null'
          title: Total Samples
        avg_score:
          anyOf:
          - type: number
          - type: 'null'
          title: Avg Score
        min_score:
          anyOf:
          - type: number
          - type: 'null'
          title: Min Score
        max_score:
          anyOf:
          - type: number
          - type: 'null'
          title: Max Score
        viewer_url:
          anyOf:
          - type: string
          - type: 'null'
          title: Viewer Url
      type: object
      required:
      - evaluation_id
      - status
      title: FinalizeEvaluationResponse
      description: Response after finalizing an evaluation
    EvaluationStatus:
      type: string
      enum:
      - PENDING
      - RUNNING
      - PROCESSING
      - COMPLETED
      - FAILED
      - TIMEOUT
      - CANCELLED
      title: EvaluationStatus
      description: Evaluation status enum
    ListInferenceModelsResponse:
      properties:
        models:
          items:
            type: object
          type: array
          title: Models
          description: List of available models
      type: object
      required:
      - models
      title: ListInferenceModelsResponse
      description: Response for listing available inference models
    CreateHostedEvaluationRequest:
      properties:
        environment_ids:
          items:
            type: string
          type: array
          minItems: 1
          title: Environment Ids
          description: List of environment IDs to evaluate
        inference_model:
          type: string
          title: Inference Model
          description: Model ID for inference
        eval_config:
          $ref: '#/components/schemas/HostedEvalConfig'
          description: Evaluation configuration
        team_id:
          anyOf:
          - type: string
          - type: 'null'
          title: Team Id
          description: Optional team ID to own the hosted evaluation
        name:
          anyOf:
          - type: string
          - type: 'null'
          title: Name
          description: Optional custom evaluation name
      type: object
      required:
      - environment_ids
      - inference_model
      - eval_config
      title: CreateHostedEvaluationRequest
      description: Request to create and start a hosted evaluation
    HostedEvalConfig:
      properties:
        num_examples:
          type: integer
          maximum: 9007199254740991.0
          minimum: -1.0
          title: Num Examples
          description: Number of examples to evaluate (-1 for all)
        rollouts_per_example:
          type: integer
          maximum: 2048.0
          minimum: 1.0
          title: Rollouts Per Example
          description: Rollouts per example
        env_args:
          anyOf:
          - type: object
          - type: 'null'
          title: Env Args
          description: Optional environment arguments to pass to the evaluation
        allow_sandbox_access:
          anyOf:
          - type: boolean
          - type: 'null'
          title: Allow Sandbox Access
          description: Allow sandbox read/write access
          default: false
        allow_instances_access:
          anyOf:
          - type: boolean
          - type: 'null'
          title: Allow Instances Access
          description: Allow instance creation and management access
          default: false
        allow_tunnel_access:
          anyOf:
          - type: boolean
          - type: 'null'
          title: Allow Tunnel Access
          description: Allow tunnel creation and management access
          default: false
        timeout_minutes:
          anyOf:
          - type: integer
            maximum: 1440.0
            minimum: 120.0
          - type: 'null'
          title: Timeout Minutes
          description: 'Custom timeout in minutes for the hosted eval run (default: 1440 = 24 hours, min: 120, max: 1440)'
        custom_secrets:
          anyOf:
          - additionalProperties:
              type: string
            type: object
          - type: 'null'
          title: Custom Secrets
          description: Custom secrets to set in the evaluation sandbox (e.g., API keys, tokens)
        sampling_args:
          anyOf:
          - type: object
          - type: 'null'
          title: Sampling Args
          description: Optional sampling arguments forwarded to `prime eval run --sampling-args`
        max_concurrent:
          anyOf:
          - type: integer
            maximum: 9007199254740991.0
            minimum: 1.0
          - type: 'null'
          title: Max Concurrent
          description: Optional max concurrency forwarded to `prime eval run --max-concurrent`
        max_retries:
          anyOf:
          - type: integer
            maximum: 9007199254740991.0
            minimum: 0.0
          - type: 'null'
          title: Max Retries
          description: Optional max retries forwarded to `prime eval run --max-retries`
        state_columns:
          anyOf:
          - items:
              type: string
            type: array
          - type: 'null'
          title: State Columns
          description: Optional state columns forwarded to `prime eval run --state-columns`
        independent_scoring:
          anyOf:
          - type: boolean
          - type: 'null'
          title: Independent Scoring
          description: Forward `--independent-scoring` to the hosted eval runner
          default: false
        verbose:
          anyOf:
          - type: boolean
          - type: 'null'
          title: Verbose
          description: Forward `--verbose` to the hosted eval runner
          default: false
        headers:
          anyOf:
          - items:
              type: string
            type: array
          - type: 'null'
          title: Headers
          description: Optional repeated headers forwarded to `prime eval run --header`
        extra_env_kwargs:
          anyOf:
          - type: object
          - type: 'null'
          title: Extra Env Kwargs
          description: Optional environment constructor kwargs forwarded to `prime eval run --extra-env-kwargs`
        api_client_type:
          anyOf:
          - type: string
          - type: 'null'
          title: Api Client Type
          description: Optional API client type forwarded to `prime eval run --api-client-type`
        api_base_url:
          anyOf:
          - type: string
          - type: 'null'
          title: Api Base Url
          description: Optional inference base URL forwarded to `prime eval run --api-base-url`
        api_key_var:
          anyOf:
          - type: string
          - type: 'null'
          title: Api Key Var
          description: Optional API key env var forwarded to `prime eval run --api-key-var`
      type: object
      required:
      - num_examples
      - rollouts_per_example
      title: HostedEvalConfig
      description: Hosted evaluation configuration
    CreateHostedEvaluationResponse:
      properties:
        evaluation_id:
          type: string
          title: Evaluation Id
          description: ID of the created evaluation
        sandbox_id:
          anyOf:
          - type: string
          - type: 'null'
          title: Sandbox Id
          description: ID of the sandbox running the evaluation
        status:
          type: string
          title: Status
          description: Current status of the evaluation
        evaluation_ids:
          anyOf:
          - items:
              type: string
            type: array
          - type: 'null'
          title: Evaluation Ids
          description: List of evaluation IDs if multiple environments were provided
        error:
          anyOf:
          - type: string
          - type: 'null'
          title: Error
          description: Error message if creation failed
      type: object
      required:
      - evaluation_id
      - status
      title: CreateHostedEvaluationResponse
      description: Response after creating a hosted evaluation
    BulkDeleteEvaluationsRequest:
      properties:
        evaluation_ids:
          items:
            type: string
          type: array
          maxItems: 100
          minItems: 1
          title: Evaluation Ids
      type: object
      required:
      - evaluation_ids
      title: BulkDeleteEvaluationsRequest
    UpdateEvaluationRequest:
      properties:
        name:
          anyOf:
          - type: string
          - type: 'null'
          title: Name
          description: Name of the evaluation
        description:
          anyOf:
          - type: string
          - type: 'null'
          title: Description
          description: Description of the evaluation
        tags:
          anyOf:
          - items:
              type: string
            type: array
          - type: 'null'
          title: Tags
          description: Tags for categorization
        metadata:
          anyOf:
          - type: object
          - type: 'null'
          title: Metadata
          description: Additional metadata
        metrics:
          anyOf:
          - type: object
          - type: 'null'
          title: Metrics
          description: High-level metrics summary
        model_name:
          anyOf:
          - type: string
          - type: 'null'
          title: Model Name
          description: Model name
        dataset:
          anyOf:
          - type: string
          - type: 'null'
          title: Dataset
          description: Dataset name
        framework:
          anyOf:
          - type: string
          - type: 'null'
          title: Framework
          description: Framework used
        task_type:
          anyOf:
          - type: string
          - type: 'null'
          title: Task Type
          description: Type of task
        inference_model:
          anyOf:
          - type: string
          - type: 'null'
          title: Inference Model
          description: Prime Inference model ID
        eval_config:
          anyOf:
          - $ref: '#/components/schemas/HostedEvalConfig'
          - type: 'null'
          description: Hosted evaluation configuration
        is_public:
          anyOf:
          - type: boolean
          - type: 'null'
          title: Is Public
          description: Whether the evaluation is publicly shareable by link; setting true without show_on_leaderboard keeps
            it off leaderboards by default
        show_on_leaderboard:
          anyOf:
          - type: boolean
          - type: 'null'
          title: Show On Leaderboard
          description: Whether this public evaluation appears on environment leaderboards
      type: object
      title: UpdateEvaluationRequest
      description: Request to update an existing evaluation
    FeedbackRequest:
      properties:
        message:
          type: string
          maxLength: 2048
          minLength: 1
          title: Message
        category:
          $ref: '#/components/schemas/FeedbackCategory'
        cli_version:
          type: string
          maxLength: 64
          minLength: 1
          title: Cli Version
        run_id:
          anyOf:
          - type: string
            maxLength: 128
          - type: 'null'
          title: Run Id
      type: object
      required:
      - message
      - category
      - cli_version
      title: FeedbackRequest
    FeedbackCategory:
      type: string
      enum:
      - bug
      - feature
      - general
      title: FeedbackCategory
    FinalizeEvaluationRequest:
      properties:
        metrics:
          anyOf:
          - type: object
          - type: 'null'
          title: Metrics
          description: Final metrics to attach to the evaluation
      type: object
      title: FinalizeEvaluationRequest
      description: Request to finalize an evaluation (optional, can include final metrics)
    UpdateEvaluationResponse:
      properties:
        evaluation_id:
          type: string
          title: Evaluation Id
        name:
          type: string
          title: Name
        status:
          $ref: '#/components/schemas/EvaluationStatus'
        updated_at:
          type: string
          format: date-time
          title: Updated At
      type: object
      required:
      - evaluation_id
      - name
      - status
      - updated_at
      title: UpdateEvaluationResponse
      description: Response after updating an evaluation
    GetEvaluationResponse:
      properties:
        evaluation_id:
          type: string
          title: Evaluation Id
        name:
          type: string
          title: Name
        status:
          $ref: '#/components/schemas/EvaluationStatus'
        eval_type:
          $ref: '#/components/schemas/EvaluationType'
        user_id:
          anyOf:
          - type: string
          - type: 'null'
          title: User Id
          description: User ID of evaluation owner
        team_id:
          anyOf:
          - type: string
          - type: 'null'
          title: Team Id
          description: Team ID if evaluation is owned by a team
        environment_ids:
          anyOf:
          - items:
              type: string
            type: array
          - type: 'null'
          title: Environment Ids
        environment_names:
          anyOf:
          - items:
              type: string
            type: array
          - type: 'null'
          title: Environment Names
        suite_id:
          anyOf:
          - type: string
          - type: 'null'
          title: Suite Id
        run_id:
          anyOf:
          - type: string
          - type: 'null'
          title: Run Id
        version_id:
          anyOf:
          - type: string
          - type: 'null'
          title: Version Id
        is_hosted:
          type: boolean
          title: Is Hosted
          default: false
        sandbox_id:
          anyOf:
          - type: string
          - type: 'null'
          title: Sandbox Id
        inference_model:
          anyOf:
          - type: string
          - type: 'null'
          title: Inference Model
        eval_config:
          anyOf:
          - type: object
          - type: 'null'
          title: Eval Config
        error_message:
          anyOf:
          - type: string
          - type: 'null'
          title: Error Message
        logs:
          anyOf:
          - type: string
          - type: 'null'
          title: Logs
        model_name:
          anyOf:
          - type: string
          - type: 'null'
          title: Model Name
        dataset:
          anyOf:
          - type: string
          - type: 'null'
          title: Dataset
        framework:
          anyOf:
          - type: string
          - type: 'null'
          title: Framework
        task_type:
          anyOf:
          - type: string
          - type: 'null'
          title: Task Type
        description:
          anyOf:
          - type: string
          - type: 'null'
          title: Description
        tags:
          items:
            type: string
          type: array
          title: Tags
          default: []
        total_samples:
          type: integer
          title: Total Samples
        avg_score:
          anyOf:
          - type: number
          - type: 'null'
          title: Avg Score
        min_score:
          anyOf:
          - type: number
          - type: 'null'
          title: Min Score
        max_score:
          anyOf:
          - type: number
          - type: 'null'
          title: Max Score
        started_at:
          anyOf:
          - type: string
            format: date-time
          - type: 'null'
          title: Started At
        completed_at:
          anyOf:
          - type: string
            format: date-time
          - type: 'null'
          title: Completed At
        created_at:
          type: string
          format: date-time
          title: Created At
        updated_at:
          type: string
          format: date-time
          title: Updated At
        metadata:
          anyOf:
          - type: object
          - type: 'null'
          title: Metadata
        metrics:
          anyOf:
          - type: object
          - type: 'null'
          title: Metrics
        viewer_url:
          anyOf:
          - type: string
          - type: 'null'
          title: Viewer Url
        is_public:
          type: boolean
          title: Is Public
          description: Whether this evaluation is publicly shareable by link
          default: false
   

# --- truncated at 32 KB (42 KB total) ---
# Full source: https://raw.githubusercontent.com/api-evangelist/prime-intellect/refs/heads/main/openapi/prime-intellect-evaluations-api-openapi.yml