Qubrid AI

Qubrid AI Inference API

The Qubrid AI Inference API provides a single, OpenAI-compatible endpoint for orchestrating 40+ open-source models running on NVIDIA GPU infrastructure. By abstracting hardware orchestration through TensorRT-LLM and Triton Inference Server, the API allows enterprise developers to run inference on models without managing underlying infrastructure.

Documentation GitHub OpenAPI

Qubrid AI Inference API

Documentation

Specifications

Schemas & Data

OpenAPI Specification