Prime Intellect Inference API
OpenAI-compatible inference API for hosted frontier and open models served at api.pinference.ai. Supports streaming chat completions, the full set of OpenAI parameters (temperature, top_p, max_tokens, logprobs), and returns a `usage` object with input/output token counts and USD cost on every response. LoRA adapters can be served alongside base models via 1-click deployments.
Prime Intellect Inference API is one of 6 APIs that Prime Intellect publishes on the APIs.io network, described by a machine-readable OpenAPI specification.
This API exposes 1 machine-runnable capability that can be deployed as REST, MCP, or Agent Skill surfaces via Naftiko.
Tagged areas include Inference, OpenAI Compatible, Foundation Models, and LLM. The published artifact set on APIs.io includes API documentation, an OpenAPI specification, and 1 Naftiko capability spec.