NVIDIA NIM Vision Language Models API
Vision-language model inference through the standard /v1/chat/completions surface with image inputs (base64 or URL) in the messages payload. Supports NVIDIA NeVA, microsoft/kosmos-2, Phi-3-vision, llama-3.2-90b-vision-instruct, and other VLMs hosted in the NIM catalog.
NVIDIA NIM Vision Language Models API is one of 10 APIs that NVIDIA NIM publishes on the APIs.io network, described by a machine-readable OpenAPI specification.
This API exposes 1 machine-runnable capability that can be deployed as REST, MCP, or Agent Skill surfaces via Naftiko.
Tagged areas include AI, Artificial Intelligence, Vision, Multimodal, and VLM. The published artifact set on APIs.io includes API documentation, an OpenAPI specification, and 1 Naftiko capability spec.