Glama AI Gateway
OpenAI-compatible LLM gateway exposing 90+ models from OpenAI, Anthropic, Google, DeepSeek, Mistral, xAI, Moonshot, Alibaba (Qwen), Cohere, and Perplexity behind a single base URL (`https://gateway.glama.ai/v1`). Drop-in replacement for any OpenAI SDK with prompt caching, load balancing, fallbacks, reasoning effort levels, text streaming, web search/fetch tools, real-time cost analytics, consolidated billing, and no rate limits under 1B tokens/day. ~40ms median overhead and 99% uptime SLO.