OctoStack
OctoStack was OctoAI's self-contained generative-AI production stack for deploying open and customer-trained foundation models inside a customer's VPC or on-premises environment. Announced April 2024, it supported NVIDIA GPUs, AMD GPUs, and AWS Inferentia, claimed 4x better GPU utilization, and bundled high-utilization batching, fine-tuning, and asset management. OctoStack is no longer offered as a standalone product after the NVIDIA acquisition; its technology has been absorbed into NVIDIA's inference stack.