TrueFoundry Model Serving API
TrueFoundry's Model Serving capability enables deployment and management of LLM and embedding models using backends like vLLM and Triton on Kubernetes infrastructure. It provides APIs for deploying models from a community registry of 1000+ configurations, managing inference endpoints, and controlling autoscaling behavior including scale-to-zero.