Ray Serve REST API

Ray Serve is a scalable model serving library built on Ray, designed for building online inference APIs. Supports composable deployments, autoscaling, HTTP ingress, gRPC, WebSockets, and request batching. Integrates with any ML framework. The Ray Serve dashboard and REST API manage deployments, replicas, routes, and application status.

API entry from apis.yml

apis.yml Raw ↑
name: Ray Serve REST API
description: Ray Serve is a scalable model serving library built on Ray, designed for building online
  inference APIs. Supports composable deployments, autoscaling, HTTP ingress, gRPC, WebSockets, and request
  batching. Integrates with any ML framework. The Ray Serve dashboard and REST API manage deployments,
  replicas, routes, and application status.
image: https://www.ray.io/favicon.ico
humanUrl: https://docs.ray.io/en/latest/serve/index.html
baseUrl: https://ray-serve.example.com
tags:
- Autoscaling
- Inference
- Machine Learning
- Model Serving
- Open Source
- Python
- Ray
properties:
- type: Documentation
  url: https://docs.ray.io/en/latest/serve/index.html
- type: GitHub
  url: https://github.com/ray-project/ray
- type: GettingStarted
  url: https://docs.ray.io/en/latest/serve/getting_started.html
- type: APIReference
  url: https://docs.ray.io/en/latest/serve/api/index.html
contact:
- type: Community
  url: https://discuss.ray.io/
- type: GitHub Issues
  url: https://github.com/ray-project/ray/issues