RunPod Serverless

RunPod Serverless provides pay-as-you-go inference endpoints with autoscaling workers, queue-based and load-balanced endpoint types, FlashBoot cold-start optimization, and per-second billing. Each endpoint exposes a URL that accepts request payloads for AI model inference and compute-intensive workloads.

RunPod Serverless is one of 3 APIs that RunPod publishes on the APIs.io network.

Tagged areas include AI, Autoscaling, GPU, Inference, and Serverless. The published artifact set on APIs.io includes API documentation.

API entry from apis.yml

apis.yml Raw ↑
aid: runpod:serverless
name: RunPod Serverless
description: RunPod Serverless provides pay-as-you-go inference endpoints with autoscaling workers, queue-based
  and load-balanced endpoint types, FlashBoot cold-start optimization, and per-second billing. Each endpoint
  exposes a URL that accepts request payloads for AI model inference and compute-intensive workloads.
humanURL: https://docs.runpod.io/serverless/overview
baseURL: https://api.runpod.ai/v2
tags:
- AI
- Autoscaling
- GPU
- Inference
- Serverless
- Workers
properties:
- type: Documentation
  url: https://docs.runpod.io/serverless/overview