Triton Inference Server

Triton GRPC API

High-performance gRPC API for model inference with support for streaming and binary tensor data.

Documentation GitHub

Documentation

📖

Documentation

https://github.com/triton-inference-server/server/blob/main/docs/protocol/README.md

Other Resources

🔗

Protocol Buffers

https://github.com/triton-inference-server/common/blob/main/protobuf/grpc_service.proto

🔗

Examples

https://github.com/triton-inference-server/client/tree/main/src/python/examples

API entry from apis.yml

name: Triton GRPC API
description: High-performance gRPC API for model inference with support for streaming and binary tensor
  data.
image: https://developer.nvidia.com/sites/default/files/akamai/triton-logo.png
humanURL: https://github.com/triton-inference-server/server/blob/main/docs/protocol/README.md
baseURL: grpc://localhost:8001
tags:
- GRPC
- High Performance
- Inference
- Streaming
properties:
- type: Documentation
  url: https://github.com/triton-inference-server/server/blob/main/docs/protocol/README.md
- type: Protocol Buffers
  url: https://github.com/triton-inference-server/common/blob/main/protobuf/grpc_service.proto
- type: Examples
  url: https://github.com/triton-inference-server/client/tree/main/src/python/examples
contact:
- FN: NVIDIA Triton Team
  email: [email protected]