PlayAI Text-to-Speech API

The PlayAI Text-to-Speech API converts text into natural, human-like speech using the PlayDialog 1.0, Dialog 1.0 Turbo, and Play 3.0 Mini models. It supports streaming, voice cloning, and a large catalog of prebuilt voices across many languages and accents.

PlayAI Text-to-Speech API is one of 2 APIs that PlayHT publishes on the APIs.io network.

Tagged areas include TTS, Streaming, Voice Cloning, Voices, and Multilingual. The published artifact set on APIs.io includes API documentation, a getting-started guide, an API reference, SDKs, a GitHub repository, and pricing.

API entry from apis.yml

apis.yml Raw ↑
aid: playht:tts-api
name: PlayAI Text-to-Speech API
description: The PlayAI Text-to-Speech API converts text into natural, human-like speech using the PlayDialog
  1.0, Dialog 1.0 Turbo, and Play 3.0 Mini models. It supports streaming, voice cloning, and a large catalog
  of prebuilt voices across many languages and accents.
humanURL: https://docs.play.ai
baseURL: https://api.play.ai
tags:
- TTS
- Streaming
- Voice Cloning
- Voices
- Multilingual
properties:
- type: Documentation
  url: https://docs.play.ai
- type: GettingStarted
  url: https://docs.play.ai/tts-api
- type: SignUp
  url: https://play.ai/signup
- type: APIReference
  url: https://docs.play.ai
- type: SDK
  url: https://github.com/playht/pyht
- type: SDK
  url: https://github.com/playht/playht-nodejs-sdk
- type: GitHubRepository
  url: https://github.com/playht
- type: Pricing
  url: https://play.ai/pricing
- type: Authentication
  url: https://docs.play.ai
features:
- name: Multiple TTS Models
  description: Choose between PlayDialog 1.0, Dialog 1.0 Turbo, and Play 3.0 Mini for quality vs speed
    tradeoffs.
- name: 200+ Prebuilt Voices
  description: Library of more than 200 prebuilt voices across many languages and accents.
- name: Streaming Output
  description: Real-time streaming audio for sub-second time to first byte.
- name: Voice Cloning
  description: Clone a voice from reference audio for personalized synthesis.
- name: Speech Parameters
  description: Adjust speed, pitch, and volume to fit the use case.
- name: Multiple Audio Formats
  description: Output to common audio formats and bitrates for downstream pipelines.
useCases:
- name: Voice Agents
  description: Power conversational voice agents with low-latency synthesis.
- name: IVR and Phone Bots
  description: Replace recorded prompts and IVR menus with generative speech.
- name: Content Production
  description: Generate narration for video, audiobooks, and explainers.
- name: Accessibility
  description: Add high-quality read-aloud features to applications.
- name: Localization
  description: Localize voiceovers across languages and accents.
integrations:
- name: Twilio
- name: LiveKit
- name: Pipecat
- name: Vapi
- name: Daily
- name: LangChain
- name: OpenAI
- name: Anthropic
- name: Vercel AI SDK
authentication:
- type: API Key
  description: API key authentication with a user ID and secret API key in request headers.