Text to Speech API: Natural Voice Synthesis for Developers
Start Building with Hypereal
Access Kling, Flux, Sora, Veo & more through a single API. Free credits to start, scale to millions.
No credit card required • 100k+ developers • Enterprise ready
What is the Text to Speech API?
The Text to Speech API converts written text into natural-sounding speech audio. With advanced AI models, you can generate high-quality voice output in multiple formats with fine-tuned control over expressiveness and style.
Use Cases
- Voice Assistants: Power conversational AI applications
- Audiobook Generation: Convert written content to audio
- Accessibility: Make content accessible to visually impaired users
- Video Narration: Generate voiceovers for videos and presentations
- E-Learning: Create audio content for educational platforms
API Parameters
Required Parameters
| Parameter | Type | Description |
|---|---|---|
text |
string | Text to convert to speech |
Optional Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
model |
string | s1 |
TTS model: s1, speech-1.6, speech-1.5 |
reference_id |
string | — | Voice model ID for custom voices |
format |
string | mp3 |
Output format: mp3, wav, pcm, opus |
temperature |
number | 0.7 |
Expressiveness (0-1). Higher = more varied |
top_p |
number | 0.7 |
Diversity via nucleus sampling (0-1) |
latency |
string | normal |
Trade-off: low, normal, balanced |
mp3_bitrate |
number | 128 |
MP3 bitrate: 64, 128, 192 kbps |
Pricing
| Usage | Price (USD) | Credits |
|---|---|---|
| Per ~1000 characters | $0.015 | ~3 |
How to Use Text to Speech API
Step 1: Create an Account
Sign up at Hypereal to get started.
Step 2: Get Your API Key
Generate your API key from the dashboard.
Step 3: Make Your API Call
const response = await fetch('https://api.hypereal.com/v1/audio/generate', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
model: 'audio-tts',
text: 'Hello! Welcome to our platform. We are excited to have you here.',
format: 'mp3',
temperature: 0.7
})
});
const audioBlob = await response.blob();
Step 4: Handle the Response
The API returns an audio file directly in your specified format (MP3, WAV, PCM, or Opus).
Best Practices
- Chunk long text - Split very long texts into smaller segments for better quality
- Choose appropriate model - Use
s1for best quality, older versions for compatibility - Adjust temperature - Lower for consistent output, higher for more expressive speech
- Select output format - Use MP3 for general use, WAV for editing, Opus for streaming
FAQ
What languages are supported?
The API supports multiple languages including English, Chinese, Japanese, and more.
What is the maximum text length?
Text is processed in chunks, so there's no hard limit. Very long texts are automatically segmented.
Can I use custom voices?
Yes, use the reference_id parameter to specify a voice from the voice library.
Why Choose Hypereal?
Access Text to Speech and 100+ other AI models through a single, unified API.
- One API key for all models
- Unified billing across providers
- Competitive pricing with volume discounts
Get Started Free - No credit card required.
Related Articles
Start Building Today
Get 35 free credits on signup. No credit card required. Generate your first image in under 5 minutes.
