text-to-speech

Fish Audio S2 Pro Text to Speech

Fish Audio S2 Pro 文本转语音模型，将文本转换为自然语音，支持参考音色、采样控制、分段、音频格式和韵律控制。

MiniMax Sound Design

Generate a personalized voice based on a text description. Returns a `voice_id` that can be used with the T2A text-to-speech API, along with a hex-encoded preview audio sample.

Google

Gemini 2.5 Flash Text-to-Speech

The Google Gemini series emphasizes multimodal understanding and instruction following, balancing speed and cost to make it suitable for production-level use. Gemini 2.5 Flash prioritizes low latency and cost-effectiveness, making it ideal for real-time scenarios. Text-to-speech supports multiple languages and emotional tone control, and can be used for voiceovers, announcements, customer service, and character dialogue. The Instant Inference API offers stable performance, no waiting time, and affordable pricing.

MiniMax

MiniMax Speech 2.8 Turbo Async Text-to-Speech

The Minimax series offers reliable synthesis capabilities, making it ideal for production environments. Designed for production-level use, this series prioritizes stability and predictable output. The speech synthesis supports multiple languages and emotional tone control, making it suitable for voiceovers, announcements, customer service, and character dialogue. The real-time inference API delivers stable performance with no waiting time and is affordably priced.

MiniMax

Elevenlabs Flash v2.5 Text-to-Speech

The Elevenlabs series offers reliable generation capabilities, making it ideal for production environments. Designed for production-grade use, this series prioritizes stability and controllable output. Its text-to-speech synthesis supports multiple languages and emotional tone control, making it suitable for voiceovers, announcements, customer service, and character dialogue. The real-time inference API delivers stable performance with no waiting time and is affordably priced.

Elevenlabs