Overview
The TTS API converts text into speech audio. It exposes a V2 OpenAI-compatible speech endpoint backed by Hidoba TTS.
Base endpoint:
POST /v2/audio/speech
Voice processor
Use hidoba voice processor to manage your voices and see their ID.
Currently API only supports ishikawa model (other models currently work only in the calls).
Typical Flow
- Send text, model, voice, and optional audio settings.
- The API validates quota access and calls the model.
- The response returns raw audio bytes in the requested format.
- Usage is recorded automatically for quota billing.
Features
- OpenAI-compatible shape: Use
model,voice,input,response_format, andspeed. - Formats: Generate
mp3,opus,wav, or rawpcm. - Billing included: Processed characters are billed automatically from provider usage.
Important Considerations
- Maximum input: 2,000 characters per request.
- Streaming: Not supported by this endpoint.
- Quota type: Requires a server-time quota.