Single file generation — IVR prompt ready to convert
Batch CSV tab — generate hundreds of MP3s from a spreadsheet
The Asterisk Audio Generator is a web-based tool that converts text to Asterisk-compatible 8 kHz mono MP3 audio files — the exact format required by Asterisk PBX systems for IVR prompts, voicemail, and telephony applications. It supports multiple TTS (text-to-speech) backends selectable per request, enabling English, Luganda, and Swahili voice generation from a single interface, with both single-file and bulk batch workflows.
Producing Asterisk-ready audio traditionally requires manual recording studios or complex audio processing pipelines. This tool eliminates that friction by combining modern neural TTS engines with automatic audio resampling, delivering production-ready MP3 files in seconds. It is built to serve telecom engineers, IVR developers, and anyone building voice-driven applications on Asterisk.
Three TTS backends are integrated, each selectable per generation request:
Local neural model — no internet connection or API key required. Includes 12 US and GB voices (male and female).
Cloud-based TTS powered by Sunbird AI, supporting both English and Luganda — one of the few TTS providers offering Luganda voice synthesis. Requires a free Sunbird API token.
Microsoft Neural cloud TTS with no API key required. Includes 4 Swahili voices covering both Kenya and Tanzania dialects (male and female).
Select a language and voice, type the text, set a filename and playback speed (0.5× to 2.0×), then click Generate. The output MP3 appears in the file list instantly and can be downloaded or deleted directly from the UI.
Upload a CSV file with text and filename columns to generate hundreds
of audio files in one operation. A live progress feed powered by Server-Sent Events shows each
file completing in real time, with per-row error reporting that does not stop the batch. Limits:
500 rows, 2 MB file size, 5,000 characters per text cell.
A Dockerfile and docker-compose.yml are included for one-command
deployment. The containerised build bundles all dependencies including sox, Python packages,
and the Kokoro model files.
Regardless of the TTS backend, all audio passes through the same post-processing pipeline to guarantee Asterisk compatibility:
sox resamples the audio to 8,000 Hz mono and re-encodes it as MP3.media/ and served via the FastAPI app.The tool exposes a clean REST API for programmatic integration:
GET /speakers — full voice roster as JSONPOST /generate — generate a single MP3 filePOST /batch — generate multiple MP3s from a CSV, streamed as SSE eventsGET /files — list all generated filesDELETE /files/{filename} — delete a fileGET /health — health check