Powering your Agentic Voice AI
Built by Applied AI Sweden AB · EU-owned & operated · Self-hosting available
TTS then ASR — tests full pipeline accuracy and latency.
Record a short speech sample, then use it as the voice for TTS. Both the audio and its transcript are required.
Read one of these sentences naturally. 5-15 seconds works best.
Auto-filled from your recording. Edit if needed.
Fire parallel TTS requests to measure throughput, latency, and time-to-first-byte under load. Click any bar to play the audio.
Drop-in replacement for ElevenLabs. Point any ElevenLabs SDK at https://samtal.moln.ai. Powered by OmniVoice (646-language TTS) and Parakeet TDT 0.6b-v3 (25-language ASR) on NVIDIA GPU.
xi-api-key: sk_your_key_here
| voice_id | Name | Language |
|---|---|---|
spectra-en-default | Nova | English |
spectra-en-warm | Oliver | English |
spectra-sv-default | Astrid | Swedish |
spectra-da-default | Freja | Danish |
spectra-no-default | Sigrid | Norwegian |
spectra-fi-default | Aino | Finnish |
spectra-de-default | Lena | German |
spectra-fr-default | Camille | French |
spectra-es-default | Lucia | Spanish |
spectra-it-default | Giulia | Italian |
spectra-nl-default | Emma | Dutch |
spectra-pl-default | Zuzanna | Polish |
spectra-pt-default | Beatriz | Portuguese |
| Operation | Latency | RTF |
|---|---|---|
| TTS (GPU, quality 4) | ~35ms | 0.10 |
| ASR (GPU) | ~70ms | 0.039 |
| Cached snippet | instant | 0 |
base_url to https://samtal.moln.aiAll endpoints, request formats, and response shapes are identical to ElevenLabs.