Voice

The Voice service adds text-to-speech (TTS) and speech-to-text (STT) capabilities to your characters. Characters can speak their responses aloud and listen to voice input.

Text-to-Speech (Kokoro TTS)

Kokoro is a lightweight, high-quality TTS engine that runs locally. When voice is enabled on a character, you can click the speak button on any message to hear it read aloud.

Streaming TTS

For longer messages, the Voice service uses streaming TTS — it splits the text into sentences and synthesizes each one independently. Audio starts playing after the first sentence is ready (typically 1-2 seconds), rather than waiting for the entire message to be synthesized.

This sentence-by-sentence approach provides low-latency audio output even for long responses.

Voice Selection

Multiple voices are available. Configure the voice per character in the Visual Builder's Voice node settings:

Select from available Kokoro voices
Adjust playback speed

Speech-to-Text (Whisper)

Whisper (via faster-whisper) transcribes spoken audio into text. When voice is enabled, you can record a voice message instead of typing.

The transcribed text is sent as a regular message to the character. Language detection is automatic.

Using Voice in Strings

When a character has the Voice node enabled:

A speak button appears on each message — click to hear the character's response
A microphone button appears in the input bar — click to record voice input
Speaking state is indicated with a pulsing animation on the speak button

Voice features are available alongside text — you can mix voice and text input freely.

Enabling Voice

Add the Voice node to your character's canvas in the Visual Builder
Configure voice selection and speed in the node settings
Voice features become available in Strings for conversations with that character

Voice processing runs entirely on your local machine. No audio is sent to cloud services.

Note: TTS quality and speed depend on your hardware. GPU acceleration significantly improves synthesis speed.

Updated on Mar 21, 2026

Advanced Features