Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/KittenML/KittenTTS/llms.txt

Use this file to discover all available pages before exploring further.

This guide covers the core workflow: loading a model, generating audio from text, and saving the result.

Choose a model

KittenTTS provides three models at different size/quality tradeoffs:

Mini

KittenML/kitten-tts-mini-0.880 MB — highest quality

Micro

KittenML/kitten-tts-micro-0.840 MB — balanced speed and quality

Nano

KittenML/kitten-tts-nano-0.856 MB — smallest and fastest
Models are downloaded automatically from Hugging Face on first use and cached locally.

Generate and save audio

from kittentts import KittenTTS
import soundfile as sf

# Load the model (downloads automatically from Hugging Face)
model = KittenTTS("KittenML/kitten-tts-mini-0.8")

# Generate audio — returns a numpy array at 24 kHz
audio = model.generate("Hello, world!", voice="Bella")

# Save to WAV file
sf.write("output.wav", audio, 24000)
print("Saved to output.wav")
The generate method returns a NumPy array sampled at 24,000 Hz. You can pass it directly to soundfile, scipy, or any other audio library.

Save directly with generate_to_file

If you do not need the audio array, use generate_to_file to skip the intermediate step:
model.generate_to_file(
    "Hello, world!",
    "output.wav",
    voice="Bella"
)

List available voices

print(model.available_voices)
# ['Bella', 'Jasper', 'Luna', 'Bruno', 'Rosie', 'Hugo', 'Kiki', 'Leo']
Pass any of these names as the voice argument to generate or generate_to_file.

Complete example

from kittentts import KittenTTS
import soundfile as sf

model = KittenTTS("KittenML/kitten-tts-mini-0.8")

text = "One day, a little girl named Lily found a needle in her room."
audio = model.generate(text=text, voice="Bruno")

sf.write("output.wav", audio, 24000)