Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/KittenML/KittenTTS/llms.txt

Use this file to discover all available pages before exploring further.

TextPreprocessor is a standalone preprocessing pipeline that converts raw text into a spoken form suitable for TTS synthesis. It expands numbers, currencies, abbreviations, and more, while removing elements that a speech model cannot pronounce meaningfully (URLs, HTML tags, etc.). You can use it independently or let KittenTTS.generate() invoke it automatically by setting clean_text=True.

Constructor

from kittentts.preprocess import TextPreprocessor

pp = TextPreprocessor(
    lowercase=True,
    remove_punctuation=True,
    expand_currency=True,
    # ... additional options
)

Number handling

replace_numbers
bool
default:"True"
Convert integers to their spoken form. 42"forty-two".
replace_floats
bool
default:"True"
Convert decimal numbers to spoken form. 3.14"three point one four".
normalize_leading_decimals
bool
default:"True"
Prefix bare decimals with a zero before expansion. .50.5"zero point five".

Text expansion options

expand_contractions
bool
default:"True"
Expand English contractions. "don't""do not".
expand_model_names
bool
default:"True"
Expand model/product name abbreviations to their spoken form.
expand_ordinals
bool
default:"True"
Expand ordinal numbers. "3rd""third".
expand_percentages
bool
default:"True"
Expand percentage expressions. "20%""twenty percent".
expand_currency
bool
default:"True"
Expand currency amounts. "$9.99""nine dollars and ninety-nine cents".
expand_time
bool
default:"True"
Expand time expressions. "3:45 PM""three forty-five PM".
expand_ranges
bool
default:"True"
Expand numeric ranges. "5-10""five to ten".
expand_units
bool
default:"True"
Expand measurement units. "5kg""five kilograms".
expand_scale_suffixes
bool
default:"True"
Expand scale suffixes. "5M""five million".
expand_scientific_notation
bool
default:"True"
Expand scientific notation. "1e6""one million".
expand_fractions
bool
default:"True"
Expand fractions. "1/3""one third".
expand_decades
bool
default:"True"
Expand decade references. "the 80s""the eighties".
expand_phone_numbers
bool
default:"True"
Expand phone numbers digit-by-digit.
expand_ip_addresses
bool
default:"True"
Expand IP addresses digit-by-digit.
expand_roman_numerals
bool
default:"False"
Expand Roman numerals to spoken form. "XIV""fourteen". Disabled by default because many uppercase abbreviations would be misidentified.

Removal options

lowercase
bool
default:"True"
Convert all text to lowercase before processing.
remove_urls
bool
default:"True"
Strip URLs from the text.
remove_emails
bool
default:"True"
Strip email addresses from the text.
remove_html
bool
default:"True"
Strip HTML tags from the text.
remove_hashtags
bool
default:"False"
Strip hashtags (e.g., #KittenTTS) from the text.
remove_mentions
bool
default:"False"
Strip @mentions from the text.
remove_punctuation
bool
default:"True"
Remove punctuation characters after all expansions have been applied.
remove_stopwords
bool
default:"False"
Remove common stopwords from the text. Provide a custom set via stopwords.
stopwords
set | None
default:"None"
Custom set of stopwords to use when remove_stopwords=True. If None, a built-in English stopword list is used.
remove_extra_whitespace
bool
default:"True"
Collapse multiple consecutive whitespace characters into a single space and strip leading/trailing whitespace.

Unicode handling

normalize_unicode
bool
default:"True"
Apply Unicode NFC normalization to the text.
remove_accents
bool
default:"False"
Strip accent marks from characters after Unicode normalization. "café""cafe".

Methods

process()

TextPreprocessor.process(text: str) -> str
Runs the full preprocessing pipeline and returns the normalized text string. TextPreprocessor is also callable — pp(text) is equivalent to pp.process(text).

Usage examples

from kittentts.preprocess import TextPreprocessor

# Most expansions are enabled by default
pp = TextPreprocessor()

result = pp("The price is $99.99, a 20% discount.")
# → "the price is ninety-nine dollars and ninety-nine cents a twenty percent discount"
TextPreprocessor is stateless — you can reuse a single instance across many calls without any thread-safety concerns.