piper
A fast, local neural text to speech system. Try out and download speech models from https://rhasspy.github.io/piper-samples. More information: https://github.com/rhasspy/piper.
- Output a WAV [f]ile using a text-to-speech [m]odel (assuming a config file at model_path + .json):
echo
Thing to say | piper -m
path/to/model.onnx -f
outputfile.wav
- Output a WAV [f]ile using a [m]odel and specifying its JSON [c]onfig file:
echo
'Thing to say' | piper -m
path/to/model.onnx -c
path/to/model.onnx.json -f
outputfile.wav
- Select a particular speaker in a voice with multiple speakers by specifying the speaker's ID number:
echo
'Warum?' | piper -m
de_DE-thorsten_emotional-medium.onnx --speaker
1 -f
angry.wav
- Stream the output to the mpv media player:
echo
'Hello world' | piper -m
en_GB-northern_english_male-medium.onnx --output-raw -f - | mpv -
- Speak twice as fast, with huge gaps between sentences:
echo
'Speaking twice the speed. With added drama!' | piper -m
foo.onnx --length_scale
0.5 --sentence_silence
2 -f
drama.wav