PicoTTS Text-to-Speech component

jmattsson
Posts: 35
Joined: Fri Jun 03, 2016 5:37 am
Contact:

PicoTTS Text-to-Speech component

Postby jmattsson » Fri Feb 09, 2024 12:29 am

Hi all,

For quite some time I've been hoping Espressif would provide a TTS engine for English as well as Chinese. It's all well and good to be able to call out to services like OpenAI's TTS, but to me on-the-edge speech interaction needs to function without Internet connectivity. We already get excellent wake word support and great speech command support on-chip. All that's missing for a full round trip of speech interaction is text-to-speech.

I finally got enough time to sit down with my ESP-BOX and port the PicoTTS engine to an ESP-IDF component. While the voice is nowhere near the quality of, say, OpenAI's, it's still remarkably good considering the limited resources it has available. So, if there are people other than myself who have been hungering for a text-to-speech engine on the ESP, have a play with the PicoTTS component. You get a choice between English (UK/US), German, French, Italian and Spanish voices. There's a small example included for the ESP-BOX that you can use to get started. If anyone wants to provide examples for something other than the ESP-BOX, feel free to raise a PR! (Or send me other dev boards =^,^= )

Hope you all find it useful!

Kindly,
/J

CityHunter71
Posts: 1
Joined: Sat Aug 17, 2024 6:06 pm

Re: PicoTTS Text-to-Speech component

Postby CityHunter71 » Sat Aug 17, 2024 6:29 pm

Hi jmattsson,

I am very happy that we share the goals of having a TTS working OFF-Line on ESP32 (I use an S3).

I've been trying different solutions for months and now I'll try yours.

I've tried porting Talkie and TTS from Arduino to ESP32, but the biggest difficulty is that many systems rely on processor "timing" and handle pauses by counting pause moments. Wanting to create a multitasking system, the very concept of WAIT is wrong and I am implementing functions that determine pauses by inserting silence into the audio buffer.

We have a lot of work to do and it's nice to share ideas.

Greetings
CityHunter71

aliarifat794
Posts: 124
Joined: Sun Jun 23, 2024 6:18 pm

Re: PicoTTS Text-to-Speech component

Postby aliarifat794 » Sun Aug 18, 2024 9:09 am

Thanks for sharing with us.

jmattsson
Posts: 35
Joined: Fri Jun 03, 2016 5:37 am
Contact:

Re: PicoTTS Text-to-Speech component

Postby jmattsson » Mon Aug 19, 2024 3:45 am

Hi CityHunter71!

I hope you'll have fun experimenting with PicoTTS! You'll have full control over the audio buffers if you wish, so you can tinker as much as you'd like. I did find that I ran low on CPU cycles if I tried to both sample and generate speech at the same time, but I haven't had time to see whether further optimisation could resolve that.

Good luck, and looking forward to seeing what you come up with!
~Jade

Who is online

Users browsing this forum: No registered users and 104 guests