PicoTTS Text-to-Speech component

jmattsson
Posts: 37
Joined: Fri Jun 03, 2016 5:37 am
Contact:

PicoTTS Text-to-Speech component

Postby jmattsson » Fri Feb 09, 2024 12:29 am

Hi all,

For quite some time I've been hoping Espressif would provide a TTS engine for English as well as Chinese. It's all well and good to be able to call out to services like OpenAI's TTS, but to me on-the-edge speech interaction needs to function without Internet connectivity. We already get excellent wake word support and great speech command support on-chip. All that's missing for a full round trip of speech interaction is text-to-speech.

I finally got enough time to sit down with my ESP-BOX and port the PicoTTS engine to an ESP-IDF component. While the voice is nowhere near the quality of, say, OpenAI's, it's still remarkably good considering the limited resources it has available. So, if there are people other than myself who have been hungering for a text-to-speech engine on the ESP, have a play with the PicoTTS component. You get a choice between English (UK/US), German, French, Italian and Spanish voices. There's a small example included for the ESP-BOX that you can use to get started. If anyone wants to provide examples for something other than the ESP-BOX, feel free to raise a PR! (Or send me other dev boards =^,^= )

Hope you all find it useful!

Kindly,
/J

CityHunter71
Posts: 3
Joined: Sat Aug 17, 2024 6:06 pm

Re: PicoTTS Text-to-Speech component

Postby CityHunter71 » Sat Aug 17, 2024 6:29 pm

Hi jmattsson,

I am very happy that we share the goals of having a TTS working OFF-Line on ESP32 (I use an S3).

I've been trying different solutions for months and now I'll try yours.

I've tried porting Talkie and TTS from Arduino to ESP32, but the biggest difficulty is that many systems rely on processor "timing" and handle pauses by counting pause moments. Wanting to create a multitasking system, the very concept of WAIT is wrong and I am implementing functions that determine pauses by inserting silence into the audio buffer.

We have a lot of work to do and it's nice to share ideas.

Greetings
CityHunter71

aliarifat794
Posts: 172
Joined: Sun Jun 23, 2024 6:18 pm

Re: PicoTTS Text-to-Speech component

Postby aliarifat794 » Sun Aug 18, 2024 9:09 am

Thanks for sharing with us.

jmattsson
Posts: 37
Joined: Fri Jun 03, 2016 5:37 am
Contact:

Re: PicoTTS Text-to-Speech component

Postby jmattsson » Mon Aug 19, 2024 3:45 am

Hi CityHunter71!

I hope you'll have fun experimenting with PicoTTS! You'll have full control over the audio buffers if you wish, so you can tinker as much as you'd like. I did find that I ran low on CPU cycles if I tried to both sample and generate speech at the same time, but I haven't had time to see whether further optimisation could resolve that.

Good luck, and looking forward to seeing what you come up with!
~Jade

lukepshot
Posts: 1
Joined: Fri Oct 11, 2024 5:18 pm

Re: PicoTTS Text-to-Speech component

Postby lukepshot » Fri Oct 11, 2024 5:23 pm

Hi jmattsson,

I too am looking to use offline STT / TTS on my esp32-pico (Atom M5) boards. I was wondering if you have an example project I could reference?

Best,
lukepshot

jmattsson
Posts: 37
Joined: Fri Jun 03, 2016 5:37 am
Contact:

Re: PicoTTS Text-to-Speech component

Postby jmattsson » Sun Oct 13, 2024 7:26 am

Hi lukepshot,

There is an example available with the component, have a look at the example page: https://components.espressif.com/compon ... anguage=en. It's currently only for the ESP-BOX, so you'd need to provide your own minimal BSP files for your board.

If you want to be fancy, you could make the board/BSP selectable via Kconfig and raise a PR over at https://github.com/DiUS/esp-picotts :D

You might find the pico quite tight on RAM to fit the TTS engine into, but I wish you luck!

Warmly,
~Jade

rama98
Posts: 1
Joined: Sat Nov 02, 2024 6:27 am

Re: PicoTTS Text-to-Speech component

Postby rama98 » Sat Nov 02, 2024 6:33 am

jmattsson wrote:
Fri Feb 09, 2024 12:29 am
Hi all,

For quite some time I've been hoping Espressif would provide a TTS engine for English as well as Chinese. It's all well and good to be able to call out to services like OpenAI's TTS, but to me on-the-edge speech interaction needs to function without Internet connectivity. We already get excellent wake word support and great speech command support on-chip. All that's missing for a full round trip of speech interaction is text-to-speech.

I finally got enough time to sit down with my ESP-BOX and port the PicoTTS engine to an ESP-IDF component. While the voice is nowhere near the quality of, say, OpenAI's, it's still remarkably good considering the limited resources it has available. So, if there are people other than myself who have been hungering for a text-to-speech engine on the ESP, have a play with the PicoTTS component. You get a choice between English (UK/US), German, French, Italian and Spanish voices. There's a small example included for the ESP-BOX that you can use to get started. If anyone wants to provide examples for something other than the ESP-BOX, feel free to raise a PR! (Or send me other dev boards =^,^= )

Hope you all find it useful!

Kindly,
/J
Hi Jmattsson
I am a student who is trying to use PicoTTS in my ESP32S3 board.
But i find some troubles implementing this function using the example with the esp32-box.
Its possible to use this function with a esp32s3 using a MAX98357A amplifier with I2C?
However, thanks for you response!

jmattsson
Posts: 37
Joined: Fri Jun 03, 2016 5:37 am
Contact:

Re: PicoTTS Text-to-Speech component

Postby jmattsson » Tue Nov 05, 2024 11:24 pm

Hi rama98,

I would expect it to be possible, but I have no direct experience. You'll need to provide a different mini BSP (Board Support Package) that can do the required initialisation and setup of your hardware. Have a look at the files main/bsp/esp-box.h and main/bsp/esp-box.c — the three functions in that header is what you would need to provide a custom implementation of.

You may be able to find an example somewhere on the net already for those. Definitely have a look through the esp-bsp repo (https://github.com/espressif/esp-bsp/) and see if your board is already supported there. If it is, you can probably copy the functions from there.

Good luck!

Who is online

Users browsing this forum: No registered users and 272 guests