Fl4m3Ph03n1x

Fl4m3Ph03n1x

What are the best text-to-speech ai generation tools that you can run locally?

Background

Lately I am in a quest to find a good quality TTS ai generation tool to run locally in order to create audio for some videos I am making.

I have limited knowledge on the topic of Neural/Baesyan networks and the area has moved a lot since the last time I studied it in detail, almost decade ago.

So I am admittedly a newcomer in regards to everything tts-ai related.

What I tried

At first I tried using online SaaS tools, like ElevenLabs, but the restrictions are massive and I simply cannot pay.

So I moved to local tools. I tried:

The first 3 failed because they are either no longer maintained, required an NVIDIA GPU (which I don’t have) or because there are simply not enough guides/information online on how to train models with the tools.

I am currently trying out piper, but I am having trouble finding voice datasets in the format they require for training (I only know of a German one, and I need it to be English).

What I need

I am looking for a tool that can create high quality male voiced sound, to read lectures. I don’t need it to be super efficient, but I do need it to work without NVIDIA GPUs. Given my novice status here, I would also appreciate a lot if there is a community that can help me with my questions when setting up or using the tool.

What are the tts-ai tools you would recommend that can fit these requirements?

Marked As Solved

Fl4m3Ph03n1x

Fl4m3Ph03n1x

I was fairly impressed and ended up using kokoro-tts: hexgrad/Kokoro-82M · Hugging Face

I can’t run it locally (no NVIDIA GPU) but Google Colabs works perfectly fine for my needs.
Should anyone have a strong enough NVIDIA GPU, then I would recommend kokoro.

Also Liked

Fl4m3Ph03n1x

Fl4m3Ph03n1x

Mozilla TTS has not been updated in 4 years (at least). The quality of the sound generated is rather poor, or at least I was not able to generate human passable sound using that tool.

tts --text "If you like to use TTS to try a new idea and like to share your experiments with the community, we urge you to use the following guideline for a better collaboration. (If you have an idea for better collaboration, let us know)" \
  --model_name tts_models/en/ljspeech/neural_hmm \
  --vocoder_name vocoder_models/en/ek1/wavegrad \
  --out_path test.wav

OpenVoice, as you very well mentioned, needs to become more mature before it can be used for the purpose I have in mind, as it shares the same poor quality that Mozilla TTS does.

I am currently playing with F5 which even has online tutorials: https://www.youtube.com/watch?v=ASFoTNpkM8o

It seems quite decent. I was able to run it on my local setup as well, which is a big plus. The problem I now face if twofold:

  • I need to find a voice database with male voices in English (have no idea where to find one)
  • I need to then train F5 or whatever tool I use with that voice

As a final step, I will also then need to learn and manipulate said tool to read paragraphs, instead of 1 liners.

Scarlet

Scarlet

You might want to check out VITS or XTTS from Mozilla/TTS, which can run on CPUs and supports training with custom datasets. Another option is OpenVoice (if it matures further) or Voxygen (which has some offline options). For datasets, LibriTTS and Common Voice (filtered for quality) are good English sources. You can also try RHVoice—not the best quality, but flexible for CPU usage.

For community support, the TTS subreddit, GitHub discussions for Mozilla/TTS, and OpenAI TTS communities might be helpful!

Popular Ai topics Top

AstonJ
I saw this clip of Elon Musk talking about AI and wondered what others think - are you looking forward to AI? Or do you find it concerning?
New
First poster: CommunityNews
Now that DeepMind has taught AI to master the game of Go—and furthered its advantage in chess—they’ve turned their attention to another b...
New
First poster: jss
We are in the middle of an AI boom. Machine Learning experts command extraordinary salaries, investors are happy to open their hearts and...
New
CommunityNews
How Blockchains Work Chances are, you know what Bitcoin is. After all, it’s valued at over $47,000 per Bitcoin right now. This post isn’t...
New
First poster: CommunityNews
In their decades-long chase to create artificial intelligence, computer scientists have designed and developed all kinds of complicated m...
New
First poster: bot
El Salvador has become the first country in the world to officially classify Bitcoin as legal currency. Congress approved President Nayi...
New
AstonJ
Can you spot the AI generated person in the pic below? ▶ Spoiler Video here:
New
First poster: bot
Fans of crypto-currencies say they are the future of money - but at what cost?
New
New
AstonJ
Curious what kind of results others are getting, I think actually prefer the 7B model to the 32B model, not only is it faster but the qua...
New

Other popular topics Top

Devtalk
Hello Devtalk World! Please let us know a little about who you are and where you’re from :nerd_face:
New
malloryerik
Any thoughts on Svelte? Svelte is a radical new approach to building user interfaces. Whereas traditional frameworks like React and Vue...
New
PragmaticBookshelf
Design and develop sophisticated 2D games that are as much fun to make as they are to play. From particle effects and pathfinding to soci...
New
New
AstonJ
I’ve been hearing quite a lot of comments relating to the sound of a keyboard, with one of the most desirable of these called ‘thock’, he...
New
AstonJ
If you are experiencing Rails console using 100% CPU on your dev machine, then updating your development and test gems might fix the issu...
New
PragmaticBookshelf
Author Spotlight Mike Riley @mriley This month, we turn the spotlight on Mike Riley, author of Portable Python Projects. Mike’s book ...
New
PragmaticBookshelf
Author Spotlight Erin Dees @undees Welcome to our new author spotlight! We had the pleasure of chatting with Erin Dees, co-author of ...
New
husaindevelop
Inside our android webview app, we are trying to paste the copied content from another app eg (notes) using navigator.clipboard.readtext ...
New
CommunityNews
A Brief Review of the Minisforum V3 AMD Tablet. Update: I have created an awesome-minisforum-v3 GitHub repository to list information fo...
New