FREE AI Voice Tool: Best Opensource AI Text-to-Speech (TTS)

FREE AI Voice Tool: Best Opensource AI Text-to-Speech (TTS) - Amphion Better Than Bark!

Рет қаралды 34,660

Күн бұрын

Пікірлер: 43

@intheworldofai 10 ай бұрын

💓Thank you so much for watching guys! I would highly appreciate it if you subscribe (turn on notifcation bell), like, and comment what else you want to see! 📅 Book a 1-On-1 Consulting Call WIth Me: calendly.com/worldzofai/ai-consulting-call-1 🔥 Become a Patron (Private Discord): patreon.com/WorldofAi 🧠 Follow me on Twitter: twitter.com/intheworldofai Love y'all and have an amazing day fellas.☕ To help and Support me, Buy a Coffee or Donate to Support the Channel: ko-fi.com/worldofai - Thank you so much guys! Love yall

@Subcode 10 ай бұрын

Tortoise sounds way better in the example..

@zoldyg7979 9 ай бұрын

The downside of Tortoise is it only working with Nvidia GPUs

@ANONYMOUSAdmin232 8 ай бұрын

And it's called tortoise for a reason. . . It's super slow@@zoldyg7979

@kenrock2 10 ай бұрын

after listening to those samples in 10:44, I find the reading in Tortoise is more natural which close to human speaking than Amphion. 2nd would be ESPNet.

@intheworldofai 10 ай бұрын

I agree with you. Tortoise sounded more realistic compared to amphion

@kenrock2 10 ай бұрын

@@intheworldofai What amazed me of Tortoise, it can distinguish what kind of character and slang it can speak given the types of dialog feed to them. Example, given a cowgirl and vampire dialog generated from chatgpt .. both has different slang even though using the same voice model.. truly amazing

@intheworldofai 10 ай бұрын

It has way more custimzation but it may be hard to configure for the average person. You should also take a look at audiocraft. It is quite good!@@kenrock2

@tammyunay2579 10 ай бұрын

problem with Tortoise eat too much memory. and slow as fuck and i have 8GB Video card it' not that high but it's not that low.

@NanasNumbers 5 ай бұрын

You're the BEST ever for sharing so much open source info. Thanks so much for all you do!

@intheworldofai 5 ай бұрын

I appreciate that!

@antdx316 10 ай бұрын

How many GB is required to run this?

@Tom77889 3 ай бұрын

Especially for the non-techies like me.

@Tom77889 3 ай бұрын

This looks good but I wish you would go through the process step by step and a bit slowly as we are overwhelmed. Too fast.

@intheworldofai 10 ай бұрын

Amphion Zero-Shot TTS NaturalSpeech2 Gradio demo is out on demo: huggingface.co/spaces/amphion/NaturalSpeech2 run with docker: huggingface.co/spaces/amphion/NaturalSpeech2?docker=true duplicate space with private gpu and no queue: huggingface.co/spaces/amphion/NaturalSpeech2?duplicate=true

@SyamsQbattar 4 ай бұрын

Does that LOCAL AI Voices support Indonesian language?

@rosemarysalem 10 ай бұрын

Love this! I will checking out this tool out!

@xiaojinyusaudiobookswebnov4951 10 ай бұрын

Can you drop the full version of the music at the beginning?

@intheworldofai 10 ай бұрын

If you go on huggingface, you will be able to get the full verison of the music

@intheworldofai 10 ай бұрын

[MUST WATCH]: How to Install AudioCraft FOR FREE - Text-to-Music AI Generator Locally (AudioGen): kzbin.info/www/bejne/p2jLkH6Dh5aSbKs MusicLM: Create Music With Text Using Google's NEW AI Tool: kzbin.info/www/bejne/iYXXlGCXba-arck Bark: FREE Opensource Text-To-Speech Ai Tool - Realistic Humanlike Voices: kzbin.info/www/bejne/lZ6Qfqt7pa2Ih7s OpenCustomGPT: Create Custom GPTs For Coding, Retrival, & Chatbots For FREE!: kzbin.info/www/bejne/Z2i4eHiubbyVepI

@TheUnderscore_ 10 ай бұрын

Hey there, just wondering, is it possible to create sounds and text within the same prompt (i.e. laughter, sighing, etc.)? I've tried different options in the Text-To-Audio demo on Huggingface, but it just seems to read the text literally.

@maxcurrent485 10 ай бұрын

Try using Bark and adding [laughing] to the prompt

@intheworldofai 10 ай бұрын

Hey man! Yeah you can with the same prompts. Try maybe using another space to generate the sounds.

@intheworldofai 10 ай бұрын

OpenChat UPDATE: Best Opensource 7B Model EVER! Better Than ChatGPT & Mistral!: kzbin.info/www/bejne/jJ3VaIuwa8eBjbM

@LucidFirAI 8 ай бұрын

'sh' is not recognized as an internal or external command, operable program or batch file.

@miaohf 8 ай бұрын

if you use jupyter notebook in colab, you should add “!” before you command. example: ! ls -ltr

@VaIhalIa 10 ай бұрын

00:01 Aen is a free AI voice tool for generating audio, music, and speech. 02:25 Ampen is a versatile AI toolkit for creating audio, music, and speech. 04:11 Amphen offers various V coders and evaluation metrics for top-notch audio signals. 05:56 Amphion can generate visualizations with audio, leading in this new capability. 07:36 Clone repository and create Python environment 09:19 Installation and usage of Amphion for text-to-speech. 10:58 Amphion and Tortoise are compared for text-to-speech capabilities. 12:51 Amphion TTS is in development and improving

@gaweyn 10 ай бұрын

11:16 consecrated braid? why would that put up that broken example as a sample

@kiyonmcdowell5603 9 ай бұрын

What's the difference between large language models in text to speech

@randallrulo2109 8 ай бұрын

large language models or LLMs are files that contain numeric values based on massive datasets that allow NLPs or natural language processors to decode the data and create coherent responses to text.

@randallrulo2109 8 ай бұрын

whereas TTS is a technology that allows a computer to synthesize speech based off of an input of text.

@OptaIgin 5 ай бұрын

Does anyone knows how can I change Ubuntu's default whisper Voice? it has only Default, I want some like Zira and Mike from windows. lol

@komakaze1 10 ай бұрын

I'm looking for Good TTS inference I can run on CPU or older AMD GPU. Preferably with a huge library of community trained voices I could download and try out. I just heard Pheme today (I'm not sure if it's more than a white paper yet). I've heard Tortoise is good but slow. I'm not sure if that's still true as there seem to be ways to make it faster. SVC2 is more for voice changing, I don't think it can do TTS. I've heard Coqui is quite good. Amphion sounds interesting as it can generate sounds as well as TTS.

@the42nd 10 ай бұрын

Great article. What is the best voice clone AI tool? Was using descript, but it had this strange digital garble in that made it kinda useless.

@X2ytCrystal 5 ай бұрын

Descript was great, but voice selection is really poor. Of all the "good" TTS apps, they probably have the least voices, and out of those, only like 2 are good.

@intelligenceservices 2 ай бұрын

is this another content creation video about yet another exciting bleeding edge AI tool that almost nobody will be successful at installing and using, and will become abandonware a year ago? I'm at my wits end trying to get Coqui, XTTS, or anything to work (for TTS purposes) and it's all just one groundhog day of nothing working. My typical day involves adding 10 more github tabs to my browser, git cloning, installing pytorch for the 50th time and reading new and exciting errors. update: Amphion won't even finish installing without going into an install/uninstall loop and finally exiting with errors, which is unsurprising.

@MarcusNeufeldt 10 ай бұрын

🎯 Key Takeaways for quick navigation: 00:00 🌐 *Amphion is an open-source text-to-speech model that can generate audio, music, and speech.* 01:02 📚 *Aimed at supporting reproducible research and helping junior researchers and engineers in audio, music, and speech generation.* 01:30 🆓 *Amphion is a free, open-source alternative to other text-to-speech models like Bark, with various audio generation capabilities.* 03:26 🧠 *Amphion's platform allows for studying the conversion of different inputs into audio, not just generating audio but also understanding the process.* 05:03 🔍 *Unique feature: Amphion offers visualization in audio generation, a feature not commonly found in similar toolkits.* Made with HARPA AI