💓Thank you so much for watching guys! I would highly appreciate it if you subscribe (turn on notifcation bell), like, and comment what else you want to see! 📅 Book a 1-On-1 Consulting Call WIth Me: calendly.com/worldzofai/ai-consulting-call-1 🔥 Become a Patron (Private Discord): patreon.com/WorldofAi 🧠 Follow me on Twitter: twitter.com/intheworldofai Love y'all and have an amazing day fellas.☕ To help and Support me, Buy a Coffee or Donate to Support the Channel: ko-fi.com/worldofai - Thank you so much guys! Love yall
@Subcode10 ай бұрын
Tortoise sounds way better in the example..
@zoldyg79799 ай бұрын
The downside of Tortoise is it only working with Nvidia GPUs
@ANONYMOUSAdmin2328 ай бұрын
And it's called tortoise for a reason. . . It's super slow@@zoldyg7979
@kenrock210 ай бұрын
after listening to those samples in 10:44, I find the reading in Tortoise is more natural which close to human speaking than Amphion. 2nd would be ESPNet.
@intheworldofai10 ай бұрын
I agree with you. Tortoise sounded more realistic compared to amphion
@kenrock210 ай бұрын
@@intheworldofai What amazed me of Tortoise, it can distinguish what kind of character and slang it can speak given the types of dialog feed to them. Example, given a cowgirl and vampire dialog generated from chatgpt .. both has different slang even though using the same voice model.. truly amazing
@intheworldofai10 ай бұрын
It has way more custimzation but it may be hard to configure for the average person. You should also take a look at audiocraft. It is quite good!@@kenrock2
@tammyunay257910 ай бұрын
problem with Tortoise eat too much memory. and slow as fuck and i have 8GB Video card it' not that high but it's not that low.
@NanasNumbers5 ай бұрын
You're the BEST ever for sharing so much open source info. Thanks so much for all you do!
@intheworldofai5 ай бұрын
I appreciate that!
@antdx31610 ай бұрын
How many GB is required to run this?
@Tom778893 ай бұрын
Especially for the non-techies like me.
@Tom778893 ай бұрын
This looks good but I wish you would go through the process step by step and a bit slowly as we are overwhelmed. Too fast.
@intheworldofai10 ай бұрын
Amphion Zero-Shot TTS NaturalSpeech2 Gradio demo is out on demo: huggingface.co/spaces/amphion/NaturalSpeech2 run with docker: huggingface.co/spaces/amphion/NaturalSpeech2?docker=true duplicate space with private gpu and no queue: huggingface.co/spaces/amphion/NaturalSpeech2?duplicate=true
@SyamsQbattar4 ай бұрын
Does that LOCAL AI Voices support Indonesian language?
@rosemarysalem10 ай бұрын
Love this! I will checking out this tool out!
@xiaojinyusaudiobookswebnov495110 ай бұрын
Can you drop the full version of the music at the beginning?
@intheworldofai10 ай бұрын
If you go on huggingface, you will be able to get the full verison of the music
@intheworldofai10 ай бұрын
[MUST WATCH]: How to Install AudioCraft FOR FREE - Text-to-Music AI Generator Locally (AudioGen): kzbin.info/www/bejne/p2jLkH6Dh5aSbKs MusicLM: Create Music With Text Using Google's NEW AI Tool: kzbin.info/www/bejne/iYXXlGCXba-arck Bark: FREE Opensource Text-To-Speech Ai Tool - Realistic Humanlike Voices: kzbin.info/www/bejne/lZ6Qfqt7pa2Ih7s OpenCustomGPT: Create Custom GPTs For Coding, Retrival, & Chatbots For FREE!: kzbin.info/www/bejne/Z2i4eHiubbyVepI
@TheUnderscore_10 ай бұрын
Hey there, just wondering, is it possible to create sounds and text within the same prompt (i.e. laughter, sighing, etc.)? I've tried different options in the Text-To-Audio demo on Huggingface, but it just seems to read the text literally.
@maxcurrent48510 ай бұрын
Try using Bark and adding [laughing] to the prompt
@intheworldofai10 ай бұрын
Hey man! Yeah you can with the same prompts. Try maybe using another space to generate the sounds.
@intheworldofai10 ай бұрын
OpenChat UPDATE: Best Opensource 7B Model EVER! Better Than ChatGPT & Mistral!: kzbin.info/www/bejne/jJ3VaIuwa8eBjbM
@LucidFirAI8 ай бұрын
'sh' is not recognized as an internal or external command, operable program or batch file.
@miaohf8 ай бұрын
if you use jupyter notebook in colab, you should add “!” before you command. example: ! ls -ltr
@VaIhalIa10 ай бұрын
00:01 Aen is a free AI voice tool for generating audio, music, and speech. 02:25 Ampen is a versatile AI toolkit for creating audio, music, and speech. 04:11 Amphen offers various V coders and evaluation metrics for top-notch audio signals. 05:56 Amphion can generate visualizations with audio, leading in this new capability. 07:36 Clone repository and create Python environment 09:19 Installation and usage of Amphion for text-to-speech. 10:58 Amphion and Tortoise are compared for text-to-speech capabilities. 12:51 Amphion TTS is in development and improving
@gaweyn10 ай бұрын
11:16 consecrated braid? why would that put up that broken example as a sample
@kiyonmcdowell56039 ай бұрын
What's the difference between large language models in text to speech
@randallrulo21098 ай бұрын
large language models or LLMs are files that contain numeric values based on massive datasets that allow NLPs or natural language processors to decode the data and create coherent responses to text.
@randallrulo21098 ай бұрын
whereas TTS is a technology that allows a computer to synthesize speech based off of an input of text.
@OptaIgin5 ай бұрын
Does anyone knows how can I change Ubuntu's default whisper Voice? it has only Default, I want some like Zira and Mike from windows. lol
@komakaze110 ай бұрын
I'm looking for Good TTS inference I can run on CPU or older AMD GPU. Preferably with a huge library of community trained voices I could download and try out. I just heard Pheme today (I'm not sure if it's more than a white paper yet). I've heard Tortoise is good but slow. I'm not sure if that's still true as there seem to be ways to make it faster. SVC2 is more for voice changing, I don't think it can do TTS. I've heard Coqui is quite good. Amphion sounds interesting as it can generate sounds as well as TTS.
@the42nd10 ай бұрын
Great article. What is the best voice clone AI tool? Was using descript, but it had this strange digital garble in that made it kinda useless.
@X2ytCrystal5 ай бұрын
Descript was great, but voice selection is really poor. Of all the "good" TTS apps, they probably have the least voices, and out of those, only like 2 are good.
@intelligenceservices2 ай бұрын
is this another content creation video about yet another exciting bleeding edge AI tool that almost nobody will be successful at installing and using, and will become abandonware a year ago? I'm at my wits end trying to get Coqui, XTTS, or anything to work (for TTS purposes) and it's all just one groundhog day of nothing working. My typical day involves adding 10 more github tabs to my browser, git cloning, installing pytorch for the 50th time and reading new and exciting errors. update: Amphion won't even finish installing without going into an install/uninstall loop and finally exiting with errors, which is unsurprising.
@MarcusNeufeldt10 ай бұрын
🎯 Key Takeaways for quick navigation: 00:00 🌐 *Amphion is an open-source text-to-speech model that can generate audio, music, and speech.* 01:02 📚 *Aimed at supporting reproducible research and helping junior researchers and engineers in audio, music, and speech generation.* 01:30 🆓 *Amphion is a free, open-source alternative to other text-to-speech models like Bark, with various audio generation capabilities.* 03:26 🧠 *Amphion's platform allows for studying the conversion of different inputs into audio, not just generating audio but also understanding the process.* 05:03 🔍 *Unique feature: Amphion offers visualization in audio generation, a feature not commonly found in similar toolkits.* Made with HARPA AI