How Good is Free AI Voice Cloning Software?

  Рет қаралды 2,948

Doctor Volt

Doctor Volt

Күн бұрын

Пікірлер: 12
@treideme
@treideme 10 ай бұрын
With the other retro videos, I'd expected you to go all the way back to computational linguistics like diphone synthesis and decomposition. Like how they came up with the MBROLA voices ... how will you run your voice clone on your next Z80 project ;)
@BurkenProductions
@BurkenProductions 10 ай бұрын
Funny that you switched to an ai voice mid video for us to not notice
@BurkenProductions
@BurkenProductions 10 ай бұрын
Damn no sponsorblock. Common guys fhe vid is a day old should be there by now
@nicksyo
@nicksyo 10 ай бұрын
do it yourself dawg
@German_byte
@German_byte 5 ай бұрын
It sounded kind of English but close listening reveals subtle German flavours to intonation.
@CJWarlock
@CJWarlock 2 ай бұрын
I've suspected that you've used AI generated voice in several parts of the video. Or that your voice tends to get unattractively monotonous (boring) on occasions. ;) EDIT: A typo corrected.
@DoctorVolt
@DoctorVolt 2 ай бұрын
Its both AI voice and natural voice.
@CJWarlock
@CJWarlock 2 ай бұрын
@@DoctorVolt I suspected as much. :) Of course it's just a neutral observation, not meant to be unpleasant for you. :) I've worked with vocals (edited others' recordings) and trained my own for better clarity and diction hence I can say a thing or two. :) There are tons of tutorials on the topic on YT, however the most important tip is: stand in front of the mirror, observe how your mouth, lips and chicks move when you speak fast like you normally do, then modify their muscle work until you get the desired result and remember that observation (the visual aid helps to remember the muscle memory faster). Basically, it's mainly the effort of the chick muscles where the attractiveness of the vocal is. You're welcome. :) PS. I doubt AI is able to recreate the many nuances of that face muscles work. :) Cheers. :)
@DoctorVolt
@DoctorVolt 2 ай бұрын
@@CJWarlock I found that AI nowadays has some basic understanding of what the text is about. So it also adapts its prosody according to the meaning. As a test, I did my last video entirely with an AI voice. The nuances still don't fit at 100%, but maybe still better than if I read my script. But for some reason, humans are still able to tell an AI voice from a natural voice. So many viewers blamed me for using AI.😪
@CJWarlock
@CJWarlock 2 ай бұрын
@@DoctorVolt Believe it or not, the non-verbal component, be it a mental component od just simply one's passion about the topic is written aside our 3D space. AI can not generate such thing, it can only wrap together some poor scraps that go together with other people's voice samples. Thoughts and emotions are essentially an information also and any information stays written somewhere. In a multi-layered reality, there are separate layers for such things. This topic exceeds the regular, limited understanding of human speech as purely mechanical. Think of spoken words as only anchors for mental and emotional concepts that anchor the wider constructs in the 3D material layer of our reality. Then, think of human speech as a conscious act of a free will to bring a certain composition of such concepts into this world. It's like creating a series of specifically tuned "magnets" that pull concepts into our reality. Words are jus tools. It's these concepts (a non-verbal information) that humans perceive on a subconscious level. Some people are aware of such deeper "mechanics", some only feel it intuitively, others are yet unaware of such things. For me this is an every day topic in which I have 10k hours of experience, thus I can easily explain it in layman terms and with clear metaphors. AI obviously doesn't have such component a s free will (it's a spiritual and mental building block of a human being), as it's only a sophisticated computer program, a tool. A tool can only take what others have generated and make a random patchwork of it. If you'd have a device similar in principle to these famous glasses in "They Live" SF movie that would allow you to see with your own eyes such different layers of our reality, you could see that the AI model takes tiny bits of that non-verbal information associated with vocal samples and wraps a random patchwork of such bits. Bits, not whole concepts. AI generated speech this looks on that layer like a total random mess of tiny scraps of mental and emotional component associated with original spoken words. Hence it's not welll perceived. The good actors have such skill that they project (emit) strongly mental pictures and feelings of hat they say. That's why people in the audience of a theater can feel the emotions and know immediately what the actor had in mind (if they are sensitive enough but most people who go to the dramatics plays - are). AI, for obvious reasons (i. e. lack of a biological brain and brain-heart duo) can't transmit such non-verbal information. It's an inanimate tool. Any such information comes from the original voice samples and their respective authors. :) Hope this help. Cheers. :) PS. I'm fluent in both technical and deeper interesting stuff thus it's easy for me to grasp such topics and how they interwine. I hope you can follow my train of thoughts. In any case, just ask about what I might not described clearly enough. :)
@BurkenProductions
@BurkenProductions 10 ай бұрын
Not similar at all the accent was american not swiss
@dafoex
@dafoex 10 ай бұрын
Someone else I follow has experimented with AI voice cloning. It's good, but the americanisms really kill the illusion. He'll be talking in a British accent and suddenly pronounce "Hyundai" as "Hun-day" and not "High-un-dye" as most Brits do.
Get Started With FPGAs and Verilog in 13 Minutes!
13:30
Doctor Volt
Рет қаралды 38 М.
СИНИЙ ИНЕЙ УЖЕ ВЫШЕЛ!❄️
01:01
DO$HIK
Рет қаралды 3,3 МЛН
黑天使被操控了#short #angel #clown
00:40
Super Beauty team
Рет қаралды 61 МЛН
I Can't Believe NO ONE Is Talking About ChatGPT Assistants
6:50
Income stream surfers
Рет қаралды 1,6 М.
I Hacked a Mouse into a Camera!
10:04
Doctor Volt
Рет қаралды 181 М.
Chinese app design: weird, but it works. Here's why
10:51
Phoebe Yu
Рет қаралды 589 М.
Local voice cloning with 6 seconds audio | Coqui XTTS on Windows
20:22
Forget About Raspberry Pi! Use Your Old Phone Instead. (Really???)
9:09
Ladybird browser update (November 2024)
13:03
Ladybird
Рет қаралды 29 М.
10 AI Animation Tools You Won’t Believe are Free
16:02
Futurepedia
Рет қаралды 562 М.
Voice Cloning Showdown: 4 AI Voice Cloning Platforms Compared.
14:13