NVIDIA’s New AI: Stunning Voice Generator!

Рет қаралды 143,836

Two Minute Papers

Күн бұрын

Пікірлер

@noisyninja_za6182 Ай бұрын

What a time to be alive!

@1BlueFunk Ай бұрын

Truly one of the most amazing times in history.

@joeeyaura Ай бұрын

someone has to post this comment every video

@mariolopez-oi2td Ай бұрын

My knuckles are white from holding onto my papers

@DoubleRainbowXT Ай бұрын

IM SO SICK OF THESE NPC COMMENTS. UNDER EVERY VIDEO I SEE MENTIONING AI THERES ALWAYS “WHaT A tImE To bE AlIvE” ITS SO STUPID AND MEANS NOTHING.

@mariolopez-oi2td Ай бұрын

@@DoubleRainbowXT Shitposting is not a crime

@Loctorak Ай бұрын

Create a sound of 100s of fellow scholars holding on to their papers.

@Yaan_Robotique Ай бұрын

I can't even begin to fathom how beautiful such a sound would be.

@Redlabel0 Ай бұрын

the sound of a stack of papers, followed by a 100 scholars gasp, followed by deafening silence. LongHall reverb

@ShivaInu42 Ай бұрын

Hahaha awesome

@wwbim Ай бұрын

Follow by another 100s of fellow scholars squishing on their papers

@engineeredtruths8935 Ай бұрын

can someone explain this reference because i do not understand it and its annoying me haha

@HansMilling Ай бұрын

The true Turing test. When your videos will be 100% AI generated and your audience don't notice.

@JackC-d9x Ай бұрын

Most people don't notice. They're usually very distracted by other things

@Chocolatecakeicecream Ай бұрын

who says they already aren't 👀

@AdvantestInc Ай бұрын

The demonstration of emotional nuance in synthesized speech is a game changer. Imagine the potential for storytelling and immersive experiences, AI is truly blurring boundaries here.

@thomaskrogh1244 Ай бұрын

Or thousands of incels and gooners create an artificial girlfriend/lover and go deeper into delusional mindscape.

@hagis23 Ай бұрын

I think humans are better storytellers. That would be the real game changer when AI cleans your house and cooks for you while you can do all the fun like creating stories

@OnigoroshiZero Ай бұрын

@@hagis23 nah, I've been using small models like Mistral to have them run text-based adventures. Even with all their current limitations, they are at the same level if not better than most experienced humans at storytelling.

@bluerangergr7466 Ай бұрын

@@OnigoroshiZero most people want to hear real people telling stories, not from robots its uncanny

@ringehdingehdurgen Ай бұрын

@@OnigoroshiZero honestly I don't trust your ability to determine if that's the case - most people don't understand what makes a great story vs what makes a passable one

@HDL_CinC_Dragon Ай бұрын

1:35 This is the most impressive part to me. An angry voice making a declarative statement like that with those emphasis pauses is super realistic and I'm impressed the AI replicated not only the sound, but the pacing as well. What a time to be alive! 3:54 I think the correct approach here is for Dr. Zsolnai-Fehér to release a hit single song of his own so he can use his song whenever he wants :D

@AGIzero00 Ай бұрын

Let's just hope that Nvidia has the balls to actually release this

@jpgallegoar Ай бұрын

open source it*/ release the weights

@Matthew_Fog Ай бұрын

@@jpgallegoarexactly this, open source alllllll the way

@Pygon2 Ай бұрын

I'd guess that both Nvidia and Meta are going sit on these things until they see how the copyright lawsuits against Suno, etc pan out. Patents typically only stifle innovation for the benefit of the few "inventors" (who rarely make significant novel changes that would not have eventually been devised by others). In fact, the "first to file" is proof that is the case, as a number of inventors may be working on the same invention with the same ideas, but only the first to file the patent is credited and* "owns" the idea despite the efforts of others who may have been days away from, or had already arrived at, the same solution. As much as I understand protecting artists somewhat, copyright isn't significantly better, especially with how much their artistry tends to appropriated as industry profits by corporations.

@eugeneputin1858 Ай бұрын

This isnt up to invidia at all

@jpgallegoar Ай бұрын

@@eugeneputin1858 how so? it's their model

@Asterrayx Ай бұрын

I can't wait for 1,000's of AI-Generated Song Slop in my feed!!!

@smugsenko Ай бұрын

It has already begun. (I've seen at least one YT channel posting AI music without saying it is, and somehow noone notices.) I do like that this system has the ability to add to and edit already created music though. I don't like the fact that people enjoy music made from people just telling AI to do everything; I enjoy the human aspect to music and art. Without that I feel like it's pointless in a way. Using AI as a tool to help develop ideas is good but I don't think it should be the idea generator in creative fields. Set AI to solve problems. Art isn't a problem to be solved.

@Asterrayx Ай бұрын

@@smugsenko Oh yeah no doubt. it has begun awhile ago actually, just with the better tech it might get a bit harder to notice if it's legit, since some people will genuinely like it and it will get pushed into recommended feeds

@bobbyboyderecords Ай бұрын

You know this video is AI right?

@Mr_Sorus Ай бұрын

Does any of this ever gets released?

@HectorCenteno Ай бұрын

The next needed step is for audio AI systems to be able to handle spatial localization of sound. To be able to generate stereo or multichannel audio that is spatially coherent. So far all of them generate only mono audio.

@kirangouds Ай бұрын

Basically beating the Adobe audio to audio paper

@DeltaNovum Ай бұрын

Ive tried and paid for Adobe ai audio products. Not only are they far worse than free alternatives, Adobe's upload and processing speeds are atrocious. It aint cheap either.

@Eysc Ай бұрын

01:06 its tenet soundtrack at the opera scene

@Heroofthesand Ай бұрын

😮😮😮

@imam5623 24 күн бұрын

Really That's sounds so beautiful, the transition between the sounds really amazing

@Shimulahmed100 Ай бұрын

As a video editor and sound designer.. What a time to be alive

@BobbyMasteria Ай бұрын

0:23 I don't know any of those models, which one correspond to udio or suno ?

@TeddyLeppard Ай бұрын

Many years ago I recall seeing a video taken at Skywalker Sound (or ILM) and someone there was demonstrating the sound of a flute mixed with a voice. It was magical. And this was easily more than 20 years ago.

@RavioliFr Ай бұрын

Yeah and now you can do 100 different versions of this in seconds, and if you change your mind doing it with 100 different instruments in the same time

@claudiusbuser Ай бұрын

Did I miss the link in the video description he is talking about at 4 minute mark? The one about the voice Isolation...

@TwoMinutePapers Ай бұрын

Yes, you are indeed right! Thank you and apologies - fixed it in the description. Posting the link here too: kzbin.info/www/bejne/p5uUhKNufcppm5Ysi=ZtSesU1e7jeoN55U&t=63

@phizc Ай бұрын

The voice isolation was very underwhelming. The voice sounds OK, but it missed the lyrics.

@SnoopyDogg101 Ай бұрын

That train one was absolutely beautiful omg

@Perqd Ай бұрын

to be honest it starts to get also scary when it comes to jobs, dating etc.

@CrispyMuffin2 Ай бұрын

I hate that AI research is getting wasted on replacing jobs that never needed replacing in the first place. So much for that future where mundane work is automated and people can focus on art and other things. Guess we're getting the complete opposite reality, just fuckin depressing

@youarethecssformyhtml Ай бұрын

@@CrispyMuffin2lmao exactly

@crazyfrogextended Ай бұрын

nothing scary about it. I can't wait for an AI waifu that isn't an annoying as f narcissist that wastes money and has tantrums constantly

@danielguzman9847 Ай бұрын

@@CrispyMuffin2it’s simply a matter of complexity unfortunately. The most readily consumable information here is speech, text, art, and the likes. Information that doesn’t require rigorous logic or complex relationships, nor a connection to the physical world like robotic, autonomous vehicles, etc. If somehow it were easier to develop machine learning models for complex scientific research than art, speech, and text, it would’ve happened that way

@joa1401 Ай бұрын

@@crazyfrogextendedSo instead of finding an actual thinking, feeling human being who can be a equal partner to you, someone you admire and share interests with, you’d rather download a computer program and instruct it to pretend to love you? You want a make believe girlfriend who looks and talks like a doe eyed anime character and whose entire existence revolves exclusively around pleasing you? Instead of dating an independent person with her own life, goals and inner world, you want software that superficially simulates the facade of your dream girl? a pet wife who wears what you tell her to wear, likes what you tell her to like, thinks what you tell her to think and never speaks out of turn? who has no material needs, no other friends or family, no dreams you can’t delete if you feel they compete with her devotion to you. Rather than your partner making a choice to date you over everyone else because she values you as an individual, you want the shallow, on-demand validation of a glorified tamagotchi whose existence you can entirely remake or erase the moment you become bored with her? How utterly romantic! It’s just like the classic love story ‘I Have No Mouth and I Must Scream’.

@Slayerthegreat010 Ай бұрын

I am looking forward to real time voice generation for virtual characters. When those characters are given general understanding and comprehension tools like in modern LLMs peopl will be able to talk naturally to any character and ask them questions and voice generation will be key to enable the characters to respond.

@somenygaard Ай бұрын

My favorite day dream about AI involves the future of mmorpg. We are not too far away from a very small group of people being able to create an incredible high quality game in a very short period of time. The game could be modified by customers changing the genre from one style to another fairly easily with the same tools.

@empatikokumalar8202 Ай бұрын

I would love to use it right away. It would be even better if it was free to try.

@galefraney Ай бұрын

Your videos always makes me smile ... so entertaining!!

@davescott7680 Ай бұрын

The funny thing is. You definitely could replace yourself with audio generation (if your not already), and even if got foind out because it generates something funky, you can be like "woah, you found it out! Good work, I got away with skiving off on a beach for 6 months before you guys got suspicious! Isn't that incredible!?". No backlash from AI usage.

@bokc_nonpopularsalt1011 Ай бұрын

I think people are short sighting it. You can more than likely get your family generational wealth via Art, Voice, and likeness even tho you have been dead for years.

@xviii5780 Ай бұрын

What is this psychopathic mindset lmao

@RavioliFr Ай бұрын

@@xviii5780 So true

@slyseal2091 25 күн бұрын

@@bokc_nonpopularsalt1011 or your potential costumers pay a passable impersonator of your dead-person-of-choice and do it for a percentile of the price, with all the plausible deniability that is necessary for the system to not collapse under the fact that there's only like 20 actual voice archetypes.

@JerkerMontelius Ай бұрын

Is there a link to a site where we can test Fugatto?

@John-5737 25 күн бұрын

Idk why but this reminds me of parrots trying to imitate sounds but in a way enhanced way using better neural networks than the ones that parrots are using

@tranceemerson8325 Ай бұрын

imagine watching a movie, but each time, it is a little bit different, with how the AI picks the camera Angles, and how the actors contribute to the scene, and the specific performance of stunts being different.. but the AI follows a script it is given.

@mshonle Ай бұрын

It shouldn’t be too surprising that it might outperform specialist models on their specialty. In general I think the “scale is all you need” crowd has some blind spots, but I’d agree with them here. When it comes to instrumentation, music, isolation and denoising there are many ways to generate annotated synthetic audio data. I’m sure it cost an enormous amount to train this, but in terms of having a model grounded in music theory we’ve only just begun to scratch the surface.

@kaspernordlund6828 Ай бұрын

Generalist being specialist! What a time to be A I

@makeitraindom1634 Ай бұрын

I don't know what you guys are excited about, I'm terrified!

@punkskaska Ай бұрын

Yea, im suprised how many people are excited about AI today not thinking about cosequences, just the amount of people who will loose their jobs. Also much how it will destroy human crativity... Why be talented and learn anything when some guy who doesnt know thing about music can generate it with AI

@RavioliFr Ай бұрын

@@punkskaska Yeah, music means nothing today, from the moment we "crafted" music instruments. Creativity was really better when we just had sticks and stones ! Technology kills art, bla bla bla

@punkskaska Ай бұрын

@@RavioliFr i think you look at it too shallow. Learning instruments and playing music is much more important process than you think, it affects neuroplastocity. You have generations of human brain evolving with music just so someone can give it to AI not even knowing how it works to make music. We should support this process and not kill it with AI. Quote "Playing a musical instrument is the brain equivalent of a full-body workout. Unlike other brain-training activities like chess and sudoku, playing an instrument recruits almost every part of the brain, including regions that process vision, sound, movement, and memory" You can read more online, there are many more benefits to music. I think human race would benefit more from it than some computer chip. Also dont ignore the fact that a lot of musician already struggle with money. Last thing i would want is some lazy guy who never invested time to learn anything to benefit more than actual musicians.

@SyoDraws Ай бұрын

That's fair. I'm personally intrigued by the technology, and a bit impressed, but also wary of it

@makeitraindom1634 Ай бұрын

Just to be clear I'm not scared that the AI will gain sentience, I know it's not capable of that right know, what I am fearful for is the the way that we communicate with each other, and the way we know what's right and what's wrong ,what's true and what's false could be altered.

@SaigeSauce Ай бұрын

I love thinking of the fun uses for this.... but that gets dashed thinking of how much more AI slop is going to be taking over KZbin and other platforms 😭

@clerothsun3933 Ай бұрын

Y'all realise AIs that do better music than this have been around for over a year

@SaigeSauce Ай бұрын

@@clerothsun3933 "Better" is very subjective. I'd love to hear some of the songs that are your favorite, because sadly a lot of the AI ones feel really repetitive and then either super simple or way too complicated. Definitely willing to learn more! I also like the backstory to artists music journey, the inspiration behind the piece, and the passion that went into making it! AI has none of that (and never can) and will be taking so many opportunities away from budding artist that already have a hard time growing.

@bokc_nonpopularsalt1011 Ай бұрын

@@SaigeSauceSure be a fan of the hip hop genre a lot of AI have much better music and even revive some genres for me like post hardcore.

@cocojinx9193 Ай бұрын

ai will never be better than Death Grips @@clerothsun3933

@Eldorado1239 29 күн бұрын

@SaigeSauce I'm pretty sure AI music won't be a thing for quite some time yet, except for things where the music doesn't really matter. Which might be exactly what you mean by "other platforms" though, if it's something like TikTok or Twitter, "musical memes" are bound to become a thing there. But I don't see AI music infecting quality content on KZbin... the lack of creative control, noise and distortions, and mostly unpleasant results should turn anyone using it into a pariah. Although, it will be a great brainstorming tool. Instead of randomly smashing the piano until you get a catchy tune, you could randomly prompt the AI until you get some ideas and expand upon them on your own in a DAW.

@hippopotamus86 Ай бұрын

2:10 Was that an error?

@Y1001 Ай бұрын

he really tries to make us believe his voice isn't AI lmao

@HDL_CinC_Dragon Ай бұрын

He's been putting the voice intro near the middle of his videos for a little while now. I think he's experimenting with trying to split the video into discrete parts like an attention getter, then intro, then details. The new voice intro location has been jarring and confusing in most of his videos since then, especially when he's talking about a voice synth tool. I always expect him to say "And that's how good this tool is!" immediately after lol. IMO, It would make a little more sense if the attention getter portion of the video was way way shorter and contained even fewer details. As it is right now, it's pretty jarring.

@dzxtricks Ай бұрын

More like an experiment with ai generated voice 😅? It makes me assume this was just simply dragging an audio to a wrong place but it happened consistently

@hippopotamus86 Ай бұрын

@@dzxtricks That's what I thought, clicked and dragged the audio on the timeline by mistake. It doesn't fit in at all there.

@ණChỉYêuMìnhEm Ай бұрын

When this is released to the public, I am going to try..speak in Dr. Károly Zsolnai-Fehér's style "What a time to be alive"

@ordinator. Ай бұрын

Christopher Walken: "Kids are. Talking. By the door!"

@cybergigafactory Ай бұрын

Can we use it already for longer form speech and music? And how expensive is it if it is such a small model?

@novantha1 Ай бұрын

Based on analogous papers and projects: “Were RNNs all we needed?” “Training Diffusion Models on a Micro Budget” “Cramming: Training LLMs in a single day” (might have gotten the title a bit wrong) llm.c Would all suggest that the model described here could probably be trained on a roughly $1,000-$10,000 budget, given sufficiently optimized training loops. A bit of a problem is that all of the ones I described above had the benefit of existing implementations to work off of, but a lot of the lessons do carry over. A naive implementation (such as was probably done by the researchers) was probably quite a bit more expensive, and could have been anywhere from $15,000 - $100,000 if I had to guess. I also think a “free” training might be possible with recent advances in distributed training, if it were to be done as a volunteer effort (ie: DiLoCo, DisTrO, etc). I guess with around 100-200 people willing to chip in compute it could be done in a reasonable time frame.

@SK-gc7xv Ай бұрын

Hello Bot, give me the sound of a joyous nerd like me, who knows more than I do about things, is amazing at teaching, has a deep accent who is super excited about the time to be alive!

@ananthakrishnank3208 8 күн бұрын

Rafael Valle is an incredible personality in Audio AI domain.

@calicops951 Ай бұрын

is there a way to be able to play with Fugatto?

@ThePowerLover Ай бұрын

Not yet.

@kipchickensout Ай бұрын

It was weird hearing a honking train pass by without the doppler effekt

@HiImKyle Ай бұрын

What does the AI Enhanced bit in the top left mean..?

@EarthMightiest Ай бұрын

my input: scream geometrically

@py_ifmain Ай бұрын

Hi, sorry but help me find the weights of this model if there are any. Do you have a link?

@Christiansmoviesnow Ай бұрын

When will this be released?

@arc7498 Ай бұрын

How and when can I use this?

@Ironbrigade Ай бұрын

Making sound effects for games would be amazing

@attlue Ай бұрын

How does the A.I. generating music, pictures, videos, etc make anyone an artist?

@Outmind01 Ай бұрын

It doesn't. Just because someone gave the AI a quirky commission doesn't mean they're the one who created the output "art". Prompt "artists" tweak & twiddle their digital knobs until something comes out they agree with. That said, the use of AI in audio can really be a game-changer. Imagine being a blind person and having your surroundings narrated to you in real time, or having it read text chats or in-game written dialogue with "emotion."

@IronKnee963 Ай бұрын

It doesn't but most people today are lazy lowlifes who love to pat themselves on the back. Besides, it's not like AI started it. Having ghost writers, auto tune etc is all cheating and has been the standard for a while now.

@AustrianEconomist Ай бұрын

@@Outmind01 Lmao. You're no different than the people who first said digital music production wasn't real art because "Digital artists tweak & twiddle their digital knobs until something comes out they agree with." Old man screams at clouds energy. Adapt or perish.

@Outmind01 Ай бұрын

@@AustrianEconomist Digital music production requires that the person doing it understands something, at least intuitively, about music theory. It requires them to search out and even create samples that fit well together. They have to understand how to use their DAWs and can make and save specific changes. By default, an AI prompt writer can't make their vision come to life. They ask the service to create something, which it does by regurgitating (sometimes barely altered) stuff real people put god knows how much effort into acquiring the skills to make and executing. And it can't even generate the same image twice lol. In this instance, I'd rather give off "old man yells at cloud" energy than "I don't care about anything other than how this latest hyped up tech is going to make me money" energy.

@yesyes-om1po 7 күн бұрын

whats crazy is this is only 2.5 billion parameters, this could fit on almost anyone's PC, I can definitely see AI Coprocessors that exist solely to inference with stuff like this for realtime gameplay or something

@am_ma Ай бұрын

Could it isolate music from voice in case of a mono track song . Meaning isolate the singer voice from an 1930's mono song record.

@gulllars4620 Ай бұрын

An Olympic swimmer winning in wrestling may be less unlikely than you think if they've had wrestling as a hobby. Swimmers are insanely strong. But to riff on that, there have been TV entertainment programs that pit athletes of different disciplines against each other in a series of challenges, and occasionally their sport comes up, and everyone would expect them to win, but fun stuff can ensue like a top cross country skier beating a top cyclist at bicycling. Both are endurance sports, and skiers may bicycle for cardio during the summer.

@richcolour Ай бұрын

4:17 AI still doesn't know how cats wear headphones

@bakkomm Ай бұрын

How about an English voice synthesizer with emotion, intonation, and pitch customization similar to VOICEPEAK? I really need that kind of tool so bad

@Yourname942 Ай бұрын

Where/when can we use Fugatto?

@pulkitkohli1993 Ай бұрын

How can we use Fugatto?

@coolrazor6835 Ай бұрын

Where is that "AI Enhance" Gradio code? I want it

@CodeWithClark Ай бұрын

2:37 Musicians play instruments in a musical context. You all can become producers or songwriters, but until the thinker can execute what the AI generated, they aren’t musicians. Music producers, or songwriters; sure. I accept that.

@hiddendrifts Ай бұрын

at this rate we're gonna get an open source version of 4o's advanced voice mode within 12 months

@HCG Ай бұрын

Showing an example being played and then not letting us hear it and instead having you just continue talking is a massive annoyance. Either let us hear it or don’t show the footage at all.

@RavioliFr Ай бұрын

Archetype of negativity, aha

@willfrank961 Ай бұрын

He mentions in the video that he couldn't use much of the audio because it would be flagged by youtube.

@armanrozika Ай бұрын

@@willfrank961 why? isnt it AI generated? if youtube can detect where the audio coming from/generated from, then it's bad AI model

@BeverageOfSorts Ай бұрын

@@armanrozika specifically the clip he doesn't show is showcasing the extraction of lyrics from a song. We can intuitively guess that the song utilised in the demo is copyrighted on KZbin

@metatron3942 Ай бұрын

Google may have to release 2-minute paper AI podcast generator

@aggressiveaegyo7679 Ай бұрын

I'm still waiting for a proper audio translator. It looks promising if it would be possible to remove the original voices without drowning out other sounds. And overlay the voiced text from Whisper.

@yusuf_seddighi Ай бұрын

Where I can get it?

@lancemarchetti8673 Ай бұрын

Looks like we're headed for a paperless society pretty quickly.

@csiguszfoxoup Ай бұрын

I don't know if I am more scared or more amazed

@EverydayOkay Ай бұрын

how to get this up and running?

@LobsterMack Ай бұрын

Just waiting for the time when I can name a character in a game and have all the people in that world actually say the name.

@kinex_studio3844 Ай бұрын

And the characters can actually reason for themselves and talk to you as themselves

@RedRisotto Ай бұрын

I wish I was younger to see another 25-30 years of development and progress... However, it's still a good time to be alive. 😉 In the next couple of years, I'm hoping for AI vector files THAT uses AI for optimal node placement and node reduction, proper curve smoothing and control, and accurate angles, and zero (stray node) artifacts. It should be doable right now, no!?

@bokc_nonpopularsalt1011 Ай бұрын

If the calculations are correct you'll see 75 years of advancement in a few years.

@assarlannerborn9342 Ай бұрын

depending on you age you might just be able to escape to heaven before the singularity hits...

@vitaliyklitchko6664 Ай бұрын

Okay, how to run it locally?

@fallofseasons Ай бұрын

Is there a way to get early access?

@SatongiFilms Ай бұрын

can it remove "audio jungle" from audio file? asking for a friend.

@oblikua Ай бұрын

Lol. 😂

@KevinLarsson42 Ай бұрын

Just writing down my thoughts: In the future, people will mostly listen to AI generated music, there will be all-in-one music apps that function similar to YT music, but it generates the songs instead. The apps would have a pause/play, skip, upvote and downvote, and save and randomize, plus a text field. The app will play songs with randomized prompts, people will constantly upvote and downvote or save songs, and this is info which can be used to narrow down the users preference for future song generation. The app will create a profile of your tastes, and eventually it can be trained to know which songs you like to automatically save them to a playlist. So in the future, the listerners are the composers and it's their tastes which determine the composition of new songs. I reckon this might lead to a new music revolution.

@kingtasaz Ай бұрын

the issue with this is that none of the music will be very good. Sure it might sound nice to listen to but all ai generated content is incredibly soulless and uninspired.

@BlackoutGootraxian Ай бұрын

@@kingtasazRight now, yes. You talk as if this technology is stagnating. I am 90% sure in 10 years, AI music will be just as good if not better than human music.

@tycho25 Ай бұрын

I'm terrified that art is being taken out of the hands of the creators and being replaced with expensive algorithms owned by big companies.

@theonewhoslost Ай бұрын

@@BlackoutGootraxian AI simply cannot replace game music and thats that. What is the point of spending time on something no one made

@BlackoutGootraxian Ай бұрын

@theonewhoslost What a nothingburger response. It can and will get better soon, if you follow AI you know how quickly its still advancing, even in the music department. Saying "nuh uh" in response just makes you look dumb.

@OpreanMircea Ай бұрын

what would I use this for? uhm.... adding sound to the AI generated porn of course, what else?

@Adrian-ep4qm Ай бұрын

"A person who thinks all the time has nothing to think about except thoughts"

@anandchoure1343 Ай бұрын

You are too stupid to think about the potential of it

@MarioCalzadaMusic Ай бұрын

That will be integrated, imagine infine Doom but with porn; Just 3D glasses and starvation

@youarethecssformyhtml Ай бұрын

@@Adrian-ep4qmcringe

@metroidandroid Ай бұрын

Replacing actors, influencing politics, memes and so on

@An_Arbitrary_Miscellany Ай бұрын

1:32 - cool that they're getting Ashton Kutcher to do their voicing.

@n9ne Ай бұрын

best implementation of voice AI is still that one classic world of warcraft addon. VoiceOver is the name i think.

@Dafastso Ай бұрын

[SUPERINTENDENT CHALMERS] GOOD LORD! What is happening in there? [PRINCIPAL SKINNER] Aurora Borealis. [SUPERINTENDENT CHALMERS] A---Aurora Borealis? At this time of year! At this time of day! In this part of the country! Localized entirely within your kitchen?!? [PRINCIPAL SKINNER] Yes. [SUPERINTENDENT CHALMERS] May I see it? [PRINCIPAL SKINNER] No.

@patrickzupanc1795 Ай бұрын

Great video, thank you!

@XashA12Musk Ай бұрын

1:29 i used to think your name was Caro Jhon Aaifa as in "Dr. Caro jhon aaifa here"

@solomslls Ай бұрын

can it clone voices ?

@DoctorMandible Ай бұрын

How to try?

@xandragonist Ай бұрын

we making brainrot music with this one 🗣🗣🗣🗣🗣🗣 🔥🔥🔥🔥🔥🔥🔥🔥

@JustFor-dq5wc Ай бұрын

It's great, but ... where is it?

@Gcrowan Ай бұрын

The introduction in the middle of the the video always feels so out of place, I keep thinking the video reset or auto played the next one by mistake.

@OpreanMircea Ай бұрын

@@Gcrowan I think he does this so it's harder for other people to re upload his content

@thesimplicitylifestyle Ай бұрын

This is absolutely incredible! 😎🤖

@Ikbeneengeit Ай бұрын

The Bitter Lesson is that the generalist usually wins in AI.

@athuldev3401 Ай бұрын

Waiting for the day when Felecia reveals that it's an AI

@sapphyrus Ай бұрын

I truly am glad to live through this era when a lot of sci-fi came out to be true even in our daily, ordinary lives! Or I could say.. WHAT A TIME TO BE ALIVE!

@CrowleyBlack2 Ай бұрын

new meme. what a time to be alive!

@mattshu Ай бұрын

That name reminds me too much of The Sopranos though 😂 “FUGAZI”

@kipchickensout Ай бұрын

Adlibs and vocals without needing to do them?? nice

@ahmetselcuk1400 Ай бұрын

nvıdıa mukemmelıgını herkes tadar umarım

@LegitJDG534 Ай бұрын

Can this model do impressions of known people? E.g. wildlife documentary presenters...

@alexdavies7112 Ай бұрын

This seems cool but without it being publicly accessible, I don't really see the point.

@LJay205 Ай бұрын

Does it have to have a 'point'? Research can just be done for the sake of research. This doesn't make sense to me: usually people complain about how everything has to be monetized, but when it isn't, people complain too?

@jduk1818 Ай бұрын

@@LJay205 Yes but it’s not about the research having a point, it’s about the public release of that research being pointless if nothing else becomes public after it. Monetisation has nothing to do with their comment. All this does is say “Here’s a look at the shiny new toys you’ll never get to play with.”

@LJay205 Ай бұрын

@@jduk1818 So would you rather have them keep their research a secret? Of course I'd love to play with this too. But to me it seems silly to declare their work entirely pointless simply because it isn't publicly accessible.

@jduk1818 Ай бұрын

@@LJay205 Sigh. I didn't say their work was entirely pointless, I said releasing a paper on it is pointless unless the actual technology is going to be public at some point. That could be for companies to use or anyone. Until then anything they say is pointless. It's grandstanding, bragging and it's just claims on paper with no way of testing the validity of those claims. I could write a paper about having created a new energy source, one that is cheap and will solve the world's energy problems and then let everybody know. All that is meaningless and unproven until it can be tested independently. So yes, they may as well keep it secret because only telling people is useless. Let me ask you this, what good is research if nobody can use that research?

@LJay205 Ай бұрын

@@jduk1818 I was referring to the original comment, whose wording seems to suggest that they declare the research itself pointless unless the model would be made public, which is the stance i argued against. Other than that, I understand your point, but I do think that they are going to release this in the future. I do agree that publishing this paper and then never following up on it would be somewhat strange and induce skepticism, but it still doesn't make the paper meaningless. I think that announcing progress like this still has some value inherently, even if it poses no scientific significance since the model isn't public. To me it seems silly to proclaim that they shouldn't have announced this and kept quiet simply because no one else can derive value from their work at this stage, which isn't even something you can say with certainty.

@DemoNova Ай бұрын

What a time to have ears!

@GWelby Ай бұрын

Can I get you to write a paper on Cascade please? Love, Greg

@LoserDub Ай бұрын

This is alll available on suno, right now

@xTerminatorAndy Ай бұрын

what alive to be a time

@neilmacmusic Ай бұрын

definitely not as hifi sounding as UDIO or SUNO but it probably will be next week!

@BluishGreenPro Ай бұрын

Does AI generated audio sound slightly noisy or echoey to anyone else? It just doesn’t sound “clean” to my ears

@Twentizz Ай бұрын

I want sound made during seggs 😂

@kadencoates8774 Ай бұрын

Yeah, during the train to orchestra example I was thinking I would much rather use two real recordings and blend between them in an audio editing software. I think generative AI like this is good to get a quick and dirty example of what you're looking for, to use as a placeholder while you create the real thing later.

@pandoraeeris7860 Ай бұрын

It'll improve rapidly. Just two more papers down the line!

@Deviloc1 Ай бұрын

Yeah, the video title calls it "Stunning", I call it "meh". Yes, two papers down the line it might be useful. In its current implementation I call it useless. Just like most current generative AI, it's amazing that it can do what it can do, but what it can do isn't actually to a quality level that is useful currently. Also, I feel like there are a dozen AI tools already released that can do all of this (AI voice tools, AI SFX tools, AI music tools, all producing similar meh results currently). In order for any of these tools to actually be useful, the quality still needs to improve by a huge amount and we need tools that allow for iterative modification to results (one-shot is almost never going to give me the final result I want, I need to be able to fine tune the results to get exactly what I want).

@bohdanvakulenko4266 Ай бұрын

Audio engineers are about to have a pay cut

@franciscomagalhaes7457 Ай бұрын

You know what I think this sort of generative AI is extremely valuable for? Tabletop RPGs. One of the most time and energy (and sometimes money) consuming activities for a game master is sprucing things up with artwork and sound effects. With generative image models I'm already seeing people taking advantage of them. Sound would be a phenomenal addition.

@DuckGia Ай бұрын

Still tuning models.

@marcshawn Ай бұрын

Hot Take: NVIDIA does not open source (Apache, MIT, GPL) enough of their models.

@debasishnandy5124 Ай бұрын

I wanna join Nvidia fr and make such cool contributions to AI

@애옹이도둑 Ай бұрын

As technology advances like this, people tend to worry, but sound designers will probably welcome it. It can drastically improve their workflow. For example, they could use AI to pre-generate sounds, present them to clients, and if approved, synthesize them with actual sounds and voices. Of course, designers who are less skilled than AI will likely be left behind. But this is simply history repeating itself.