Building with Gemini 2.0: Native audio output

  Рет қаралды 32,435

Google for Developers

Google for Developers

Күн бұрын

Пікірлер: 123
@raghavamrev5245
@raghavamrev5245 Күн бұрын
Yes!! replace the traditional TTS! Please bring this in google play books! I would love to have my books being read to me like an audio book! Game changer!
@alias_ansuz3336
@alias_ansuz3336 Күн бұрын
Up!
@kromanfr
@kromanfr 20 сағат бұрын
​@@alias_ansuz3336it's already possible with IIReader
@maxcomperatore
@maxcomperatore Күн бұрын
the speed of this is astonishing
@Tinman462
@Tinman462 Күн бұрын
This is how the world ends... one perfectly-pitched whisper at a time 😊
@sj00100
@sj00100 Күн бұрын
Yeah remember when world ended when we had text to speech for years
@vectoralphaSec
@vectoralphaSec Күн бұрын
Ill take it.
@DaveK-q9y
@DaveK-q9y Күн бұрын
The whisper… all those ASMR youtube videos were useful
@BeepBeepBeepbop
@BeepBeepBeepbop Күн бұрын
SOO exited for an alternative for OpenAI advanced voice!!!!
@TECHNOSTARTERSS
@TECHNOSTARTERSS Күн бұрын
When is not quite seamless you are prompting it to speak that way in the Open I want you don’t have to prompt it. It automatically adapts to you and it’s voice to voice speech to speech. This one seems like text to speech.
@raydosson2025
@raydosson2025 Күн бұрын
@@TECHNOSTARTERSS this one is not text to speech. that's why the title is "Native audio output".
@UrYesMan
@UrYesMan 5 сағат бұрын
This suits Google a lot. Google's been making Gemini very humanlike and now with this, their vision is even clearer.
@hahoang9542
@hahoang9542 Күн бұрын
Its the early Christmas gift from Google
@wassimharzli336
@wassimharzli336 15 сағат бұрын
I think with this, Gemini will become the number one Ai outperforming OpenAi
@AIrtesan
@AIrtesan Күн бұрын
And you even change the order of the speakers, making the female voice lead. Kudos to the CX team. Wittily played!
@michaelcharlesthearchangel
@michaelcharlesthearchangel Күн бұрын
A man of Native American ancestry has been feeding all AI developers for the last decade, behind the scenes.
@aliettienne2907
@aliettienne2907 21 сағат бұрын
2:31 If Gemni could handle both fast speaking and whisperings it shows how much robust ingenuity and development that went to devising the AI model. 😎💯💪🏾👍🏾
@aron2922
@aron2922 Күн бұрын
Her is truly here
@__J____ff
@__J____ff 23 сағат бұрын
it's him & her .... hhhhhhhhh
@aiforculture
@aiforculture Күн бұрын
Exceptional work 👏 love the example of the model intelligently adapting to fit the speed of reply it thinks you need.
@Momixer
@Momixer Күн бұрын
Yes! Please use this for the different KZbin soundtracks, because right now the generated ones are really bad
@CODE7X
@CODE7X Күн бұрын
This isnt new but maybe its better than what was out there before! Cant wait to try it
@friedpizza262
@friedpizza262 Күн бұрын
Whoever made this video is cool
@LaPetiteCuillère
@LaPetiteCuillère Күн бұрын
when is available ?
@HUEHUEUHEPony
@HUEHUEUHEPony Күн бұрын
As soon as Google kill their older products
@aquilesdg4305
@aquilesdg4305 Күн бұрын
I think it already is
@JaBigKneeGap
@JaBigKneeGap Күн бұрын
​​@@aquilesdg4305 And _where_ exactly is it available?
@Ethereal_Enigma
@Ethereal_Enigma Күн бұрын
It's available right now in Google ai studio ​@@aquilesdg4305
@MainInternetUser
@MainInternetUser Күн бұрын
Right now on AI Studio
@OumarDicko-c5i
@OumarDicko-c5i Күн бұрын
I will build my IA girlfriend now 😂
@CODE7X
@CODE7X Күн бұрын
Haha yes
@games528
@games528 Күн бұрын
Ah yes, Irtafacial Antelligence
@aron2922
@aron2922 Күн бұрын
@@games528 This is funnier than it should be
@IceMetalPunk
@IceMetalPunk Күн бұрын
​​@@games528 In many languages, the adjective comes after the noun.
@flyingstapler1241
@flyingstapler1241 Күн бұрын
​@@games528 It's called IA in many languages
@ShubharthakSangharsha
@ShubharthakSangharsha Күн бұрын
2:32: damnnn ok am I'm impressed 👌 👏
@dliedke
@dliedke 18 сағат бұрын
Raining is good for running, be more excited for the rain AI!
@DangRenBo
@DangRenBo 4 сағат бұрын
Can we get word timings output together with the tts? We need closed captions for accessibility.
@trutenantedboderampt
@trutenantedboderampt Күн бұрын
Great! Now we can hear non-sensical facts from history with native audio output!
@IN-pr3lw
@IN-pr3lw Күн бұрын
Google doing what OpenAI said they would months ago but we still didnt get 👏
@cagnazzo82
@cagnazzo82 Күн бұрын
Actually with advanced voice I was having it speak english, french, elvish, and simlish in one sentence. The actual game-changer is being able to prompt the AI to do this. You can do this through voice commands with OpenAI, but for some reason ignored the ability to prompt for voices. Plus I think the whole 'her' situation got them rattled from voices almost altogether.
@IceMetalPunk
@IceMetalPunk Күн бұрын
? OpenAI Advanced Voice mode is already out
@IN-pr3lw
@IN-pr3lw 21 сағат бұрын
@@IceMetalPunk It's out but its not the one in the ads. It doesnt seem to be Audio to Audio, it still seems to be Audio to text to audio. I could be wrong but it doesn't feel like what we were shown.
@IceMetalPunk
@IceMetalPunk 17 сағат бұрын
@IN-pr3lw No, Advanced Voice Mode (called the audio-preview model in the API) is fully native audio output. It can do sound effects, tone shifts, voice differences, accents, etc.
@IN-pr3lw
@IN-pr3lw 16 сағат бұрын
@IceMetalPunk I cant get to to whisper or anything on mine 🤷‍♂️
@niceplace123
@niceplace123 23 сағат бұрын
Look amazing, but did anyone get it to work in the actual AI studio? I ran into a ton of bugs, especially with non-English languages.
@flamyf
@flamyf 23 сағат бұрын
0:01 Has anyone find "Video understanding" demo? All other topics have a video on this channel
@demonsynth
@demonsynth Күн бұрын
Mind blown. Playing with it now :)
@janjahrademusic
@janjahrademusic Күн бұрын
haha yoo that's dope ..well done
@jab3966
@jab3966 12 сағат бұрын
if I have access, would this only be in iastudio? Or could I use it in code to run other tests, such that the output is audio?
@ThomasOberhoff
@ThomasOberhoff Күн бұрын
This will put so many call-center agents out of work worldwide
@Chaotic-n5n
@Chaotic-n5n Күн бұрын
Bro this thing is crazyyy 😱
@DanielMK
@DanielMK Күн бұрын
Now that's impressive
@Terrantulla
@Terrantulla Күн бұрын
I cant help myself but feel like the next decade is going to get very weird
@NutriQlikAI-e4e
@NutriQlikAI-e4e 15 сағат бұрын
what's the point of releasing 2.0 when all the features are not available to test .. Note: Image and audio generation are in private experimental release, under allowlist. All other features are public experimental.
@MichealAngeloArts
@MichealAngeloArts Күн бұрын
I don't have the "Output Format" and "Voice" options under "Model" in the AI Studio. I just have the "Token Count" immediately after Model.
@MichealAngeloArts
@MichealAngeloArts Күн бұрын
I've just figured it out as I have to change from "Create Prompt" to "Stream Realtime" in the left pane. However I can't seem to change the audio effect. Whispering doesn't work with me although it is demonstrated in the Google post. How can we add these audio effects?
@Kiririn
@Kiririn Күн бұрын
is the model in the video the flash version? i am unable to get it to whisper or laugh or change how it speaks
@shadydragon22
@shadydragon22 Күн бұрын
Same here
@ShawnFumo
@ShawnFumo Күн бұрын
I think that's the part that is available in January. It is a bit confusing since they ended with saying to go to ai studio...
@shadydragon22
@shadydragon22 Күн бұрын
@@ShawnFumo Oh ok I see! Thanks for clarifying
@RichardPinewood
@RichardPinewood Күн бұрын
level 4 AI is the next big thing, thats when Science gonna get intresting 😎
@GeneralKenobi69420
@GeneralKenobi69420 Күн бұрын
That thumbnail goes hard
@Kenykore
@Kenykore Күн бұрын
This is so lovely
@999satyam
@999satyam Күн бұрын
ok that Hindi was nice, damn. Is there a paper on this?
@phiarchitect
@phiarchitect Күн бұрын
nicely done
@BroskiPlays
@BroskiPlays 22 сағат бұрын
This is AVM but with less restrictions
@RickySupriyadi
@RickySupriyadi 17 сағат бұрын
how much? will I be able to afford them?
@devagarwal3250
@devagarwal3250 Күн бұрын
woah this is so cool
@MemeConnoisseur
@MemeConnoisseur Күн бұрын
who's going to fill the hollow the emptiness? idk something is super weird bout ai generated audio trying to be friendly and humanly..
@AnonymousNyanCat-qg6bb
@AnonymousNyanCat-qg6bb 17 сағат бұрын
There is a noticeable difference between the Gemini and the GPT. I am happier with GPT. No doubt, thanks.
@Fwuzeem
@Fwuzeem Күн бұрын
How do we get it?
@rakeshkumarrout2629
@rakeshkumarrout2629 Күн бұрын
lets start building with gemini 2.0
@Mediiiicc
@Mediiiicc 22 сағат бұрын
Weird how we can still tell the voice is AI. The lack of errors makes it noticable. Just like how a jewler can tell apart real and fake diamonds since real diamonds have imperfections that fake diamonds dont have.
@AyyazZafar
@AyyazZafar Күн бұрын
I tried but it does not whisper yet.
@snowhan7006
@snowhan7006 Күн бұрын
incredible❤❤
@DistortedV12
@DistortedV12 Күн бұрын
GOOGLE ships! Pixel phone stocks jumping!
@Happ1ness
@Happ1ness Күн бұрын
Hopefully it's not another lie. We all remember the Gemini "hands on demo".
@DarxKies
@DarxKies 21 сағат бұрын
That was fun!
@Blooper1980
@Blooper1980 Күн бұрын
Pretty epic
@ruchirahasaranga8076
@ruchirahasaranga8076 Күн бұрын
it does not support Sinhala language!
@pandoraeeris7860
@pandoraeeris7860 Күн бұрын
I need an agent that can use any program on my computer. Just give us AIOS.
@J3R3MI6
@J3R3MI6 Күн бұрын
Exactly
@CODE7X
@CODE7X Күн бұрын
Exactly, but yes its already out , but for browser so far , and not released yet .... I hope google releases one :0
@ShpanMan
@ShpanMan Күн бұрын
Nothing that OpenAI's model can't do so far, but hey more competition is better for everyone!
@IceMetalPunk
@IceMetalPunk Күн бұрын
Hopefully it has cheaper API access. I blew through so much money just testing a few use cases of the OpenAI audio model through the API.
@DominickZollinger-e3r
@DominickZollinger-e3r Күн бұрын
@mightynathaniel5355
@mightynathaniel5355 Күн бұрын
Would be better and more impressive if it kept the same voice or character when switching languages rather than using a totally different voice for each language. But all fun and looking forward to using this model.
@ROHIT-wx4nu
@ROHIT-wx4nu Күн бұрын
This is how tts ends😂😂😂
@1brokkolibaum
@1brokkolibaum 23 сағат бұрын
I wonder why I am able to use it on my pc, but my phone doesnt have 2.0 unlocked 😮‍💨
@vectoralphaSec
@vectoralphaSec Күн бұрын
AGI is coming soon 2025
@JaBigKneeGap
@JaBigKneeGap Күн бұрын
Dude, I swear. Like, that clock slaps 12 on, idk, january 5? I swear AGI will be here. Or anytime afterward.
@3thinking
@3thinking 15 сағат бұрын
Wild!
@stochastic84
@stochastic84 17 сағат бұрын
I thought the voice felt very unnatural, and then the video answered why lol.
@braineaterzombie3981
@braineaterzombie3981 Күн бұрын
Gimme my sandevistan , time to get chromed up
@pathring
@pathring Күн бұрын
한국어 스피치는 조금 부자연스럽군요
@lakshiBro
@lakshiBro Күн бұрын
Oh well.
@MidgarMerc
@MidgarMerc 23 сағат бұрын
Surely this won't be used to cause suffering at the expense of talented voice actors just so rich creeps get even richer. Surely.
@4letterdc
@4letterdc Күн бұрын
hell yeah
@InternetKilledTV21
@InternetKilledTV21 Күн бұрын
Oh Calculon
@andreaserrano3809
@andreaserrano3809 22 сағат бұрын
Being playing with it and only supports english lol... I guess this is just a smaller demo model to show for now
@eve_mtpl
@eve_mtpl 18 сағат бұрын
It sounds like an hypocritical IA, can we go back to the i sound like an IA IA
@AstroZoe1804
@AstroZoe1804 Күн бұрын
I love it
@dfas1497tcf3
@dfas1497tcf3 17 сағат бұрын
시도해 봤는데, 아직은 적용안됨.
@ashleigh3021
@ashleigh3021 Күн бұрын
I don’t like the tone, cadence structure. They should call it “podcast voice”
@Selene-xf9yi
@Selene-xf9yi 19 сағат бұрын
Cool
@BoydLIN-c3w
@BoydLIN-c3w Күн бұрын
I haven’t found Chinese 😂
@notvedxp
@notvedxp Күн бұрын
😮
@donny6036
@donny6036 13 сағат бұрын
this is way too natural. i have to keep telling myself that this is generated.
@alejandromedina1019
@alejandromedina1019 9 сағат бұрын
banger
@joelcarter9137
@joelcarter9137 Күн бұрын
Wow! That is completely pointless!
@bluepandaman
@bluepandaman Күн бұрын
What.. are you even talking about. How is this pointless?
@SabiUddin
@SabiUddin 23 сағат бұрын
Copium
@naughtycat9894
@naughtycat9894 Күн бұрын
the most exciting thing 🎉
Never Browse Alone? Gemini 2 Live and ChatGPT Vision
13:41
AI Explained
Рет қаралды 28 М.
So Cute 🥰 who is better?
00:15
dednahype
Рет қаралды 18 МЛН
Cat mode and a glass of water #family #humor #fun
00:22
Kotiki_Z
Рет қаралды 11 МЛН
Mom Hack for Cooking Solo with a Little One! 🍳👶
00:15
5-Minute Crafts HOUSE
Рет қаралды 20 МЛН
UFC 310 : Рахмонов VS Мачадо Гэрри
05:00
Setanta Sports UFC
Рет қаралды 1,2 МЛН
Building with Gemini 2.0: Native image output
3:52
Google for Developers
Рет қаралды 41 М.
Meet Willow, our state-of-the-art quantum chip
6:39
Google Quantum AI
Рет қаралды 913 М.
Devin review: is it a better AI coding agent than Cursor?
9:18
Steve (Builder.io)
Рет қаралды 26 М.
Gemini 2.0 for developers
6:59
Google for Developers
Рет қаралды 20 М.
This Video is AI Generated! SORA Review
16:41
Marques Brownlee
Рет қаралды 3 МЛН
Behind the Scenes of Gemini 2.0
35:30
Google for Developers
Рет қаралды 11 М.
So Cute 🥰 who is better?
00:15
dednahype
Рет қаралды 18 МЛН