Gemini 2.0 - How to use the Live Bidirectional API

  Рет қаралды 13,634

Sam Witteveen

Sam Witteveen

Күн бұрын

Пікірлер: 30
@leonvanzyl
@leonvanzyl 9 күн бұрын
For a video titled "how to use the API", you never once showed how to use the API 😅. You might want to rename this to use how use AI Studio.
@samwitteveenai
@samwitteveenai 3 күн бұрын
Fair enough. AI Studio is the front end/play ground of the API (it's not a consumer interface etc) but I can see why you and others feel it should have had some code rather than just showing the cook book etc. Good feedback
@mattkydd
@mattkydd 9 күн бұрын
It pains me to say it -- but I'm absolutely blown away by G2.0 bi-directional conversational quality
@NLPprompter
@NLPprompter 9 күн бұрын
the era of realtime conversation AI is here...wohooooo
@matterhart
@matterhart 5 күн бұрын
Really appreciate the descriptive, non-clickbait titles.
@lgmuk
@lgmuk 9 күн бұрын
Gemini 2.0 Is incredible! 🤩 Great video!
@nufh
@nufh 9 күн бұрын
Do you remember the controversy in the early days of AI when Google was trying to catch up by using pre-prompt tricks for their model showcases? Now, they've truly delivered.
@SassePhoto
@SassePhoto 6 күн бұрын
Thank you, very helpful, got it straight away!
@unclecode
@unclecode 9 күн бұрын
The first playground for developers that actually non-developer can use it for daily life issues 😅 amazing
@MrKrzysiek9991
@MrKrzysiek9991 9 күн бұрын
Thanks, as always a great video :)
@alexwoxst
@alexwoxst 9 күн бұрын
When do we get the Pydantic AI RAG app? :)
@samwitteveenai
@samwitteveenai 9 күн бұрын
Will try and write something this weekend. Been flat out with work.
@alexwoxst
@alexwoxst 9 күн бұрын
@@samwitteveenai Thanks man, no stress take your time with it, just wanted to let you know there is interest. Many frameworks are hot for 4-5 days and then fall off, but Pydantic AI seems to have some staying power. Maybe thats something you could comment on in the video? :)
@samwitteveenai
@samwitteveenai 9 күн бұрын
Love your comment about frameworks for 4-5 days, so so true. I have started using Pydantic in a few things so will certainly make more vids about it.
@BM-ni4uz
@BM-ni4uz 9 күн бұрын
Hi Sam, thanks for the great videos. Do you think that the bidirectional API will be cost-effective for real world applications (once officially released) ? I imagine building apps that use this API for continuous periods of time. What are your thoughts?
@samwitteveenai
@samwitteveenai 3 күн бұрын
Yeah, I think it will depend on the application, and if it's a high-value application, etc. But surely, over time, the price is just going to go down and down. So, already I think you'll find that this is way cheaper than the original pricing that OpenAI had. And over time, it's just going to get even cheaper.
@BM-ni4uz
@BM-ni4uz 2 күн бұрын
I am seeing a huge use case for consumer apps. Let’s hope API price decreases fast enough.
@thenoblerot
@thenoblerot 9 күн бұрын
It's strange to me that the textual output of the model almost seems to be like a speech-to-text transcription? I mean, it often has incorrect punctuation, capitalization, even the wrong words sometimes? Not that one can trust a model to know about itself, but it says it generates text first and tts is secondary. Odd
@samwitteveenai
@samwitteveenai 9 күн бұрын
It actually could be that. I don’t think the model outputs the text and audio together
@thenoblerot
@thenoblerot 9 күн бұрын
@@samwitteveenai If true.. what a bizarre pipeline. I always thought dear sweet dumb Moshi had a cool approach, with distinct internal monologue, text, and audio streams.
@DanielWeikert
@DanielWeikert 9 күн бұрын
I tried it, and it is quite good in english. Sometimes i notice a slight delay in the conversation. Have you experienced that? I also tried it in german, and it's really bad, not even close to ChatGPTs voice mode. I fully understand that german is difficult, I just hope there will be some improvment there. br
9 күн бұрын
Same here. It’s unusable in Czech and French.
@samwitteveenai
@samwitteveenai 9 күн бұрын
Did you try multiple times? I had some issues with it before the release where on one time the non English language would be good then extremely bad. It was like it was a seed issue was changing the voice. This affected speed and accent. It is still an experimental model so I will pass the feedback along.
9 күн бұрын
@@samwitteveenai I went back and tried a few times with different voices and also a Czech system prompt. Nothing. It is not just bad. It is absolutely awful -- like an old style English TTS reading out a text in a foreign language. But it could understand anything I say in Czech. Also, when I turned the Realtime API into text output, the Czech was just fine. So it's just pronunciation - GPT4o Realtime, btw, is perfect at this. I'm working on a project on language teaching with LLMs, so was really looking forward to trying this as an alternative.
@oOserkanCakmakOo
@oOserkanCakmakOo 9 күн бұрын
Thank you very much
@immortalbk00
@immortalbk00 9 күн бұрын
Damn... I could terminate my child's English tutor and save on monthly tuition fees. I gave the system prompt below, and a screenshot of his school's final year exam English paper and it work wonders! "You are an english language tutor. You would help the student on english language questions but not providing the answer directly. If a student gets the answer wrong, guide the student towards the correct answer by explanation and a series of questions to assist him or her." RIP tuition centres
@ayushgarg70
@ayushgarg70 9 күн бұрын
"Okee, so..."
@rcoding513
@rcoding513 9 күн бұрын
so.... no api action here!!!!
@joselobo6902
@joselobo6902 9 күн бұрын
First 😅
@mrchongnoi
@mrchongnoi 9 күн бұрын
Google got it. Sometimes it is best to come from behind. Review he golang API. Looks good. Will test it out later this week.
Gemini 2.0 - Video Analyzer with Code
17:47
Sam Witteveen
Рет қаралды 10 М.
Gemini 2.0 Flash Tested - Is AI Better Than Humans?
8:43
In Depth Tech Reviews
Рет қаралды 38 М.
人是不能做到吗?#火影忍者 #家人  #佐助
00:20
火影忍者一家
Рет қаралды 20 МЛН
VIP ACCESS
00:47
Natan por Aí
Рет қаралды 30 МЛН
coco在求救? #小丑 #天使 #shorts
00:29
好人小丑
Рет қаралды 120 МЛН
Gemini 2.0 Flash Thinking
20:13
Sam Witteveen
Рет қаралды 23 М.
Build a WhatsApp AI Chatbot with Self-Hosted Llama 3.2
7:49
Tafadzwa D
Рет қаралды 4,4 М.
PydanticAI - The NEW Agent Builder on the Block
21:45
Sam Witteveen
Рет қаралды 28 М.
Gemini 2.0 Flash
14:26
Sam Witteveen
Рет қаралды 43 М.
Gemini 2.0 and the evolution of agentic AI with Oriol Vinyals
51:57
Google DeepMind
Рет қаралды 52 М.
Open Reasoning vs OpenAI
26:59
Sam Witteveen
Рет қаралды 30 М.
Anthropic's New Agent Protocol!
15:35
Sam Witteveen
Рет қаралды 42 М.
NVIDIA Unveils STUNNING Nano Super Computer for Only $249
8:22
Matthew Berman
Рет қаралды 143 М.