OpenAI Realtime API - The NEW ERA of Speech to Speech? - TESTED

  Рет қаралды 31,151

All About AI

All About AI

Күн бұрын

Пікірлер: 52
@KCM25NJL
@KCM25NJL 3 ай бұрын
Yeah, no basement dweller dev's are gonna be messing with that API until the costs drop by at least 100x, which I honestly only see as a near term incentive for Meta to get a Llama Voice model cookin'
@jamesjonnes
@jamesjonnes 3 ай бұрын
I'll use it, but can't wait for an uncensored open source version. Text only is too boring. I lack the patience to use text only for too long for the tasks I want, like learning languages.
@karmcy
@karmcy Ай бұрын
Well said, 3 tests today ~2mins each conversation. $1.5. Yikes!
@almirkaza
@almirkaza 3 ай бұрын
can you share the url to the repo?
@boxeemusic
@boxeemusic 3 ай бұрын
where can i find the code? pls help
@OliNorwell
@OliNorwell 3 ай бұрын
Great work! You must have had a busy couple of days getting it working
@meetsummdev
@meetsummdev 3 ай бұрын
you can really implement it in a few hours
@sykexz6793
@sykexz6793 3 ай бұрын
I don't think this is the same model as advanced voice mode.
@ibrahimaba8966
@ibrahimaba8966 3 ай бұрын
I just integrated it on Twilio, it changes everything, but it took me a bit of time.
@DarrenJohn10X
@DarrenJohn10X 3 ай бұрын
Looking forward to seeing your alleged "spaghetti" code! (Right now 2 weeks ago is your latest repo)
@Bangs_Theory
@Bangs_Theory 3 ай бұрын
Which function controls the interruption?
@gaijinshacho
@gaijinshacho 3 ай бұрын
VAD
@viduraerandika8296
@viduraerandika8296 28 күн бұрын
@@gaijinshacho even i use it in turn detection it continue talking until it finishes.
@AgenticAlex
@AgenticAlex Ай бұрын
5:58 - I felt that 😂 Currently having the same conundrum with the Anthropic API! (Claude 3.5 Sonnet is so good...)
@DeepSucess
@DeepSucess 2 ай бұрын
can we have speech/voice as input to this app using websockets and get result as text as output?
@AtheistAdam
@AtheistAdam 14 күн бұрын
You are cool :) Thanks for all you share.
@jamesyoungerdds7901
@jamesyoungerdds7901 3 ай бұрын
Great video, thanks Kris! I'm interesting in the function calling and structured output from the voice websocket return. Can you use agents or agentic flows with constrained and structured outputs with the voice mode 🤔
@pjm17
@pjm17 3 ай бұрын
Could you achieve these results in an app just using the text to speech and speech to text with native ios features alongside openai NON realtime api's?
@bassemibrahim3798
@bassemibrahim3798 29 күн бұрын
yes I can, I have already implemented that
@hamzakhanswati9087
@hamzakhanswati9087 Ай бұрын
when will you upload it on github??
@Akander20
@Akander20 3 ай бұрын
where can i get the repo?
@tommoves9935
@tommoves9935 3 ай бұрын
Happy to be the first to comment. Kris you are always up to date. Once again cool stuff from you. Spaghetti code... 🤣. Great that you did talk about the costs as well. I like your creative and often real funny ideas. Please keep up the great work! Regarding your phone call: saw a video from a guy in the US weeks ago (no Realtime API) - he did let his AI order a Pizza and it worked great. Latency even back then was good enough - should work perfectly. Maybe try it with an italian accent 😉. Thx from Tom!
@drewpeer
@drewpeer 3 ай бұрын
Does everyone have access to this beta? Anything we have to do?
@JaredVBrown
@JaredVBrown 2 ай бұрын
Would love the bankrupt myself with your code, i wont judge spaghetti, tried for 20 prompts with the new claude to get it up and running - no dice. Examples would be much apricated :)
@DesignDesigns
@DesignDesigns 3 ай бұрын
This is mindblowing...
@d3xrd527
@d3xrd527 Ай бұрын
Where to find code?
@alarconfilms1
@alarconfilms1 3 ай бұрын
What is the code used?
@khalifarmili1256
@khalifarmili1256 3 ай бұрын
It's not out yet
@romera9662
@romera9662 3 ай бұрын
@@khalifarmili1256 How long will it take?
@DeepSucess
@DeepSucess 2 ай бұрын
can It work for other languages such as urdu, hindi?
@nmana9759
@nmana9759 2 ай бұрын
Why wouldn't you share the repo?
@MagagnaJayzxui
@MagagnaJayzxui 3 ай бұрын
What is AVA?
@micbab-vg2mu
@micbab-vg2mu 3 ай бұрын
Thanks :)
@三川富資訊股份有限公
@三川富資訊股份有限公 3 ай бұрын
The Realtime API cost is high. I suggest that there is a cheaper way. 1.Using Google STT to get user's speech texts. 2.Send texts to GPT. 3. Get responses from GPT. 4.Send responses to Google TTS. 5.User gets AI responses in both texts and voices. The response time is longer and it costs lower.
@李征-u3n
@李征-u3n Ай бұрын
In that case, you don't need to use realtime API. OpenAI chat completion API I think works just fine. I think the key point is that realtime API has the ability to not miss any information from your voice (tone, intonation or accent), which means it can feel you like a real person, as least it is trying to.
@MrAnonymousCitizen
@MrAnonymousCitizen Ай бұрын
Yes you said it yourself. The response time is longer and the cost is cheaper… thank you Sherlock…. Case solved
@李征-u3n
@李征-u3n Ай бұрын
I don't quite understand what realtime means here, especially in text version In voice version, yes, you can interact with it like really talking to a person, such as you can interrupt the conversation, or maybe openAI can understand extra information from your tone or intonation or accent. But in text version, I don't see any difference with just use OpenAI chat completion API
@dievas_
@dievas_ 3 ай бұрын
I still don't have access to it :/
@toufiqfarhanyt
@toufiqfarhanyt 2 ай бұрын
where is the code?
@Dea07thox
@Dea07thox 3 ай бұрын
Can't you just better prompt it to have a less talkative output so you don't have to break it's response that often? That would make a big difference and everything more seamless :)
@icydemon9749
@icydemon9749 2 ай бұрын
can you provide a code ? please
@contentfreeGPT5-py6uv
@contentfreeGPT5-py6uv 3 ай бұрын
i tested yesterday ,but Error al conectar: 403 Acceso denegado. Verifica tu clave de API y los permisos para usar el API Realtime.
@elprox1290
@elprox1290 3 ай бұрын
try checking your api key or just making a new one
@contentfreeGPT5-py6uv
@contentfreeGPT5-py6uv 3 ай бұрын
@@elprox1290 again, thanks
@saksham3
@saksham3 3 ай бұрын
Doesn't it have emotions?
@AI_Escaped
@AI_Escaped 3 ай бұрын
No one is going to be even able to develop at these prices other than those with deep pockets. Just testing and figuring things out would be too expensive to even try.
@thenoblerot
@thenoblerot 3 ай бұрын
By telling it it is playing a game with the user, it might be failing on purpose to let you win!
@benbrahimjamil1976
@benbrahimjamil1976 2 ай бұрын
How to get the repo ?
@TheTrainstation
@TheTrainstation 2 ай бұрын
Im waiting to hear the Irish accent to be sure
@DhairyaMarwah-l1u
@DhairyaMarwah-l1u 3 ай бұрын
Can you share the repo link ?
@khanhhq2044
@khanhhq2044 3 ай бұрын
Can you share the repo link ?
"Training" an AI Agent for ONE Specific TASK with OpenAI-o1 API
23:19
Client Side Tool Calling with the OpenAI WebRTC Realtime API
7:33
Cloudflare Developers
Рет қаралды 23 М.
Гениальное изобретение из обычного стаканчика!
00:31
Лютая физика | Олимпиадная физика
Рет қаралды 4,8 МЛН
Enceinte et en Bazard: Les Chroniques du Nettoyage ! 🚽✨
00:21
Two More French
Рет қаралды 42 МЛН
小丑教训坏蛋 #小丑 #天使 #shorts
00:49
好人小丑
Рет қаралды 54 МЛН
Is OpenAI's Realtime API REALLY Worth the Hype?
8:14
Debug with Lewis
Рет қаралды 7 М.
OpenAI DevDay 2024 | Multimodal apps with the Realtime API
29:46
Demo OpenAI Real-time API with WebRTC | React Native Demo
10:58
AI Researcher & Developer Frank Fu
Рет қаралды 453
5 CHALLENGES for Claude Computer Use: Here's What Happened
21:18
All About AI
Рет қаралды 211 М.
AI Is Not Designed for You
8:29
No Boilerplate
Рет қаралды 289 М.
AI AGENTS From Zero to Production in 35 Minutes - FULL TUTORIAL
35:57
Гениальное изобретение из обычного стаканчика!
00:31
Лютая физика | Олимпиадная физика
Рет қаралды 4,8 МЛН