OpenAI Now Has a Text-to-Speech API - Testing and Overview

  Рет қаралды 8,970

Jarods Journey

Jarods Journey

Күн бұрын

Пікірлер: 52
@megaaziib
@megaaziib Жыл бұрын
the voice is good, but it's not good as elevenlabs level, elevenlabs voice can have different intonation according to text and punctuation (very good for roleplay chatbot). open ai voice is similar to micosoft edge tts which you can use it for free.
@KeystoneScience
@KeystoneScience Жыл бұрын
So excited for this. I hope they come out with a nice zeroshot cloning model to use that is at a similar price point.
@Jarods_Journey
@Jarods_Journey Жыл бұрын
This would be pretty cool! Dunno if they're thinking about any liability for that though lol
@Mowgi
@Mowgi Жыл бұрын
That chat assistant in your own example sure did use a lot more words than necessary :P Your face at 7:55 🤣 Looks exciting, keen to see where it goes from here. As always, interested to see if they make any advances with accents.
@Jarods_Journey
@Jarods_Journey Жыл бұрын
Haha ofc. Accents will hopefully be coming, that'll be a nice addition!
@shawn4990
@shawn4990 Жыл бұрын
First time catching your content Jarod... great Video. Easy subscribe. In reading the comments, I noticed many users are finding the limitations of OpenAi's software. Like anything else, early adopters/users of new technology share in the often painful growth of such tech while its development matures and evolves. For many though, they see this as a 'glass' half empty, rather than half full... either way, in my humble opinion, our future looks bright.
@Jarods_Journey
@Jarods_Journey Жыл бұрын
I agree, I'm generally optimistic on these things. Looking at the same time last year or the year before, the "AI" growth has been awesome, so it's just developing and maturing
@myte1why
@myte1why Жыл бұрын
well for Turkish I can say it sound like non native speaker talking but not bad. to explain it: In native Turkish speakers, words tend to flow more smoothly, whereas non-native speakers often have a tendency to place more emphasis on words. The situation here is the same. That is to say, it's quite good but for someone whose native language is different.
@stevecato
@stevecato Жыл бұрын
Not really clear where OpenAI want to go with this being so late to the party. I've been using EdgeTTS, Piper and looking forward to your 11Labs replacement - can't really see why I would pay OpenAI to use this one.
@Jarods_Journey
@Jarods_Journey Жыл бұрын
Always room for competition and so I think it's about time they get in the field. Quality wise it's not bad as it's still relatively lifelike for many languages, but I wouldn't use it for personal stuff as I've got tortoise+rvc. BuT the creator of tortoise is part of OpenAI sooo I can see things getting pretty good over this next year
@doingtime20
@doingtime20 Жыл бұрын
I tried on spanish, my native language, and honestly it's pretty good. Only issue is all voices speak like an american with a USA english accent. The better is one is Nova, very little accent.
@Urfriendinfinance
@Urfriendinfinance 11 ай бұрын
I saw your tortoise TTS video. What's the best AI voice right now is it? heard to use Tortoise TTS then rvc voice.
@Jarods_Journey
@Jarods_Journey 11 ай бұрын
I haven't heard anything really much better than Tortoise TTS to RVC. XTTS v2 is also good. Eleven labs if you don't mind paying.
@Urfriendinfinance
@Urfriendinfinance 11 ай бұрын
@@Jarods_Journey thanks for the quick reply. i was reading your about page and noticed you graduated from chico. I also graduated from there 2010.
@stan-zm3ep
@stan-zm3ep Жыл бұрын
thank you Jarrod for all your contribution, u deserve a PhD in TTS :)
@Jarods_Journey
@Jarods_Journey Жыл бұрын
Haha thanks!
@ИванКоряко-ц5б
@ИванКоряко-ц5б Жыл бұрын
And what about streaming? Could you get audio responses until full text generation?
@piersonmarks
@piersonmarks Жыл бұрын
Yes - streaming is available via chunk transfer encoding. You can just stream the response from OpenAI to a buffer/file
@Jordan-tr3fn
@Jordan-tr3fn Жыл бұрын
cool video ! how did you write in japanese like that at 7:38 ? I'm learning japanese but find it difficult to write without another virtual keyboard
@Jarods_Journey
@Jarods_Journey Жыл бұрын
You can install the microsoft keyboard input and then when you type, you just use romaji to type. Search around online to see how you can get it up and running :)!
@Jordan-tr3fn
@Jordan-tr3fn Жыл бұрын
@@Jarods_Journey thank you!
@nufh
@nufh Жыл бұрын
The Indonesia is quite spot on.
@wisnunugroho9958
@wisnunugroho9958 Жыл бұрын
berbayar ga bang ini? Kalo dijalanin lokal gini..
@juancalara3997
@juancalara3997 Жыл бұрын
Interesting! is there a way to add this to a custom GPT?
@kaeuzita
@kaeuzita Жыл бұрын
The AI speaks other languages like an American trying to speak other languages. The accent lol
@rickarroyo
@rickarroyo Жыл бұрын
09:29 - In Portuguese it doesn't sound like Brazil or Portugal, it sounds strange 😑 But your like is guaranteed :))
@MaisnerProductions
@MaisnerProductions Жыл бұрын
Great update. Interesting.
@Jarods_Journey
@Jarods_Journey Жыл бұрын
Thanks mate :)!
@andreimariano2674
@andreimariano2674 Жыл бұрын
I am a native Filipino speaker and I would say Open AI always know how to play their cards even with multilingual tts. Really excited for this.
@naturalbeauty19964
@naturalbeauty19964 Жыл бұрын
How can I add any voice , for example voice from marvel iron man and any persons ?
@Jarods_Journey
@Jarods_Journey Жыл бұрын
Not possible, I don't think OpenAI wants to be liable for voice cloning lol
@mythaimusic39
@mythaimusic39 Жыл бұрын
I played with the API yesterday and let me tell you that I have a nearly perfect french with it. I tried every voices and they all are nearly perfect with just a (very) slight English accent here and there. That's far from your demo where the french speaking is horrible, so I guess that's because it is speaking several languages in the same prompt ?!?
@Jarods_Journey
@Jarods_Journey Жыл бұрын
Yep! You're on the money there, I noticed the same thing afterwards where when you use one language, it's much better. Probably a transformer model and is inferring an English accent or a mixture of accents when I used it in the example
@tacototo8933
@tacototo8933 Жыл бұрын
Dutch was all right, but pronounced "eten" (eat) incorrectly. Nontheless this is amazing
@bobbyboe
@bobbyboe Жыл бұрын
german = bad french = bad italian = bad the voice in your app = very good 😊 I wonder if you enable your audiobook-app to speak german... and make it installable for non-coders like me 🙏 I would like to train my own voice and be able to do recordings without setting up my audio-setup.
@YumiSumire
@YumiSumire Жыл бұрын
Vietnamese is really accurate
@SpudHead42
@SpudHead42 Жыл бұрын
Android only so far for the chatGPT tts?
@Jarods_Journey
@Jarods_Journey Жыл бұрын
I actually am not sure, I'd assume it's availble on iOS as well
@Airbender131090
@Airbender131090 Жыл бұрын
Can we run it localy?
@Jarods_Journey
@Jarods_Journey Жыл бұрын
Yes, you just need to call the API
@jerrythefeared
@jerrythefeared Жыл бұрын
French, German and Dutch were horrendous. Korean started fine, but the other 3/4th was butchered really bad :/ edit: Russian was good!
@mikeiavelli
@mikeiavelli Жыл бұрын
French 9:06 ; 10:18 --> Sadly, the french version was really bad. Maybe it's better with another voice?
@danielmonge2318
@danielmonge2318 Жыл бұрын
I'm from Brazil and let me just say... That Portuguese was terrible lol. It had a heavy American accent. I've played around and I also like Nova the best. When done in isolation, with a text input that is 100% Portuguese, it does a lot better, but it does have a country-side accent to it, but at least it does sound like a native. I don't like the fact that it is too fast. You can control the speed by sending the "speed" and a float, from 0.25 to 4.0, where 1.0 is normal speed, but that's just an audio post-processing step, like changing the speed on a KZbin video.
@Jarods_Journey
@Jarods_Journey Жыл бұрын
Yeah, I noticed too that if you use english and then have it switch to another language, it might start speaking english with a tinge of an accent or vise versa. They're probably running a transformer model behind their TTS as well like tortoise so this could be affecting the output on it haha.
@WanJieMing
@WanJieMing Жыл бұрын
You don't eat dairy ?? WHY SO WEAK ?
@bluevisor
@bluevisor 11 ай бұрын
What kind of Asian uses calculator for 1.5x30?
@gamersgabangest3179
@gamersgabangest3179 Жыл бұрын
I am italian and the Italian accent is not perfect
@m_a_p
@m_a_p Жыл бұрын
Most of the non-english TTS sounds like an American trying more or less hard to speak the language. German and French are especially terrible.
@MarcoManzo
@MarcoManzo Жыл бұрын
in german it has a strong english accent. not great
@Airbender131090
@Airbender131090 Жыл бұрын
Russian is very bad… crazy english accent
Realtime Speech Translation with Facebook's SeamlessM4T
9:59
Jarods Journey
Рет қаралды 8 М.
ТЮРЕМЩИК В БОКСЕ! #shorts
00:58
HARD_MMA
Рет қаралды 2,7 МЛН
From Small To Giant 0%🍫 VS 100%🍫 #katebrush #shorts #gummy
00:19
I Was SHOCKED When I Realized Who My Bar Opponent Was...
16:12
Anna Cramling
Рет қаралды 4,5 МЛН
My Top 5 Open Source Text to Speech Softwares Starting off in 2024
8:37
ChatGPT can now create apps?
18:21
Vlad Holtz
Рет қаралды 101 М.
Why The US is Struggling to Return to the Moon
19:55
Real Engineering
Рет қаралды 715 М.
Qwen Just Casually Started the Local AI Revolution
16:05
Cole Medin
Рет қаралды 96 М.
AI Audiobook Maker Updates & GPT-SoVITS Package
9:12
Jarods Journey
Рет қаралды 2 М.
7 New AI Tools You Won't Believe Exist
14:09
Skill Leap AI
Рет қаралды 185 М.
ТЮРЕМЩИК В БОКСЕ! #shorts
00:58
HARD_MMA
Рет қаралды 2,7 МЛН