Is OpenAI's o1 model a breakthrough or a bust?

Рет қаралды 14,886

Steve (Builder.io)

Күн бұрын

Пікірлер: 43

@burger-guy-99 Ай бұрын

Steve, I think your channel is underrated 👏

@LeelaSankharM 28 күн бұрын

fireship sent me here.. watched most of the videos.. subscribed! and turned on all notificiations!! no bs.. very straight forward, you are very underrated steve!!

@devlisandro Ай бұрын

This is what i need instead of LLM cope of no brainers. Great video, greetings from argentina!

@jeffwads Ай бұрын

2:10. For this type, I just round up or down to the closest whole number. So 40 + 50 and then just subtract 3.

@nonstopper Ай бұрын

What app were you using for the intro clip? EDIT: NVM

@dogmaticwonder Ай бұрын

What was the answer?

@Steve8708 Ай бұрын

Builder.io

@messi-r7m7r 28 күн бұрын

I didn't test Claude, but o1 is such a big upgrade in coding. only tested it for 2-3 days yet but it's great.

@richardantao3249 Ай бұрын

Great references made to Daniel Kahneman and the Dot Com Bubble.

@didiercatz Ай бұрын

o1 / o1 pro is not available through API though, right?

@Steve8708 Ай бұрын

Unfortunately that’s correct, so I had to do the comparison with o1-preview. I took one of the faster runs for the video (sometimes it took as much as 90 seconds before anything displayed). I cross checked it with chatgpt with latest o1 with same prompts and got similar times as in the video - ~30 seconds of waiting before output, similar code output. More details in the blog post.

@Anas_Alaqeel Ай бұрын

Not yet, but github copilot is already using it somehow!

@didiercatz Ай бұрын

@ as far as I know the new o1 is up to 50% faster than the preview

@Steve8708 Ай бұрын

@@didiercatz yup, it’s mostly the delay time, and that’s why I took a video of a 30 second delay (which is on the very fast end for o1-preview and not a 90 second which is more common). That better matched the results I was seeing on chatgpt with o1 using the same prompts as our app uses (average around 30 second delay for “thinking time” like in the video)

@freeaccstuff153 29 күн бұрын

Gpt definitely helped my project life

@GKuul Ай бұрын

i love 'up to date' video

@augmentos Ай бұрын

Can you do a full CODE only, Pro review against o1, Gemini and Sonnet? Almost no exclusive code videos on this yet I have watched them all lol

@Sicxpac Ай бұрын

there is draw backs on o1 but its definitely a massive step up for other things

@RsZ789 Ай бұрын

Steve isn't the fact that Dario left openAI an explanation for the stagnation?

@npc-drew Ай бұрын

I find system 2 thinking useful for research, I have done it twice now, together with perplexity and a google search, and o1 was able to give more in-depth meaning into my queries.

@mnageh-bo1mm Ай бұрын

very on point videos keep it up

@Egal0190 29 күн бұрын

Why is QwQ called a "copycat" of o1, when both haven't invented chain-of-thought but in the same sentence it was said that c-o-t has been around for years? o1 isn't open source, so Alibaba obviously couldn't "copy" o1. Also obviously QwQ has been in development long before o1 saw the public and they were released only weeks apart - way too little time to copy and publish a whole model as a copy of the release of the other one. Maybe it wasn't meant this way in this video, because the guy seems like a nice guy with no malign intentions, but i have noticed this on a lot of other US channels that everything from a CN company is automatically labeled as copy and inferior, while everything from the US is always labeled original and superior. In reality CN has way more new patents per year compared to the US. Which is also quite logical, because CN has around ten times as many STEM graduates per year compared to the US. If there has been some kind of insider information that QwQ was an actual copy of o1, please update me.

@evaneoskowar8859 Ай бұрын

I tested both on highly complex coding tasks. While o1 is a significantly better improvement, Claude 3.5 Sonnet still outperforms it AND is much faster

@LadderVictims 29 күн бұрын

hey, i like ur pfp can u tell me her name?

@alonzoperez2470 28 күн бұрын

Do a video on Gemini 2.O coding

@szebike Ай бұрын

The core technology is usefull the hype and insane overvaluation of companies without proeper results is not. At the moment its still a huge gamble and it leans towards stagnation. I assume it will take a similar trajectory like the internet, first its too clunky to be usefull years later with better hardware and adoption it becomes cool then big tech companies will ruin it again with over regulation and greed.

@Cagatay-gf8mp Ай бұрын

i love claude

@CommentGuard717 27 күн бұрын

I wouldn't say its "marginally better" for me and my prompts it's a lot better like 30 minutes to 30 seconds better

@Toxicflu Ай бұрын

I think you made your video too early... You should of waited a week after all of openAI's December releases. ;)

@ronanru Ай бұрын

This video sounds like it's AI generated

@junaidahmed765 Ай бұрын

O1 is already a beast. Have tested it extensively.

@jambazz 18 күн бұрын

You kind of failed talking about o1 model though, out at least not enough

@BadWithNames123 Ай бұрын

O1 is preview of o2 and o3 ;) And no.. progress is not slowing at all

@sid4579 Ай бұрын

stop with cope lmao, it is slowing. llm aren't going too far

@SimonNgai-d3u Ай бұрын

First time watching this channel?

@Sindigo-ic6xq Ай бұрын

@@sid4579 then you will have no problem betting 1k that this clearly wont be true within 18 months

@John4343sh Ай бұрын

@@sid4579 It isn't cope lol. Wtf are you talking about? All of the data points towards exponential scaling especially now we are using TTT.

@anthonybanks6772 Ай бұрын

01 is so much better and accurate than Claude. Yes it can be slow but the quality of the output is 10x better. Remember quality is King 👑

@James1787Madison 29 күн бұрын

Quntum computing will get us around these problems you are talking about.

@poisonza Ай бұрын

I think u comparing reasoning model with non reasoning model is not fair. gpt4o / gpt4o mini versus claude sonnet:haiku is more adequate.

@SearchingForSounds Ай бұрын

It's just a LLM Model. One has a hidden token for hiding CoT yap they fine tuned. It's just as easy for Claude to do the same. They have done a great job selling "reasoning" model to idiots

@SearchingForSounds Ай бұрын

It's just a LLM Model. One has a hidden token for hiding CoT yap they fine tuned. It's just as easy for Claude to do the same. They have done a great job selling "reasoning" model to idiots