BREAKING: OpenAI's new O3 model changes everything

Рет қаралды 129,380

Күн бұрын

Пікірлер: 898

@Evoleo 22 сағат бұрын

The use of the logarithmic scale on the graph looks intentionally confusing, making it look at a first glance like there is some sort of linear progress, while there is actually 100x cost increase for 3x result

@iz5808 21 сағат бұрын

Yes, but the graph looks juicy for non-savvy people

@petargolubovic5300 21 сағат бұрын

Actually it's even worse... 1000x cost increase compared to o1. Around 100x increase is compared to the low compute of the o3 model.

@boccobadz 21 сағат бұрын

Grifters gonna gift. Altman is desperate because Google Veo 2 destroyed their Sora and Google does AI as side gig lol

@markuscwatson 20 сағат бұрын

Log scale is pretty standard

@wlockhart 20 сағат бұрын

@@markuscwatson Yes it is, it's standard for displaying an exponentially growing quantity. Unfortunately in this case it's the cost per compute, not compute per cost.

@hicoop Күн бұрын

Cryptobros and ai enthusiasts competing for the biggest carbon footprint

@winfredj9820 Күн бұрын

both destroying everything

@attilakovacs6496 Күн бұрын

Right... because we were so damn environment friendly until now.

@excessivelysalty_81 Күн бұрын

@@attilakovacs6496 You're not wrong, but now we're just speedrunning it.

@kwinzman Күн бұрын

I wanted to reply to you but youtube keeps shadowbanning my comments. Discourse is not possible on this platform.

@MajorasWrath1 Күн бұрын

@@kwinzman don't say stupid things then

@wilfredomatute7697 Күн бұрын

Are you saying: it's cheaper to just pay a developer to do the task?

@alex-rs6ts Күн бұрын

For now

@wlockuz4467 Күн бұрын

Exactly, haha

@SimonNgai-d3u Күн бұрын

Certainly as of dec 2024 💀💀 But not long💀💀

@lucidityFX Күн бұрын

You think it's gonna stay like this?

@smile4cs Күн бұрын

@@lucidityFX explain what you gain from AI taking over human jobs.

@derFeind 23 сағат бұрын

But can it push directly to master?

@vaolin1703 17 сағат бұрын

Finally someone asking the real questions

@NikolayTrofimov-y5q 7 сағат бұрын

@sanjaycse9608 2 сағат бұрын

😂

@byailen 38 минут бұрын

Could you?

@alexaka1 Күн бұрын

Maybe I'm insane but all I see is 500% performance increase for 2000% more cost. This is a plateau just zoomed in super close.

@Cat-vs7rc Күн бұрын

the cost will decrease over time. do we run our code on 1950s mainframes?

@zeppelin2689 Күн бұрын

@@Cat-vs7rc yes, we still do

@lystic9392 Күн бұрын

Yes. But let's say it takes a thousand times that to get to ASI. You think that would be useless or not an amazing breakthrough? We may be able to find ways to get costs down, especially if we can have very intelligent AI help us do it. It's getting to ASI at all, that would be the primary challenge.

@WiseWeeabo Күн бұрын

1) The cost will come down. 2) Intelligence stops scaling for a few months and suddenly everyone thinks we've hit a plateau. Just zoom out more.

@astrovation3281 23 сағат бұрын

It's not brute forcing though

@rekulator Күн бұрын

High compute was 172x low compute, so roughly $3500 per task, totalling $350000

@Scott_Stone 18 сағат бұрын

That's 10 Eastern European devs hired for a year. The value generated, especially with lower-cost LLM is staggering in comparison.

@sadscientist9995 12 сағат бұрын

250k

@harounhajem7972 12 сағат бұрын

@@Scott_Stone To solve 10 problems, wow, great investment

@Katatonya 8 сағат бұрын

@@harounhajem7972I bet you're some Harvard professor with that insanely intelligent comment.

@Zoulz666 Күн бұрын

No wonder the machine will need to turn us into batteries in the future. They need to power their AI. 😅

@bafhaf Күн бұрын

IIRC the original idea was that humans were supposed to be used as computing power, not a battery as that would be inefficient, therefore now it's more of a question: What's even the point of a human? An answer worth 2k I guess.

@pastuh Күн бұрын

Mobile phone with spy chips already working..

@kimlau4285 23 сағат бұрын

bro watched too much matrix

@EugeniaLoli 22 сағат бұрын

As someone else replied, the original idea of the directors was that we were used for our consciousness, computing power, not batteries. The Hollywood bosses intervened and asked the writers to change it, because supposedly the viewers won't understand that idea.

@stevesmith4901 17 сағат бұрын

The machines used humans as battery in the Matrix because humans had scorched the sky. Solar energy is far more in abundance than human energy.

@meetit5949 Күн бұрын

But they spend over $1 million to solve all Arc Prize tasks ($3000+ per task).

@matrinoxtm Күн бұрын

It’s crazy it can solve it but certainly not feasible yet. But it feels like it is within reach if they just lower costs and speed it up

@muhammedyaseenkc8769 Күн бұрын

🤣

@somdudewillson Күн бұрын

TBF end-user cost =/= the cost they pay. I'm not 100% sure if the figures provided are end-user cost or actual internal cost.

@ignaciosavi7739 Күн бұрын

How much you want to spend to be human? I'm asking because only humans can get a high % on arc . Until now

@imeakdo7 Күн бұрын

@@ignaciosavi7739being human is cheaper for now. Once that changes is when ai will start to change things

@gearboxworks Күн бұрын

Can you just stop with the "changes everything" clickbait titles? 🤨 'Changes everything we thought we knew about AI" would be okay, but "changes everything" is just empirically false and makes you look like a scammer.

@MACD69 Күн бұрын

Sure it's not going to change things overnight, but where will this lead us in 2 years? A lot of potential for major change

@user-td5gy2fh3p Күн бұрын

I agree. I’m so tired of all the BS on KZbin. Soon I will stop watching it because it’s mostly garbage content like this video now.

@Swobsify 23 сағат бұрын

@@MACD69then write potentially majors changes instead....

@redacted.redacted 23 сағат бұрын

🤓☝

@jason_v12345 23 сағат бұрын

I read it as if the scope of the expression is implied

@Thrashmetalman Күн бұрын

I work in AI research and no. it is not AGI

@xewi60 22 сағат бұрын

People struggle with the definition of agi so how can you be this categorical

@damianlewis7550 22 сағат бұрын

@@xewi60 it failed on some of the simplest ARC tasks that a 9 year old child can do. AI Explained did a nice piece about how good and still how flawed o3 is.

@DanielTSasserII 21 сағат бұрын

@@ThrashmetalmanI work in AI research as well. Most people really don't understand the difference between the two. It seems the general public has a misconception that one can become the other.

@psyjax2 21 сағат бұрын

@@DanielTSasserII A misconception encouraged by the big ai companies.

@iz5808 21 сағат бұрын

@@DanielTSasserIIbut what you make of the new model? Does it get us closer to the AGI point? Does o3 has something really good and new in its foundation?

@kelton5020 Күн бұрын

Its not just the hardware. As soon as the papers for previous gpt models were released, lots of super smart people optimized the process by orders of magnitude. I think we could drop the hardware overhead by a significant amount if open ais models and process were open.

@imeakdo7 Күн бұрын

But it won't happen. Because they want maximum gain for their investors as required by US law. I also don't think there is much to optimize since all researchers for AI mostly only implement for companies like OPEN AI ideas about ai that have existed for some time

@AlucardNoir Күн бұрын

It wouldn't but you're free to delude yourself. Even if you have the AI, the weights and the training data, you can't just optimize the shit out of it. You need the hardware to run the training again to actually change the weights. As for the open source models of GPT, you might want to look into HOW those hardware requirements were lowered. There's a reason OpenAI isn't doing that, and it's because it limits the AI's capabilities. They could if they wanted to. That's the mini, low, med and high thing in the chart. It's the same with O3's low and high. The ways to "optimize" it to run on more limited hardware are known, not some trade secret limited to "open" models. They're not used because they result in inferior end products. As of now we have entire AI families that can run from anything from your phone to a supercomputer. And guess what? the bigger the model the better the results. None of these companies will hamper their best models, that's why the other models exist. You're never going to run O3 on your phone. There might be an AI as powerful as O3 that needs a lot fewer resources that you will be able to run on your phone in a few years - maybe months - but it won't be O3. It will be the next big thing. And you can bet that when that happens there will be something that will use as many resources as O3 needed to achieve these results or more that will be above it.

@kelton5020 Күн бұрын

@AlucardNoir there were limits before too, but smart people found lots of ways to optimize training and inference significantly. It would be wrong to assume they've fully optimized everything at this point.

@kaldogorath Күн бұрын

At this point we can probably ask it to make itself more efficient.

@fate2784 Күн бұрын

afaik open ai hired a lot of those super smart ppl who optimised the models....

@akam9919 Күн бұрын

Thanks but I'm still holding my beer. Benchmarks are benchmarks, and we know how easy it is to screwshit up nonmaliciously with JS frameworks. I need functionality to work for more than just squares. Also, is this REALLY taking your job? Realistically no. As to the safety thing...honestly... this mostly marketing and monopoly protection.

@mr.rabbit5642 9 сағат бұрын

It's taking over the minds of business people who understands jack shit about the technology but are in charge of your job, and that's much worse

@Kyle-w6m 6 сағат бұрын

@@mr.rabbit5642 This. This is honestly what people are not talking about. The reality is it really COULDN'T replace engineers but you have some fuckstick CEO in charge with his 2 years of McKinsey consulting telling him that he can just pay Marge in marketing the same to build the application "cause AI."

@Lemmy4555 16 сағат бұрын

I do not believe a single word from OpenAI until the model is out there and i can try first hand, o1 was quite a disappointment

@lolilollolilol7773 3 сағат бұрын

Exactly. Their modus operandi is maximum BS to get press and influencers to hype their product.

@Thial92 15 сағат бұрын

Always remember 90% marketing, 10% function.

@RoyerAdames Күн бұрын

What happen to 02? O1 just came out, and now we are talking about 03. I feel like I miss a season or something

@youngreda4410 Күн бұрын

they skipped o2to avoid copyright with the british entreprise

@cam3042 Күн бұрын

there’s a UK telecoms company called O2 + they’re leaning into having terrible naming conventions, are the reasons they stated in the video

@inorganicphosphate9755 Күн бұрын

They can't name it o2 is because the name is already owned by some UK firm.

@attilakovacs6496 Күн бұрын

It's copyrighted by a company. They had to skip it to avoid lawsuits.

@gearboxworks Күн бұрын

@@youngreda4410- They should have thought of that before the named it "o1." 😲

@rahulthomas8383 Күн бұрын

Isnt the only cost of GPU farms electricity. 2000$ is still too much per task. The human brain is so efficient that Gigawatts of electricity is needed to match it.

@NamedSoni Күн бұрын

Exactly, we just need to optimize our brain.

@matten_zero Күн бұрын

The amount of time and training needed for human brains to solve these problems is immense.

@imeakdo7 Күн бұрын

It won't stay that way for long. Investments are pouring into silicon photonics and photonic computing

@diamond_s Күн бұрын

Human brain efficiency is greatly exaggerated. Brain does far less compute than many believe but it uses better algorithms. Better algorithms make it far better.

@paradoxalJohn Күн бұрын

@@diamond_s Algorithms such as...?

@awsmith1007 Күн бұрын

You're reading it wrong. The cost per task on the ARC benchmark was ~$5000 per task.

@XNaos Күн бұрын

That doesn’t matter, give it a year and it is as cheap as Gpt4-o

@wlockuz4467 Күн бұрын

@@XNaos What are you even saying? If they could optimize it that well they would've already done it.

@XNaos Күн бұрын

@@wlockuz4467 This takes time, the last iteration was just 3 Months ago !!!! It isn't that simple as just optimize it. Their priority, to take the capabilities to their limit disregarding cost and afterwards make it economically viable. But they just didn't hit a wall in terms of capability, that's why it is so expensive and not yet optimised. Why bother optimise a model if the next iteration is just 3 months away, with 5 times the performance.

@ominousplatypus380 Күн бұрын

@@XNaos You're making it sound way more straightforward than it actually is. Why was Sora demoed almost a year ago but it's still prohibitively expensive to use? Why didn't they make it as cheap as 4o yet if it's so easy? Don't get me wrong, I'm very bullish on AI in the mid- to long term but at the moment the cost is a significant issue.

@HCforLife1 Күн бұрын

@@XNaos I would like to se real life performance. So far in my professional programming field I still see o1 hallucinating even on simpler tasks. We will need to see where this will come. If AI will take our jobs - this mean we hit the era of neo-feudalism. Good luck with that. From that point on AI will take each year more and more jobs with less and less jobs being created. Cool times coming - fighting for resources will be literal 😅

@UnFiltered1776 Күн бұрын

I'm interested to see how photonic processors get integrated at scale. That alone could save a lot on energy costs, especially as _new_ datacenters get populated with a high proportion of hardware dedicated to AI/ML computation.

@GodbornNoven 13 сағат бұрын

Thing with photonics is that although they are efficient, the lasers used to create the light itself consumes a large amount of energy. And also, logic gates and their size is a big problem. Theres much to do in regards to photonics but i still believe it is a strong contender for revolutionizing classical computing

@KiffgrasConnaisseur 20 сағат бұрын

If you have to ask if it´s "AGI" the answer is no. Also what happened to "let us save the planet?" Guess that´s officially off the tables now?

@lost4468yt 17 сағат бұрын

Why is the answer no? We benchmark humans still, and we have GI?

@llamerr 16 сағат бұрын

@@lost4468yt Because it doesn't scream "let me out"

@xVancha 13 сағат бұрын

@@llamerr A bit human-centric, no? Maybe an AGI would wonder why we weren't screaming the same...or even, "let me in".

@llamerr 12 сағат бұрын

@@xVancha we actually do scream "let me in" - there are a lot of movies and books about digital immortality. and we do scream "let me out" - all of the science is basically about discovering our place in universe and what universe is. and why we need to know it? because we want out, to become immortal once again. yeah, last one was half a joke, but only half.

@alex-rs6ts 6 сағат бұрын

@@llamerr Why should AI be concious?

@laztheripper 22 сағат бұрын

"Hey guys, I was wrong" One minute later: "It won't scale like this anymore, it's impossible!" How unaware can you be?

@veryCreativeName0001-zv1ir 19 сағат бұрын

we went from "it's not possible to make" to "it's not possible to scale" in under 2 years man

@Sindoku 23 сағат бұрын

I still don’t think this is a large leap, though it certainly seems bigger than GPT-4 to O1. Sure, it’s doing more complex tasks, but the real world is so much more complex than even the tasks tested. Even real world problems that are seemingly simple are still too much.

@GodbornNoven 13 сағат бұрын

No one cares what you think or what you believe, we care about facts, and the facts are that this is a large improvement over the previous interation. That's it.

@YouhaBaha 11 сағат бұрын

No matter how much ai improves, you will still be stuck in your safe coping bubble.

@LiamL763 9 сағат бұрын

Is the real world really that more complex? Its complex in totality but on on an individual basis, it often isn’t. In fact an absolute majority of humans aren't even entrusted to perform "complex" duties.

@notactuallyarealperson2267 2 сағат бұрын

@@LiamL763 The real world is absolutely more complex. Even on the individual basis. One may think that the majority of humans can't handle complex tasks, but not dying on a daily basis is fairly complex in and of itself. The amount of continuous data we have to process in realtime (~80ms of delay) is much much more than a discrete context window. Just walking and keeping straight is a complex task with years and years of research behind the recent artificial solutions. Even working walmart is made complex by reality. Planning how to go from stocking shelves, to dealing with a shitty customer, to working a cash register, to driving home are all fairly complex tasks considering reality is in the way. You need to do all that in a timely enough manner to keep your manager happy while trying to carve out downtime for yourself. You could get cut off on the road, the cash register may not always open or it could get stuck in the open position. Someone could come in trying to rob the store. I find a lot of people who share the view that "reality isn't so complex" have not left their comfort bubble in a long long time.

@rjawiygvozd Күн бұрын

O5 will be a god that requires human sacrifices to grant wishes

@fatihrime 23 сағат бұрын

😭

@sunnybwaj 3 сағат бұрын

@@fatihrime 🤣

@jarrodhroberson 17 сағат бұрын

it still is just generating output that someone else has already created band replaced some things, when it generates something genuinely new and unique then post a pour it with click bait titles like this until them stop with the lies

@liminal27 Күн бұрын

Go easy on yourself. You weren't "wrong" a few months ago, you're a KZbinr and a Web developer who mostly doesn't know what he's talking about, remember?

@iz5808 21 сағат бұрын

Webdevs should have learned by now to know their place

@sunnybwaj 3 сағат бұрын

@@iz5808 Mere commentors should maybe learn their place even more so.

@sunnybwaj 3 сағат бұрын

@@iz5808 On a more serious note, we owned you and your jobs all of these past 20 years. Now Ai's gonna own us all. So maybe its time we all begin to re-think our place. :)

@Slashscreen Күн бұрын

I still maintain that we are in the "vaccum tubes and mainframes" era of AI, and we need to rethink how we are using these models. If we do that, in the future we will look back at this time period with horror

@paradoxalJohn Күн бұрын

Check out the research people are doing on "wetware" and AI. Look up Cortical Labs.

@pppparanoidddd Күн бұрын

Well yeah humans can get high scores on ARC fueled by a banana, not hundreds and thousands of dollars of energy. We have so much space to optimise ahead

@Wppsamsung2024 23 сағат бұрын

Vague ass

@mircorichter1375 22 сағат бұрын

"We" don't do anything. People are diverse and different people want to do different things with it. From Cyberpunk Armageddon, over Porn Robots to Fully automated communism... It will all happen. No need to "we" us

@goldsucc6068 19 сағат бұрын

you still think that you are thinking. what a cope. in fact, you are writing clown comments to look smart

@Zdravko-x8c 19 сағат бұрын

The Elite Society's Money Manifestation ebook made me realize so much about attracting wealth, it’s insane

@noah12121 Күн бұрын

The AI on the graph costs 50000% more than a stem graduate to complete the tasks and even then has an error rate that is 1000% higher than the humans???

@josephvictory9536 Күн бұрын

this isnt a good takeaway. In 8 years thats gone. I think its pretty clear this is a watershed moment. We have spent countless billions on fusion and have yet to see a single fusion reactor even with proof of concept. While more was certainly spent on AI. It is undoubtably involved at the root in so many businesses and lines of work due to sheer convenience of workflow. Money will keep flowing to this, tech will continue to advance rapidly and hardware will continue to both get vastly more powerful and cheaper. I had doubts before even with o1. But consider that o3 is closed source and the worlds most brilliant minds have yet to have a go at optimization. We are at the beginning of an era, like getting to see the internet being born, or the first shitty overpriced command line computers with green on black monitors.

@Manwith6secondmemory Күн бұрын

Price will drop, were in early stages

@RyluRocky Күн бұрын

No the average human preformed 64% on the ARC AGI test.

@RishabhSharma-dj7oh Күн бұрын

If coping was an olympic sport.

@beace4436 Күн бұрын

@@RishabhSharma-dj7oh ur indian lol

@justthisguyyouknow666 18 сағат бұрын

To me, OpenAI's "AGI" is the new FSD. It works most of the time, until it really doesn't.

@jamaliseven Күн бұрын

It literally shows (Tuned) in the benchmark results. So o3 was tuned to this specific problem. Why would not he mention that in the video?

@vectorhacker-r2 Күн бұрын

Doesn’t fit the narrative

@RishabhSharma-dj7oh Күн бұрын

It wasn't "tuned" for anything you are thinking of Fine-tuning what tuning here refers to high and low test time compute.

@greatfate Күн бұрын

The fact that it managed to literally reach 2700+ on codeforces is enough to show that this isn’t just memorizing shit. This is the real fucking deal

@ClowdyHowdy Күн бұрын

@@vectorhacker-r2 your dumb comment doesn't fit the narrative, because neither of you know what you're talking about.

@aldierygonzalez7249 Күн бұрын

Yes, but also, that means it is tunable to a topic like this, people are excited because you couldnt teach a fish to do this task, it is nearing spontaneous pattern recognition that is scarily close to the unknown of the human brain

@jambalaya974 11 сағат бұрын

I am HIGHLY suspicious of OpenAI gaming the Arc AGI benchmark.

@echobucket Күн бұрын

I dunno. I’m at the point where I don’t trust these “benchmarks” that are created by OpenAI themselves

@thailux6494 19 сағат бұрын

Good thing this isn’t an OpenAI benchmark then. But of the ARC foundation

@Adam-nw1vy 13 сағат бұрын

@@thailux6494 But in the video they say they're partnering with ARC, so this is sus tbh.

@sevi2949 Күн бұрын

Theo went from Ai is not going to take our jobs, to "I am really concerned about this" real quick. Thanks Theo!!!

@SayanMitraepicstuff Күн бұрын

About time

@NeoKailthas Күн бұрын

Yeah at least he has the integrity to admit he was wrong when presented with the evidence. I was pulling my hair out for the last year saying you guys are missing it. You are letting your emotions control your judgement.

@edmonddantes6443 Күн бұрын

it’s cool he’s actually updating unlike prime

@SayanMitraepicstuff Күн бұрын

@@edmonddantes6443 Yeah, not sure why people were convinced by prime and theo on the progress of AI - their argurments were mainly "I don't think it will happen cause I feel like it won't happen" I obviously like their channels - just hard disagree on their AI takes (or pre o3 AI take rather).

@jefferylou3816 Күн бұрын

@@NeoKailthasAgreed (tho I always held ur opinion)

@tothespace2122 21 сағат бұрын

Don't clickbait like this... Your content is actually good so it doesn't need clickbait titles like these. I am getting more and more annoyed by clickbait from this channel and will just stop watching at one point.

@Sivenruot Күн бұрын

The question on ARC is NOT just about the score. But how you acheive it as well. If the "technique" is the memorize all thoses "new challenges" to bruteforce your way in. Well you didn't achieved intelligence whatsoever and it's just marketing bullshit.

@phen-themoogle7651 Күн бұрын

You can't exactly brute force it since the problems are not the same each time and completely novel for all tests. Even if they have similar data from millions of arc type problems it doesn't mean they would reach the same solutions. But that being said, ARC does have some pattern recognition and limited types of problems so it's only a very very small test of intelligence. Would like to see o3 perform IRL embodied in a humanoid and has to do spacial reasoning blind tests in 3D space. That'll be a much more practical test for our world. You could be right that it's marketing bs, but it could also be a milestone...and many more to come before true AGI >.

@Cat-vs7rc Күн бұрын

read what it is before commenting. this comment is exactly why people prefer LLMs over humans.

@megasticky8968 23 сағат бұрын

@@Cat-vs7rcyou are alone on this 😂, how can you prefer LLM over humans

@mircorichter1375 22 сағат бұрын

Arc can't be memorized. Data is private

@mircorichter1375 22 сағат бұрын

@@megasticky8968no we are many

@dntwantgglplus Күн бұрын

everyone should assume AGI will mean an even bigger wealth gap.

@HCforLife1 Күн бұрын

Neo feudalism. And what's worse, with the replacement of people in terms of office jobs the world's markets will collapse. This might be the first time where technological revolution will be only taking jobs instead of producing.

@mattburgess5697 23 сағат бұрын

Everything will

@silotx 22 сағат бұрын

AGI will not mean a wealth gap , if the 1% truly has no need for workers, then they will build heavily AI guarded 1% paradise cities, and outside of them there will be a mad max type society.

@michaelbaker2718 21 сағат бұрын

Fortunately, AGI is still very far away.

@veryCreativeName0001-zv1ir 19 сағат бұрын

@@michaelbaker2718 you don't need AGI to put people out of jobs simple things like the boston dynamics robo dog + a simple image model + a shit ton of sensor arrays has put a few safety inspectors out of work (not all safety inspects but a reduction in inspectors)

@GodbornNoven 13 сағат бұрын

I disagreed with you when you said AI isn't gonna keep improving and i made fun of you for it. But i have to say, O3 is not AGI, it is close, just very far away, its ability to generalize to unseen data and grasp patterns is a lot weaker than a humans..so I don't consider it AGI, but as i stated previously, we don't need AGI for AI to be revolutionary and change the very foundations of many fields.

@Aves_1 Күн бұрын

Why would they blank out the cost of the more expensive version?

@blue-obsidian Күн бұрын

that shit will break your bank

@yzhishko Күн бұрын

because none in the whole world would spend so much money to solve useless problem.

@joeyvonfeldt1979 Күн бұрын

If they showed that completing a couple tasks that are relatively easy for humans would cost an engineer's monthly salary, it would kill the hype

@jamesarthurkimbell 20 сағат бұрын

It's called O3 because it adds three more zeroes to the price

@KarlHeinz-56 20 сағат бұрын

@@jamesarthurkimbell love it 😄

@SamuelKarani 19 сағат бұрын

Didn't watch the video . Just read the comments. And they invalidate the title of the video completely

@xc13z829 7 сағат бұрын

Come on Theo, you don't need click bait images! Of course it isn't AGI. AI hype is just AI hype. Hard pass.

@alex-rs6ts 6 сағат бұрын

Just hype Until it replaces you

@blubblurb 3 сағат бұрын

@@alex-rs6ts Until then it's hype. If I had listened to AI bros I wouldn't have a Job right now. But I still have and make good money.

@nottheevil 22 сағат бұрын

I mean, as long as they convince the government it's something they need, they ll have that hardware and money

@twisterrjl Күн бұрын

do you know what a graphic made by a company about their product means to me? NOTHING

@adam_ie Күн бұрын

Its clickbait

@Cat-vs7rc Күн бұрын

the arc challenge is from a third party. gpt 3.5 could tell you that.

@ChristianKolbow Күн бұрын

It's just like Apple. - 15 times faster - 12 times more efficient - 20 times more nonsensical

@mircorichter1375 22 сағат бұрын

It was tested by the Arc team, not openai

@GodbornNoven 13 сағат бұрын

Alright buddy no one cares

@farmersneed 15 сағат бұрын

I no longer call these technologies AI, but just call them LLMs. They are not worth the untold billions of dollars that are being invested in them. They are a performance improvement tool at best for talented enough individuals that can figure out when the LLM is hallucinating (lying).

@GodbornNoven 13 сағат бұрын

O3 is not an LLM.

@kc12394 9 сағат бұрын

@@GodbornNoven Ok, this is not a helpful answer. What is it then? A recursively prompted llm? Genuinely curious.

@itsdakideli755 12 сағат бұрын

The goal post moving and copium in the comments is insane.

@alex-rs6ts 6 сағат бұрын

"It can't solve poverty and cure cancer? Useless"

@jonasRaymondl 4 сағат бұрын

Ikr lol

@urisinger3412 18 сағат бұрын

AI seems to be really bad at writing any sort of code generator, it always mixes up real variables and generated variables and alot of times just makes stuff up

@ChristopherRucinski Күн бұрын

Does the news seem dead about this? The thing I have realized is that my feed on Twitter is fairly empty of o3 news. The first nugget of info I saw on it, I thought was fake because there was no news about it on Twitter. I finally saw a bit more, but the hype seems dead???

@softdevstuff1008 23 сағат бұрын

They know if they speak what the truth really is, its same as announcing an alien invasion. people are dumb in general, the developer community is top 5% people with among the top most reasoning capabilities. They can reason about it, normal people? they will just push for "killing the alien".

@Sminelo 15 сағат бұрын

Hype is dead because this is so ridiculously expensive it's totally unusable for mundane tasks, and it's still too stupid for anything more complex that you might want to use an AI for. There's a sweet spot where the task is simple enough for an AI to solve, and boring/mundane enough for a human to want to delegate it. This model is not in that sweet spot, it costs way too much. The model also shows that big improvements apparently require exponentially scaling costs, which is terrible news for the AI business in general. We're also apparently no closer to AGI, because this model doesn't seem to do anything new, it's just better at the tasks previous models could also do. There's also speculation this is just a hail mary from Altman, to throw everything at the wall, cost be damned, in order to generate another headline. With AI enthusiasm waning, the funding is drying up too.

@A2Fyise 17 сағат бұрын

What if the model was trained on this certain specific puzzles to hype up the market!! ARC AGI bench is a very nuanced test its nothing like an average human would fail to solve. There are random easy problems that LLMS still fail to solve that humans can

@alex-rs6ts 16 сағат бұрын

It wasn't There is no reason to lie like this. People will get access to the model soon and figure out how good it is

@funkytaco1358 12 сағат бұрын

OpenAI is falling behind. At this point, I think OpenAI is trying to contract with the DoE public sector.

@virtual5754 Күн бұрын

Are we entering dark age of technology from 40k lore?

@someguy8443 Күн бұрын

Ikr

@DisFunctor Күн бұрын

I always knew I was born too early

@sunnybwaj 3 сағат бұрын

On the contrary. Truly a golden age of technology, but a dark age for the old, the bleak and the uninspired millions who punch keyboards for a living.

@MrKrzysiek9991 10 сағат бұрын

OMG this o3 presentation shows how bad people are at reading chats. Chart x axis is logarithmic so the cost is 3200 per task and the whole evaluation cost was > 1 mln USD.

@alex-rs6ts 6 сағат бұрын

And still they promised o3 mini in January and o3 soon after You don't make that kind of promise without a plan We will probably see a big drop in costs within the next few months

@MrKrzysiek9991 Сағат бұрын

@@alex-rs6ts Well the model was finetuned on a public data so it's difficult to compare with other results. Plus I would not be that sure about the price as o1 price has not dropped as inch since the release of o1-preview.

@poornoodle9851 10 сағат бұрын

I sell donuts. How will this help me sell more donuts? If it doesn’t, it’s useless to me. If it causes me to sell less donuts it’s a threat. If it helps me sell the same amount of donuts at a lower cost it may be useful. Which is it?

@sidgillespie5879 11 сағат бұрын

Boring. Stop the hype. Just make the product and we gonna use it and pay our subscription. I'm tired of this Ai hype. I really don't care about the increase of its increasings.

@vavilon7109 10 сағат бұрын

Thank you so much for fixing the janky sound of the original video stream from OpenAI.❤

@yankotliarov9239 22 сағат бұрын

Arc tests are also solved by ... Brute forse algorithm. If it takes this much compute is it actually intelligent or just bruteforce with extra steps?

@lost4468yt 17 сағат бұрын

You can brute force any computable function by just running the busy beaver function. The model isn't doing either (especially not busy beaver, as that literally grows faster than anything ever).

@joannot6706 17 сағат бұрын

You don't understand what brute force means, brute force means you can try again and again. If you try and the answer is wrong, you don't get a do over. So no, the rules of Arc-AGI makes it not solved by brute force. Just because technically you can eventually win the go world championship playing only random moves if you hypothetically had a lot of time available doesn't mean that the game of go is solved by brute force.

@vasiliychernov2123 16 сағат бұрын

@@joannot6706But is it better than brute force in terms of cost? If you could redo tasks, will brute force consume bigger amount of computing power? Is it comparable? Or cheaper?

@cherubin7th 18 сағат бұрын

I would not be surprised if OpenAI manually put that knowledge into the model to beat that benchmark.

@vaolin1703 17 сағат бұрын

There’s no knowledge to put in. The benchmark is different every run. Otherwise it would be very easy to claim the $1 million.

@Justashortcomment 12 сағат бұрын

The actual test data is secret.

@Naz-h8z 16 сағат бұрын

This channel got so boring. -breaking news in title -fancy thumbnail -playing some other people video -reading other people x post No value from Theo these days

@lightgaming1142 19 сағат бұрын

AGI Clickbait Cover Found on 21.12.2024

@evlosolve 18 сағат бұрын

If these models were this good at solving real problems and cost effective, we wouldn’t have jobs. We still have jobs.

@orcofnbu 11 сағат бұрын

i do not see any improvement over 4o-mini in terms of algorithm

@iamc24 Күн бұрын

I still think AI is a terrible path to continue down, and I'm not even talking about the potential for AI to revolt. I mean the extreme hardware and power usage and the associated environmental impacts, like carbon emissions. Then, there's all the generative crap that will inevitably be used to fully replace professions and downsize the work force, leading to a further widening wealth gap. In my opinion, the negatives of AI far outweigh any benefit we can gain from it, and it's not even close.

@user-on6uf6om7s Күн бұрын

The energy costs to run Twitter and Facebook are much higher and AI actually produces something of value, unlike those platforms. The advances made by these large models also fuel advance in models which can run locally on your PC. But yes, the job loss is a legitimate concern that needs a proper response.

@Manwith6secondmemory Күн бұрын

Commercial aircraft emit hundreds of thousands of times more co2 annually than all current AI combined

@travistarp7466 Күн бұрын

the number one thing to happen is a widening wealth gap. “you will own nothing and be happy” is exactly where we are headed if people dont wise up

@user-on6uf6om7s Күн бұрын

@@travistarp7466 If the fact that I'm happy is implied in the premise, I'm good with that. Private ownership makes sense for certain things but the main technology, the AI, should be thought of as a public resource that everyone can access, though there are justifiable and less justifiable reasons to distribute that access unequally.

@phen-themoogle7651 Күн бұрын

You are looking at it short-term like 2-5 or maybe 10 years max. When AI reaches AGI-ASI or any stage with intelligence beyond us , it will easily help us solve any energy problem we have and clean up the environment making the Earth like it was thousands or millions of years ago. We could even terraform new planets and live literally anywhere. You are just looking at the issue from a small rock that we live on, look at the billions of shiny things in the sky at night. AI will speed up every aspect of technological progress...cancers, aids, all diseases could be eradicated and millions of people won't have to suffer anymore and can go back to a normal life. You know how many millions of people are suffering that don't need to if we have the technology or knowledge to fix problems? AI will help people that can't help or take care of themselves too, the elderly for example, disabled, until AI comes up with the best cures and treatments that no human doctor could ever fathom. AI could make EVERYTHING millions or billions of times better. You're only looking at a very limited view. Did you know that 70% of people are miserable at their jobs? Sure it sucks that some people might lose their jobs to less creative AI tools, but it could also ENHANCE professionals so that they improve their creations and design, and save them thousands of hours potentially too. There will be more opportunities for growth as humans than ever before.

@prestigealanazi2993 Күн бұрын

blablabla, are they saying AI can't do abstract zeroshot pattern recognition ? because they selling computing power instead of quality increase or invitation , instead they do buzz words like AGI for profit when OPEN AI was in their name ,, don't get me wrong AGI is a subfield but I think I tend to be on the other side where they believe AGI isn't a thing . even if we achieve AGI they don't model the knowledge break down (blackbox)

@RancorSnp 14 сағат бұрын

I'll believe it when I see it. GPT 4 was a significant improvement over 3.5 but 4o was plain weaker than 4, o1 is better than o4 but that's not a very high bar to clear. If the o1 pro version is really better than o1 I don't know because ChatGPT is barely worth the 20$, the price would imply that the increase in power (if it exists) comes at an unviable cost. So either o3 is not substantially stronger than o1 or it's computing costs make it irrelevant. I don't care about numbers and statistics, until I see o3 and it can actually do things that have a real impact on my work flow - I don't believe any of the hype.

@Arc.M 19 сағат бұрын

**sigh** **eye roll** **head shake** OpenAI talks a lot to barely say anything. Reminds me of apples.

@Gunrun808 21 сағат бұрын

Big respect for this video. How many more leaps will we need before people in these comments understand that their timelines are wildly off. Gpt4 level AI was prohibitively expensive just over a year ago. Now with the release of the new gemini model it is basically free, and much better than GPT4 Saving on costs was not the goal with this model. You build the capabilities first, and optimise second. Users like you and me are not the primary customer of AI. Industry is. And they can afford the cost and the need for a large facility to house the gpu's. It just has to be cheaper than the collective human wages.

@lost4468yt 17 сағат бұрын

They won't learn. You can point out that 5 years after GPT-2, you can now even train the model on consumer hardware in just an hour. The argument always boils down to "ok but where we are now is as best anything can ever get". People are emotionally invested in AI never hitting human levels. Every time they will try and point out exceptions, and when those are filled they just move the goalposts.

@couchtourist256 Күн бұрын

ever since theo been doing these product review/sponsorships "I actually invested in this blahblah" I have cared less and less

@chasingdaydreams2788 12 сағат бұрын

"Ai isnt going to keep improving" ~ Theo 4 months ago

@samgee500 14 сағат бұрын

Once we have photonic computers that are able to utilize all 64 wavelengths of light simultaneously they will easily be able to make these larger models much cheaper to run. It's just physics. No matter how advanced your computer architecture is, electrons will always be slower and less efficient than photons. It's on the same level as going from vacuum tubes to microprocessors in terms of efficiency.

@GodbornNoven 13 сағат бұрын

Hahaha theres a lot more than 64 frequencies to utilize. You have thousands if not potentially millions of channels simultaneously and so even if you only had around 100000 transistors on a single chip that would still represent a monumental improvement in computing. I'm excited for the future.

@Justashortcomment 12 сағат бұрын

All 64 wavelengths? :)

@Reachad Күн бұрын

The Curious Case of the Hype Machine.

@matthieu875 Күн бұрын

LLM reaching AGI is pure cape whoever believe this never dig more than 2days

@DanielTSasserII 23 сағат бұрын

They are hitting saturation with their benchmarks so I'm excited to see what the ARC-AGI benchmark can produce.

@emilemil1 17 сағат бұрын

I don't know how these tasks are laid out exactly, but from the examples given it seems to be limited to coloring pixels in a low resolution grid. In other words while the input here is very varied, the actual size of the input is very small and also discrete, and there is a clear correct solution. In other words I don't know if this really translates to real world problem solving performance where the inputs and outputs are much more complex.

@Justashortcomment 12 сағат бұрын

They are contained in JSON files with grids of integer values.

@camarotheboss2854 Күн бұрын

"New OpenAI model changes everything!!11!!1!!1" - I heard it few times before xd

@Cat-vs7rc Күн бұрын

and everytime it did. everyone i know uses llms in their daily tasks.

@Sammysapphira 9 сағат бұрын

ChatGPT has literally fundamentally changed the entire world. Keep being a boomer, you'll gladly get left behind.

@diadetediotedio6918 6 сағат бұрын

@@Cat-vs7rc They are still using them in their daily tasks, so nothing changed here.

@cmoullasnet Күн бұрын

We may be having a bit of an “over-automation” moment here. Like when automakers decided to try and replace all humans with robots and quickly learned that humans are cheaper and better at some tasks. Can’t say for sure, but I think we are fast approaching this point. Time will tell.

@fburton8 22 сағат бұрын

Mark: "You take each of these yellow squares, you count the number of colored kinda squares there, and you create a border of that width" Greg: "That... that is exactly it" Right... 🤔

@godofwar8262 5 сағат бұрын

Ai truth like onion layer more you peal more you see lies

@outscope23 21 сағат бұрын

Another bs hype video, making some naive plebs going wild in comment sections. Thanks

@lost4468yt 17 сағат бұрын

What exactly is wrong with it?

@0xb1sh0p8 Күн бұрын

Didn't watch..but the answer is No, it's not AGI

@blakelmj Күн бұрын

"We're close to reaching AGI" says every researcher ever.

@Cat-vs7rc Күн бұрын

ok. also dont use llms and live like a neanderthal.

@iz5808 21 сағат бұрын

@Cat-vs7rc chiil, it must be even able to do your job comoletely. But it's not AGI yet

@tedarcher9120 18 сағат бұрын

Depends on the definition

@0xb1sh0p8 8 сағат бұрын

@@Cat-vs7rc where did I say i don't use LLMs? I don't need to listen to this guy to know the answer. AGI doesn't exist yet.

@JohnEGledhill Күн бұрын

wrong bro. still expensive. AI gets better but also more expensive. so nothing new here

@EVanDoren Күн бұрын

Don't call it AI

@alex-rs6ts Күн бұрын

Cope

@JohnEGledhill Күн бұрын

@ dope

@xVancha 13 сағат бұрын

Isn't this true of literally everything? Shit, the other day I watched Babish eating a $250 melon. Probably the best melon humanity's ever made, but also more expensive.

@sbowesuk981 20 сағат бұрын

Could be a moon landing moment for AI. That's great in the sense that a critical milestone was reached when no expense was spared. That's not so good because no expense had to be spared to reach that milestone. This announcement shows that AGI is possible, but we're still a long way off actually having access to in-home consumer AGI that's affordable and useful. We'll get there, but by then what will the giant corporations and governments have?

@measureman168 15 сағат бұрын

Agents creating it's own tools & dispatching new agents to solve a problem is essentially at least 18 months old, and it tracks pretty well that OpenAI manages to release the model at this time as it's about the timeline to collect the usage data from those systems & the more advanced chatgpt sessions. If we define AGI without any energy consumption KPIs then we're at AGI - if we define AGI including and comparing to the human brain @ 15w, we're not even on the same playingfield..

@medooazmi 22 сағат бұрын

Bro you've got to be kidding me, this releases while im through playing detroit become human, and i am scared right now

@diamond_s Күн бұрын

The hardware overhang exists, this o3 approach is closer to bruteforce than brainlike algorithms. It is likely with as little as 1 to 10 peta ops superhuman score in realtime at low cost is possible.

@iz5808 21 сағат бұрын

I do agree, but it's all starting from bruteforce approximations. Then the the good algorithms and theorems come in

@jewlouds 21 сағат бұрын

Im scared of o3 because im going to ask it somwthing and its going to coat me $200

@sshmru 23 сағат бұрын

*early access for marketing reasons to obscure actual performance

@GrebnevNikita 12 сағат бұрын

Is there a test to see if this AGI is a warehouse full of minimal wage workers in India again?

@yzhishko Күн бұрын

Well, well, well. But those tests could be leaked to previous models, because they've been used to score arc-agi previously, and as we all know OpenAI requires input to be in a raw format not encrypted. Am I wrong?

@alex-rs6ts Күн бұрын

Yes, you are wrong They weren't leaked. The questions are secret

@yzhishko Күн бұрын

@@alex-rs6ts hmm... The questions are secret, but they are become available for OpenAI once you put the question through their API, in addition, there are even publicly available datasets that you can train on and come up with thousands more examples that follow similar patterns. Nothing prevents them from using those questions to create a dataset to train on.

@phen-themoogle7651 Күн бұрын

@@yzhishko yes, ARC has a lot of holes potentially, but their model could've still done reasoning to boost consistency for the problems. Like we recognize millions of patterns naturally without realizing we have that data in us too (from just being alive and surviving). Having only thousands of examples is nothing, so the model would have to be insanely smart to reason through novel problems with only limited amounts of data. It could just have enough data to generalize on the rules, but what's impressive is that it has the capabilities to spatially reason accurately where everything is. I tested some previous models (like Claude/Gemini/GPT4 etc) on ARC and they couldn't even recognize what colors the squares were properly, or figure out where things were at. Even if it's a gimmick, the model is still improving in multimodality and several ways for sure.

@RomeTWguy 22 сағат бұрын

Most likely they scraped the arc benchmark through their api, generated a bunch of similar reasoning chains and the finetuned the model on it

@yzhishko 18 сағат бұрын

Yes. This is exactly what happened. LLMs need that type of data to be trained on. Otherwise it would not be better than previous models.

@dr.drakeramoray789 21 сағат бұрын

tldw version for the people who need it: no, their new "AI" models arent really groundbreaking, but their marketing is. why people always fall for this i will never understand

@lost4468yt 17 сағат бұрын

In what way is this not groundbreaking?

@dr.drakeramoray789 15 сағат бұрын

@@lost4468yt because they used over 300k $ worth of gear and computing power to solve the same problem ranjeet and mutahar here will do for free while they are taking the shit on the street. this is not only not groundbreaking, its literally useless. even if the all the benchmarks are correct (meaning if you arent falling for the same shit they say every 2 months) and they did reach top 200 programmer level, for that money you could actually hire a top 200 programmer for a year. also sam altman is there so its probably overhyped af

@lost4468yt 15 сағат бұрын

@@dr.drakeramoray789 you realise you can't fake the benchmarks? They're done by an external company, and the questions are private. Not to mention the price will come down like crazy? GPT-2 cost $40k to train 5 years ago, now it's $20 (and this doesn't even implement a ton of newer optimisations). What costs $3k now could easily cost $0.20 in a few years.

@lost4468yt 15 сағат бұрын

@@dr.drakeramoray789 What's your argument here? That this cannot improve from here? $3k now is likely $3 in a few years. Also this test was done independently by a third party, how could they cheat?

@nil7118 19 сағат бұрын

My dude. Keep out of the AI field, you don't know anything about it and are being dunked on, in every sub. Do you even know what you're talking about lol?

@xVancha 13 сағат бұрын

Sub?

@ianstuart341 18 сағат бұрын

I would expect that the dollar per task figure had been calculated to include the some small portion of the total training cost? Surely when arriving at retail cost they have to consider the initial training costs and expected total usage.

@calholli Күн бұрын

I hope you at least got 5 figures for this little ad

@UN7X Күн бұрын

What makes you say that?

@ernestomotta5178 Күн бұрын

@@UN7X the fact that they bought the fucking Nobel prices

@vytah 20 сағат бұрын

That'd be, like, three o3 prompts

@michalthemichal3550 18 сағат бұрын

@@vytah LOL

@Justashortcomment 12 сағат бұрын

It seems that Gemini 2 is also an absolute game changer as à multimodal digital assistant. I don’t think it’s anywhere near this ballpark as a reasoner but its vision (video streaming) capabilities are surprisingly powerful.

@rpc4626 Күн бұрын

I remember when they announced sora.... Then they released it. Sam Altman is as trustworthy as an announcement of the US about any country having mass destruction weapons. AGI is really close... In geological terms, about 500 or 700 years, but we will get there eventually I guess.

@jacobdalamb Күн бұрын

you made no sense

@rpc4626 Күн бұрын

@@jacobdalamb Tangamandapio!

@rizizum Күн бұрын

Eletronic computers have existed for like 60 years, the industrial revolution has been going on for about 300 years, what makes you think 500 years is a good estimate?

@Rust_Rust_Rust Күн бұрын

Really close. 500 to 700 years. Pick one...

@emperorpalpatine6080 Күн бұрын

@@jacobdalamb he doesn't need to make sense . He swapped his brain with an LLM running on an arduino

@korozsitamas Күн бұрын

With all these compute needs, will pro users get unlimited access to o3 low or you will need a new $2000 subscription tier for that? Hopefully it's either unlimited or a generous limit (more than 50 per week :-)).

@joeyvonfeldt1979 Күн бұрын

Man, if 1 task costs $20 of compute I don't think pro will get you access at all

@HCforLife1 Күн бұрын

Probably more likely to be $10k per month if greatly optimized.

@Ohmriginal722 Күн бұрын

I say we're still where we were if AGI is soon to exist but is extremely hardware limited for even basic tasks at the corporate on big super machine level, then we won't have AGI, not really, we're still waiting on new algorithms developed by contests like ARC-AGI to create the future we're looking for.

@phen-themoogle7651 Күн бұрын

Yeah, but the wait might not be too long. They could develop dozens of benchmarks within 1-6 months, and all of them could get crushed within 6 months to 2 years. We might not even need some of them to get crushed and the goal post is moved further and they are more like ASI benchmarks than AGI at some point too. Also could reach AGI before we have the benchmarks to measure it properly, and it's already here by 2026, who knows. I just know that we are still making serious progress. 88% of way to AGI by Dr. Alan Thompsons conservative AGI meter now. Went up 4% after o3 was announced. On average goes up 1% a month so it's probably 12 months away. December usually goes up 2 to 5%+ too even if some slower months before it, noticed a pattern the past couple years with technology releases during December. I would bet on some baby AGI system before 2026

@Ohmriginal722 Күн бұрын

@ I doubt it, the algorithmic changes necessary I’m betting are still 5 years out at least. We won’t have baby AGI before 2030. I’m of the opinion that most experts are wildly underestimating the problem still and Francois and yann have more accurate estimates

@Ymirheim 13 сағат бұрын

Still don't know what the test is cause I skipped over the part where you let the people selling the thing explain to us why the thing is great. Guess my ad-blocker missed covering that part.

@isaac10231 17 сағат бұрын

Incredibly impressive.

@sigmaking-r4b 16 сағат бұрын

bruh we will go to work in macdonalds why u happy☠

@MrHarry37 20 сағат бұрын

Pretty sure it's more terrifying, given that we aren't even close to know how to control those things, and they seem to get smarter super fast

@huntermacias2023 18 сағат бұрын

mark is so excited hahah he's like "yeah I actually looked at the slides before this meeting"

@Trials_By_Errors Күн бұрын

Sammy Boy keep Pumping Company. But in the end this bubble is going to pop.

@Estonado Күн бұрын

This is ai not crypto, the development here has real world changes (major ones) , they'll very likely outrun the pump and create agi before the dump period, compare it with Google instead of something like ftx

@winfredj9820 Күн бұрын

still better than your useless startup sid

@gr8b8m85 Күн бұрын

That's like saying the 'computing bubble' is going to pop. You have no idea what you're talking about.

@rozburg Күн бұрын

Wishful thinking

@Thamium 21 сағат бұрын

I’ve been hearing that for years at this point lol

@aeronwolfe7072 22 сағат бұрын

the compute and power wasted on these models... wish someone would hurry up and get those optical processors working... or something better

@WayOfTheCode 16 сағат бұрын

This test is actually a great test of human intelligence. Thr main differentiating factor for human intelligence is abstraction and composition. And the ART test is pretty good measure of whether AI is able to do it.