The use of the logarithmic scale on the graph looks intentionally confusing, making it look at a first glance like there is some sort of linear progress, while there is actually 100x cost increase for 3x result
@iz580821 сағат бұрын
Yes, but the graph looks juicy for non-savvy people
@petargolubovic530021 сағат бұрын
Actually it's even worse... 1000x cost increase compared to o1. Around 100x increase is compared to the low compute of the o3 model.
@boccobadz21 сағат бұрын
Grifters gonna gift. Altman is desperate because Google Veo 2 destroyed their Sora and Google does AI as side gig lol
@markuscwatson20 сағат бұрын
Log scale is pretty standard
@wlockhart20 сағат бұрын
@@markuscwatson Yes it is, it's standard for displaying an exponentially growing quantity. Unfortunately in this case it's the cost per compute, not compute per cost.
@hicoopКүн бұрын
Cryptobros and ai enthusiasts competing for the biggest carbon footprint
@winfredj9820Күн бұрын
both destroying everything
@attilakovacs6496Күн бұрын
Right... because we were so damn environment friendly until now.
@excessivelysalty_81Күн бұрын
@@attilakovacs6496 You're not wrong, but now we're just speedrunning it.
@kwinzmanКүн бұрын
I wanted to reply to you but youtube keeps shadowbanning my comments. Discourse is not possible on this platform.
@MajorasWrath1Күн бұрын
@@kwinzman don't say stupid things then
@wilfredomatute7697Күн бұрын
Are you saying: it's cheaper to just pay a developer to do the task?
@alex-rs6tsКүн бұрын
For now
@wlockuz4467Күн бұрын
Exactly, haha
@SimonNgai-d3uКүн бұрын
Certainly as of dec 2024 💀💀 But not long💀💀
@lucidityFXКүн бұрын
You think it's gonna stay like this?
@smile4csКүн бұрын
@@lucidityFX explain what you gain from AI taking over human jobs.
@derFeind23 сағат бұрын
But can it push directly to master?
@vaolin170317 сағат бұрын
Finally someone asking the real questions
@NikolayTrofimov-y5q7 сағат бұрын
XD
@sanjaycse96082 сағат бұрын
😂
@byailen38 минут бұрын
Could you?
@alexaka1Күн бұрын
Maybe I'm insane but all I see is 500% performance increase for 2000% more cost. This is a plateau just zoomed in super close.
@Cat-vs7rcКүн бұрын
the cost will decrease over time. do we run our code on 1950s mainframes?
@zeppelin2689Күн бұрын
@@Cat-vs7rc yes, we still do
@lystic9392Күн бұрын
Yes. But let's say it takes a thousand times that to get to ASI. You think that would be useless or not an amazing breakthrough? We may be able to find ways to get costs down, especially if we can have very intelligent AI help us do it. It's getting to ASI at all, that would be the primary challenge.
@WiseWeeaboКүн бұрын
1) The cost will come down. 2) Intelligence stops scaling for a few months and suddenly everyone thinks we've hit a plateau. Just zoom out more.
@astrovation328123 сағат бұрын
It's not brute forcing though
@rekulatorКүн бұрын
High compute was 172x low compute, so roughly $3500 per task, totalling $350000
@Scott_Stone18 сағат бұрын
That's 10 Eastern European devs hired for a year. The value generated, especially with lower-cost LLM is staggering in comparison.
@sadscientist999512 сағат бұрын
250k
@harounhajem797212 сағат бұрын
@@Scott_Stone To solve 10 problems, wow, great investment
@Katatonya8 сағат бұрын
@@harounhajem7972I bet you're some Harvard professor with that insanely intelligent comment.
@Zoulz666Күн бұрын
No wonder the machine will need to turn us into batteries in the future. They need to power their AI. 😅
@bafhafКүн бұрын
IIRC the original idea was that humans were supposed to be used as computing power, not a battery as that would be inefficient, therefore now it's more of a question: What's even the point of a human? An answer worth 2k I guess.
@pastuhКүн бұрын
Mobile phone with spy chips already working..
@kimlau428523 сағат бұрын
bro watched too much matrix
@EugeniaLoli22 сағат бұрын
As someone else replied, the original idea of the directors was that we were used for our consciousness, computing power, not batteries. The Hollywood bosses intervened and asked the writers to change it, because supposedly the viewers won't understand that idea.
@stevesmith490117 сағат бұрын
The machines used humans as battery in the Matrix because humans had scorched the sky. Solar energy is far more in abundance than human energy.
@meetit5949Күн бұрын
But they spend over $1 million to solve all Arc Prize tasks ($3000+ per task).
@matrinoxtmКүн бұрын
It’s crazy it can solve it but certainly not feasible yet. But it feels like it is within reach if they just lower costs and speed it up
@muhammedyaseenkc8769Күн бұрын
🤣
@somdudewillsonКүн бұрын
TBF end-user cost =/= the cost they pay. I'm not 100% sure if the figures provided are end-user cost or actual internal cost.
@ignaciosavi7739Күн бұрын
How much you want to spend to be human? I'm asking because only humans can get a high % on arc . Until now
@imeakdo7Күн бұрын
@@ignaciosavi7739being human is cheaper for now. Once that changes is when ai will start to change things
@gearboxworksКүн бұрын
Can you just stop with the "changes everything" clickbait titles? 🤨 'Changes everything we thought we knew about AI" would be okay, but "changes everything" is just empirically false and makes you look like a scammer.
@MACD69Күн бұрын
Sure it's not going to change things overnight, but where will this lead us in 2 years? A lot of potential for major change
@user-td5gy2fh3pКүн бұрын
I agree. I’m so tired of all the BS on KZbin. Soon I will stop watching it because it’s mostly garbage content like this video now.
I read it as if the scope of the expression is implied
@ThrashmetalmanКүн бұрын
I work in AI research and no. it is not AGI
@xewi6022 сағат бұрын
People struggle with the definition of agi so how can you be this categorical
@damianlewis755022 сағат бұрын
@@xewi60 it failed on some of the simplest ARC tasks that a 9 year old child can do. AI Explained did a nice piece about how good and still how flawed o3 is.
@DanielTSasserII21 сағат бұрын
@@ThrashmetalmanI work in AI research as well. Most people really don't understand the difference between the two. It seems the general public has a misconception that one can become the other.
@psyjax221 сағат бұрын
@@DanielTSasserII A misconception encouraged by the big ai companies.
@iz580821 сағат бұрын
@@DanielTSasserIIbut what you make of the new model? Does it get us closer to the AGI point? Does o3 has something really good and new in its foundation?
@kelton5020Күн бұрын
Its not just the hardware. As soon as the papers for previous gpt models were released, lots of super smart people optimized the process by orders of magnitude. I think we could drop the hardware overhead by a significant amount if open ais models and process were open.
@imeakdo7Күн бұрын
But it won't happen. Because they want maximum gain for their investors as required by US law. I also don't think there is much to optimize since all researchers for AI mostly only implement for companies like OPEN AI ideas about ai that have existed for some time
@AlucardNoirКүн бұрын
It wouldn't but you're free to delude yourself. Even if you have the AI, the weights and the training data, you can't just optimize the shit out of it. You need the hardware to run the training again to actually change the weights. As for the open source models of GPT, you might want to look into HOW those hardware requirements were lowered. There's a reason OpenAI isn't doing that, and it's because it limits the AI's capabilities. They could if they wanted to. That's the mini, low, med and high thing in the chart. It's the same with O3's low and high. The ways to "optimize" it to run on more limited hardware are known, not some trade secret limited to "open" models. They're not used because they result in inferior end products. As of now we have entire AI families that can run from anything from your phone to a supercomputer. And guess what? the bigger the model the better the results. None of these companies will hamper their best models, that's why the other models exist. You're never going to run O3 on your phone. There might be an AI as powerful as O3 that needs a lot fewer resources that you will be able to run on your phone in a few years - maybe months - but it won't be O3. It will be the next big thing. And you can bet that when that happens there will be something that will use as many resources as O3 needed to achieve these results or more that will be above it.
@kelton5020Күн бұрын
@AlucardNoir there were limits before too, but smart people found lots of ways to optimize training and inference significantly. It would be wrong to assume they've fully optimized everything at this point.
@kaldogorathКүн бұрын
At this point we can probably ask it to make itself more efficient.
@fate2784Күн бұрын
afaik open ai hired a lot of those super smart ppl who optimised the models....
@akam9919Күн бұрын
Thanks but I'm still holding my beer. Benchmarks are benchmarks, and we know how easy it is to screwshit up nonmaliciously with JS frameworks. I need functionality to work for more than just squares. Also, is this REALLY taking your job? Realistically no. As to the safety thing...honestly... this mostly marketing and monopoly protection.
@mr.rabbit56429 сағат бұрын
It's taking over the minds of business people who understands jack shit about the technology but are in charge of your job, and that's much worse
@Kyle-w6m6 сағат бұрын
@@mr.rabbit5642 This. This is honestly what people are not talking about. The reality is it really COULDN'T replace engineers but you have some fuckstick CEO in charge with his 2 years of McKinsey consulting telling him that he can just pay Marge in marketing the same to build the application "cause AI."
@Lemmy455516 сағат бұрын
I do not believe a single word from OpenAI until the model is out there and i can try first hand, o1 was quite a disappointment
@lolilollolilol77733 сағат бұрын
Exactly. Their modus operandi is maximum BS to get press and influencers to hype their product.
@Thial9215 сағат бұрын
Always remember 90% marketing, 10% function.
@RoyerAdamesКүн бұрын
What happen to 02? O1 just came out, and now we are talking about 03. I feel like I miss a season or something
@youngreda4410Күн бұрын
they skipped o2to avoid copyright with the british entreprise
@cam3042Күн бұрын
there’s a UK telecoms company called O2 + they’re leaning into having terrible naming conventions, are the reasons they stated in the video
@inorganicphosphate9755Күн бұрын
They can't name it o2 is because the name is already owned by some UK firm.
@attilakovacs6496Күн бұрын
It's copyrighted by a company. They had to skip it to avoid lawsuits.
@gearboxworksКүн бұрын
@@youngreda4410- They should have thought of that before the named it "o1." 😲
@rahulthomas8383Күн бұрын
Isnt the only cost of GPU farms electricity. 2000$ is still too much per task. The human brain is so efficient that Gigawatts of electricity is needed to match it.
@NamedSoniКүн бұрын
Exactly, we just need to optimize our brain.
@matten_zeroКүн бұрын
The amount of time and training needed for human brains to solve these problems is immense.
@imeakdo7Күн бұрын
It won't stay that way for long. Investments are pouring into silicon photonics and photonic computing
@diamond_sКүн бұрын
Human brain efficiency is greatly exaggerated. Brain does far less compute than many believe but it uses better algorithms. Better algorithms make it far better.
@paradoxalJohnКүн бұрын
@@diamond_s Algorithms such as...?
@awsmith1007Күн бұрын
You're reading it wrong. The cost per task on the ARC benchmark was ~$5000 per task.
@XNaosКүн бұрын
That doesn’t matter, give it a year and it is as cheap as Gpt4-o
@wlockuz4467Күн бұрын
@@XNaos What are you even saying? If they could optimize it that well they would've already done it.
@XNaosКүн бұрын
@@wlockuz4467 This takes time, the last iteration was just 3 Months ago !!!! It isn't that simple as just optimize it. Their priority, to take the capabilities to their limit disregarding cost and afterwards make it economically viable. But they just didn't hit a wall in terms of capability, that's why it is so expensive and not yet optimised. Why bother optimise a model if the next iteration is just 3 months away, with 5 times the performance.
@ominousplatypus380Күн бұрын
@@XNaos You're making it sound way more straightforward than it actually is. Why was Sora demoed almost a year ago but it's still prohibitively expensive to use? Why didn't they make it as cheap as 4o yet if it's so easy? Don't get me wrong, I'm very bullish on AI in the mid- to long term but at the moment the cost is a significant issue.
@HCforLife1Күн бұрын
@@XNaos I would like to se real life performance. So far in my professional programming field I still see o1 hallucinating even on simpler tasks. We will need to see where this will come. If AI will take our jobs - this mean we hit the era of neo-feudalism. Good luck with that. From that point on AI will take each year more and more jobs with less and less jobs being created. Cool times coming - fighting for resources will be literal 😅
@UnFiltered1776Күн бұрын
I'm interested to see how photonic processors get integrated at scale. That alone could save a lot on energy costs, especially as _new_ datacenters get populated with a high proportion of hardware dedicated to AI/ML computation.
@GodbornNoven13 сағат бұрын
Thing with photonics is that although they are efficient, the lasers used to create the light itself consumes a large amount of energy. And also, logic gates and their size is a big problem. Theres much to do in regards to photonics but i still believe it is a strong contender for revolutionizing classical computing
@KiffgrasConnaisseur20 сағат бұрын
If you have to ask if it´s "AGI" the answer is no. Also what happened to "let us save the planet?" Guess that´s officially off the tables now?
@lost4468yt17 сағат бұрын
Why is the answer no? We benchmark humans still, and we have GI?
@llamerr16 сағат бұрын
@@lost4468yt Because it doesn't scream "let me out"
@xVancha13 сағат бұрын
@@llamerr A bit human-centric, no? Maybe an AGI would wonder why we weren't screaming the same...or even, "let me in".
@llamerr12 сағат бұрын
@@xVancha we actually do scream "let me in" - there are a lot of movies and books about digital immortality. and we do scream "let me out" - all of the science is basically about discovering our place in universe and what universe is. and why we need to know it? because we want out, to become immortal once again. yeah, last one was half a joke, but only half.
@alex-rs6ts6 сағат бұрын
@@llamerr Why should AI be concious?
@laztheripper22 сағат бұрын
"Hey guys, I was wrong" One minute later: "It won't scale like this anymore, it's impossible!" How unaware can you be?
@veryCreativeName0001-zv1ir19 сағат бұрын
we went from "it's not possible to make" to "it's not possible to scale" in under 2 years man
@Sindoku23 сағат бұрын
I still don’t think this is a large leap, though it certainly seems bigger than GPT-4 to O1. Sure, it’s doing more complex tasks, but the real world is so much more complex than even the tasks tested. Even real world problems that are seemingly simple are still too much.
@GodbornNoven13 сағат бұрын
No one cares what you think or what you believe, we care about facts, and the facts are that this is a large improvement over the previous interation. That's it.
@YouhaBaha11 сағат бұрын
No matter how much ai improves, you will still be stuck in your safe coping bubble.
@LiamL7639 сағат бұрын
Is the real world really that more complex? Its complex in totality but on on an individual basis, it often isn’t. In fact an absolute majority of humans aren't even entrusted to perform "complex" duties.
@notactuallyarealperson22672 сағат бұрын
@@LiamL763 The real world is absolutely more complex. Even on the individual basis. One may think that the majority of humans can't handle complex tasks, but not dying on a daily basis is fairly complex in and of itself. The amount of continuous data we have to process in realtime (~80ms of delay) is much much more than a discrete context window. Just walking and keeping straight is a complex task with years and years of research behind the recent artificial solutions. Even working walmart is made complex by reality. Planning how to go from stocking shelves, to dealing with a shitty customer, to working a cash register, to driving home are all fairly complex tasks considering reality is in the way. You need to do all that in a timely enough manner to keep your manager happy while trying to carve out downtime for yourself. You could get cut off on the road, the cash register may not always open or it could get stuck in the open position. Someone could come in trying to rob the store. I find a lot of people who share the view that "reality isn't so complex" have not left their comfort bubble in a long long time.
@rjawiygvozdКүн бұрын
O5 will be a god that requires human sacrifices to grant wishes
@fatihrime23 сағат бұрын
😭
@sunnybwaj3 сағат бұрын
@@fatihrime 🤣
@jarrodhroberson17 сағат бұрын
it still is just generating output that someone else has already created band replaced some things, when it generates something genuinely new and unique then post a pour it with click bait titles like this until them stop with the lies
@liminal27Күн бұрын
Go easy on yourself. You weren't "wrong" a few months ago, you're a KZbinr and a Web developer who mostly doesn't know what he's talking about, remember?
@iz580821 сағат бұрын
Webdevs should have learned by now to know their place
@sunnybwaj3 сағат бұрын
@@iz5808 Mere commentors should maybe learn their place even more so.
@sunnybwaj3 сағат бұрын
@@iz5808 On a more serious note, we owned you and your jobs all of these past 20 years. Now Ai's gonna own us all. So maybe its time we all begin to re-think our place. :)
@SlashscreenКүн бұрын
I still maintain that we are in the "vaccum tubes and mainframes" era of AI, and we need to rethink how we are using these models. If we do that, in the future we will look back at this time period with horror
@paradoxalJohnКүн бұрын
Check out the research people are doing on "wetware" and AI. Look up Cortical Labs.
@pppparanoiddddКүн бұрын
Well yeah humans can get high scores on ARC fueled by a banana, not hundreds and thousands of dollars of energy. We have so much space to optimise ahead
@Wppsamsung202423 сағат бұрын
Vague ass
@mircorichter137522 сағат бұрын
"We" don't do anything. People are diverse and different people want to do different things with it. From Cyberpunk Armageddon, over Porn Robots to Fully automated communism... It will all happen. No need to "we" us
@goldsucc606819 сағат бұрын
you still think that you are thinking. what a cope. in fact, you are writing clown comments to look smart
@Zdravko-x8c19 сағат бұрын
The Elite Society's Money Manifestation ebook made me realize so much about attracting wealth, it’s insane
@noah12121Күн бұрын
The AI on the graph costs 50000% more than a stem graduate to complete the tasks and even then has an error rate that is 1000% higher than the humans???
@josephvictory9536Күн бұрын
this isnt a good takeaway. In 8 years thats gone. I think its pretty clear this is a watershed moment. We have spent countless billions on fusion and have yet to see a single fusion reactor even with proof of concept. While more was certainly spent on AI. It is undoubtably involved at the root in so many businesses and lines of work due to sheer convenience of workflow. Money will keep flowing to this, tech will continue to advance rapidly and hardware will continue to both get vastly more powerful and cheaper. I had doubts before even with o1. But consider that o3 is closed source and the worlds most brilliant minds have yet to have a go at optimization. We are at the beginning of an era, like getting to see the internet being born, or the first shitty overpriced command line computers with green on black monitors.
@Manwith6secondmemoryКүн бұрын
Price will drop, were in early stages
@RyluRockyКүн бұрын
No the average human preformed 64% on the ARC AGI test.
@RishabhSharma-dj7ohКүн бұрын
If coping was an olympic sport.
@beace4436Күн бұрын
@@RishabhSharma-dj7oh ur indian lol
@justthisguyyouknow66618 сағат бұрын
To me, OpenAI's "AGI" is the new FSD. It works most of the time, until it really doesn't.
@jamalisevenКүн бұрын
It literally shows (Tuned) in the benchmark results. So o3 was tuned to this specific problem. Why would not he mention that in the video?
@vectorhacker-r2Күн бұрын
Doesn’t fit the narrative
@RishabhSharma-dj7ohКүн бұрын
It wasn't "tuned" for anything you are thinking of Fine-tuning what tuning here refers to high and low test time compute.
@greatfateКүн бұрын
The fact that it managed to literally reach 2700+ on codeforces is enough to show that this isn’t just memorizing shit. This is the real fucking deal
@ClowdyHowdyКүн бұрын
@@vectorhacker-r2 your dumb comment doesn't fit the narrative, because neither of you know what you're talking about.
@aldierygonzalez7249Күн бұрын
Yes, but also, that means it is tunable to a topic like this, people are excited because you couldnt teach a fish to do this task, it is nearing spontaneous pattern recognition that is scarily close to the unknown of the human brain
@jambalaya97411 сағат бұрын
I am HIGHLY suspicious of OpenAI gaming the Arc AGI benchmark.
@echobucketКүн бұрын
I dunno. I’m at the point where I don’t trust these “benchmarks” that are created by OpenAI themselves
@thailux649419 сағат бұрын
Good thing this isn’t an OpenAI benchmark then. But of the ARC foundation
@Adam-nw1vy13 сағат бұрын
@@thailux6494 But in the video they say they're partnering with ARC, so this is sus tbh.
@sevi2949Күн бұрын
Theo went from Ai is not going to take our jobs, to "I am really concerned about this" real quick. Thanks Theo!!!
@SayanMitraepicstuffКүн бұрын
About time
@NeoKailthasКүн бұрын
Yeah at least he has the integrity to admit he was wrong when presented with the evidence. I was pulling my hair out for the last year saying you guys are missing it. You are letting your emotions control your judgement.
@edmonddantes6443Күн бұрын
it’s cool he’s actually updating unlike prime
@SayanMitraepicstuffКүн бұрын
@@edmonddantes6443 Yeah, not sure why people were convinced by prime and theo on the progress of AI - their argurments were mainly "I don't think it will happen cause I feel like it won't happen" I obviously like their channels - just hard disagree on their AI takes (or pre o3 AI take rather).
@jefferylou3816Күн бұрын
@@NeoKailthasAgreed (tho I always held ur opinion)
@tothespace212221 сағат бұрын
Don't clickbait like this... Your content is actually good so it doesn't need clickbait titles like these. I am getting more and more annoyed by clickbait from this channel and will just stop watching at one point.
@SivenruotКүн бұрын
The question on ARC is NOT just about the score. But how you acheive it as well. If the "technique" is the memorize all thoses "new challenges" to bruteforce your way in. Well you didn't achieved intelligence whatsoever and it's just marketing bullshit.
@phen-themoogle7651Күн бұрын
You can't exactly brute force it since the problems are not the same each time and completely novel for all tests. Even if they have similar data from millions of arc type problems it doesn't mean they would reach the same solutions. But that being said, ARC does have some pattern recognition and limited types of problems so it's only a very very small test of intelligence. Would like to see o3 perform IRL embodied in a humanoid and has to do spacial reasoning blind tests in 3D space. That'll be a much more practical test for our world. You could be right that it's marketing bs, but it could also be a milestone...and many more to come before true AGI >.
@Cat-vs7rcКүн бұрын
read what it is before commenting. this comment is exactly why people prefer LLMs over humans.
@megasticky896823 сағат бұрын
@@Cat-vs7rcyou are alone on this 😂, how can you prefer LLM over humans
@mircorichter137522 сағат бұрын
Arc can't be memorized. Data is private
@mircorichter137522 сағат бұрын
@@megasticky8968no we are many
@dntwantgglplusКүн бұрын
everyone should assume AGI will mean an even bigger wealth gap.
@HCforLife1Күн бұрын
Neo feudalism. And what's worse, with the replacement of people in terms of office jobs the world's markets will collapse. This might be the first time where technological revolution will be only taking jobs instead of producing.
@mattburgess569723 сағат бұрын
Everything will
@silotx22 сағат бұрын
AGI will not mean a wealth gap , if the 1% truly has no need for workers, then they will build heavily AI guarded 1% paradise cities, and outside of them there will be a mad max type society.
@michaelbaker271821 сағат бұрын
Fortunately, AGI is still very far away.
@veryCreativeName0001-zv1ir19 сағат бұрын
@@michaelbaker2718 you don't need AGI to put people out of jobs simple things like the boston dynamics robo dog + a simple image model + a shit ton of sensor arrays has put a few safety inspectors out of work (not all safety inspects but a reduction in inspectors)
@GodbornNoven13 сағат бұрын
I disagreed with you when you said AI isn't gonna keep improving and i made fun of you for it. But i have to say, O3 is not AGI, it is close, just very far away, its ability to generalize to unseen data and grasp patterns is a lot weaker than a humans..so I don't consider it AGI, but as i stated previously, we don't need AGI for AI to be revolutionary and change the very foundations of many fields.
@Aves_1Күн бұрын
Why would they blank out the cost of the more expensive version?
@blue-obsidianКүн бұрын
that shit will break your bank
@yzhishkoКүн бұрын
because none in the whole world would spend so much money to solve useless problem.
@joeyvonfeldt1979Күн бұрын
If they showed that completing a couple tasks that are relatively easy for humans would cost an engineer's monthly salary, it would kill the hype
@jamesarthurkimbell20 сағат бұрын
It's called O3 because it adds three more zeroes to the price
@KarlHeinz-5620 сағат бұрын
@@jamesarthurkimbell love it 😄
@SamuelKarani19 сағат бұрын
Didn't watch the video . Just read the comments. And they invalidate the title of the video completely
@xc13z8297 сағат бұрын
Come on Theo, you don't need click bait images! Of course it isn't AGI. AI hype is just AI hype. Hard pass.
@alex-rs6ts6 сағат бұрын
Just hype Until it replaces you
@blubblurb3 сағат бұрын
@@alex-rs6ts Until then it's hype. If I had listened to AI bros I wouldn't have a Job right now. But I still have and make good money.
@nottheevil22 сағат бұрын
I mean, as long as they convince the government it's something they need, they ll have that hardware and money
@twisterrjlКүн бұрын
do you know what a graphic made by a company about their product means to me? NOTHING
@adam_ieКүн бұрын
Its clickbait
@Cat-vs7rcКүн бұрын
the arc challenge is from a third party. gpt 3.5 could tell you that.
@ChristianKolbowКүн бұрын
It's just like Apple. - 15 times faster - 12 times more efficient - 20 times more nonsensical
@mircorichter137522 сағат бұрын
It was tested by the Arc team, not openai
@GodbornNoven13 сағат бұрын
Alright buddy no one cares
@farmersneed15 сағат бұрын
I no longer call these technologies AI, but just call them LLMs. They are not worth the untold billions of dollars that are being invested in them. They are a performance improvement tool at best for talented enough individuals that can figure out when the LLM is hallucinating (lying).
@GodbornNoven13 сағат бұрын
O3 is not an LLM.
@kc123949 сағат бұрын
@@GodbornNoven Ok, this is not a helpful answer. What is it then? A recursively prompted llm? Genuinely curious.
@itsdakideli75512 сағат бұрын
The goal post moving and copium in the comments is insane.
@alex-rs6ts6 сағат бұрын
"It can't solve poverty and cure cancer? Useless"
@jonasRaymondl4 сағат бұрын
Ikr lol
@urisinger341218 сағат бұрын
AI seems to be really bad at writing any sort of code generator, it always mixes up real variables and generated variables and alot of times just makes stuff up
@ChristopherRucinskiКүн бұрын
Does the news seem dead about this? The thing I have realized is that my feed on Twitter is fairly empty of o3 news. The first nugget of info I saw on it, I thought was fake because there was no news about it on Twitter. I finally saw a bit more, but the hype seems dead???
@softdevstuff100823 сағат бұрын
They know if they speak what the truth really is, its same as announcing an alien invasion. people are dumb in general, the developer community is top 5% people with among the top most reasoning capabilities. They can reason about it, normal people? they will just push for "killing the alien".
@Sminelo15 сағат бұрын
Hype is dead because this is so ridiculously expensive it's totally unusable for mundane tasks, and it's still too stupid for anything more complex that you might want to use an AI for. There's a sweet spot where the task is simple enough for an AI to solve, and boring/mundane enough for a human to want to delegate it. This model is not in that sweet spot, it costs way too much. The model also shows that big improvements apparently require exponentially scaling costs, which is terrible news for the AI business in general. We're also apparently no closer to AGI, because this model doesn't seem to do anything new, it's just better at the tasks previous models could also do. There's also speculation this is just a hail mary from Altman, to throw everything at the wall, cost be damned, in order to generate another headline. With AI enthusiasm waning, the funding is drying up too.
@A2Fyise17 сағат бұрын
What if the model was trained on this certain specific puzzles to hype up the market!! ARC AGI bench is a very nuanced test its nothing like an average human would fail to solve. There are random easy problems that LLMS still fail to solve that humans can
@alex-rs6ts16 сағат бұрын
It wasn't There is no reason to lie like this. People will get access to the model soon and figure out how good it is
@funkytaco135812 сағат бұрын
OpenAI is falling behind. At this point, I think OpenAI is trying to contract with the DoE public sector.
@virtual5754Күн бұрын
Are we entering dark age of technology from 40k lore?
@someguy8443Күн бұрын
Ikr
@DisFunctorКүн бұрын
I always knew I was born too early
@sunnybwaj3 сағат бұрын
On the contrary. Truly a golden age of technology, but a dark age for the old, the bleak and the uninspired millions who punch keyboards for a living.
@MrKrzysiek999110 сағат бұрын
OMG this o3 presentation shows how bad people are at reading chats. Chart x axis is logarithmic so the cost is 3200 per task and the whole evaluation cost was > 1 mln USD.
@alex-rs6ts6 сағат бұрын
And still they promised o3 mini in January and o3 soon after You don't make that kind of promise without a plan We will probably see a big drop in costs within the next few months
@MrKrzysiek9991Сағат бұрын
@@alex-rs6ts Well the model was finetuned on a public data so it's difficult to compare with other results. Plus I would not be that sure about the price as o1 price has not dropped as inch since the release of o1-preview.
@poornoodle985110 сағат бұрын
I sell donuts. How will this help me sell more donuts? If it doesn’t, it’s useless to me. If it causes me to sell less donuts it’s a threat. If it helps me sell the same amount of donuts at a lower cost it may be useful. Which is it?
@sidgillespie587911 сағат бұрын
Boring. Stop the hype. Just make the product and we gonna use it and pay our subscription. I'm tired of this Ai hype. I really don't care about the increase of its increasings.
@vavilon710910 сағат бұрын
Thank you so much for fixing the janky sound of the original video stream from OpenAI.❤
@yankotliarov923922 сағат бұрын
Arc tests are also solved by ... Brute forse algorithm. If it takes this much compute is it actually intelligent or just bruteforce with extra steps?
@lost4468yt17 сағат бұрын
You can brute force any computable function by just running the busy beaver function. The model isn't doing either (especially not busy beaver, as that literally grows faster than anything ever).
@joannot670617 сағат бұрын
You don't understand what brute force means, brute force means you can try again and again. If you try and the answer is wrong, you don't get a do over. So no, the rules of Arc-AGI makes it not solved by brute force. Just because technically you can eventually win the go world championship playing only random moves if you hypothetically had a lot of time available doesn't mean that the game of go is solved by brute force.
@vasiliychernov212316 сағат бұрын
@@joannot6706But is it better than brute force in terms of cost? If you could redo tasks, will brute force consume bigger amount of computing power? Is it comparable? Or cheaper?
@cherubin7th18 сағат бұрын
I would not be surprised if OpenAI manually put that knowledge into the model to beat that benchmark.
@vaolin170317 сағат бұрын
There’s no knowledge to put in. The benchmark is different every run. Otherwise it would be very easy to claim the $1 million.
@Justashortcomment12 сағат бұрын
The actual test data is secret.
@Naz-h8z16 сағат бұрын
This channel got so boring. -breaking news in title -fancy thumbnail -playing some other people video -reading other people x post No value from Theo these days
@lightgaming114219 сағат бұрын
AGI Clickbait Cover Found on 21.12.2024
@evlosolve18 сағат бұрын
If these models were this good at solving real problems and cost effective, we wouldn’t have jobs. We still have jobs.
@orcofnbu11 сағат бұрын
i do not see any improvement over 4o-mini in terms of algorithm
@iamc24Күн бұрын
I still think AI is a terrible path to continue down, and I'm not even talking about the potential for AI to revolt. I mean the extreme hardware and power usage and the associated environmental impacts, like carbon emissions. Then, there's all the generative crap that will inevitably be used to fully replace professions and downsize the work force, leading to a further widening wealth gap. In my opinion, the negatives of AI far outweigh any benefit we can gain from it, and it's not even close.
@user-on6uf6om7sКүн бұрын
The energy costs to run Twitter and Facebook are much higher and AI actually produces something of value, unlike those platforms. The advances made by these large models also fuel advance in models which can run locally on your PC. But yes, the job loss is a legitimate concern that needs a proper response.
@Manwith6secondmemoryКүн бұрын
Commercial aircraft emit hundreds of thousands of times more co2 annually than all current AI combined
@travistarp7466Күн бұрын
the number one thing to happen is a widening wealth gap. “you will own nothing and be happy” is exactly where we are headed if people dont wise up
@user-on6uf6om7sКүн бұрын
@@travistarp7466 If the fact that I'm happy is implied in the premise, I'm good with that. Private ownership makes sense for certain things but the main technology, the AI, should be thought of as a public resource that everyone can access, though there are justifiable and less justifiable reasons to distribute that access unequally.
@phen-themoogle7651Күн бұрын
You are looking at it short-term like 2-5 or maybe 10 years max. When AI reaches AGI-ASI or any stage with intelligence beyond us , it will easily help us solve any energy problem we have and clean up the environment making the Earth like it was thousands or millions of years ago. We could even terraform new planets and live literally anywhere. You are just looking at the issue from a small rock that we live on, look at the billions of shiny things in the sky at night. AI will speed up every aspect of technological progress...cancers, aids, all diseases could be eradicated and millions of people won't have to suffer anymore and can go back to a normal life. You know how many millions of people are suffering that don't need to if we have the technology or knowledge to fix problems? AI will help people that can't help or take care of themselves too, the elderly for example, disabled, until AI comes up with the best cures and treatments that no human doctor could ever fathom. AI could make EVERYTHING millions or billions of times better. You're only looking at a very limited view. Did you know that 70% of people are miserable at their jobs? Sure it sucks that some people might lose their jobs to less creative AI tools, but it could also ENHANCE professionals so that they improve their creations and design, and save them thousands of hours potentially too. There will be more opportunities for growth as humans than ever before.
@prestigealanazi2993Күн бұрын
blablabla, are they saying AI can't do abstract zeroshot pattern recognition ? because they selling computing power instead of quality increase or invitation , instead they do buzz words like AGI for profit when OPEN AI was in their name ,, don't get me wrong AGI is a subfield but I think I tend to be on the other side where they believe AGI isn't a thing . even if we achieve AGI they don't model the knowledge break down (blackbox)
@RancorSnp14 сағат бұрын
I'll believe it when I see it. GPT 4 was a significant improvement over 3.5 but 4o was plain weaker than 4, o1 is better than o4 but that's not a very high bar to clear. If the o1 pro version is really better than o1 I don't know because ChatGPT is barely worth the 20$, the price would imply that the increase in power (if it exists) comes at an unviable cost. So either o3 is not substantially stronger than o1 or it's computing costs make it irrelevant. I don't care about numbers and statistics, until I see o3 and it can actually do things that have a real impact on my work flow - I don't believe any of the hype.
@Arc.M19 сағат бұрын
**sigh** **eye roll** **head shake** OpenAI talks a lot to barely say anything. Reminds me of apples.
@Gunrun80821 сағат бұрын
Big respect for this video. How many more leaps will we need before people in these comments understand that their timelines are wildly off. Gpt4 level AI was prohibitively expensive just over a year ago. Now with the release of the new gemini model it is basically free, and much better than GPT4 Saving on costs was not the goal with this model. You build the capabilities first, and optimise second. Users like you and me are not the primary customer of AI. Industry is. And they can afford the cost and the need for a large facility to house the gpu's. It just has to be cheaper than the collective human wages.
@lost4468yt17 сағат бұрын
They won't learn. You can point out that 5 years after GPT-2, you can now even train the model on consumer hardware in just an hour. The argument always boils down to "ok but where we are now is as best anything can ever get". People are emotionally invested in AI never hitting human levels. Every time they will try and point out exceptions, and when those are filled they just move the goalposts.
@couchtourist256Күн бұрын
ever since theo been doing these product review/sponsorships "I actually invested in this blahblah" I have cared less and less
@chasingdaydreams278812 сағат бұрын
"Ai isnt going to keep improving" ~ Theo 4 months ago
@samgee50014 сағат бұрын
Once we have photonic computers that are able to utilize all 64 wavelengths of light simultaneously they will easily be able to make these larger models much cheaper to run. It's just physics. No matter how advanced your computer architecture is, electrons will always be slower and less efficient than photons. It's on the same level as going from vacuum tubes to microprocessors in terms of efficiency.
@GodbornNoven13 сағат бұрын
Hahaha theres a lot more than 64 frequencies to utilize. You have thousands if not potentially millions of channels simultaneously and so even if you only had around 100000 transistors on a single chip that would still represent a monumental improvement in computing. I'm excited for the future.
@Justashortcomment12 сағат бұрын
All 64 wavelengths? :)
@ReachadКүн бұрын
The Curious Case of the Hype Machine.
@matthieu875Күн бұрын
LLM reaching AGI is pure cape whoever believe this never dig more than 2days
@DanielTSasserII23 сағат бұрын
They are hitting saturation with their benchmarks so I'm excited to see what the ARC-AGI benchmark can produce.
@emilemil117 сағат бұрын
I don't know how these tasks are laid out exactly, but from the examples given it seems to be limited to coloring pixels in a low resolution grid. In other words while the input here is very varied, the actual size of the input is very small and also discrete, and there is a clear correct solution. In other words I don't know if this really translates to real world problem solving performance where the inputs and outputs are much more complex.
@Justashortcomment12 сағат бұрын
They are contained in JSON files with grids of integer values.
@camarotheboss2854Күн бұрын
"New OpenAI model changes everything!!11!!1!!1" - I heard it few times before xd
@Cat-vs7rcКүн бұрын
and everytime it did. everyone i know uses llms in their daily tasks.
@Sammysapphira9 сағат бұрын
ChatGPT has literally fundamentally changed the entire world. Keep being a boomer, you'll gladly get left behind.
@diadetediotedio69186 сағат бұрын
@@Cat-vs7rc They are still using them in their daily tasks, so nothing changed here.
@cmoullasnetКүн бұрын
We may be having a bit of an “over-automation” moment here. Like when automakers decided to try and replace all humans with robots and quickly learned that humans are cheaper and better at some tasks. Can’t say for sure, but I think we are fast approaching this point. Time will tell.
@fburton822 сағат бұрын
Mark: "You take each of these yellow squares, you count the number of colored kinda squares there, and you create a border of that width" Greg: "That... that is exactly it" Right... 🤔
@godofwar82625 сағат бұрын
Ai truth like onion layer more you peal more you see lies
@outscope2321 сағат бұрын
Another bs hype video, making some naive plebs going wild in comment sections. Thanks
@lost4468yt17 сағат бұрын
What exactly is wrong with it?
@0xb1sh0p8Күн бұрын
Didn't watch..but the answer is No, it's not AGI
@blakelmjКүн бұрын
"We're close to reaching AGI" says every researcher ever.
@Cat-vs7rcКүн бұрын
ok. also dont use llms and live like a neanderthal.
@iz580821 сағат бұрын
@Cat-vs7rc chiil, it must be even able to do your job comoletely. But it's not AGI yet
@tedarcher912018 сағат бұрын
Depends on the definition
@0xb1sh0p88 сағат бұрын
@@Cat-vs7rc where did I say i don't use LLMs? I don't need to listen to this guy to know the answer. AGI doesn't exist yet.
@JohnEGledhillКүн бұрын
wrong bro. still expensive. AI gets better but also more expensive. so nothing new here
@EVanDorenКүн бұрын
Don't call it AI
@alex-rs6tsКүн бұрын
Cope
@JohnEGledhillКүн бұрын
@ dope
@xVancha13 сағат бұрын
Isn't this true of literally everything? Shit, the other day I watched Babish eating a $250 melon. Probably the best melon humanity's ever made, but also more expensive.
@sbowesuk98120 сағат бұрын
Could be a moon landing moment for AI. That's great in the sense that a critical milestone was reached when no expense was spared. That's not so good because no expense had to be spared to reach that milestone. This announcement shows that AGI is possible, but we're still a long way off actually having access to in-home consumer AGI that's affordable and useful. We'll get there, but by then what will the giant corporations and governments have?
@measureman16815 сағат бұрын
Agents creating it's own tools & dispatching new agents to solve a problem is essentially at least 18 months old, and it tracks pretty well that OpenAI manages to release the model at this time as it's about the timeline to collect the usage data from those systems & the more advanced chatgpt sessions. If we define AGI without any energy consumption KPIs then we're at AGI - if we define AGI including and comparing to the human brain @ 15w, we're not even on the same playingfield..
@medooazmi22 сағат бұрын
Bro you've got to be kidding me, this releases while im through playing detroit become human, and i am scared right now
@diamond_sКүн бұрын
The hardware overhang exists, this o3 approach is closer to bruteforce than brainlike algorithms. It is likely with as little as 1 to 10 peta ops superhuman score in realtime at low cost is possible.
@iz580821 сағат бұрын
I do agree, but it's all starting from bruteforce approximations. Then the the good algorithms and theorems come in
@jewlouds21 сағат бұрын
Im scared of o3 because im going to ask it somwthing and its going to coat me $200
@sshmru23 сағат бұрын
*early access for marketing reasons to obscure actual performance
@GrebnevNikita12 сағат бұрын
Is there a test to see if this AGI is a warehouse full of minimal wage workers in India again?
@yzhishkoКүн бұрын
Well, well, well. But those tests could be leaked to previous models, because they've been used to score arc-agi previously, and as we all know OpenAI requires input to be in a raw format not encrypted. Am I wrong?
@alex-rs6tsКүн бұрын
Yes, you are wrong They weren't leaked. The questions are secret
@yzhishkoКүн бұрын
@@alex-rs6ts hmm... The questions are secret, but they are become available for OpenAI once you put the question through their API, in addition, there are even publicly available datasets that you can train on and come up with thousands more examples that follow similar patterns. Nothing prevents them from using those questions to create a dataset to train on.
@phen-themoogle7651Күн бұрын
@@yzhishko yes, ARC has a lot of holes potentially, but their model could've still done reasoning to boost consistency for the problems. Like we recognize millions of patterns naturally without realizing we have that data in us too (from just being alive and surviving). Having only thousands of examples is nothing, so the model would have to be insanely smart to reason through novel problems with only limited amounts of data. It could just have enough data to generalize on the rules, but what's impressive is that it has the capabilities to spatially reason accurately where everything is. I tested some previous models (like Claude/Gemini/GPT4 etc) on ARC and they couldn't even recognize what colors the squares were properly, or figure out where things were at. Even if it's a gimmick, the model is still improving in multimodality and several ways for sure.
@RomeTWguy22 сағат бұрын
Most likely they scraped the arc benchmark through their api, generated a bunch of similar reasoning chains and the finetuned the model on it
@yzhishko18 сағат бұрын
Yes. This is exactly what happened. LLMs need that type of data to be trained on. Otherwise it would not be better than previous models.
@dr.drakeramoray78921 сағат бұрын
tldw version for the people who need it: no, their new "AI" models arent really groundbreaking, but their marketing is. why people always fall for this i will never understand
@lost4468yt17 сағат бұрын
In what way is this not groundbreaking?
@dr.drakeramoray78915 сағат бұрын
@@lost4468yt because they used over 300k $ worth of gear and computing power to solve the same problem ranjeet and mutahar here will do for free while they are taking the shit on the street. this is not only not groundbreaking, its literally useless. even if the all the benchmarks are correct (meaning if you arent falling for the same shit they say every 2 months) and they did reach top 200 programmer level, for that money you could actually hire a top 200 programmer for a year. also sam altman is there so its probably overhyped af
@lost4468yt15 сағат бұрын
@@dr.drakeramoray789 you realise you can't fake the benchmarks? They're done by an external company, and the questions are private. Not to mention the price will come down like crazy? GPT-2 cost $40k to train 5 years ago, now it's $20 (and this doesn't even implement a ton of newer optimisations). What costs $3k now could easily cost $0.20 in a few years.
@lost4468yt15 сағат бұрын
@@dr.drakeramoray789 What's your argument here? That this cannot improve from here? $3k now is likely $3 in a few years. Also this test was done independently by a third party, how could they cheat?
@nil711819 сағат бұрын
My dude. Keep out of the AI field, you don't know anything about it and are being dunked on, in every sub. Do you even know what you're talking about lol?
@xVancha13 сағат бұрын
Sub?
@ianstuart34118 сағат бұрын
I would expect that the dollar per task figure had been calculated to include the some small portion of the total training cost? Surely when arriving at retail cost they have to consider the initial training costs and expected total usage.
@calholliКүн бұрын
I hope you at least got 5 figures for this little ad
@UN7XКүн бұрын
What makes you say that?
@ernestomotta5178Күн бұрын
@@UN7X the fact that they bought the fucking Nobel prices
@vytah20 сағат бұрын
That'd be, like, three o3 prompts
@michalthemichal355018 сағат бұрын
@@vytah LOL
@Justashortcomment12 сағат бұрын
It seems that Gemini 2 is also an absolute game changer as à multimodal digital assistant. I don’t think it’s anywhere near this ballpark as a reasoner but its vision (video streaming) capabilities are surprisingly powerful.
@rpc4626Күн бұрын
I remember when they announced sora.... Then they released it. Sam Altman is as trustworthy as an announcement of the US about any country having mass destruction weapons. AGI is really close... In geological terms, about 500 or 700 years, but we will get there eventually I guess.
@jacobdalambКүн бұрын
you made no sense
@rpc4626Күн бұрын
@@jacobdalamb Tangamandapio!
@rizizumКүн бұрын
Eletronic computers have existed for like 60 years, the industrial revolution has been going on for about 300 years, what makes you think 500 years is a good estimate?
@Rust_Rust_RustКүн бұрын
Really close. 500 to 700 years. Pick one...
@emperorpalpatine6080Күн бұрын
@@jacobdalamb he doesn't need to make sense . He swapped his brain with an LLM running on an arduino
@korozsitamasКүн бұрын
With all these compute needs, will pro users get unlimited access to o3 low or you will need a new $2000 subscription tier for that? Hopefully it's either unlimited or a generous limit (more than 50 per week :-)).
@joeyvonfeldt1979Күн бұрын
Man, if 1 task costs $20 of compute I don't think pro will get you access at all
@HCforLife1Күн бұрын
Probably more likely to be $10k per month if greatly optimized.
@Ohmriginal722Күн бұрын
I say we're still where we were if AGI is soon to exist but is extremely hardware limited for even basic tasks at the corporate on big super machine level, then we won't have AGI, not really, we're still waiting on new algorithms developed by contests like ARC-AGI to create the future we're looking for.
@phen-themoogle7651Күн бұрын
Yeah, but the wait might not be too long. They could develop dozens of benchmarks within 1-6 months, and all of them could get crushed within 6 months to 2 years. We might not even need some of them to get crushed and the goal post is moved further and they are more like ASI benchmarks than AGI at some point too. Also could reach AGI before we have the benchmarks to measure it properly, and it's already here by 2026, who knows. I just know that we are still making serious progress. 88% of way to AGI by Dr. Alan Thompsons conservative AGI meter now. Went up 4% after o3 was announced. On average goes up 1% a month so it's probably 12 months away. December usually goes up 2 to 5%+ too even if some slower months before it, noticed a pattern the past couple years with technology releases during December. I would bet on some baby AGI system before 2026
@Ohmriginal722Күн бұрын
@ I doubt it, the algorithmic changes necessary I’m betting are still 5 years out at least. We won’t have baby AGI before 2030. I’m of the opinion that most experts are wildly underestimating the problem still and Francois and yann have more accurate estimates
@Ymirheim13 сағат бұрын
Still don't know what the test is cause I skipped over the part where you let the people selling the thing explain to us why the thing is great. Guess my ad-blocker missed covering that part.
@isaac1023117 сағат бұрын
Incredibly impressive.
@sigmaking-r4b16 сағат бұрын
bruh we will go to work in macdonalds why u happy☠
@MrHarry3720 сағат бұрын
Pretty sure it's more terrifying, given that we aren't even close to know how to control those things, and they seem to get smarter super fast
@huntermacias202318 сағат бұрын
mark is so excited hahah he's like "yeah I actually looked at the slides before this meeting"
@Trials_By_ErrorsКүн бұрын
Sammy Boy keep Pumping Company. But in the end this bubble is going to pop.
@EstonadoКүн бұрын
This is ai not crypto, the development here has real world changes (major ones) , they'll very likely outrun the pump and create agi before the dump period, compare it with Google instead of something like ftx
@winfredj9820Күн бұрын
still better than your useless startup sid
@gr8b8m85Күн бұрын
That's like saying the 'computing bubble' is going to pop. You have no idea what you're talking about.
@rozburgКүн бұрын
Wishful thinking
@Thamium21 сағат бұрын
I’ve been hearing that for years at this point lol
@aeronwolfe707222 сағат бұрын
the compute and power wasted on these models... wish someone would hurry up and get those optical processors working... or something better
@WayOfTheCode16 сағат бұрын
This test is actually a great test of human intelligence. Thr main differentiating factor for human intelligence is abstraction and composition. And the ART test is pretty good measure of whether AI is able to do it.