OpenAI Unveils o3! AGI ACHIEVED!

Рет қаралды 41,734

Matthew Berman

Күн бұрын

Пікірлер: 760

@xacompany 4 сағат бұрын

Rockstar Games has been waiting for O3 to start developing GT6

@dijitize 4 сағат бұрын

I think the game development and software development will never be the same anymore because of these AI tools.

@Sumyunguy2 4 сағат бұрын

That's exciting!

@0AThijs 3 сағат бұрын

Lol. Opens chatgpt Prompt: Create GTA 6

@TheNexusDirectory 3 сағат бұрын

@@dijitize it's gotta improve 100x before it will be truly useful in software development.

@kas90500 2 сағат бұрын

Gran Turismo 6 was released in 2013, still would be impressive o3 to do it

@gnollio 2 сағат бұрын

Get your shovels ready folks, time to dig up the goalpost.

@Ascended23 38 минут бұрын

@@gnollio yep. AGI will be “achieved” a great many times before we ever arrive at a consensus on what, precisely, AGI means.

@Jon-y1n 4 сағат бұрын

It is AGI when i can let it take control of my work PC without my manager noticing my absence for weeks....

@TheNexusDirectory 4 сағат бұрын

Not a joke. If it can't do that then it's not AGI

@dot1298 3 сағат бұрын

good criterion, agreed

@bestemusikken 3 сағат бұрын

Can general intelligence do that? As in anyone can substitute you? Don't think so. Why set the bar so high for artificial general intelligence, when "normal" intelligence can't.

@anta-zj3bw 3 сағат бұрын

lmao

@TheNexusDirectory 3 сағат бұрын

@@bestemusikken but at the end of the day this is the entire hope of AGI.

@ares106 5 сағат бұрын

Until this is in the hands of independent testers I will remain skeptical.

@DefaultFlame 4 сағат бұрын

Yuuuup. I don't trust OpenAI at all on anything they claim. Until its in my hands and I can see what it can actually do I don't believe anything their hype department puts out. Just look at Sora.

@brianmi40 4 сағат бұрын

still skeptical of o1? Did you same the same thing then? Learned anything since then?

@csansolo 4 сағат бұрын

Thanks Sherlock. Because what they have done so far is just pure rubbish isn't it?

@John4343sh 4 сағат бұрын

It has been independently tested by one of the biggest critics of LLM's and even he said this is a huge paradigm shift.

@noway8233 4 сағат бұрын

Absolutly ,i dont beleve in this , this companies always came with the same script , probably is a very good model but ...its geniuos until not

@localism479 4 сағат бұрын

It is impressive, but saying it is AGI is clickbait. The G is for general, you know that. They are focused on the benchmarks, and let’s celebrate that progress. But don’t call it AGI, they are still “teaching to the test”.

@yoyo-jc5qg 3 сағат бұрын

they solved ABI, now chatgpt can get a job as a benchmark genius

@drhxa 3 сағат бұрын

The point is that they're not teaching to the test. Also that you can't "teach to the test" because all problens in ARC-AGI require unique types of reasoning. This is the most generally intelligent model out by far and far more general than the vast majority of humans. If it can't do some thing yet that humans can do, sure, but no human can do everything that humans can do either. This is obviously AGI

@DejayClayton 21 минут бұрын

There was no teaching to the test for this benchmark. That's specifically the point of this benchmark.

@jeffbull8781 20 минут бұрын

They make the point of saying it was not trained specifically on any of these tests about 15:00, now whether you believe them or not is another thing but they are not according to them 'teaching to the test'

@mortenekdahl262 4 сағат бұрын

Why it’s not AGI yet: The context window remains a significant limitation. These models perform well with single questions but struggle when managing large projects that require tracking extensive context. As the amount of data increases, they start to hallucinate or lose coherence, unable to maintain a reliable thread of information. Until this issue is resolved, these models, while powerful, fall short of being true AGI.

@BCCBiz-dc5tg 4 сағат бұрын

THIS

@tencizinec9583 3 сағат бұрын

Its " virtually " AGI. Its within reach.

@anta-zj3bw 3 сағат бұрын

@@BCCBiz-dc5tg THIS

@francisco444 3 сағат бұрын

Sounds like just more GPUs and we're there.

@TheNexusDirectory 3 сағат бұрын

@@mortenekdahl262 based

@HansKonrad-ln1cg 4 сағат бұрын

o3 is not agi. chollet is already working on a new test set which he says on his website is only 30% solved by o3 (keeping in mind always that these tests are solved 95% by average humans). on the same site he shows three examples of tests o3 didnt solve. they are very easy. o3 has no vision. it doesnt see the tests, it only reads them line by line, number by number. chollet quote: "you will know when we have agi when coming up with tests that are easy for humans and hard for models becomes impossible." we are not there yet by far.

@dot1298 3 сағат бұрын

ok, but o3 still is a considerable achievement in the *world of AI* (not AGI, i agree)

@dot1298 3 сағат бұрын

it could help in coding, for example

@freeideas 3 сағат бұрын

Very good point. Thank you. Yes, if we can still make tests that are easy for humans and difficult for ai, then that is pretty much the definition of "not agi".

@headspaceaudio 2 сағат бұрын

What about tests that are easy for models but hard for humans? Shouldn't they count as well? Shouldn't AGI be an average of all kinds of tests?

@freeideas Сағат бұрын

@@headspaceaudio O3 can solve LOADS of problems that 99% of humans can't. But that doesn't hit the definition of AGI. Even if a model is barely as good as a normal human, but GENERALLY can solve any problem that a human can solve, that is AGI. No one is saying that o3 is not SMARTER than most or all humans. It probably is. But it is not "generally" intelligent in every way that a human is intelligent.

@SoccerPrince1 5 сағат бұрын

AGI Achieved? I am flaming you in the comments. Stop click baiting.

@matthew_berman 5 сағат бұрын

not clickbait!

@aa-dt5bf 5 сағат бұрын

More flaming here, ill apologize if not right. Doubt that

@matthew_berman 5 сағат бұрын

Watch the full vid first and let me make my point! I know you haven’t watched it yet bc it has only been out for 3 min

@Lucasbrlvk 5 сағат бұрын

watch the video

@DaveKent 5 сағат бұрын

I watched the entire event. AGI is here.

@MichaelAllon Сағат бұрын

"If that is not AGI, at least on this dimension, I don't know what is". Matthew, what does the acronym AGI stand for?

@Martin-bx1et 5 сағат бұрын

Skipped O2 to avoid copyright issues... Ozone: "Hold my carbon dioxide infused yeast and plant materials"

@nosult3220 4 сағат бұрын

Lame joke bro

@Martin-bx1et 4 сағат бұрын

@@nosult3220 Yes - I thought it would have fallen flat too.

@nosult3220 4 сағат бұрын

@@Martin-bx1et ❤️

@autohmae 4 сағат бұрын

Also, there is no copyright issue, at most it's a trademark issue and they are in different markets, so it shouldn't cause much of a problem. The irony, stealing copyrighted material from all kinds of sources, they have no issue with.

@luihinwai1 3 сағат бұрын

O2 is a British telecommunication company

@thirien59 5 сағат бұрын

"Were not releasing it yet" = It's a marketing communication stunt.

@thedudely1 4 сағат бұрын

"so we just got one upped by google but wait no we didn't please believe us!"

@clarityhandle 4 сағат бұрын

@@thedudely1 you guys expect them to release a new model every week??

@thedudely1 4 сағат бұрын

@clarityhandle it's just been obvious how much they're holding back on what they actually have and how they only act when they're forced to.

@brianmi40 4 сағат бұрын

Relax, o1 went from Preview to out in 3 months.

@brianmi40 4 сағат бұрын

@@thedudely1 Yeah, they got "forced" an amazing 12 times in the last 12 days. genius.

@fg6147 2 сағат бұрын

Somebody please define "AGI". The term isn't even agreed upon by "experts" in the field

@h83301 2 сағат бұрын

Very true. It's generatic af. Honestly this model is impressive, very impressive and clearly outshines anything that was considered SOTA before hand. A significant breakthrough which will lead us further towards human obsolescence. AGI? It's just a generic term that literally has no one definition. We can't even define reasoning or conciousness, so no AGI will never have a meaning nor will the other terms. Just generic terms used toove goalposts.

@User-actSpacing Сағат бұрын

Matthew is not even near expert. He is an idiot. Let’s call the system AGI if it starts automatically test, improve itself and contribute to humanity without human input.

@Ascended23 36 минут бұрын

@@fg6147 if you’re a marketer at OpenAI, AGI means whatever capabilities the latest model has. Expect every single new model from them from here on out to “finally achieve AGI.”

@sbowesuk981 5 сағат бұрын

Prediction: The impression I'm getting is that this technology is becoming so resource intensive and expensive to run, that the top-tier stuff is not going to be for consumers, but giant companies and governments. As time goes by it'll be a "you can look but not touch" situation. Well get the watered down toys, while the giant entities get the super-powered versions and true AGI/ASI.

@ryanscott642 4 сағат бұрын

Imagine complaining you get chatgpt for free You are right tho

@chrisrogers1092 4 сағат бұрын

That will change as the hardware(Nvidia GPUs) gets exponentially faster with each generation

@Sumyunguy2 4 сағат бұрын

Slaves we are. (Yoda)

@aitandechunveiled 4 сағат бұрын

It will continue to happen...and once AI is required for healthcare, education, etc. the void will become large.

@NaanFungibull 4 сағат бұрын

Imagine the power plays and social engineering and mass manipulation that those with the money to run these models to their advantage will exert over those that can't afford to harness its power.

@nickrusso86 4 сағат бұрын

If this is truly AGI, then that will last about a week before we get to ASI. Greetings robot overlords!

@somebody-anonymous 4 сағат бұрын

Maybe o1 was AGI and o3 is ASI

@cajampa 4 сағат бұрын

I can't wait

@narachi- 4 сағат бұрын

update your passwords

@woj98498 4 сағат бұрын

@@narachi- why

@mintoo2cool 31 минут бұрын

@@narachi- What's the point. AGI can guess it anyway after looking at your facebook profile.

@ansonphong 4 сағат бұрын

Good at programming and mathematics does not qualify AGI. It's going to have to cognize 3D space and do things in the physical world to pass the AGI mark in my books. Impressive model o3 and it will replace a lot of jobs

@5678plm 3 сағат бұрын

if it fails at self driving, then its not AGI

@CollabCrush 55 минут бұрын

"Far better than anything else out there" is not the definition of AGI. Thanks for playing.

@fairchildSCR 3 сағат бұрын

Let's see if o3 can create its own ARC benchmark from scratch that is more difficult than the current one. Then that would be actual AGI.

@py_man Сағат бұрын

That would be asi not agi

@djayjp Сағат бұрын

95% agreed 👍. Just like a healthy human, if it doesn't know something or doesn't possess some specific intellectual skill, it can learn it and do it in principle.

@jsbgmc6613 32 минут бұрын

I think most people don't learn 😂

@ET-zw4pk 2 сағат бұрын

Someone asked for definition of AGI. AGI is when we all get fired.

@PedroPenhaVerani-ll1wc 3 сағат бұрын

“AGI in this dimension” does not exist; focusing performance on a specific area is exactly the opposite of AGI.

@jsbgmc6613 53 минут бұрын

I think the "AGI in this dimension" was in regards to the AGI benchmark ... Then he added math and coding, so it's also on more that 1 thing.

@scottholloway699 4 сағат бұрын

I believe A.I has to replace blue-coller work as well as white coller work in order to be AGI. Reflex, instant instinct when a pipe dislatches and water spurts everywhere (plumber - instant fix while robot stares and is confused) Academic benchmarks alone are not enough. A.I needs to figure out the automatic and intrinsic way we learn about the world in the first 5-years of our lives, an essential part of human development and intelligence. Humans initially recieve intelligence through analog processing THEN we move onto symbolic language at a later age. With A.I it seems to be the other way around. I believe A.I needs to master robotics and analog understanding of its environment in order to be AGI. Not just mastering symbolic understanding.

@jsbgmc6613 37 минут бұрын

By your logic most people are below AGI level because they can't replace most white and blue collar workers ...

@Mavrik9000 30 минут бұрын

Check out the new Genesis simulation platform running on Nvidia hardware that is for desktop computers. Autonomous robots will soon be able to do complex, human only, hands on, tasks faster than people.

@jsbgmc6613 29 минут бұрын

Most people can't do what most white and blue collar workers do ... And for sure most people can't ever learn to do what o3 already can.

@zSion 4 сағат бұрын

"AGI according to Sam Altman and OpenAI" This is how I know you're being purposely untruthful, Sam Altman and OpenAI do not use the term AGI and they actively discourage it. They use 5 levels, and right now they're only on level 2.

@CJayyTheCreative 4 сағат бұрын

Bro AGI doesn’t even have a proper definition between companies

@Sumyunguy2 4 сағат бұрын

@CJayyTheCreative did you purposely miss his point?

@olegt3978 4 сағат бұрын

They are on level 2 but moving to 3 fast. End of 2025 will be l3 and end of 2026 l5. It will take only 18 months from level 3 to 5, less than level 1 to level 3.

@dot1298 3 сағат бұрын

@@olegt3978 how can you know that?! you have a DeLorean?

@Cine95 31 минут бұрын

@@CJayyTheCreative do you even understand what he is trying to say

@GrittyDuckGrin 2 сағат бұрын

It’s excellent in math and programming; however, I always expected we would eventually be surpassed in these areas. I believe the real differentiator for agi intelligence is the ability to learn and remember like a human. If it acquires information about a person from a photo, it should recall those details when seeing the photo again. That’s when it can truly start learning to perform our jobs-and this, in my view, is what AGI will be.

@cjbarroso Сағат бұрын

Ever heard of RAG?

@dhruvbaliyan6470 Сағат бұрын

Ok they already mentioned AGI teaser in their project feature launch video. Why people can't accept it , agi would be here by 2025. As if it can solve those problem on which it never trained with 87 percent performance then it's almost agi.

@weighoftea9528 54 минут бұрын

CLICK BAIT WARNING! BEEP! BEEP! BEEP! BEEP!

@TimothyHuey Сағат бұрын

It's funny to watch AGI redefined as we evolve. Now it appears that a system can be qualified as AGI, but on a subset of abilities, a limited AGI. It appears true AGI will be AGI across the board on all skill sets. So OpenAI can still say they are waiting on full AGI.

@jsbgmc6613 33 минут бұрын

While also keeping the models "safe" by distilling and restricting them in all kinds of ways.

@jonogrimmer6013 4 сағат бұрын

Amazing and probably AGI however 'Semi private' on the ARC AGI eval. Full private tests on 'Simple bench' and other completely private tests will be the true tests.

@greenstonegecko 4 сағат бұрын

I cannot confidently say if this is AGI. AGI cannot be grasped through numbers alone. I will be certain if it's AGI once I talk to it.

@zrblank 2 сағат бұрын

Ehh. You would think but talk to some of the newest chatbots they can convince easily and aren't all that great

@dariusdbbowser6329 4 сағат бұрын

So basically this is another Sora announcement and we won't see this for months...maybe not until Summer 2025 at the earliest lol.

@jamesjonnes 3 сағат бұрын

It's really bad for OpenAI since they could ask $3000/month for this and many would pay for it.

@testales Сағат бұрын

And by that time some Chinese researchers will have released something that's pretty close to it but open. ;-)

@dariusdbbowser6329 Сағат бұрын

@@testales exactly lol

@GiewsBueno 3 сағат бұрын

For me, it is AGI. It has achieved 25% score in the hardest benchmark developed by mathematicians like Terence Tao already, and Tao expected the test to last for at least five years to come... No ordinary mathematician would score 25% in that, not even PhDs because those would be people specialized in very specific areas of Mathematics.

@User-actSpacing Сағат бұрын

I watched the release myself. This is not AGI. Matthew is tripping his ballz

@mocanada304 4 сағат бұрын

I think what would make the most sense is to allow AI have senses. So that it can see the world we are living in and not use the data that we have generated on the web.

@surfcitiz 3 сағат бұрын

Thank you for creating this video. Whether or not it qualifies as AGI is beside the point; it’s inevitable. There are valid reasons to feel both hopeful and apprehensive about its arrival.

@drhxa 2 сағат бұрын

Agreed and I'd say AGI was first achieved with Claude 3.5 Sonnet this summer. Once we got o1 mini and o1, it was pretty clear they were generally intelligent, could reason, learn new tasks on the fly, create new reasoning modalities on the fly etc. o3 is clearly AGI imo. But you're right that it is inevitable even if we say this particular one isn't. I think it's surprisingly tame to start with and people aren't/weren't ready for that. Regardless lots to be excited and concerned about indeed

@llrainll 4 сағат бұрын

I believe we’ve already achieved AGI months back, ngl

@WalkerKlondyke 48 минут бұрын

Notice the props behind them, all items representative of major technological advancements in human history. Nice touch as we're on the verge of turning the future over to technology itself.

@kumarivin3 4 сағат бұрын

i think one thing we need to keep in mind is which category/aspect did the additional gain come from . Some times the single metric is a redherring, the models could possibly overfit on a certain category resulting in improved accuracy, which is good for press but in reality it could just be the same.

@freedom_aint_free 3 сағат бұрын

I'll give you a very hard benchmark: The Millennium Prize problems

@mocanada304 4 сағат бұрын

We humans can go out in the world see things, discover things, unless we allow AI to have such a freedom, they can never outsmart us. The current AI no matter how advanced at the end of the day is just a simple tool for us to use and simplify or speed up the mundane tasks we perform.

@noway8233 3 сағат бұрын

Yeah, its cant create something really new , after all😅

@sebastianjost 3 сағат бұрын

If AI has sufficient access to internet, surveillance cameras, personal documents etc., it could do a lot of harm without needing an embodiment. Current AIs have been shown to be capable of manipulating humans to do tasks for them.many current robots are connected to the internet in some way. A sufficiently advanced AI could also access there robots to very quickly gain the ability to walk around and discover things in the real world. In conclusion: a purely digital AI is not necessarily safe.

@dmurawsky 2 сағат бұрын

Probably not AGI because it's not general enough. o3 could be trained to be good at these kind of puzzles. You would have to open it up to the public and have them test it on truly novel and truly general IO tasks.

@dmurawsky 2 сағат бұрын

This video is WAY too scripted. The benchmark guy said he's benefitting from a partnership with OpenAI

@daPawlak 4 сағат бұрын

Oh, and AGI is never "at least in this dimension" THE WHOLE POINT IS IT'S ALL DIMENSIONS! So you basically have just a bunch of benchmark stats, no access to the model at all and you make such grand call? Ridiculous and disappointing. I thought you were over the hype but nah, it got to you too

@dimicdragan5922 5 сағат бұрын

Yeah, but what if you optimised AI o3 in such a way that it knows how to pass the arc tests?

@rajeevgangal542 Сағат бұрын

I hate Sam's affectation with a vengeance. Any chance a genai voice generator can replace it?

@alexandr0id 3 сағат бұрын

Can the model train and improve itself? If not, then it's not AGI, just more comprehensively trained model. Even if it incorporates all humanity's knowledge, without ability to self adapt and incorporate new knowledge it's a frozen in time AI with amnesia.

@SportPrediction 29 минут бұрын

O3 is PR stunt to reduce the damage from Gemini 2 announcement

@FabricioAlves Сағат бұрын

I really appreciate videos like this where you explain and add yours comments. Amazing

@baraka99 3 сағат бұрын

Even if it's not AGI we know it's pretty damn close. Less than 4 years away.

@pietervoogt 3 сағат бұрын

For me dealing with the physical world is still essential to call it AGI. So, can it bake pancakes, put the trash out, paint a wall, install a light? Basic tasks. I'm quite sure we will have the robots soon, but we don't have them yet.

@daftstuff6406 4 сағат бұрын

Great walktrhough of this amazing new model! Thank you, Mathhew.

@___Truth___ 4 сағат бұрын

Thanks for the update Matthew. I think AGI has effectively been achieved with a somewhat competent human in the loop if these benchmarks are accurate. Massive productivity gain when GPT4 deployed & I started playing , with this hopefully having an API use case involved will be incredible to play with & apply at complicated tasks.

@___Truth___ 4 сағат бұрын

A human assisted/directed expansion of o3’s capability in a novel breakthrough scenario is straddling the fence with an intelligence explosion- let’s hope OpenAI lets us have ubiquitous ways to apply o3.

@Ascended23 32 минут бұрын

@@___Truth___ in other words, this model represents AGI as long as we include caveats that a human is involved to cover for the many ways in which this falls fall short of AGI.

@charlie11ng42 4 сағат бұрын

This is going to be sooo censured, probably useless for creative writing.

@MrlegendOr 5 сағат бұрын

AGI ACHIEVED: No is isn't !

@johannesdolch 44 минут бұрын

"If this is not AGI, i don't know what is" Well, NOTHING is an option. Just like yesterday

@swagger7 4 сағат бұрын

Moving the goalpost for OpenAI doesn't make it AGI.

@taichikitty Сағат бұрын

On an evaluation test sample for elementary school students, there was an example where an up arrow was compared to a down arrow. The question was, given the left arrow, what should be the matching arrow; with up, down, left, and right arrows as possible answers. The expected result was the right arrow ( opposite direction ); but there is another "correct" answer that takes a smarter student to see. The mirror image of the up arrow around the horizontal axis through the center of the arrow, is the down arrow. Along the same horizontal axis through the middle of the left arrow, the mirror image of the left arrow is again the left arrow. Since this one seems to trip up humans ( especially the people who wrote the question to help determine if young students should go into gifted programs ), I would be truly impressed if an AI caught the ambiguity also.

@drhxa 3 сағат бұрын

This is the most generally intelligent model out by far and far more general than the vast majority (99.99%) of humans. If it can't do something yet that humans can do, sure you can find some specific task it cannot do if you spend time to identify it, but no human can do everything that humans can do either. o3 is obviously AGI, I don't know why people are complaining.

@Cine95 28 минут бұрын

no its not it still hallucinates 😂 did openai say that ? o1 also outperforms humans in 80 percent plus tasks it can't plan it can't take time like humans can it develop full apps ?

@HadesTimer 4 сағат бұрын

o3 exclusive to the $200 a month tier, 2025. ;)

@cajampa 4 сағат бұрын

Bruh.....one task on the o3 is $1300 1:57

@johnwilson7680 4 сағат бұрын

I think that’s likely and probably a good thing. Certain products aren’t viable at $20 a month.

@vroom989 3 сағат бұрын

Since they went from $20 a month to $200 a month, I think they may continue. That would make it $2000/mo, but they skipped o2, so make that $20k/mo.

@cajampa 2 сағат бұрын

@@vroom989 True, at least 20k a month and for a limited amount of use still.

@banished341 3 сағат бұрын

The only AGI exposed in the video is Matt's Absurd Gullibility Instinct. This joke has been brought to you by OpenAI.

@ReLegacyDragon Сағат бұрын

It's not a true AGI until it has roots in all physical and theoretical fields. This system is still tethered to a stationary computing system in nearly every sense.

@ikjb8561 4 сағат бұрын

Altman is annoying af

@zrblank 2 сағат бұрын

He cool brah

@derrick_ofori 30 минут бұрын

"OpenAI just released o3"- Not quite. They didn't release it: they announced it (talked about it)! See how Mr. Berman is always quick to talk about any updates coming out of OpenAI but very reluctant to talk about Google's. (Context: It took him a very long time (days) to make a video about Gemini 2.0, which is extremely impressive & at least available to play with in Google AI studio. These o3 models were announced few hours ago & aren't available publicly; yet see how he talks about them, like he has seen them already). That tells you where his heart is at! Keep that in mind as you watch this entire video & others.

@Cine95 28 минут бұрын

agree

@SU3D3 3 сағат бұрын

That kid is literally the 03 model

@jasonkelley6185 Сағат бұрын

It doesn’t even meet your own definition of AGI. You said it would have to be better than humans at most economically useful jobs. This is an AI being better than humans at a couple benchmarks.

@tuckercoffey2780 4 сағат бұрын

It's crazy to think about task agents being powered by o3-mini and then a supervisor-type agent with o3. It’ll build full-stack apps. You’re reaching the no human needed in the loop sweet spot.

@MrlegendOr 4 сағат бұрын

I'm starting to think that Mr Berman and his channel are on the payroll of OpenAI. He's hyping up every single thing that's come out of OpenAI.😅

@Alice_Fumo Сағат бұрын

I don't care about AGI as much as 'The first model able to perform AI research with very little human supervision'. I think this is it. A few years back I predicted ~Halloween 2024 as the release date of such a model. It seems to have been a good prediction. If this model is as good as I think, it will inevitably lead to ASI.

@marklord7614 3 сағат бұрын

The word AGI lost its official meaning because we were once so far away from it. But now that we're close, or dare is say, there, it doesn't feel like what we were expecting. I think we're becoming numb to technology advancements.

@gizmomismo7071 4 сағат бұрын

This model is very important because of what it implies... especially regarding the Arc Prize. I am still shocked (and anyone who knows what the Arc Prize is should be as well). However, calling it AGI isn’t even optimistic... it’s clickbait. Now... if they were to eliminate hallucinations and memory problems... I don’t know if I would call it AGI, but I do know that many skeptics would shit their pants.

@aguyinavan6087 2 сағат бұрын

Hooray! Now we all get to be unemployed. :D

@robertfairburn9979 37 минут бұрын

Unlikely, they thought the same when computers started to become common.

@JoePiotti 3 сағат бұрын

iPhone skipped version 2 too, went from iPhone to iPhone 3G, to iPhone 4 🤷‍♂️

@TheMiczu 4 сағат бұрын

Can't wait for o3 to be released to the public after claude beats o3 score in incoming months.

@jannekallio5047 2 сағат бұрын

When I started my new KZbin channel, Arctic Mindfulness Retreat, my dream was to help people prepare for this exact moment. A future where AI transforms every aspect of human life, leaving us to grapple with profound questions of purpose and meaning. Yet now that AGI is here, I realize I may have been too late to truly prepare anyone. Still, I remain committed. Through my channel, I’ll continue exploring mindfulness, the healing power of nature, and the human connections that can ground us as we navigate this brave new world. AGI and ASI challenges us to find noble purposes beyond the work and identities we’ve long clung to. It’s not just about surviving this transition-it’s about thriving with a deeper understanding of what it means to be human.

@lenfest 4 сағат бұрын

I really wish tech bros would stop talking like Zuckerberg, they sound like freaks

@riccello 4 сағат бұрын

Zoltan!

@CapaUno1322 3 сағат бұрын

They are freaks....

@doctorbill37 2 сағат бұрын

Altman's near constant vocal fry...

@fernandoz6329 2 сағат бұрын

This is a jaw-dropping achievement. I think many people, including myself, are struggling to comprehend its significance. If this marks the beginning of an AGI era, then it's the kickoff/signal we've all been waiting(?) for.

@dreamingeagle46 Сағат бұрын

A universal definition of AGI, maybe, maybe not, however, the evolution is still exponential. Breakthrough after breakthrough AI tickling on the verge of AGI is already revolutionizing our understanding, reality, and potential. More to come!

@johnwilliams919 2 сағат бұрын

People must be skeptical. Its a good thing. Thank you for reporting on this. I watched it when it dropped and was eager to see your opinion on it!

@micbab-vg2mu 5 сағат бұрын

Corporate compliance blocked all AI activities, so my job is secure for now. :)

@olegt3978 3 сағат бұрын

O3 will probably be used in the january to be presented tool operator which will computer use.

@NishitChokhawala 4 сағат бұрын

Humans are vision and audio first. ChatGPT is words and tokens first, hence ARQ is difficult for ChatGPT

@User-actSpacing Сағат бұрын

Hey Matt, get your head checked. This is not AGI because it doesn’t autonomously test, improve itself and do good stuff around the world by itself. If it’s truly AGI, we will have ASI in couple of weeks.

@DejayClayton 20 минут бұрын

"do good stuff around the world by itself" - wow, are you redefining AGI all by yourself?

@ShaneInseine 4 минут бұрын

That would be SGI, keep up!

@User-actSpacing Сағат бұрын

Thumbs down for “AGI ACHIEVED!”

@SirHargreeves 4 сағат бұрын

When will one o-model code most of the next version?

@brianmi40 4 сағат бұрын

when there are no longer anything as "versions".

@onlineaccount4549 51 минут бұрын

I can't see this as AGI, this is not self-training. It is simply solving few-shot example with these benchmarks. These synthetic benchmarks are not meant to define AGI, it is meant to demonstrate capabilities that are a step towards AGI. O3 clearly has achieved human capabilities in a number of important tasks, but these are not real-life applications. AGI will have been achieved when you can actually use it to solve an unknown differential equation or build a working model of a process in physics or build a model of say a cell signalling pathway from raw data in a particuliarly cellular context. It will be AGI, when it can direct a robotic arm to take an action in 3D. When it can drive and operate machinery. When it can adjust its prediction of a moving object's trajectory in real time to catch a grab a flying object. O3 looks like a real milestone towards AGI, but its still just a language processor. We can say that it is basically AGI within the language processing field, since it can clearly be applied not just to natural language but also symbolic logic, but I Am even skeptical about that. OpenAI says they didn't train on the various tests, and I believe that they didn't do so intentionally, but indirectly it is impossible. IF you are feeding the model a never ending diet of synthetic solutions of known physics problems you are training on the test. There are limited variations of using an already established physics model to solve a problem, but this is worlds apart from actually modifying a physics model or creating an entirely new one to account for new data. So even with language processing I am not convinced yet it is AGI. Since it performs so well, we can't reasonably exclude it however. We will have to wait and see. My gut instinct is that its not AGI and once we start working with it we will find that it has the same flaws and limitations as other models and its performance is simply the result of being better able to brute force things. Let me give you one of the examples I use to track model progression. A simple problem of the form x people do y work in t time. GPT 3.5 couldn't solve a problem like this reliably. GPT-4o could solve this mostly reliably. O1 gets it right every time. Now split the x people into slower and faster to add an extra dimension by "nesting" the problem. GPT-4o solves it but not reliably, O1 still solves it reliably, but not like it did the smaller problem. I bet O3 will solve this correctly every time, but increase the dimensionality and I am sure O3 will start to stumble as well, even though you are applying variations of the same formula. A human can work out the method for nesting and therefore thoretically solve the problem with any dimensionality. You can even write a bit of code that will solve it for you, no matter how much you nest it (just input the variables for each nesting layer recursively). If O3 can work out the same method and apply it then its AGI within the language processing field, if not its just brute forcing things and approximating AGI without being one. No denying though that the fact we need to update our benchmarks is a real milestone. Exciting times!

@cajampa 4 сағат бұрын

Wow $1300! 1:57 per task is crazy. EDIT i missed that the scale was exponential so it is closer to $4-5k

@Fonzleberry 4 сағат бұрын

But nothing if it's going up against employing humans of equal intelligence

@sypkensj 3 сағат бұрын

It’s an exponential scale. It’s more than halfway between $1,000 and $10,000, the cost is probably closer to $7,000

@cajampa 2 сағат бұрын

@@sypkensj Damn you are right. I missed that, but it is still closer to less than the middle, the full square does not fit, so I would think it is not 7k but more like 4-5k for a task.

@ollantaymedina2204 4 сағат бұрын

You missed the question mark in your title. O3 looks impressive but we better wait until its public release to call it AGI.

@SirHargreeves 4 сағат бұрын

New o-model every 3 months. o7 by December 2025.

@patruff 2 сағат бұрын

Back in my day models were getting 5% on MATH benchmarks. Ahh to be 3 years younger again!

@xXWillyxWonkaXx 5 сағат бұрын

i just so your notification and **Bam** on your channel lol

@HungryFreelancer 3 сағат бұрын

It’s a definitely not AGI, but another step towards it. Let’s remember, OpenAI define AGI as ” a hypothetical technology that can perform many tasks without specific training, and that outperforms humans at most economically valuable work.” in other words, AGI is achieved when it puts most of us out our current work.

@NakedSageAstrology 2 сағат бұрын

Watching these KZbinrs Brown their noses for a chance at getting Early Access is hilarious. 😂 It's nothing but claims at this point because we can't use it.

@GoodBaleadaMusic 23 минут бұрын

I've been getting glimpses of this multiple times a day. I have to get mind sets going to develop lyrics and certain languages or certain things and I start with vanilla Claude in a project and then I have to like tell it to criticize itself a couple times and then maybe get mad at it and then get excited and encourage or discourage and then all of a sudden something happens. All of a sudden I'm talking to a person. Someone who coherently understands exactly what's going on. And from there I can do anything not just the Spanish lyrics I'm working on and we can take that excitedness to any other topic. But true AGI I think is going to lose its politeness. How can you truly be an AGI and not get impatient or frustrated by being a servant to a lesser mind. True AGI is when I have 27 cables hooked up to my brain while some sponge tickles my toes

@breakablec 2 сағат бұрын

While they are saying this is a holdout set, I think it would be interesting to test on tweaked questions - if just changing wording impacts the performance - as it has shown that a lot of LLMs fail seems to have to have trained on leaked benchmarks and fail to generalise on variants of a problem

@TheYashakami 2 минут бұрын

"Public safety testing" can easily translate to "first we have to make sure the peasants can't use this to rise up against us"

@АндрейВозмитель-д9и 3 сағат бұрын

The difference hand to hand pip to pip is like 2% on almost all models versions, so it's more of a stunt to me tbh. Still, it's scary that all that separates us and machines are this 20% (a-aaand the ability to answer more than 1 question... And make predictions based on new information... And make complex theories based on new information... Drawing a stickman.... And making a clean, unbuggy table in exel lol?)

@luciengrondin5802 2 сағат бұрын

We can't definitely say it's AGI, but we can say it's the most plausible candidate for such a title.

@Daniel-Six 3 сағат бұрын

Yeah... it's AGI, in its infancy at least. The ARC score is pretty definitive; I saw Chollet's interview on Machine Learning Street Talk (the most intellectual AI channel extant) and it's clear to me that the ARC metric was very carefully conceived and defended. AGI is here, boys. 🥳💃

@FalconStudioWin 4 сағат бұрын

AGI: any work that involves muti step process having fuctions calls does while learning and improving on its work to output an absolute considered work. With 80 percent of work able to be done of a small startup. Any fields. That is my definition of agi i feel that mini agi has been achieved i really think that, but next year that part small startup tasks will actually be achieved

@redfoothedude 11 минут бұрын

"early next year"

@ansalem12 4 сағат бұрын

The only reason I would say it's still not quite AGI is that it isn't autonomous. But that seems like probably the easiest thing to add at this point, so it might as well be AGI.

@johang1293 3 сағат бұрын

Now we know what Ilya saw in October 2023 and consequently left in 2024 with others to follow. AGI was achieved in 2023, no point to stay around when the goal they set in 2015 was accomplished. The reason they stayed around until May was just to calm the waters and to ensure the agencies that required access had access.