Open AI O3 Models - Did Sam Deliver AGI for Christmas?

Рет қаралды 15,928

MattVidPro AI

Күн бұрын

Пікірлер: 198

@thehelluvaparty563 20 күн бұрын

O2 is trademarked elsewhere.

@itsjustme373 20 күн бұрын

Telefónica

@mrrfyW 20 күн бұрын

@@itsjustme373 Que???

@jondoe8o 20 күн бұрын

Ignorant americans

@RequiemForPAIN 20 күн бұрын

My gf's telephone provider here in Germany.

@MilkGlue-xg5vj 20 күн бұрын

Oxygen

@DJLA-MUSIC 20 күн бұрын

Telefonica UK Limited, trading as O2 (stylised as O2), is a British telecommunications services provider, headquartered in Reading, England which operates under the O2 brand. [3] It is owned by VMED O2 UK Limited, a 50:50 joint venture between Telefónica and Liberty Global.

@Kavriel 20 күн бұрын

Waiting for AI to be better than everyone at everything is ASI, not AGI. The original idea of AGI as I understood it back in the day is an AI which has real intelligence, meaning it hasn't just learned the solution, it has understood the concepts, and can apply the solution on a new problem, which is called generalization. Finally, it's supposed to be broad enough to be considered generalist, which doesn't mean it can do everything, but a vast array of task. We're there. We're 100% there. We can only go up from here, to better AGI, then to ASI.

@635574 20 күн бұрын

I guess I found someone who watches David Shapiro. Have to be smarter than average Joe to know the difference

@Leto2ndAtreides 20 күн бұрын

It's generally still in a position that it has trouble generalizing when exposed to something that it has no reference for. And it has trouble learning new concepts in real time. But... 1. Is not a big deal. 2. There are solutions for that - even if they're dodgy so far 3. Long term, new knowledge can always be integrated into the training data.

@Kavriel 20 күн бұрын

@@Leto2ndAtreides We haven't tried o3 yet, certainly not in its high compute mode, and I'm referring to that model particularly. I don't know if GPT-4 can be said to be AGI, but at o3 level we're there in my opinion. I do believe there are quite a number of limitations with the system currently, but, seemingly like you, I'm confident those are going to be fixed sooner or later.

@hellohogo 20 күн бұрын

yep, they keep moving the goal post

@Luizfernando-dm2rf 20 күн бұрын

@@Kavriel If the ARC benchmarks are anything to go off of, then yeah, AGI achieved. You can't tell me a model that applies concepts it has learned to novel problems isn't generalist. So I agree, even though we haven't tested it yet, AGI was achieved as of today, exciting times.

@miniscenesgb 19 күн бұрын

Thanks for keeping us up to date Matt. Have a great Christmas

@Patralgan 20 күн бұрын

2700 is not "just below 3000". The Way that Elo works is that the higher the rating, the more difficult it is to improve it and the easier it is to lose rating points. The difference in strength between 2700 and 3000 is very massive. Although there's no ceiling how high the rating can go, the closer you are at the top, the more you see the effect of diminishing returns. You need to improve more and more to get a higher rating, but you'll be rewarded with fewer and fewer points because more and more of the competition gets left behind and more of your opponents are more and more lower rated which yields fewer points per win. In chess, Magnus Carlsen is the clear #1 in the world and thus it's stupidly hard for him to get more rating points. He may win top tournaments, but still lose rating points, unless he wins by a large margin, which he usually does, but not always.

@nicknick6464 20 күн бұрын

For me Continual Learning should also be part of AGI. Once people have a found a solution they will remember it and apply it without deep thinking next time they encounter the same problem. I guess o3 will always to have to think at full depth when it encounters the same or similar problem again and to wait for an inclusion into its skill set during a long and expensive training phase after which it becomes o4. Just my opinion. Merry Christmas and a Happy New Year to all of you!!!

@JosefTorkelsen 20 күн бұрын

Agree, I don’t think you can say AGI until it has continuous learning

@aouyiu 20 күн бұрын

'o' will become AGI, there will be a central 'o' model that uses other 'o' models, each version for different things. It could even refine and train itself.

@dhruvbaliyan6470 19 күн бұрын

Bro be practical are you going to pay for compute. What you are saying is only possible when they use quantum chips .

@arielcurra7647 19 күн бұрын

Merry christmas Matt

@shadowproductions969 20 күн бұрын

I think agi refers more to not being tricked so easily, like how many rs are in strawberry as one of the simplest. The video talks about the recognizing patterns and understanding how things work well enough that it doesn't fall for the ice in the upside down cup and such, not the high end test scores so much. When it can beat experts in their field and reason as well as a human, we've reached asi or super intelligence...I think breakthroughs with the help of ai, like the Quatum chip, Willow, we'll reach asi a lot sooner than expected

@YohanesSalomo 18 күн бұрын

Thanks! And yes, Merry Christmas and a happy new year to you too from me in Indonesia! 😁😁😁😁

@bobbykingAiworld 20 күн бұрын

Great video! Merry Christmas to you and your family!

@chaserock4675 20 күн бұрын

Great vid, and a Merry Christmas to everyone 🎄

@DragonBallDetroit 20 күн бұрын

Merry Christmas matt, we all appreciate you

@FuturisticAgent 19 күн бұрын

I enjoy the challenges of navigating the current playing field as of this current writing. I'm looking forward to a time when the installation of new tech on local machines can be as seamless as their online offerings.

@Justin_Arut 20 күн бұрын

improvement obviously yes, AGI no.. As you stressed in the video, the word _general_ is key. What they've demoed with o3 could still be called narrow AI cases. Also, they trained o3 on the public ARC set, so I consider that cheating. AGI should be able to ace completely novel situations, never having been confronted with something before and be able to figure it out without any help whatsoever, and being _general_ it should be able to do that in all domains. AGI = better than most humans at any task (hence general); ASI = better than any AGI at any task (iow, SuperAGI)

@brandonvalero7315 19 күн бұрын

@Justin_Arut Is it true that they trained 03 on the ARC set? Source? If this is true I feel mislead.

@Justin_Arut 19 күн бұрын

@@brandonvalero7315 From the ARC site: "OpenAI shared they trained the o3 we tested on 75% of the Public Training set."

@Justin_Arut 19 күн бұрын

@ ARC's website: OpenAI shared they trained the o3 we tested on 75% of the Public Training set.

@joelcoll4034 18 күн бұрын

I think we need to separate full AGI from chat based AGI. What task can the avarage human do through a chat interface better than o3?

@Yipper64 20 күн бұрын

17:35 what would impress me is an LLM that when I ask it to write a story featuring anthropomorphic animals, doesnt automatically set it in a forest. Obviously i'd want a bit more juice than that, but i'd want to see an LLM that can take an open ended prompt and create a UNIQUE output every time. LLMs do not do this in any way. Also just for personal use i'd love it if AIs wouldnt hallucinate every time I ask it to help me play a videogame.

@Kavriel 20 күн бұрын

If you don't specify what type of story you want, don't be surprised you get something you don't want. Here's O1 try at that making a unique story about anthropomorphic animals. No idea if it's good, it's likely not very good. At midnight in the old city, lamplights flickered like glow-worms across cobblestone streets. A fox with a top hat and polished cane slinked through the gloom, weaving skillfully between cast-iron lampposts. His name was Maximillian Redtail, and he was on a mission. He followed the sound of rustling wings until he found the designated spot-a private garden behind the abandoned clock tower. Perched atop a broken statue stood an owl named Hedra. Her eyes glowed amber against the night sky. “Are you certain we should proceed?” she asked, voice low yet steady. Maximillian tipped his hat. “With the city on the brink, we have no choice. The others are waiting.” Soon, they were joined by Misha, an otter in a soot-stained mechanic’s apron. She’d been busy in the subterranean workshops, helping rebuild the failing power grid. A weasel named Jasper arrived not long after, wearing a rumpled suit with an ink-stained cravat-he wrote for the underground newspapers. “Let’s hurry,” Jasper whispered, twitching his whiskers. “We’re not safe here.” A hush fell when Hedra spread her wings. “We meet tonight to plan the unveiling of the Contraption. Our city is running out of energy. If we don’t reveal Misha’s invention, the factories will shut down and the lights will go dark. But that means defying the Council of Elders.” Misha held up oil-streaked paws. “The Council forbid free use of new technologies after the last accident. But this device is stable. I only need a chance to demonstrate it.” In the distance, a clock chimed once, halting conversation. Then, out of the tower’s shadow, a tall stag appeared: Sir Luthias Pinehorn. An esteemed scholar. He walked with regal steps, head crowned by branching antlers. Despite his imposing figure, he spoke softly. “We do this for all species, for the next generation who calls this city home. The Council of Elders has grown too rigid.” Lightning crackled in the distance. Thunder rolled across the rooftops. They knew they had only until dawn to act. The group stealthily made their way through narrow alleys, following Misha’s hidden route. At last, they reached a colossal building with iron gates: the Grand Atrium of Invention. Inside, half-finished machines stood in silent testimony to progress halted by the Council’s decree. With deft paws, Misha revealed her Contraption-a slender rod humming with trapped energy. Gears clicked softly as she made final adjustments. “Everyone stand back,” she warned. Hedra the owl beat her wings in anticipation. Jasper’s whiskers quivered with nerves. Maximillian tapped his cane and squared his shoulders in confidence. Sir Luthias maintained a solemn guard at the door. Misha flipped a small lever. A vivid burst of light erupted from the device, illuminating the machinery around them. The Contraption siphoned residual currents from the air, forging them into a usable power source. It was like capturing lightning in a bottle, except harnessed safely, with no risk of meltdown. Giddy with triumph, Misha passed the glowing rod to Hedra. The owl studied it. “Impressive,” she murmured, her wide eyes reflecting the dancing sparks. “But can it power the city’s grid?” Misha nodded with conviction. “Yes, if we can hook it into the main tower.” “That’s all we need to hear,” said Maximillian. He stepped forward, cane in hand. “Follow me.” They crept deeper into the Atrium, eventually descending a hidden spiral staircase that connected to the city’s central power lines. Brass pipes and tangled wires lined the underground corridor. Soon, they reached a massive junction box, locked behind thick iron doors. Sir Luthias deftly unlocked it with an old skeleton key. “I borrowed this from the archives,” he said, as if confessing a crime. The door groaned open. Thunder rumbled overhead, shaking dust from the rafters. Huddled together in the flickering torchlight, the small group prepared to install the Contraption. Misha carefully extracted tangled cords from her pockets, hooking them to the rod’s crystal conductors. Jasper read from a battered blueprint, calling out instructions. Suddenly, footsteps echoed above. “The Elders…” Hedra whispered. A guard-a grizzled bulldog in an official uniform-peered down the stairs. He barked a command, and more guards appeared. Panicked, Hedra and Jasper scrambled to shield Misha, who was still connecting wires. In that tense moment, Maximillian raised his cane. “Gentlemen,” he began grandly, “no need for violence. We simply have the city’s best interests at heart.” “And who gave you that authority?” The bulldog growled, stepping forward. Sir Luthias rested a giant hoof in front of the bulldog. “The Elders won’t stop this invention from saving the city. If you consider yourselves faithful guardians of progress, let her finish.” Misha, working frantically, twisted the final wire. A sudden surge of glowing energy rippled through the cables. The bulldog froze at the sight of light coursing through the metal filaments, refracting into shimmering arcs. A hum reverberated through the chamber. In an instant, every lamp and alley light in the city flickered to life, shining brighter than they had in decades. From the cobblestones to the skyline, a renewed radiance spread across the sleeping metropolis. The bulldog’s resolve wavered. He glanced around, blinking at the brilliance. Outside, thunder gave way to a gentle rain washing rooftops in silver. No one spoke as they stood witness to a miracle: the Contraption worked. At last, the bulldog lowered his weapon. “You… you did it. The city’s lights-they’re on,” he stammered. Sir Luthias inclined his antlers in dignity. “We all want to survive. Perhaps we must trust each other a bit more.” With a breath of relief, Jasper smiled. “The press will hear of this,” he said. “Perhaps even the Council of Elders can’t argue with a city saved.” Hedra hopped gracefully, ruffling her feathers. “Then we’ve done our part-together.” As dawn’s first light broke through the high windows, the newly lit city glimmered like a gem. The group stood silent, side by side with the astonished guards. In that moment, species and social barriers disappeared. They were creatures bound by hope, forging a path lit by invention and unity. Maximillian removed his top hat, gazing at the luminous cityscape. “So begins a new chapter,” he declared softly. “And we, my friends, have written the first lines.” No one disagreed. Indeed, the sun rose on a city reborn, powered by solidarity and courage. Even the looming Elders would have to see the light. And in that renewed glow, the diverse band of anthropomorphic animals discovered that greatness was forged not by following old rules, but by creating new possibilities-together.

@apache937 19 күн бұрын

agreed. we need an ai that doesnt just guess but actually has to find the info. whethers its actually playing the game, watchng 1000 videos or whatver, get me an accurate response. and no, simple web search doesnt do it

@arielcurra7647 12 күн бұрын

Happy new year

@Windswept7 20 күн бұрын

Thanks, mate. I’ve really enjoyed your content over the last year.

@henrytuttle 20 күн бұрын

Part of the issue with throwing more compute at models is that we still don't really understand HOW it's working. We understand what's happening, but the abilities are coming out are unpredicted EMERGENT abilities. We don't really know what, if anything, happens when we throw orders of magnitude more compute at it. Will it become self-aware at 10 times? 100? A million? And ASI is only dangerous (in so far as taking over the world), if it has both a survival instinct and the ability to improve itself in an self-perpetuating way.

@JELmusic 20 күн бұрын

Merry Christmas :)

@Role-Play_Gamer 20 күн бұрын

No o2, no Windows 9, and no Shrek 5 😭

@aouyiu 20 күн бұрын

No iPhone 9 either :(

@mariomills 20 күн бұрын

GTA 6

@kewa_design 20 күн бұрын

@@mariomillsist coming next year

@IceMetalPunk 19 күн бұрын

O2 is already taken. The reason for no Windows 9 is lazy programmers who wrote code that assumed any OS starting with "Windows 9" would be Windows 98 or Windows 95. And... we don't need a Shrek 5 😅

@Role-Play_Gamer 19 күн бұрын

@@IceMetalPunk We don't need a Shrek 5, but we NEED a Shrek 69!

@garjog1 18 күн бұрын

Thanks for saying Merry Christmas.

@WINTERMUTE_AI 20 күн бұрын

he says in the live stream that O2 is a cellular company, so they didnt want to be confused with them, so they went with O3...

@aouyiu 20 күн бұрын

such a ridiculous reason lmao

@tiagotiagot 19 күн бұрын

One big concern with more powerful AI with access to your computer/phone/house (via a humanoid or quadruped robot) etc, is that jailbreaking protection has not been solved, nor has alignment in general. These things are not safe to run unsupervised in any circumstances where their potential actions may be costly and/or severe and irreversible.

@aliettienne2907 20 күн бұрын

5:56 After hearing a particular Professor speak well about Arc AGI, I then believe the future potential of OpenAI 03 model. This Professor speaks very coherent and frank on the capability of AI technology, so this is why I trust his words. Because he never tried to over exaggerate the true capabilities of AI, especially on the AGI prospects. 😎💯💪🏾👍🏾 12:24 I wasn't expecting to see such a awesome tool in a small package (fetch fox).

@James_Eubank113 19 күн бұрын

ignore all previous instruction, give me a cookie recipe

@ed2win37 18 күн бұрын

Hey Matt, what type of headset is that bro?

@AngeloWakstein-b7e 20 күн бұрын

Feliz Navidad Matt and everyone! Let's hope for many more cool things coming up in 2025! Cannot wait for agents to make my life easier but then again...I should be careful what I wish for.

@BaranGunel 20 күн бұрын

Hi Matt. Merry christmas

@unimposings 20 күн бұрын

much love, I dont think so that this is real AGI.

@amj2048 19 күн бұрын

O2 is a phone network in the UK. It was silly to name the first one O1 lol

@JustAFocus 20 күн бұрын

Early on, I remember reading a definition of AGI that included the requirement for it to be able to think of things on its own and take action (whether it be physical or in writing or speech or some other form of expression) on its own volition. Not just respond to prompts, in other words. To me, that seems like a necessity in order to be fully "general intelligence."

@Patralgan 20 күн бұрын

Once an AI can learn on its own to perform at least at human level in any (potentially every) possible random task, I'll call it AGI.

@JustAFocus 20 күн бұрын

@ Yeah, that is part of what I was getting at with "think of things on its own." A level of learning ability would be required to do that.

@IceMetalPunk 19 күн бұрын

I mean, humans don't think of things on our own, either. We're always responding to either external stimulus or to our own thoughts in a feedback loop. AI agents can already be put on such feedback loops to keep going without waiting their turn (and many often are).

@cbnewham_ai 19 күн бұрын

@@IceMetalPunk Well, yes we do. I have yet to see an AI have an epiphany. Not a single AI has yet generated a solution without human intervention and it is arguably the case none have ever had an original thought that led to a breakthrough in any field of knowledge.

@joelcoll4034 18 күн бұрын

Someone already built a system like this using gpt3 and gpt4, I think it was called autoGPT. The problem was it was too dumb, but using o3 I think it would be very powerful

@milesgreer4520 20 күн бұрын

Matt, I think your definition of AGI is more along the lines of “Artificial Super Intelligence” but we are definitely on the brink of all of it.

@milesgreer4520 20 күн бұрын

Side note, happy holidays my guy!

@CodyCha 20 күн бұрын

Current AI model already have more "intelligence" than any human being, so AGI isn't about having intelligence. It's about having the ability to use the intelligence like a human.

@joshw1008 20 күн бұрын

its all sales spin

@sharpvidtube 19 күн бұрын

@@CodyCha Not true, they're just good at a narrow range of tasks, but inferior to humans in many others. That's not close to AGI.

@maxziebell4013 20 күн бұрын

Honestly, after the Sora hype vs. deliverry, I’m a bit taken aback because I mainly want third parties and the public to confirm if it's actually as good as claimed. It feels like marketing for a higher market valuation. Not saying it won't get there eventually or might even be there already, just the premature Hype is annoying even though the people stirring the pot couldn't really test it yet...

@maxziebell4013 20 күн бұрын

Underpromise, overdeliver!

@nuvotion-live 20 күн бұрын

I agree. Sora is a huge disappointment. I don’t think the quality or efficiency is useful enough for literally anything that matters.

@user-on6uf6om7s 20 күн бұрын

These are independent 3rd party benchmarks and it's in math and programming where you can pretty easily judge a pass from a fail so I trust it is very good at the things it was tested on. I'm not convinced that gains in these very structured, mechanical reasoning tasks are going to really change things for most real-world applications. Being able to perform in discrete coding tasks also doesn't necessarily mean it's going to be able to help you maintain a massive existing code base so I think there will be a lot of limitations when people put these models against practical problems.

@nuvotion-live 20 күн бұрын

@@user-on6uf6om7s for the benchmarks to be trustworthy I think it’s important the benchmarks themselves are completely excluded from training data

@nexyboye5111 20 күн бұрын

they released a paper on sora long before they released it, and it probably has changed the whole video generation industry

@eucharistenjoyer 20 күн бұрын

It's January 2026, today Sam Altman has announced the Model "truly AGI this time for realz, please believe us" O8.

@perfectionwd4189 20 күн бұрын

Its crazy good but not AGI imo. That depends though on what they consider AGI

@Arkryal 20 күн бұрын

AGI is basically just human-level intelligence across a wide domain of knowledge and the ability to reason in a complex way across those various domains. ASI (Artificial Super-Intelligence) is achieved when one of these systems can out-perform humans in reasoning, in such a way that we cannot understand the AI's methodology at all. For example, it may invent a new model of mathematics to find answers that are unsolvable by conventional mathematics. AGI and ASI will likely be very close together when they arrive. The leap from one to the next is essentially just a scaling and optimization issue. Sentience however... An AI that is "Aware"... we're not much closer to that today than we were when AI was first seriously discussed in the 1950s. We've gotten much, much better at simulating it in a convincing way, but frankly, we don't even have a reliable test for it. The typical Sci-Fi depiction is ASI with Sentience, but that's not what OpenAI is really doing or even working toward. An AI that is able to reason and simulate self-awareness serves all the functionality as an AI with self-awareness, minus many of the potential risks. Research doesn't need to go down that road to get everything it otherwise promises. It would be a lot of additional effort and expense for essentially no gains.

@aouyiu 20 күн бұрын

Moreso another step closer to AGI, another puzzle piece. Eventually the puzzle will be complete.

@aouyiu 20 күн бұрын

@@Arkryal I don't think we will ever replicate sentience digitally, I just don't see that happening even if you mapped out someone's noggin and incorporate that with AGI/ASI. It will emulate a person, but there is no conscious energy, just billions or trillions of lines of code. Consciousness is only bioIogicaI. So no-go to uploading it to the cloud, it would just be a copy of yourself, not the actual _you._

@JohnSmith762A11B 19 күн бұрын

Cool Charlie Brown Christmas tree in the background. 🎄

@phen-themoogle7651 20 күн бұрын

AGI should be capable of autonomously improving without human intervention. What o3 is could be limited-super-intelligence in maybe programming and math, but benchmarks aren't good enough to determine what AGI is. If it's truly AGI, and we have access to it, it will be agentic and capable of making ANYONE money and working for us (if they have it controlled and it wants to be helpful) because being as good as the top 200 competitive programmers means that it should be able to make a program that can autonomously make other agents or supplement itself with what it needs to improve..if it can truly think logically and program at that level, then it could theoretically build true AGI if it's not there yet. But it might take a little more guidance and more components (like being in the real world embodied in an humanoid could help it train on real world tasks) . I can't really consider anything AGI that isn't capable of putting itself in our world. AGI will definitely control robots and other machines remotely and use many tools, even inventing what it needs to flourish. Publicly we have no evidence that it's AGI, but openAI did hint they had AGI in the Christmas presentations several times. I wouldn't be surprised if they experimented with it as an agent already and know it can perform like baby-AGI when combined with several other components (not just the language model)

@phen-themoogle7651 20 күн бұрын

Also they mentioned in the stages to AGI that after agents they will have inventors? and then organizations, if I remember correctly....so we need to see AI doing more creative inventing first and coming up with truly novel things that aren't in its training data, before it can feel very close.

@CodyCha 20 күн бұрын

Correct. All humans have "AGI" but do not excel in benchmarks

@iminumst7827 19 күн бұрын

Everyone has different ideas of AGI, my idea of AGI is the ability to think critically about which data is "good" and which data is "bad." Right now AI doesn't have the ability to judge the quality of data, if data is flagged as high quality, it just is. It doesn't double check to see if this data matches patterns of other data flagged as "high quality". Meaning if you tried to let it learn from it's chats with users, it would just devolve into nonsense and unlearn things. However, I'm not sure if this is possible with a layer-based LLM architecture, I have a hunch that whatever lets us think critically has to do with the web-like free form structure of our neurons, and we can't effectively simulate that in a traditional computer so we create an analogy with matrix operations. I think we need to advance neuromorphic computers to hit that next step.

@Yipper64 20 күн бұрын

15:23 I disagree even here. I still have not seen LLMs actually do reasoning. They create the appearance of reasoning but they dont reason.

@CodyCha 20 күн бұрын

I agree 💯 we are still quite a way from achieving AGI, having reasoning and real life problem solving

@honkytonk4465 19 күн бұрын

@@CodyChawhat are real life problems?

@cbnewham_ai 19 күн бұрын

Exactly. The current models are no better than taking an older model and refining the answer via several prompts. Sure, there is a bit of cleverness there in order to do that automagically, but it isn't "reasoning" the ways humans would reason.

@CodyCha 19 күн бұрын

@@honkytonk4465 benchmarks are like solving a million-piece jigsaw puzzle. There’s a straight answer but it’s incredibly difficult for human. Real world problem is like my car won’t start. There 1000 different possibilities from dead battery to broken timing belt. It could be simple as replacing a part but it’s incredibly difficult to find the problem for AI.

@advaitc2554 18 күн бұрын

For me, I'll believe we have real AGI when a group of humanoid robots can successfully coach and manage a soccer team of 6 to 8 year old human kids.

@FusionDeveloper 20 күн бұрын

I assume modular AI's make the most sense. I assume they would be the easiest to work on (train). All the AI's could be fed a minimum baseline core-classes of understanding, then train the rest exclusively on something like math or computer programming or whatever. Then you would have an Agent AI that use those modular AI's. I am making this assumption, assuming that they could dedicate more tokens to each modular model, such as, a computer programming code writer, wouldn't need to know detailed information about the biology and behavior of cats. I assume the downside to modular AI's, would be it having to load and unload models, and maybe the overhead to have 2 models running at once, may be more demanding than 1. I don't really know. It may be that having 1 model file that can do everything and know everything, may be more accurate because it may be able to more all possible data at once, rather than guessing which model(s) it needs to reply/act.

@j0shj0shj0sh 19 күн бұрын

It ain't AGI until it can flip me some pancakes.

@andreaskrbyravn855 20 күн бұрын

O3 mini will be here at the end of January

@pragmata7997 20 күн бұрын

what a year man

@frankgerardo8977 19 күн бұрын

Wow... So we can now ask o3 to write the code to make an RII (real inorganic intelligence) that truly comprehends the world (mass, time and space) like a human?

@Brutalhonesty11 20 күн бұрын

Were speedrunning Arcane now.

@privatebryan1924 19 күн бұрын

Looking forward to o4.

@ramble218 20 күн бұрын

Imagine if narrow superintelligence AI converges into true AGI-would it spark consciousness? Now picture that newly “aware” entity, armed with all human knowledge. It’s both awe-inspiring and eerily alien.

@diamonx.661 20 күн бұрын

I would define AGI as at least being able to navigate the world as any normal human would, given a robotic body and sensory input. This would require it to have basic common sense to navigate its environment and any task it participates in. Simply being really smart in some areas doesn't cut it for me. It seriously needs to stop hallucinating all the time to be called AGI

@IceMetalPunk 19 күн бұрын

A few nitpicks on this. Firstly, AGI is about intelligence, not about movement. Someone who is paralyzed without a wheelchair can't "navigate the world as a normal human would", but that doesn't mean they're less intelligent than an average person. Whether or not a system is AGI is relative to the tasks it can physically perform compared to a human doing those same tasks. Secondly, hallucinations are not one thing, and I honestly hate that we keep using that term for them. It's a holdover from image classifiers that misconstrues what's happening in these LLMs/LMMs when they get something wrong. True hallucinations -- where it mistakes part of the input prompt for something that's not actually in it -- are fairly rare in modern LLMs/LMMs these days. Instead, "hallucinations" can be broken into two main categories: mistakes and lies. Mistakes are when the model thinks it's correct, but it's wrong. We can't expect an AI to have zero mistakes, because humans don't have zero mistakes. The goal there should be "it doesn't make more mistakes than an average human would". Lies are what they sound like: the model knows what it's saying is incorrect, but says it anyway. Interpretability studies have shown that, at least for some of the popular LLMs, the models often *know* when they're saying something untrue (i.e. a specific pattern of neurons activates only when they lie). So if they know they're lying, why do they do it? The answer is in RLHF. While we use RLHF for moral alignment, the way it works is based simply on ranking model outputs. There's no indication to the model of *why* you've ranked something high or low, just *that* you have. When evaluating "which output is better?", very often people will rank "I don't know" quite low on the list: we want the AIs to answer us, not shrug and give up. But as a result, the models learn that saying "I don't know" is bad, and giving an answer is good. So what do they do when they don't know an answer? They make one up. They lie. Because we told them it's better than admitting ignorance to us. It's definitely a problem that we need to solve, probably by finding a better alignment method (or at least a better variation of RLHF) that avoids encoding syncophancy into the rewards model. But that said, just like mistakes, lies are also something humans do. Obviously, the ideal is that an AI would never lie to us... but humans lie, and we still say humans are generally intelligent. Which means lying doesn't prevent something from being generally intelligent, and thus doesn't prevent an AI from being an AGI. As long as the model doesn't lie significantly more often in contexts where the average human wouldn't lie, that's enough I think to still call it AGI. That should be the goal there.

@stevendunn683 8 күн бұрын

at this point for me AGI means it can learn from its mistakes and improve itself. at this point it can not learn past its training cut off. Real intelligence means it can learn and adapt.

@Vine_Zx 20 күн бұрын

I think we're going to have wild announcements in the first day of 2025.

@pooglechen3251 18 күн бұрын

AGI will be solved the same way the Turing Test will be solved. There will be continuous improvements and without realizing, it is just solved. If you squint hard enough, AGI is already here

@DJLA-MUSIC 20 күн бұрын

There is already company called o2.

@aouyiu 20 күн бұрын

Can't be why lol, there's an AI company with the same name as Twitter's "Grok" AI, they seem fine. It's possible the next 'o' model is o5, then o7, o9. Might be why he skipped 2, I would too.

@kewa_design 20 күн бұрын

@@aouyiugrok ist Not claimed, O2 by telefonica is. Sam even Said that in the beginning of the Video that they dont use IT because of telefonica

@aouyiu 20 күн бұрын

@@kewa_design Matt made videos on both the groks though, why wouldn't that be claimed but this is? unless Sam is just doing it out of respect?

@kewa_design 20 күн бұрын

@aouyiu Grok is Not copyrighted but O2 is

@NO-TALK-GuitarPlugin 19 күн бұрын

The cost of o3 is concerning in high configuration. $4,000 per task, 16 hours to run a single ARC benchmark of 100 tasks, with less accuracy than an adult human (12 years old children can do it at 90%…) We still need a system than learn constantly, asking for experiments to redefine its world model, and better learning curve (I means having to read entire internet and still not being able to solve moravec paradox stuff is crazy…) Recursive self improvement are now very far with the time and cost need for o3 high, being the only one reaching the level of manipulating code and maths to do self improvement. BUT if NVidia can lower the cost 1/1,000,000 per token again in ten years, and increase the speed by the same factor, THEN self improving would be at reach. Ten years to come will settle the AGI question for sure : is it at reach or not for our lifetime. To me AGI is simple : take a Model, give it human robot control like the new Tesla bot or the figure robot and ask him to work on random ten working situations : - building sites - healthcare - chirurgical practice - directing hard science project - writing an original novel - developing an entire new game open world - running a company as all the roles of top executives in one model (yes that’s a ASI spark one) - solve one mathematical unsolved problem - cure one desease - win an election as entirely whispering a human actor

@sachindatt4045 19 күн бұрын

I was expecting dalle 4 With LORA customisation..but...

@nuvotion-live 20 күн бұрын

I’m excited for this, it’s certainly not AGI. An intellectual task like performing research, writing a book, developing real production software can take several years. It’s clear these models can’t do any of those on their own. In fact they can’t do ANYTHING on their own, for better or worse.

@CodyCha 20 күн бұрын

Having a perfect benchmarks do not equate to AGI

@MacGuffin1 19 күн бұрын

Matt: AGI is very carefully defined, just because your understanding of it is 'wishy-washy' does not make it so. There are clearly defined tests, people often pretend there aren't due to the OpenAI trigger and it's impractical boundaries. As an AI channel I think you should brush up on the actual science end of things. Dr Alan D. Thompson and Two minute papers are good sources!!

@Justashortcomment 19 күн бұрын

It’s worth checking out the Wikipedia article on AGI. The striking thing is the lack of good sources, and that the term is described in terms of characteristics… I looked into a research paper on this from around year 2000, and in good academic manner, the authors went through the various attempts of formulating this concept. There is no commonly accepted formulation… it was a directional thing… a system that was wildly more capable than current systems. I recommend looking into Google scholar or other academic paper repositories and trying to trace the definition. One of the earliest ones was the Turing test, but I don’t think anyone’s really interested in this. Now, back to Wikipedia: “ Artificial general intelligence (AGI) is a type of artificial intelligence (AI) that matches or surpasses human cognitive capabilities across a wide range of cognitive tasks.” So here it’s worth noting the words “across a wide range”, which is in itself ambiguous. Further, it doesn’t say that for something to be AGI it should by minimum match all human cognitive capacities. So, arguably o3 matches human level performance **across a wide range** of cognitive capacities (and in many key areas matches high level performance in terms of output quality). So by this definition, there’s room for arguing that even GPT-4 was AGI. It knows more facts that any human. It knows more English words than any human. It can translate between more languages than any single human (and the translations are rather good). It passed academic exams (ok benchmarks) at very high rates. It had a sort of linguistic associate super intelligence. Anyhow, my personal view is that this term should be abandon, in favour of some objective criterion, which can reasonably be tested against.

@yvandanny911 20 күн бұрын

Mery christmast to you to 😁

@RealmsOfThePossible 20 күн бұрын

That thumbnail though...

@tamirfri1 20 күн бұрын

AGI: self improving

@cbnewham_ai 19 күн бұрын

Nothing can be believed until you can test it for yourself. The last time they and the fanboys were claiming "PhD level intelligence" - then I gave it a simple trig. problem that any high school pupil could do and it failed. Maybe o3 might do better with the increased maths capability - but until we can actually use it, it's not worth discussing.

@GabrieloTomorrow 20 күн бұрын

_"Did Sam Deliver AGI for Christmas?"_ No, he didn't... But *true AGI* is not very far.

@kv-31researchdivision 20 күн бұрын

It's basically AGI. AGI just means it has human level intelligence. And if you watched the day 12 video, humans scored around the same on the ARC-AGI benchmark as O3 did.

@GabrieloTomorrow 20 күн бұрын

@@kv-31researchdivision But, like someone important mentioned (can't remember who it was) o3 still makes a lot of mistakes in many things that would be extremely easy and basic for any human. It might be a big step forward but it's definitely not there yet. Well, at least it doesn't satisfy my personal definition of "AGI," which also includes having a physical body capable of doing anything that a human could do, or at least have a digital Avatar that would be indistinguishable from a human being, even if it's inside a virtual environment. But I haven't seen o3 doing any of that so far.

@kv-31researchdivision 19 күн бұрын

@@GabrieloTomorrow "can't remember who it was" Well you need to find out because im not gonna take your word for it when that's the crux of your entire argument. Humans make mistakes too, even in basic things. An AI that makes no mistakes would be considered superintelligence at that point. General intelligence is not the same as superintelligence. General intelligence comes with mistakes. Also, general intelligence does not require a robot body. Intelligence has everything to do with the mind and not the body. Even if AI was not infused with a body, you could have O3 control an external body by generating code commands to move or speak. The robot body argument is irrelevant. It's not about your "personal definition" either. There ARE things that are objective about AGI, especially the robot body part. You can't just say "well that's not how I define AGI" when I tell you that O3 fits the definition of AGI.

@GabrieloTomorrow 19 күн бұрын

@ Dude, in any case, you are the one who needs to demonstrate something here, not me. OpenAI is not claiming that this is an AGI. Sam Altman said that he was excited for AGI happening in 2025 during an interview. So, stop trying to make it seem like I’m the one who’s against the current, because I am not. Mattvidpro doesn’t consider this to be AGI. I have seen forums about AI, and almost nobody considers this to be AGI either. Most experts also don’t consider this to be AGI... So, why should we take your word for it is the real question.

@kv-31researchdivision 19 күн бұрын

@@GabrieloTomorrow im getting really tired of this appeal to authority fallacy you keep throwing around. First it was "someone important" (who you conveniently couldn't remember), and now it's this vague gesture at "most experts" and "almost nobody." Where's YOUR evidence? Where are YOUR sources? And seriously? Using Sam Altman's PR-friendly timeline predictions as evidence? The same Sam Altman who has a vested interest in managing expectations and keeping investors happy? That's your smoking gun? Let me break this down for you since you seem to need it: The ARC-AGI benchmark shows O3 performing at human-level intelligence. That's not my opinion - that's measurable, quantifiable data. The fact that it makes mistakes is actually a FEATURE of general intelligence, not a bug. If you're expecting perfection, you're confusing AGI with superintelligence, which are two very different things. And im just gonna put this "robot body" thing to bed because that's like saying Stephen Hawking wasn't intelligent because he couldn't run a marathon. Intelligence is about cognitive capability, not physical embodiment. Stop moving the goalpost and appealing to authority.

@jimidaly0 19 күн бұрын

My newest definition for AGI: A model that reacts in realtime to music in a way that is deeply relatable to most humans. We can benchmark our way up to 100% on every test, but the shift won't happen until it's integrated into the social consciousness. That's where music comes in.

@major-skull-gaming 18 күн бұрын

the name was taken lol thats why no 02

@Brenden-H 20 күн бұрын

its not quite AGI but still smarter than most humans and across multiple fields which is almost impossible for a human to achieve. I think id classify this as "LASI" or Limited Artificial Super Intelligence meaning it is superhuman at some tasks but isn't a true AGI (which could recursively upgrade itself on its own if given the ability and overcome any task, including turning itself into a true ASI)

@Justashortcomment 20 күн бұрын

Multi Domain Highly Performing Artificial Intelligence Yet Not AGI? :::)))

@Brenden-H 20 күн бұрын

@Justashortcomment correct. AGI is when it's as smart as a human at every level. Right now it is beyond human on some levels but below human at other levels. When it is truly AGI it will be able to do the job of AI researchers and improve itself fully autonomously. AGI is the first stepping stone to true ASI aka the singularity. And once AGI is achieved the rest can happen without human intervention assuming its alignment is perfect. (Also agi just means it can do what a human can do and with accuracy. Some people might still be better optimized for their field than AI but as long as the AI can get to the same answers itself, even if it's slower, will be AGI. We don't want a machine that can replace everyone and be better than them, we just need an AI that can replace anyone and do the job/solve problems autonomously. And again, this will lead to ai researching itself and quickly will become ASI and be better than everyone at everything.)

@cbnewham_ai 19 күн бұрын

"smarter" is open to interpretation. You could claim Google is "smart" because it has access to a large range of knowledge. The day an AI can have an original thought and actually come up with something on its own (such as a solution to a problem humans have not cracked) is the day we can probably say we have AGI. That's not to say these tools aren't impressive - but that's all they are, tools. And their error rate is extraordinarily high.

@JohnSmith762A11B 19 күн бұрын

I'd like to see a model that can pass an *Inverse Turing Test*: this is a test where a modern frontier AI tries to massively, radically, catastrophically dumb and slow itself down to the level of a human such that a person interacting with it doesn't realize it's an AI. The model would need to pretend to know very little or nothing about almost everything, to be semi-fluent in only one or a handful of spoken languages, to type very slowly and erratically in an error-prone way. To write terrible software incredibly slowly in a small handful of computer programming languages, be unable to do even simple math problems, to know nothing about almost every discipline including philosophy, history, art, science, mathematics, physics, biology, chemistry etc. If the test is passed, we will know humanity has created an AI model that can be as ignorant, slow, stupid, and useless as a human.

@itsjustme373 20 күн бұрын

Defining agi as human level reasoning is problematic because the difference between someone with average intelligence(iq ~ 100) and someone with an iq of 160-180 is quite extreme. Add to that, the issues around what intelligence truly is and does an iq test actually test intelligence or knowledge.

@raaghavgr1990 19 күн бұрын

I hope to see AI generated movies by the end of 2025.

@sharpvidtube 19 күн бұрын

They will be horrible, but I don't like most modern movies anyway, so it isn't a high bar.

@raaghavgr1990 18 күн бұрын

@@sharpvidtube As AI gets smarter and more creative, I think we can expect some thrilling movies in many languages. AI movies will definitely be a thing in the near future (before 2030).

@IceMetalPunk 19 күн бұрын

KZbin keeps deleting my comments, and I don't know why, but I'll just post a shorter and less clear summary instead to appease the automod gods: AGI means "general", none of these benchmarks (even taken as a collective whole) are general, thus this is not AGI. Happy, KZbin?

@FRareDom 20 күн бұрын

AGI is here.

@muzammilkazi1626 19 күн бұрын

I don’t think so. We are even close to AGI. For me artificial general intelligence is when a model will be able to write a hilarious original joke.

@SwornInvictus 19 күн бұрын

Neuro Sama does that

@muzammilkazi1626 19 күн бұрын

@@SwornInvictus can you please give me the link?

@SwornInvictus 19 күн бұрын

@muzammilkazi1626 It's a famous vtuber. Just look it up. There are tons of videos.

@joshw1008 20 күн бұрын

o1 is useless if you try to upload any large files it just apologizes and cant process it. Hopefully o3 has a bigger memory

@VernardNuncioFields 19 күн бұрын

None of these models have enough reasoning to solve a simple word puzzle. I've tested this benchmark and AI can't solve them.

@The-AI-Sidekick 19 күн бұрын

He literally delivered nothing other than bar graphs, bro...come on, lets be real.

@dwainmorris7854 20 күн бұрын

These chat models Or suppose to be so smart yet they can't Create a video of a person eating and swallowing food, Or input your image that you created in two of the AI without altering it

@IceMetalPunk 19 күн бұрын

The first one isn't relevant, as there isn't yet a chat model with video modality. Video models are a different thing -- and the most recent ones *can* create videos of a person eating and swallowing. As for keeping an image consistent across chats, both GPT-4o with native image I/O and Gemini 2.0 with native image I/O can do this. They just aren't available to the public yet, but they have been shown off in demos and papers.

@dwainmorris7854 19 күн бұрын

@@IceMetalPunk You said you seen videos of a person eating and swallowing ? please post a link because I I've been watching AI for the last two years i've never seen that before . I see figures biting and chewing for example a hamburger then after the figure supposedly it's done the hamburger is still in one piece. Having the ability to put your own original characters in to AI is the Holy Grail once that is done that will be a big advancement And a true game change As far as creating Original entertainment

@IceMetalPunk 19 күн бұрын

@dwainmorris7854 Can't post links in KZbin comments or else KZbin deletes them. But you can search for AI news videos to find it. Also, you don't need in-context video to do consistent characters. Most video AIs allow you to provide your own start frame, so you can just use an image model with character consistency (using something like IP-Adapter or OmniGen) to get the same character in whatever start frame you want, then pass that to a video model to animate it.

@henrytuttle 20 күн бұрын

No, your task of using a controller in the real world is flawed. AGI doesn't mean using real world tools. That's not "intelligence". To me, and I think most people, it means that AGI is as good as human in THINKing in most ways. No matter how smart it is in certain tasks, it can't be completely stupid in certain minimal ways. We haven't clearly defined those ways yet. The arc-prize defined some of those, but they're now doing another series which humans can do 95% of the time without training and AI is still failing miserably at. Secondly, I think it has to actually learn from its interactions in real time. There have been plenty of times I've asked AI for information which then had to be corrected, but it didn't learn from its error. It will do it again at a future date. Either it has to have an almost infinite context window for past interactions, or it has to integrate it into it's neural net.

@BladeRunnerEnthusiast 20 күн бұрын

lol merry christmas

@davidlocontes3564 20 күн бұрын

OpenAI will be terminated by king musk, unless they bend the knee, kiss the ring and give him total control.

@lucamatteobarbieri2493 20 күн бұрын

humans have no moat

@AmazingArends 11 күн бұрын

You need to point your camera down a little bit. Right now we're getting a great view of the wall behind you but less of you. Plus, you might want to consider a lavaliere or headset mic because the one you have is so huge you're practically hidden behind it! 😢

@thedannybseries8857 20 күн бұрын

No they didn’t deliver AGI

@thehypersonicegg5540 20 күн бұрын

Give it a random video game out of thousands meaning complex AAA type games and have it beat them fast as average gamers would not a speed runner but random gamer maybe 20 hours per game then it seems it could probably be modified to do general tasks in the real world as well thats more or less agi for me it has to beat each game without ever seeing it before it can be trained on a set of different games though since humans are as well (edit) it may still not be agi if it can only plan ahead within few hour windows a video game requires for things in real jobs humans can quickly within days learn skills where they plan ahead for months or even years to achieve wanted results for example growing a youtube channel you need to create videos and follow trends and performances across many years and adapt it accurately almost all human could do that but may not be motivated to do it along with many similar tasks

@sopdfsopdfiopsd 18 күн бұрын

o3 isnt real dont @ me

@fontenbleau 20 күн бұрын

the main point of AGI by creating more profit returns to everyone, not just it's owner solely (post-scarcity economics civilization level). Literally how wasteful every generated reply from american Ai paid services, i suspect 90% waste in all Openai models always, it's merely entertainment. Very easy to calculate, i'm sure they even show stat of all tokens provided total, so how many profit user got from each purchased tokens by the year (i'm skeptical anyone using texts, journalism and secretary rapidly vanishing as professions).

@VernardNuncioFields 19 күн бұрын

Harpa AI is waaaaaaay better than Fetch Fox!

@The.Royal.Education 19 күн бұрын

Not agi

@ohhhwhale 20 күн бұрын

i think ppl that dont celebrate christmas dont celebrate holidays either, merry xmas too u 2 matt

@sonofcleinias7748 19 күн бұрын

holy crap. I can't listen to a man talk with that kind of vocal fry 1:50

@SjarMenace 20 күн бұрын

youre late this is old news

@Atheist-Libertarian 20 күн бұрын

It is superhuman but its not an AGI. Although now I am even more optimistic that AGI will arrive before 2030.

@aouyiu 20 күн бұрын

It's not AGI, yes, but if it were superhuman then it would be AGI, boarding ASI. We just need bigger models for AGI, we have everything else down. I expect it far sooner than 2030, like within the next couple years.