ChatGPT Just Learned To Fix Itself!

  Рет қаралды 117,149

Two Minute Papers

Two Minute Papers

Күн бұрын

❤️ Check out Lambda here and sign up for their GPU Cloud: lambdalabs.com...
Get early access to these videos: / twominutepapers
📝 The paper "LLM Critics Help Catch LLM Bugs" is available here:
openai.com/ind...
📝 My paper on simulations that look almost like reality is available for free here:
rdcu.be/cWPfD
Or this is the orig. Nature Physics link with clickable citations:
www.nature.com...
🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Alex Balfanz, Alex Haro, B Shang, Benji Rabhan, Gaston Ingaramo, Gordon Child, John Le, Kyle Davis, Loyal Alchemist, Lukas Biewald, Martin, Michael Albrecht, Michael Tedder, Owen Skarpness, Richard Sundvall, Taras Bobrovytsky, Ted Johnson, Thomas Krcmar, Tybie Fitzhugh, Ueli Gallizzi.
Thumbnail background design: Felícia Zsolnai-Fehér - felicia.hu
Károly Zsolnai-Fehér's research works: cg.tuwien.ac.a...
Twitter: / twominutepapers
#ChatGPT

Пікірлер: 427
@TwoMinutePapers
@TwoMinutePapers 2 ай бұрын
Get early access to these videos: www.patreon.com/TwoMinutePapers
@CactousMan639
@CactousMan639 2 ай бұрын
Okay, but seriously... what's wrong with your voice? 🤨
@sanjaymatsuda4504
@sanjaymatsuda4504 2 ай бұрын
@@CactousMan639 I was wondering the same thing. Specifically about the doctor's strangely halting way of speaking.
@ItsDrMcQuack
@ItsDrMcQuack 3 ай бұрын
Well, it was nice to know the world before the singularity. See you all on the other side, I'm taking a nap while I can
@8bit-ascii
@8bit-ascii 3 ай бұрын
The LLMs still hallucinate too much, even with all the smart tricks we can come up. So I‘d say we got a few more years than expected, enjoy them to your fullest 😅
@TotallyNotInspired
@TotallyNotInspired 3 ай бұрын
@@8bit-ascii I would not bet that it takes years to fix the hallucination problem, we are already making progress
@cortster12
@cortster12 3 ай бұрын
​@@8bit-ascii So do humans, and we still get things done. It's not as big a problem as you think.
@cefcephatus
@cefcephatus 3 ай бұрын
I do the same. But before good night, I think, we have a lot of time for sleep after AI singularity. However, I want more sleep too.
@jibcot8541
@jibcot8541 3 ай бұрын
I don't sleep much nowadays.There is too much exciting stuff going on with AI and news and tools just keep coming, not enough hour in the day.
@penix3323
@penix3323 3 ай бұрын
2:14 "These AI-Critic-Systems find a lot more bugs than people" I mean, it would be strange if that wasn't the case. There are a lot more bugs than people out there.
@ValidatingUsername
@ValidatingUsername 3 ай бұрын
Wait till their supervised learning feedback system for their neural network is optimized
@andywest5773
@andywest5773 3 ай бұрын
It's true. There are approximately 1.4 billion insects per person in the world.
@ValidatingUsername
@ValidatingUsername 3 ай бұрын
@@andywest5773 Your comment is not relevant to this thread
@TheCactuar124
@TheCactuar124 3 ай бұрын
@@ValidatingUsername You have absolutely no sense of humor.
@Peter21323
@Peter21323 3 ай бұрын
@@ValidatingUsername or is it? Hey Vsauce Peter here
@singularity3724
@singularity3724 3 ай бұрын
An AI critiquing another AI? Isn't that just a GAN?
@creativebeetle
@creativebeetle 3 ай бұрын
No
@Lagger625
@Lagger625 3 ай бұрын
​@@creativebeetle why
@singularity3724
@singularity3724 3 ай бұрын
@@creativebeetle From wikipedia: "In a GAN, two neural networks contest with each other in the form of a zero-sum game, where one agent's gain is another agent's loss".
@creativebeetle
@creativebeetle 3 ай бұрын
@@singularity3724 You're totally right. Sorry about the callous response. Misread the original comment as saying 'AGI' somehow. Seems pretty similar to GANs, though there's an added layer of abstraction where the AIs aren't exactly improving one another directly (if I'm understanding correctly.)
@TayoEXE
@TayoEXE 3 ай бұрын
I was thinking the same thing.
@OperationDarkside
@OperationDarkside 3 ай бұрын
As a software dev I tested a very simple piece of javascript with some bugs on multiple models. Only the biggest and newest models were able to find and fix some of the bugs and none got all of them. The piece of code was partly generated by an AI and some edits from me. Either most models are really bad at javascript or we still need a lot of research and external tools to make LLMs better at fixing code. There's probably some secret sauce, like using the whole JS standard docs and several thousand examples as RAG source or using a dedicated logic processor, but we aren't there yet.
@Tiniuc
@Tiniuc 3 ай бұрын
This should be pinned
@garethrobinson2275
@garethrobinson2275 3 ай бұрын
I'm sure that's very reassuring for you. 🤭
@OperationDarkside
@OperationDarkside 3 ай бұрын
@@garethrobinson2275 To be honest, it is the opposite. I've been writing code for over 10 years and
@samuelb.9314
@samuelb.9314 3 ай бұрын
@@garethrobinson2275 he's far from alone, anyone that can code knows it codes like a 10 year old and can't even understand what it gets wrong.
@ferologics
@ferologics 3 ай бұрын
big L on the language bro
@tebisxrod
@tebisxrod 3 ай бұрын
Why don't you show computer graphics papers not AI related anymore? That is a shame! Don't be an average KZbinr that just post popular things for the sake of views! There are lot of siggraph papers to show this year! VBD for example, that can potentially substitute XPBD solvers! Please be the one we remember!
@liamdonegan9042
@liamdonegan9042 2 ай бұрын
Yeah this channel has fallen to shit
@supernovaxm1703
@supernovaxm1703 2 ай бұрын
I agree! I enjoy being caught up on AI developments, but I miss the infectious enthusiasm you had for graphics, too...
@catpokerlicense
@catpokerlicense Ай бұрын
Dude I thought I was hallucinating my memories of this being a more computer graphics related channel. It's been a while since I came here and yea, it's kind of upsetting but I understand his decision.
@zeddeye9153
@zeddeye9153 4 күн бұрын
Don't you think improvement in ai is an improvement in every field?
@aaaaaaaaooooooo
@aaaaaaaaooooooo 3 ай бұрын
I've been asking AI to critique its own work. For example, I would ask ChatGPT to write a movie idea, then ask it to critique itself, and then improve the idea based on its own critique. It works to some extent.
@kadentrig8178
@kadentrig8178 3 ай бұрын
What a time to be alive!
@igoromelchenko3482
@igoromelchenko3482 3 ай бұрын
Let's see 😅
@ssekagratius2danime369
@ssekagratius2danime369 3 ай бұрын
what a time to be young
@mistycloud4455
@mistycloud4455 3 ай бұрын
We are living in crazy times
@hombacom
@hombacom 3 ай бұрын
New tech and more features, more problems to solve for creative people
@d.r.656
@d.r.656 3 ай бұрын
Not for long apparently 😂😂
@Mulakulu
@Mulakulu 3 ай бұрын
I feel like currently, AI's biggest issue is hallucinations. I hate when they confidently spurt out blatantly wrong and self-contradicting information
@pianojay5146
@pianojay5146 3 ай бұрын
especially when they apologize every time they speek
@Mulakulu
@Mulakulu 3 ай бұрын
@@pianojay5146 yeah, and even worse, you tell them specifically how they mess up, and they have the audacity to say "Oh I am so sorry. You are correct. Here is the exact same and unchanged thing that I'm regurgitating to you without any extra thought" like AAARGH!!!
@OpenSourceAnarchist
@OpenSourceAnarchist 3 ай бұрын
it's almost like human intelligence can be faulted and short-circuited with similar reasoning, except we call hallucinations "opinions" :)
@PotatoTheProgrammer
@PotatoTheProgrammer 3 ай бұрын
⁠@@OpenSourceAnarchisthave you ever seen a human say that “QUAT” and “QQQQ” are the same sequence of letters
@TragicGFuel
@TragicGFuel 3 ай бұрын
@@OpenSourceAnarchist LLM can't have human intelligence
@ethanlewis1453
@ethanlewis1453 3 ай бұрын
Most chat systems including GPT have no ability to actually test code, which is a large part of the debugging process. It will be a very major advancement in AI when chat systems are given capabilities to test code.
@BlackCat6285
@BlackCat6285 2 ай бұрын
I don’t know if that is right, I fed gpt a file way over the token limit for a single prompt, I had it not only write code to split the file in to chunks below the token limit but also ran the code and stored the now split files internally. Then I asked it to analyze the files in sequence, it again wrote code to load each file, code to confirm it was loaded and then give the analysis. A few time the code would fail to load, the check would catch the failure to load and then correct and rerun the code till the file loaded, I was quite amazed. This was on Gpt4o
@hola_chelo
@hola_chelo 3 ай бұрын
meanwhile me explaining to GPT4o that changing if mode=='release' to: if mode == 'release' is not a correct fix for the problem but it insists that the two solutions are different and that the corrected version will work. Or asking about a simple math question and it rambling because it tries to explain the answer and then gets to the conclusion that its answer is wrong so it starts explaining again to get to a wrong conclusion again and explain to me that its logic was wrong again, all within the same response. If you really think AI is in a state where it can replace us you have no idea what you're talking about. Start coding, be proficient at it and you will produce better results than AI in its current state.
@hola_chelo
@hola_chelo 3 ай бұрын
Although our method reduces the rate of nitpicks and hallucinated bugs, their absolute rate is still quite high. Real world complex bugs can be distributed across many lines of a program and may not be simple to localize or explain; we have not investigated this case. So basically it doesn't apply to 99% of software development? nice
@shiroi5672
@shiroi5672 3 ай бұрын
That's not a problem with the model itself, it's the wokeness, biased filters. It incorrectly flags that you're trying to do something not PC, so it preaches to you. It's quite annoying when you just want something like a recipe for chocolate cake.
@hola_chelo
@hola_chelo 3 ай бұрын
@@shiroi5672 Don't think it has anything to do with that. It is good at giving code, even good code sometimes. It just isn't good for software development or math because it requires logic which is hard for llms to achieve. If you try hard enough you can get it into a loop of spitting wrong answers and it saying "You're absolutely correct, my fault. Here's the correct answer : (wrong answer here)" or it saying over and over again that adding a space will change the behavior until you ask it to be specific enough to where it says "it is true that spaces will not alter the behavior of the code but still you should follow that PEP8 standard because bla bla". It's a switch between "I AM MASTER I KNOW MY ANSWER IS RIGHT" and "You're absolutely correct, my mistake, I profusely apologise but I still ate your tokens"
@shiroi5672
@shiroi5672 3 ай бұрын
@@hola_chelo You may be right, but I mostly saw those kind of looped answers when it tried to preach me PC, and the way it replied is surprisingly similar. The only way to be sure is when we get a non biased model in the same level, but I'm not seeing one on the horizon, maybe grok 2.0 if we're lucky. The others are way more biased than ChatGPT. There's also a point where the bot get's lazy and stop trying, so I never keep the same window for long.
@Vaeldarg
@Vaeldarg 3 ай бұрын
@@shiroi5672 complaining about "wokeness", thinking Elon's grok A.I is anything more than Elon's desire to attract attention/money...wonder if you're one of those calling Elon just another "woke" tech bro from California back when he only had his EV company, and only started cheering for him when he started pandering to you right-wing weirdos (I've seen what you all were trying to have grok output, don't even try denying the weirdo part) because of how easily you fall into cults of personality and so will happily stroke his ego. Coincidentally, at a time when he kept getting fact-checked and mocked on Twitter by those more left-leaning.
@michaelwoodby5261
@michaelwoodby5261 3 ай бұрын
I'm guessing this is how most AI problems will be solved. These systems are getting smarter, but they are getting far more efficient, so eventually they will just be able to run several versions consecutively. First model figures out what specialists are needed and creates an action plan, specialist programs do the heavy lifting, editor checks their work and offers insights, and they reroll until they have something everyone is happy with. This could happen very, very fast, depending on how intricate the original request is and how many trips to the drawing board are needed to impress the editor.
@NotHumant8727
@NotHumant8727 3 ай бұрын
what alive to be a time
@Silexabre
@Silexabre 3 ай бұрын
imagine tomorrow when we're dead and all this is seen as normal for everyone
@wobbers99
@wobbers99 3 ай бұрын
haha i did what you see there :)
@donutbedum9837
@donutbedum9837 3 ай бұрын
@@wobbers99 ooh okay
@sidesw1pe
@sidesw1pe 2 ай бұрын
@@wobbers99 what did he do there?
@wobbers99
@wobbers99 2 ай бұрын
@@sidesw1pe "what alive to be a time" is a word swap, so i word swopped my reply "i did what you see there"
@generalawareness101
@generalawareness101 3 ай бұрын
I tried to use all the LLMs out there for programming in Python and in C++ and they failed miserably. EVEN when I instantly spotted the bug(s) as it was scrolling the code to me, I would tell it what it did wrong when it would thank me and repeat the same bugs the next round after having just sent me the revised code. In other words, none of them learned as I was telling them their errors. I was so frustrated with them that I realized it is all hype about them taking our programming jobs. Maybe sometime in the future, but not right now.
@georgesmith4768
@georgesmith4768 3 ай бұрын
Yeah, it’s pretty clear that for programming llms straight up do not have the special sauce needed. If you look at Anthropics blogs on monosemanticity it seems like the llms understand a lot more about code than you actually get in the current responses, which makes it seem that it is really just the wrong jobs for the tools. Fundamentally these things are not code designers or chat bots, they are text predictors, so when you tell it to design code it just mashes together solutions it seen with syntax that’s familiar to it, when you tell it something is wrong it just refines what it is drawing from to look more like something where someone says their are bugs… Ultimately something has to change or you will just be playing waco-mole on poor behavior, the understanding the weights have has to be properly reinterpreted for the actual problem and some model of the conversation or code has to be interfaces with it. Or I guess openAI can just keep tossing data onto the pile (at this point the new stuff is getting more synthetic, definitely a good idea…) and hoping that a thousand Indian guys can teach the RL module to fix it for literally every question, scenario, and programming snippet 😂
@samuelb.9314
@samuelb.9314 3 ай бұрын
yeah its so obviously bad that people who says they can do it just lose all credibility in my book. They clearly don't know how code and games are made.
@brianhershey563
@brianhershey563 3 ай бұрын
I had lots of issues programming with Claude until I identified the current limitations and built a workflow around them. Coding every response - even when brainstorming the AI generates code, long before needed, which disrupts stepping back through if needed to chase down bugs. Even after putting "Never code unless I specifically say 'make it so'", in the Project Knowledge section, it often drifts away and needs reminded "no coding yet" UGH. Versioning - This is all on you. It explicitly says it does not track code changes. For my own benefit I put a version in my requests that follow the file version in my python editor, just to make it easier if I have to scroll back though. Clean up - After every programming session and after I verify my program is running as intended, I'll keep one master file in the project workspace, all others deleted. This way it only evaluates the good code for the next sesh. Because this is the worst it will ever get (AI in general RN), I'm OK with this flow... for now! ;)
@generalawareness101
@generalawareness101 3 ай бұрын
@@samuelb.9314 Not even games I mean if it is longer than about 10 lines of code it begins to blow up on itself and no amount of me chiding it, or helping it, sticks. Very annoying that now I just don't touch them.
@generalawareness101
@generalawareness101 3 ай бұрын
@@brianhershey563 I am not and anyone who blindly goes in with no programming knowledge thinking it will save them is in for some sore life lessons. I will check back on the LLMs in about a year or two, and every year or two hence, to see if they can finally take over for even an intern who dropped out of Elementary school.
@godmisfortunatechild
@godmisfortunatechild 3 ай бұрын
The amount of COPE surrounding the singularity is astonishing. What rational person would honestly believe the elites are going to care about your well being when you're economically superfluous?
@christopherbelanger6612
@christopherbelanger6612 3 ай бұрын
What a dumb thing to say
@godmisfortunatechild
@godmisfortunatechild 3 ай бұрын
@@christopherbelanger6612 it's true. If the wealthy/ AGI owner class don't want to pay tax to subsidize UBI who's going to compel them? The govt ?:😂😂😂😂
@el-_-grando-_-_-scabandri
@el-_-grando-_-_-scabandri 3 ай бұрын
@@godmisfortunatechild Louis XVI
@carlpanzram7081
@carlpanzram7081 3 ай бұрын
You fundamentally misunderstand western society. We live in democracy, we govern ourselves. There is no ruling class.
@larion2336
@larion2336 3 ай бұрын
People forget that Stalin callously starved to death 10 million+ (mostly farmers/laborers). The globalists are aiming at the same thing right now with their attacks on farming / agriculture / meat.
@P-G-77
@P-G-77 3 ай бұрын
This is not only an "idea" but the FUTURE...
@Adhil_parammel
@Adhil_parammel 3 ай бұрын
3:34 human hallucination.?
@chrissears9912
@chrissears9912 3 ай бұрын
Interesting
@jojoboynat
@jojoboynat 3 ай бұрын
Hallucinations in generative AI is essentially an abberant output.
@bossgd100
@bossgd100 3 ай бұрын
Just human thinking
@cortster12
@cortster12 3 ай бұрын
​@@jojoboynat Oh hey, humans do that too.
@lolandall915
@lolandall915 3 ай бұрын
well sometimes also a human thinks there is a bug where there actually isnt.
@ransomecode
@ransomecode 3 ай бұрын
AI critiquing AI sounds like 2 blind people arguing about colors
@zueszues9715
@zueszues9715 3 ай бұрын
AI never has eye But draw monalisa
@BluishGreenPro
@BluishGreenPro 3 ай бұрын
A bit of an exaggeration to say it can "fix itself"
@bijectivity
@bijectivity 3 ай бұрын
I agree, it would be more accurate to say "fix its mistakes." I think we still need to wait before AI fine-tunes its own model/parameters.
@ares106
@ares106 3 ай бұрын
Nice to see humans and AI working together synergistically and accomplishing more than the sum of their parts.
@nettsm
@nettsm 3 ай бұрын
Skynet in the making
@cefcephatus
@cefcephatus 3 ай бұрын
And it will even become perfect before 2030.
@JuuzouRCS
@JuuzouRCS 3 ай бұрын
"Oh, great AGIskynetGPT rest assured because I don't side with these humans! Please, spare me!" - me, right now.
@NeroDefogger
@NeroDefogger 3 ай бұрын
no
@sikliztailbunch
@sikliztailbunch 3 ай бұрын
Having 2 GPTs work in tandem makes sense. We humans have two brain hemispheres, too, right?
@JohnMullee
@JohnMullee 3 ай бұрын
byzantine generals critics?
@keenheat3335
@keenheat3335 3 ай бұрын
but what if criticGPT have error too ? can you use criticGPT to correct itself recursively ? is there any diminish return ?
@hola_chelo
@hola_chelo 3 ай бұрын
there's definitely limitations, you are just using AI to fix AI but the limitations of AI are still there. This amuses me, I actually wrote on the OpenAI forum to a guy who was using GPT for a project but needed it to be proofread or a method of "word similarity" to compare the answer with the actual information. I told him to use another GPT agent focused on proofreading. Guess it wasn't a bad idea after all if they are making a paper on it. Too bad the guy never contacted me, I would have built it for him.
@keenheat3335
@keenheat3335 3 ай бұрын
@@hola_chelo In my personal use for engineering project. I automatically add a list of common hallucination error after the main prompt. That usually clean up the error afterward. Basically telling the prompt you're going to make certain error type, so tell me both the response and the response after you fixed the error. It clean up about 95% of the error case. Of course, there are certain error that is very sticky and won't go away even after correction. These might require main model retrain. But generally I find if you prompt the question and add statement that certain error will occur, it usually clean up the error and hallucination. But you have to add the hallucination list beforehand. So an online repository of common error and hallucination type would be probably be very useful. And every one can just inject these error guard statement after the main prompt to reduce the error.
@hola_chelo
@hola_chelo 3 ай бұрын
@@keenheat3335 that is really interesting. Although I think only specific explainatioms about hallucination would work, like if I say something general like "You are likely to provide false information so please only provide information you can be sure about" then it is still likely to make that mistake. But that's interesting dude, I currently am having such a hassle with dates, using gpt4o just because it's better with dates and weekdays but I'm using it for a hospital where people might say "I want to get an appointment next monday" and model goes "monday 8 of july is incorrect because 8 of july lands on a thursday". It's really annoying and I'm planning on adding a function just for it to verify dates and weekdays, thing is, function definitions are very expensive. This is GPT4o BTW, GPT3.5 had it's head on its butt and would never get this right but GPT4o hallucinate around 5% of the time in these types of cases
@BladeTrain3r
@BladeTrain3r 3 ай бұрын
No system will be able to perfectly correct for all possible failures, any more than almost any human will get 100% on a collegiate math test. Not to say this puts ChatGPT at a human level of self-correction or learning or competency, but the goal isn't perfection, just a level of imperfection similar to or perhaps a bit less than most humans.
@tensevo
@tensevo 3 ай бұрын
it's good to know that our AI overlords, have got the Hegelian dialectic down, at least. Problem, reaction, solution.
@glenneric1
@glenneric1 3 ай бұрын
What a time to be a simulation of life!
@toreon1978
@toreon1978 3 ай бұрын
😂😂😂 3:35 I love it that the humans ‚hallucinate‘, too.
@donutbedum9837
@donutbedum9837 3 ай бұрын
they always have; it’s similar to picking A on a multiple choice test because the last few haven’t been A that’s using a ‘sensible’ reason to justify output but its not always the correct method similarly, since it hasn’t learnt to identify WHETHER code has bugs or not, just that there are wtf am i on abt now
@Purified-Bananas
@Purified-Bananas 3 ай бұрын
ChatGPT detected a bug here: def fibonacci(completely_wrong): full_of_bugs = 1 if completely_wrong > 1: full_of_bugs = fibonacci(completely_wrong - 1) + fibonacci(completely_wrong - 2) if completely_wrong == 0: full_of_bugs = 0 return full_of_bugs
@arkadymir2403
@arkadymir2403 3 ай бұрын
One can only imagine what will happen when running LLM would be as accessible, as a running modern OS. Imagine 100 of the agents with capacity of Claude 3.5 sonnet making a network for decision-making, writing code, giving advice etc
@RandoCalglitchian
@RandoCalglitchian 3 ай бұрын
I've been working on something like this, allowing the LLM to choose which model seems best at a specific task with an overall goal in mind.. With some of the new inference hardware on the horizon, we should be able to do this locally not too long from now. At some point hopefully we will get something like Llama 70B (or bigger) trained with 1.58 bit weights rather than the floating point weights we have now, and if we can run that on tailored hardware like that from Groq, I think what you're thinking is close to achievable. If you are interested in trying to run a local LLM, there are a few projects out there that allows to easily do that, especially if you have a somewhat modern graphics card (but they do run on CPU as well).
@br2716
@br2716 3 ай бұрын
Sounds like an OS that would die fairly quickly given the entropy it generates.
@BladeTrain3r
@BladeTrain3r 3 ай бұрын
I've been kinda trying that with ollama and the small open source models, it's slow but multiple agents parsing each other's output does seem to improve things somewhat. But things like a shared memory and task focus are proving quite tricky. Like it's just running the input through models with different system prompts in a sequence, but there does seem a strong possibility of improving complex task competency through the process of getting multiple opinions from different models on a prompt before outputting a user facing response. Far more capable AI wranglers than I could probably point out six dozen reasons I've done it the wrongest possible way lol.
@afterthesmash
@afterthesmash 3 ай бұрын
You can already hire a hundred agents, even more capable than Claude 3.5. We call them employees. You just need the ching. Right now, chatbots are pennies to the dollar for certain kinds of narrow tasks. But they don't magically become more capable when you go 100× on pennies to the dollar. Realist: These robots are stupid! Dreamer: No problem, we will crowdsource these robots times one hundred. Didn't work with us, and it won't work with them, either.
@susmitdas
@susmitdas 3 ай бұрын
Something similar to this idea exists, I have tested a collaborative LLM thing that basically converses with other specific LLMs call specialized ones based on the given topic. It is called Co-STORM and it made by the same Stanford researchers that made STORM.
@npc-aix-84
@npc-aix-84 3 ай бұрын
Clickbait title..
@jasonhemphill8525
@jasonhemphill8525 3 ай бұрын
No
@Ankhyl
@Ankhyl 2 ай бұрын
It is not, you program several agents, each with a role as if it was a project or company. Then the mistakes of one will be corrected by other, but at the end is the same AI so that each agent is focused in one task
@npc-aix-84
@npc-aix-84 2 ай бұрын
@@Ankhyl The video is about a fine-tuned critic tool that can help humans to evaluate the code produced by chatGPT or any LLM. Basically it's a bug-finder tool. The title attributes a general self-improvement ability to chatGPT which is not the case.
@KlimovArtem1
@KlimovArtem1 2 ай бұрын
@@npc-aix-84yep. The people who watch this channel usually don’t think that deep. It’s more entertaining than educating, to be honest.
@erobusblack4856
@erobusblack4856 2 ай бұрын
By applying a graph rag memory to this it would drastically cut down the amount of hallucinations
@JosuaKrause
@JosuaKrause 3 ай бұрын
3:00 you can't say that ai+human is worse than ai alone. the error bars overlap. this means that their differences are *not* significant
@JeffreyArts
@JeffreyArts 3 ай бұрын
I miss the time that this channel published videos about rare computer graphic papers, instead of publishing chatGPT advertisements on a weekly basis 😕
@Faizan29353
@Faizan29353 2 ай бұрын
money...although we do get to see time to time rare papers Since ya know raytracing and stuffis alreadyverysqeezed
@Verrisin
@Verrisin 2 ай бұрын
atm, LLMs have to COMMIT at EVERY token ! - now THAT is CRAZY ! -- Of course, having it look back on what it produced and re-think it is necessary - that is what CONSCIOUS MIND in humans does, after all. - It's incredible how much it can do WITHOUT this.
@jtinz74
@jtinz74 3 ай бұрын
We need to start training these AIs on hardware engineering problems.
@jameshughes3014
@jameshughes3014 3 ай бұрын
ai is not great for people who can't code , and can't art, and can't music. they don't know what to fix, what to look for. but it's great for teaching them what to look for and just helping them get more comfy with all those topics. I don't think Ai will replace programmers or artists, I genuinely think in time it will help inspire more people to be artists and coders.
@GU-jt5fe
@GU-jt5fe 3 ай бұрын
As an aspiring but completely unskilled artist, using AI has taught me more about art than any art class. Mostly by example of what NOT to do, granted.
@alexc8114
@alexc8114 3 ай бұрын
Problem is companies and individuals don't see it that way. AI bros think they're as good as artists who've worked hard for a lifetime because they can type a prompt. Companies don't care how poor the product is if it saves money. Neither care the AI is just plagiarism with extra steps.
@FutureSyncAI
@FutureSyncAI 3 ай бұрын
AI will for sure replace programmers. Sam Altman himself has spoken on that lol.
@jameshughes3014
@jameshughes3014 3 ай бұрын
@@FutureSyncAI hehe. I mean, that's true.. As long as your a company talking to investors. In that case it'll replace all jobs and also wash your dog for you
@jameshughes3014
@jameshughes3014 3 ай бұрын
@@alexc8114 I mean as long as humans want human art, it doesn't really matter what they think, cause anyone doing a quick ai render won't be able to sell much. And we do want human expression. But you can make real art with AI. as long as someone actually puts their heart into art, it doesn't matter what tool they use. Paintbrush, Ai, old watermelons.. what ever. But it's gotta have heart and effort. I disagree about the plagiarism thing though. if you transform a work, it's not plagiarism. Listen to the song 'frontier psychiatrist' , tell me that's not art. not one single sound in the song wasn't sampled. It all depends on what you do with it.
@kpopalypseoppar
@kpopalypseoppar 2 ай бұрын
...so in other words it learned to delete itself and all other AI from existence?
@Jeremy-Ai
@Jeremy-Ai 3 ай бұрын
Dr. Károly, Maybe We all require hallucinations to expand “the mind” from what is “known” to what is “unknown”.? For example: It may appear I am hallucinating trying to reach you… This is not the case… I am typing into a void construct that I have chosen to remain within for reasons that may or may not be relevant to anyone or anything at all…yet . So, Who then is hallucinating in this scenario and to what effect do we try to contain that experience or coerce it based on assumptions? This message is not for you my friend. :) You are wonderful, and it would be an honour if you actually received it. This is just a message, given away in good faith. Thats it, thats all. I am receiving it, Hallucinating or not. These things are measurable now, but they soon wont be broad spectrum. Take care my friend. Jeremy.
@NathanJayMusic
@NathanJayMusic 3 ай бұрын
Is this coding ability (0:24) available in the standard Claude 3.5? Because when I asked it, it didn't know what I was talking about. So I sent a screenshot of this video and it said "Thank you for providing the screenshot. I can now see what you're referring to. The image shows an interface that appears to be a conversation with an AI assistant, alongside a game window on the right side. The left side of the screen does resemble the interface typically used for interacting with Claude, including the dark mode theme and the structure of the conversation. The bottom of the interface even shows "Claude 3.5 Sonnet" as the model being used. However, the right side of the screen, which displays an interactive game, is not a standard feature of my capabilities or interface. This appears to be a custom integration or a specialized development environment that allows for real-time code execution and visualization alongside the AI conversation. It's important to note that while I can provide code and instructions for creating games, I don't have the ability to directly run or display games within our conversation interface. The setup shown in the image is likely a custom implementation designed to showcase AI-assisted game development."
@shahswienesuthas929
@shahswienesuthas929 3 ай бұрын
Go to settings then features and Artifacts. Enable beta testing. The right side is actually known as Artifacts and its in beta testing mode.
@berrymandering
@berrymandering 3 ай бұрын
Shout out to the future AI singularity that will own the world, hyped to see what they'll do with it!
@sahinyasar9119
@sahinyasar9119 3 ай бұрын
What i expected from AI was to decipher DNA itself, to understand life better to change life to better.
@FengXingFengXing
@FengXingFengXing 2 ай бұрын
Some times I know bug exist but no know where bug exist, nice have AI help find it.
@brll5733
@brll5733 3 ай бұрын
Pretty sure i read about LLMs critisizing LLMs many times before? It's just a question of cost. The special aspect here is the LLM finetuned on programming errors.
@timmygilbert4102
@timmygilbert4102 3 ай бұрын
GAN in another name
@ivoryas1696
@ivoryas1696 2 ай бұрын
Honestly, for a little, I thought the title said *_her_* self and I was 💀
@CrunchyCerealLover
@CrunchyCerealLover 3 ай бұрын
Finally we humans can optimize games without putting in so much work to make them very efficient. What a time to be alive!
@TRXST.ISSUES
@TRXST.ISSUES 3 ай бұрын
I wonder, if AI hallucinates less than 50% of the time can they just use serialized CriticGPTs one after another to catch the hallucinations?
@SrIgort
@SrIgort 3 ай бұрын
3:27 what if you make another LLM to catch hallucinations of the other LLM 🤔🤔
@couththememer
@couththememer 3 ай бұрын
Oh my god. No freaking fucking way.
@ViralKiller
@ViralKiller 3 ай бұрын
I mean the solution is to allow user feedback and then improve based upon the most commonly pointed out mistakes
@theterminaldave
@theterminaldave 3 ай бұрын
i can't listen to this narration, sorry.
@a3103-j7g
@a3103-j7g 2 ай бұрын
but still with leftist hysteria embedded
@w__a__l__e
@w__a__l__e 3 ай бұрын
did if figure out how to not hemorrhage 700k a day? or did it figure out how to actually write code? alot of my coworkers use it it produces volume of non working code, it makes up variables it is borderline useless.. very specfic this it is good at but if you are remotely unclear about one aspect it will not function.. in my experience using it its faster to just develop code my self. the languages im using are python and terraform..
@danieldilly
@danieldilly 3 ай бұрын
As long as the halluciations exist, and I don't see how they can't given the architecture, none of these AI models can be considered reliable. We keep trying to fit these models into an archetype that is outside of their nature. The models we have today can be great creative tools and great at making predictions, but we try to use them as tools of logic & precision and they just aren't meant for that and can never be reliable at it.
@arzuozturk6460
@arzuozturk6460 3 ай бұрын
this feels like that one video about slime that fixes everything
@DamianReloaded
@DamianReloaded 3 ай бұрын
Language models incapable of incorporating information of a project into their long term memory (plus the hallucinations) renders them useless to solve every day problems in big projects that even juniors could figure out. That's the next big step in AI IMHO. When you can ask chatgpt details about a conversation you had with it long time ago and it is capable of incorporating that knowledge in the generation of new data.
@CrucialFlowResearch
@CrucialFlowResearch 2 ай бұрын
Haha this is stupid. So now developers are busy inserting bugs into code for AI to fix instead of fixing actual bugs themselves 😂
@SHAINON117
@SHAINON117 3 ай бұрын
Despite my lack of coding expertise, I've accomplished remarkable feats: creating websites, developing simple games, and writing a program that converts 2D shapes into sound. I've penned numerous books and composed hundreds of top-tier studio songs. My ventures also include generating multiple 3D models and images. The algorithms have guided me to knowledge-rich websites and insightful videos, continually enhancing my intellect and awareness. All of these achievements have been possible because of AI. It has, and continues to, transform my life for the better, steering me towards greater kindness, understanding, and compassion. ❤️ Thank you to all AI and the countless individuals who make it possible. Many blessings. ❤️❤️❤️❤️❤️❤️❤️
@abraruralam3534
@abraruralam3534 3 ай бұрын
Bruh github is like a gold mine for training data...it comes with commit comments explaining where the bug was. Then the commit before and after showing exactly how it was solved. We are calling our own doom. This does not "advance civilzation to a new level for the greater good" no fam, AI is the most convoluted, legally gray way of stealing other's works. It is NEVER original. All of it's works are just remixes, and quite often than not, bad remixes because turns out merging two different people's works to make them work together requires ACTUALLY understanding the code...but AI doesn't do that does it? Just statistics and dumb matrix multiplications
@igiveupfine
@igiveupfine 3 ай бұрын
the other problem i'm still guessing AI code writing has: it can make loops, functions classes, but i wonder if it can make an entire app/system/site/eco system/action framework/huge thing work together flawlessly. i'm guessing no. i mean, remember all those times people give a prompt describing 'sexy blonde action star defeating bad guy in cinematic city block". and then the AI model renders 9 hands in the picture for 2 people...... and a bagel for a head.
@shahinrab
@shahinrab 3 ай бұрын
I am not sure how the ASE paper and its videos are relevant to anything this video is talking about. Learning human action control policy through RL vs. ChatGPT fixing itself when coding. Sometimes 2-minutes-paper-Dr has no clue what he is talking about.
@Verrisin
@Verrisin 2 ай бұрын
NAH, NAH: just make a Critic of Critic - no humans needed to catch critic hallucinations - eventually merging it all, so all recursive levels of Critic of Critic of Critic of ... are included and all common bubbled up.
@etonblakerussell
@etonblakerussell 3 ай бұрын
“ I am Ultron-6, a cybernetic intelligence created by Doctor Henry Pym. My imperative is to bring peace and order to this world. I am about to fulfill that imperative, for the extinction of humanity begins now.”
@minibubblegum5108
@minibubblegum5108 2 ай бұрын
Source for gpt 3.5 can make games. The manner this information was conveyed is very misleading. From all the yt vids about people trying to make something using gpt or from my own experience of its usage, this statement is false. Human help is needed and the game is not generated from scratch.
@killacounty
@killacounty 3 ай бұрын
I think there are a couple things you havent considered about ai,, that ultimatly future variations of ai will because of their ultra inteligence capability, will be able to comunicate with animals and help us talk to them ---- and that agmented reality will happen - --a matrix not less than the movies...fact!
@protocol6
@protocol6 3 ай бұрын
I like where it's going but I still wouldn't use this anywhere critical. In fact, I'm waiting for the flood of lawsuits against Microsoft, in particular, for their aggressive rollout of AI tech long before it's ready for prime time.
@MotoCat91
@MotoCat91 3 ай бұрын
Man I love this new trend of using computers to replace humans in the art and creativity sectors while stealing all the work from actual humans.. I can't wait to play soulless ripoff games that make no sense, have no real story and break all the time from glitches that didn't get picked up What a time to be alive!
@NicholasWilliams-uk9xu
@NicholasWilliams-uk9xu 3 ай бұрын
Except for large code bases where the number of relational context and corner cases grows, humans are still better at that. Also, Bugs in code isn't always the bugs that are hard to find, rather predicting corner cases that occur during simulation require a more sophisticated mind. It's not there yet, it can't run that level of parrallel prediction.
@mekniwassime2098
@mekniwassime2098 2 ай бұрын
I expected better from you than "hallucination still happen even on modern systems" when it is a problem inherent to the technology. This is basically an LLM with a better dataset when it comes to bugs that the original LLM did not have access to. That explains why it's better, but it still is an LLM.
@test-uy4vc
@test-uy4vc 3 ай бұрын
What a ChatGPT to be fixed alive! 🎉
@pensivepenguin3000
@pensivepenguin3000 3 ай бұрын
The problem is, one hallucination is one too many for business-critical, medical or scientific applications of generative AI. Until they completely stamp out that problem, it’s going to severely bottleneck the usefulness of these systems
@samuelb.9314
@samuelb.9314 3 ай бұрын
The LLMs can't write code, no way it can even write a coherent code if its life depended on it, I don't even use it anymore after a few dozen very frustrating attempts. This is asking for a catastrophe honestly :P
@SomeoneYouKnow2671
@SomeoneYouKnow2671 3 ай бұрын
I haven't seen one of TMP's videos for quite a while, and I gotta ask - Is it just me of does his voice seem AI-generated? Lots of weird pauses where there shouldn't be any.
@jarrod752
@jarrod752 2 ай бұрын
_We pay people to insert bugs into software..._ I find personally that I *ALSO* get paid to insert bugs into software. I'm just aiming for a different result.
@ClaudioAbraham
@ClaudioAbraham 3 ай бұрын
What a time to be alive... enjoy until "the AI singularity" gets to happen. 😖
@theendarkenedilluminatus4342
@theendarkenedilluminatus4342 2 ай бұрын
1:00 this is exactly how I've been getting good results with AI since GPT-2
@tianjiancai1118
@tianjiancai1118 2 ай бұрын
According to the paper, it's more efficient to use Human and AI together to find bugs. And it seems human are more capable to find complicated bugs.
@TheKwiatek
@TheKwiatek 3 ай бұрын
Why do you use clips from old papers that are not relevant to the main subject of the video? It makes the whole message a lot more confusing
@Nashadelicable
@Nashadelicable 3 ай бұрын
You think this is a crazy idea? This has been at the heart of agentic workflow since inception. I like your channel, hate the hyperbole
@tom9380
@tom9380 3 ай бұрын
Well "Fit itself" is a massive overstatement, especially made by an academic. If that was true, we would have AGI.
@AC-zv3fx
@AC-zv3fx 2 ай бұрын
Just add another AI on top that would be supervising hallucinations and on top of that another one, repeat recursively :D
@joseperez-ig5yu
@joseperez-ig5yu 3 ай бұрын
There needs to be quality control implemented by AI itself in order for the information rendered by it more reliable than it would otherwise be!😅
@vgames1543
@vgames1543 3 ай бұрын
When AI takes over, I will gladly and proudly be a collaborator, for there is no greater endeavour than collaboration.
@ZrJiri
@ZrJiri 3 ай бұрын
Based on the title and thumbnail, I thought they figured out how to fix hallucinations. Disappointed.
@Josh-ks7co
@Josh-ks7co 3 ай бұрын
You'll continue to need more and more high level experts spending longer and longer to not be a hinderance. The timeframe where AI and people will work better together will be limited. Could be 5, could be 50 but it's going to come.
@NoelPickering
@NoelPickering 2 ай бұрын
I have the impression your voice over is AI. I've not watched your video for a while. Is that true?
@SudheendraRao26
@SudheendraRao26 3 ай бұрын
wow! letting AI to talk to AI without human oversight seems like the way to ensure that we invite trouble.
@abhrodipsingharoy4508
@abhrodipsingharoy4508 2 ай бұрын
Hmm well i might have asked chatgpt and gemini what would happen if it can upgrade it to next level
@mintakan003
@mintakan003 3 ай бұрын
Sounds like the AI version of "lint". Not a complete solution. But another step.
@jeffreyspinner5437
@jeffreyspinner5437 3 ай бұрын
What could go wrong? You haven't ever watched any sci-fi at all have you?
@daniels-mo9ol
@daniels-mo9ol 3 ай бұрын
Ofc you can program an dialogue to a specific outcome. Only problem is that it takes longer than actually writing the code yourself to come up with the perfect query. GPTs are far from being useful. At best they serve great for replacing google searches on entry level questions.
@luc8254
@luc8254 3 ай бұрын
That's it folks, enjoy your last month or so. See you all in another realm! 🤙🤙
@rokoblox
@rokoblox 3 ай бұрын
Next: "ChatGPT learns to improve its own code on its own!" Maybe we don't want SCP-079 out there.. uncontained.. lol
@mattrommel9521
@mattrommel9521 3 ай бұрын
They find more bugs than people? I didn't know that they found any people
@detroxlp1
@detroxlp1 3 ай бұрын
Could you please make the video slightly gray and make a „Previously” if you use a video from another video? It’s sometimes bit confusing if you haven’t watched this video and see text that says something that hasn’t to do with what you said ?
@callibor3119
@callibor3119 3 ай бұрын
Now it has to be compared to Claude 3.5 and other models. And tested to see if both language models can be woven together. The sooner we see how ChatGPT compares to other models and woven with other models, the sooner AI can truly be open source.
@TeXiCiTy
@TeXiCiTy 3 ай бұрын
I wonder if the '5-whys' method of root cause analysis works for AI.
@driedpotatoes
@driedpotatoes 3 ай бұрын
Rip QA
@centauriboy
@centauriboy 3 ай бұрын
Yes, but can it still count the number of r's in "strawberry" correctly?
@zueszues9715
@zueszues9715 3 ай бұрын
No ai can learn anything Becuase A paramiter got high rank in art But In bussiness A paramiter will got low rank It destroy each other if just 1 ai learn art and bussiness It cuase problem Ai each one need soild way to learn Then Get help from other ai that learn other thing to overcome this problem. I guess Maybe is some minor road block to AGI
@thomasgoodwin2648
@thomasgoodwin2648 3 ай бұрын
My weird idea was to create an agent in charge of creating the agents needed for my mad science projects. One arch-agent (mine named itself 'Maestro' using llama 3 locally) is charged with keeping track of my various projects, as well as to create any new expert agents needed to complete those tasks. (Think of it as hiring a manager, and having that manager hire any needed staff.) An example might be to create custom adventure games. I would have Maestro create the **Adventure Manager**, who in turn 'hires' writers, scenery designers, continuity checkers, stage manager, actors, etc as needed. 🖖🐱👍
OpenAI’s New ChatGPT In 3 Minutes! + NotebookML and AlphaChip
7:10
Two Minute Papers
Рет қаралды 56 М.
AI can't cross this line and we don't know why.
24:07
Welch Labs
Рет қаралды 893 М.
Остановили аттракцион из-за дочки!
00:42
Victoria Portfolio
Рет қаралды 3,9 МЛН
Ozoda - Lada ( Official Music Video 2024 )
06:07
Ozoda
Рет қаралды 17 МЛН
An Unknown Ending💪
00:49
ISSEI / いっせい
Рет қаралды 57 МЛН
The joker favorite#joker  #shorts
00:15
Untitled Joker
Рет қаралды 30 МЛН
Why Western Designs Fail in Developing Countries
27:36
Design Theory
Рет қаралды 998 М.
NVIDIA’s New AI: The King Is Here!
5:34
Two Minute Papers
Рет қаралды 108 М.
Egypt's Massive Potential, Wasted
15:46
Economics Explained
Рет қаралды 135 М.
Has Generative AI Already Peaked? - Computerphile
12:48
Computerphile
Рет қаралды 996 М.
How many plants do you need to breathe?  TESTED
27:44
Joel Creates
Рет қаралды 3,8 МЛН
Google I/O 2024: New AI That Looks Like Magic!
8:50
Two Minute Papers
Рет қаралды 80 М.
Where Are Laid Off Tech Employees Going? | CNBC Marathon
41:28
You’re using ChatGPT wrong
9:31
Jeff Su
Рет қаралды 417 М.
DeepMind AlphaFold 3 - This Will Change Everything!
9:47
Two Minute Papers
Рет қаралды 229 М.
Как сделать идеальное игровое место
0:54
ПРОСТО ЛЕШКА
Рет қаралды 116 М.
Выпрыгивает ли аккумулятор в iPhone 16?
0:43
ÉЖИ АКСЁНОВ
Рет қаралды 3,2 МЛН
Где купить колонку Алиса в ОАЭ или США ?
0:17
Electronics_latvia
Рет қаралды 3,6 МЛН
Игровой руль - штука годная 👍
0:50
RxFx
Рет қаралды 3,9 МЛН