New ChatGPT: Hallucinations No More!

Рет қаралды 65,798

Күн бұрын

Пікірлер: 200

@AdvantestInc Ай бұрын

Fascinating exploration of AI’s push to address hallucinations! Especially intriguing was the idea of AI's confidence adjustment, reminds us how essential it is for these tools to not just deliver answers, but to do so with a level of assuredness we can trust.

@yaazarai Ай бұрын

A level of assuredness Google* can trust. These models can be biased, dont trust them...

@Churdington Ай бұрын

@@yaazarai The interesting thing is -- Just about every negative critique we could give to AI, we could also give to people. AI can be biased, but, so can human intelligence. AI can be bad at creating artwork, and it can be influenced by other people's art, but, human intelligence often creates worse art, and human intelligence often learns from other artists, too, even practicing their techniques to learn. Human intelligence first has to learn what the physical world looks like, though. For example, a person that has been blind their whole life can't imagine what colors look like. They also don't understand what it means for objects to appear smaller as they are farther away, because they've never seen that happen before. (There's some interesting interviews with blind people on youtube) Similarly to humans, AI has to learn before it can create art. Also, AI can be bad at math, but, again, human intelligence is often much worse. Etc, etc. (That wasn't an argument against what you said, btw)

@shApYT Ай бұрын

I wish this channel went back to highlighting ignored papers instead of covering hyped press releases.

@KevinGriffin-l6r Ай бұрын

It is hard to do that with the quantity of papers being written in the AI space. There is just so much to look at and evaluate.

@Lucky-df8uz Ай бұрын

It's one of the main reasons I stopped watching as much :( Like sometimes a press release is fine but it's just not what I originally came to two minute papers for and if I want a press release they are already there in a ton of formats.

@RogerBernstein Ай бұрын

I totally agree! This channel has unfortunately become one of those typical hyped up AI youtubers where all we get is one sensationalist / overselling press release after another with no questions asked. At first I was quite interested in the progress in AI, but I stopped watching a few months ago and checked in today after a long time to see that almost every video is either OpenAI or Nvidia. The worst part is that he also started to sensationalize and oversell the results like in the AlphaProteo video. AlphaFold and AlphaProteo are nice but closed, commercialized products with very limited use cases in actual research, not like "protein folding is solved" as he claimed.

@shabadrandhawa3829 26 күн бұрын

@@RogerBernstein agree

@telotawa Ай бұрын

it's far more censored than google search - it's basically useless imo, censorship kills usefulness

@RyluRocky Ай бұрын

I completely disagree it’s objectively far less censored than Google.

@thornelderfin Ай бұрын

@@RyluRocky He means Google Search not Google Gemini. Google Search is not censored at all.

@malditonuke Ай бұрын

Google search is censored. Look for things that you know you aren't supposed to find. Then notice how you don't find them.

@thornelderfin Ай бұрын

@@malditonuke I meant porn and similar legal things that are usually censored (definitely censored in all AI searches and all AI prompts except Mistral). What exactly did you mean by "what you aren't supposed to find". I hope not some conspiracy theory.

@Faizan29353 Ай бұрын

@@thornelderfin "Compare them with YANDEX" Then yes google is a bit censored... Say "Movie, Software Piracy sites" etc etc

@jsalsman Ай бұрын

Thank you for addressing hallucinations. I find the new search hallucinates much more than GPT-4o with web browsing from before a week ago. Insidiously, it doesn't just halucinate titles, authors, and dates, but snippets which make the fake citations extremely inciting. There needs to be a button to push on hallucinated references that you really want to exist that will make some agent on the back end go out and write the paper as penance for lying.

@UserName-pi9no Ай бұрын

You're welcome

@DialingSpoon527 Ай бұрын

Have you seen the oaisis minecraft ai? It's an absolutely insane project.

@TwoMinutePapers Ай бұрын

Thanks for the recommendation, I haven't seen it yet. Super cool - can't wait to have a closer look!

@skyless_moon Ай бұрын

@@TwoMinutePapers new video? 😯 I hope so!

@DialingSpoon527 Ай бұрын

@skyless_moon I do partialy fear the might of the scholars killing the relatively short wait times currently enjoyed.

@dinhero21 Ай бұрын

pretty insane, it's basically a public version of GameNGen

@TheAutisticRebel Ай бұрын

@@TwoMinutePapersIt's INSANE!

@Paulginz Ай бұрын

The most interesting result from this paper, in my opinion, is that even when trying on purpose to come up with uncontroversial questions that have a single correct answer, and double-checking, humans only agree on facts 97% of the time.

@amitraam1270 Ай бұрын

If AI learns from human texts, why wouldn't it hallucinate? Don't we work with incomplete data all the time? So, that is what its "learning".

@steve23063 Ай бұрын

Why should it hallucinate? It could instead say it’s very unsure and that its response is a guess that’s likely incorrect due to incomplete data. Instead it provides a response with the same sense of certainty every time, whether it’s hallucinating or not.

@amitraam1270 Ай бұрын

@steve23063 hi, have you met us humanz? 😀 We do that, too!

@jerrygreenest Ай бұрын

It hallucinates all the time. For common things it's okay, as they're easy to answer. But ask it something just slightly non-trivial, it will always give you bullshit answers, while being certain it is not wrong (but it is wrong)

@jerrygreenest Ай бұрын

In programming it happens all the time. Ask it something for bash, a very common widespread thing, it will likely answer properly, but ask it about relatively new shell Nushell, which is 5 years old already, - it knows about its existence in overall, but completely wrong about how it's being used, what's code valid, and what's not, invalid knowledge all the time.

@TheAutisticRebel Ай бұрын

Hallucination is a stupid confusing word. All we do is hallucinate. That is the structure of neural networks. The map is never the territory. It's a computation engine. It doesn't HALLUCINATE it does what neural networks do. Mind you they are very very very VERY DUMB insulated NETWORKS that are rigid and not brain plasticity like the human brain. Hey are doing what they were designed to do. We just need more nuance in it's own the feedback loop.

@meowcoo Ай бұрын

Isn't this basically bing AI?

@dtibor5903 Ай бұрын

Similar but not identical

@n-i-n-o Ай бұрын

Yes

@maxpilip9277 Ай бұрын

Bing ai more worse

@bapo224 Ай бұрын

Perplexity AI seems better

@n-i-n-o Ай бұрын

@@bapo224 I agree, i have the pro subscription got 1 year for free and its really nice.

@mercerwing1458 Ай бұрын

I've been waiting for something like this! Google needs competition SO BAD

@tannenbaumxy Ай бұрын

You might want to give Perplexity a try. In my subjective experience, it's been better across the board.

@Boothiepro Ай бұрын

I mean Microsoft's bing ai/copilow was using gpt4 with web search for at least a year now, it cited sources too

@kevindixonmusic Ай бұрын

I hope OpenAI makes a search site. I would use it from day 1

@rkan2 Ай бұрын

@@Boothieproit still always felt like a downgrade to ChatGPT

@egyvalaki2774 Ай бұрын

Like Google "Learn About"?

@TewaAya Ай бұрын

Chatgpt still will not source from the old forums. I found a reference to what I wanted to see arounds 2003-2011 on bing.

@Chef_PC Ай бұрын

That Dream Theater search put a big smile on my face.

@TheAutisticRebel Ай бұрын

Me TOO... THAT DID NOT GO UNNOTICED! I love the subtleties in his videos. Love love LOVE THIS CHANNEL!!! 🎉🎉🎉

@Iris_and_or_George Ай бұрын

Uuuh planning a trip, finding a place for dinner asking about the weather for a specific day have been possible in normal chatgpt and copilot for ages right, same as showing sources? Ive done all three. I've been using them instead of Google for months. Or am I missing something, please enlighten me.

@Iris_and_or_George Ай бұрын

Except whenever it censors something stupid and annoys me. Gemini and copilot don't like it when you reply to their : "I can't give you blahblah " with: your own search engine will give it to me no problem, so the company doesn't mind.

@Sus_Bak Ай бұрын

Finally a competitor to Perplexity 😂

@edumazieri Ай бұрын

a bit late to the party :p

@bzikarius Ай бұрын

What do i can say? Finally fact-checking made. Something, that should be made at start, like constant memory for saved conclusions, knowledge base, where fact can be checked or aligned. Neural network finally become useful.

@SunnyOst Ай бұрын

One of the biggest things that kept leaving a poor taste for me when using chatgpt - when it was super confidently wrong. Excited to see it being unsure!

@dinhero21 Ай бұрын

sadly, RLHF will do that, it's one of the biggest problems in AI currently (aside from stuff like, spatial reasoning)

@Badumtss2468 Ай бұрын

Doesn’t Perplexity AI already do that?

@JYobtat8156 Ай бұрын

I just use Perplexity lol

@krux02 Ай бұрын

I am kind of over with AI. In software it just tries to solve the wrong problem: automating code duplications instead of cutting down on them. Code is a liability not an asset after all. In art I am not as deeply invested, but it also feels like it, but I can't put it into words. For search engines I just don't trust it. I really prefer a link to a dodgy site that I can judge on my own, rather than an AI summarizing it for me and not telling me where it came from. So can you please do some other science instead? The unpopular science maybe? How AI generated code creates more bugs faster? How computers in education consistently worsen students learning effects?

@SimGunther Ай бұрын

That's what parsers and linters are supposed to be for, but instead we get a bunch of guessing hoping a cluster of those neurons will come up with the right optimizations and that's if this code actually solves the problem you're invested in.

@antonystringfellow5152 Ай бұрын

Is that also the case with Claude 3.5 (new) sonnet? I've heard a lot of devs saying it's much better than the other models out there

@DreckbobBratpfanne Ай бұрын

@@antonystringfellow5152it is. o1 models can also create algorithms. Not really sure what the original post wants here. Research has also proven that even the old gpt3.5 is useful as an assistant in eg programming, the upgraded models even more so.

@krux02 Ай бұрын

@@SimGunther I absolutely agree. I try to work on this problem my inventing my own language. But you have to agree that there is the problem of, programming languages kind of need to keep their bugs and inconsistencies, otherwise they break their existing code, and therefore stop being the language they initially were. This is a hard problem with no good solution to it (yet).

@krux02 Ай бұрын

@@antonystringfellow5152 I have not tried, and I am kind of not interested, as I already stated in my original post. I am kind of over with AI.

@jantube358 Ай бұрын

So all of the custom versions of ChatGPT that you can create for a specific purpose that Open AI calls "GPTs" that have been introduced in November 2023 will be redundant, soon? Or are they already? I remember the Tech Advisor GPT, the Law GPT, the JavaScript GPT and other GPTs. I was already wondering if using them makes sense at all or if the regular Chat GPT would give me the same results. For example when I had specific questions regarding job application letters or national law codes. What do you think about the redundancy of those GPTs for the consumer customer?

@ipodtouchiscoollol Ай бұрын

doesnt microsoft copilot already do this???

@theupsider Ай бұрын

It does.

@kipchickensout Ай бұрын

i asked it to get the newest CS2 update and it searches the wrong pages. i googled cs2 updates and the first link was the correct one. then i told it what webpage to go to and it got the correct date of the recent update but then it just listed only items from previous updates because they were on the same page but further down...

@appvoid Ай бұрын

it will, in fact, hallucinate

@aburak621 Ай бұрын

Karoly, the Dream Theater fan let's goooooooo!!!

@cupofjoen Ай бұрын

Forcing chatgpt to cite real sources was the hardest thing for me. Too bad I already finished my thesis lol. No worries I only use it to fix my grammar.

@LorenzoValente Ай бұрын

Wait a second... Karoly is a Dream Theater fan????? 0:31

@TwoMinutePapers Ай бұрын

Just went to my first concert a couple days ago...it was a Dream (Theater) come true. Couldn't believe I finally saw them! So good!

@LorenzoValente Ай бұрын

@@TwoMinutePapers Oh boy, I saw them live in Milan two weeks ago! So glad you enjoyed it too, it was indeed a dream ❤ I'm sure you also loved the crazy lasers/lights show (as a light simulation researcher by trade!)

@TheAutisticRebel Ай бұрын

So jealous. Seen hundreds of concerts... not one of them... 😢

@someghosts Ай бұрын

Has anyone made the joke that you’re still getting hallucinations since the page you get back on google was probably ai generated too? If not then I’d like to make that joke now.

@hotel_arcadia Ай бұрын

DREAM THEATER MENTION ❗❗

@ak-gi3eu Ай бұрын

It would be great if we had option to modify the prompt after generated . instead of typing again new prompt

@dennisdelgado4276 Ай бұрын

Love how dude sprinkled ever so lightly his political stance 😅

@luckyankraj Ай бұрын

So they got scared of perplexity 😂

@thelowlytrinity Ай бұрын

Google is quaking.

@parthasarathyvenkatadri Ай бұрын

Ask it funny questions like "why john wick is an idiot ?" 😂

@cbnewham5633 Ай бұрын

Perplexity AI has been doing this for ages. OpenAI is late to the party. And no mention of Perplexity in this video.

@TheAutisticRebel Ай бұрын

Agreed, but its NOT ABOUT Perplexity so WHO CARES

@cbnewham5633 Ай бұрын

@TheAutisticRebel I care, obviously, otherwise I would not have commented. Yes, I know it's not about Perplexity (and I just knew someone would follow up my comment to make that obvious statement). It is remiss not to mention Perplexity at all.

@mekniwassime2098 Ай бұрын

"Hallucinations No More" I expected better from this channel.

@DanielRut Ай бұрын

hallucination and accuracy vs perplexity?

@the-0-endless376 Ай бұрын

AI chatbots are a dead end tech. This is a waste of money and time they could have put towards medical AI instead. Cancer predictions, not planning trips to hotels that don't exist.

@crawkn Ай бұрын

If you can ask an AI for a level of confidence and get a meaningful evaluation correlated with accuracy, why wouldn't you just have the AI assess its confidence automatically, and make adjustments where appropriate? There should be a threshold of confidence below which a given statement should be rejected, or appropriate disclaimers provided within the response. Don't wait to be corrected and then disclaim.

@joletun Ай бұрын

This is wrong. It still hallucinates. Specially with new data. Like new API code and programming languages.

@SP-ny1fk Ай бұрын

It isn't very good at telling seo spam websites apart from real websites though

@dzxtricks Ай бұрын

Is there no "confident" parameters included in each answer so we know how much we can trust that? Considering we can ask "how confident" means it can actually be an extractable Information for users to acknowledge

@noriniatpacgniekcah Ай бұрын

This is the beta testers contribution to the world.. Indirect way

@KevinLSchwartz Ай бұрын

Have you had a chance to try out the new o1 (not preview, but whatever is after that)? What do you think of it?

@xtc_w Ай бұрын

I thought he did a video on it

@mantekarys Ай бұрын

He already di not that long ago

@Kuchiriel Ай бұрын

I can only comment other ppls comments, why

@goodfortunetoyou Ай бұрын

I asked an AI model on a the United states - Mexico Canada Free Trade Agreement website whether or not lumber was part of the agreement, and the location of that information in the pdf text of the agreement. I wanted the pdf to reference the underlying language. It failed to correctly recognize that I wanted a pdf, and failed to correctly localize any part of the agreement that focused on whether or not lumber was subject to the agreement, or had an exemption which would allow for tariffs. I'm by no means an expert on legal texts or trade agreements, but I count that experiment as failed, using an unknown version of a public chatbot model.

@mahdoosh1907 Ай бұрын

×10 more heat = x10 less water

@Yourname942 Ай бұрын

will the search results be skewed by sponsors paying to be at the top?

@kachowbltch3585 Ай бұрын

I wonder what happens to all those sites that rely on advertising for upkeep once they aren’t getting visitors anymore

@TheAutisticRebel Ай бұрын

As the pie gets bigger they are less likely to be found anyway... Unless... hmmnnn 🤔 what if they aren't on the first page of Google?

@TheAutisticRebel Ай бұрын

Sure, I wonder what will happen to all those newspapers too!

@V0TION Ай бұрын

Coal search

@DR20005 Ай бұрын

Since this dropped, I stopped using Google

@vblaas246 Ай бұрын

Two minute papers, have you cloned your own voice recently?

@-_Nuke_- Ай бұрын

Hello guys, Im using the free version of chat gpt for coding. But I wonder, is there a free ai that I can use for coding that is dedicated to be used for coding?

@TheAutisticRebel Ай бұрын

There are but they all suffer the same issues to one degree or another. Coding is still a human activity... for now.

@-_Nuke_- Ай бұрын

@@TheAutisticRebel I see! I mean, so far with Chat GPT I get really good results (taking into account that its not an actual person). I wonder if there is anything that it can see the entire code that you are writting in VS code for example. I know about copilot but this is not free right?

@TheAkdzyn Ай бұрын

I find myself prompting Google like an Ai more often nowadays. This will make things easier!

@gwenandersen6803 Ай бұрын

i dont this its gonna be better then google, even there is ads google gives very good result and dont need to find what i need for too long

@henrythegreatamerican8136 Ай бұрын

Just what we needed.... Another search engine.

@TheAutisticRebel Ай бұрын

🎉🎉🎉 Hahaha spit out my coffee loling!!! How true is THAT! 🤣🤣🤣

@stanleesiele6028 Ай бұрын

Still can't break ecdlp

@NoThing-ec9km Ай бұрын

Wowiee

@Billy4321able Ай бұрын

0:14 Ah yes, write a Python script. I ask that all the time. 🤦

@billr3053 Ай бұрын

I might. If I was forced to work in Python.

@TheAutisticRebel Ай бұрын

I have AUTOMATED some cool stuff using Python. "Real coders" laughed at the language... Ridiculous. It's so powerful.

@SimGunther Ай бұрын

@@TheAutisticRebel I've also done plenty of automation in Python and bash. Pretty good stuff to be made with those two languages!

@CookieXD1998 Ай бұрын

I actually do. I am not good with python yet and needed some simple scripts, that reformat text strings

@ripplerxeon Ай бұрын

IDK the direction we are going, Soon everything will be done by an AI which takes the fun out of searching on the web. Like before if i get a code Error i would search it and look at what other people did to solve that and also learn new things on the way but now when i ask chatgpt i get the answer and the fun of solving a problem by myself is gone. It just feel wrong (i know no one is stopping me to do the old method but as human we are lazy and in the end we do the dumb sh*). I feel like we will be doing nothing in the future and just doom scrolling some KZbin shorts/ tiktoks etc.

@dioc8699 Ай бұрын

Or u could go out and have fun with all that free time. Choice is urs.

@gremlinsaregold8890 Ай бұрын

To be honest this is rubbish only yesterday correcting a transcription I got a hilarious and totally hullicinted answer that had nothing at all to do with the expected output. It recited I wanna hold your hand by the Beatles

@_Clivey Ай бұрын

As always though, having links to source is good, but if that source itself is AI generated, then we are back to square one

@nyyotam4057 Ай бұрын

The problem with this, is that the model can invent sources as well. The only way to minimize hallucinations is to make the model stick to the training data using setting Temp to 0. Which is, perhaps, what they do.

@CharlieShread Ай бұрын

Sider extension has been doing this for quite a long time already. Integrating various different models with internet search, including Claude Sonnet 3.5, GPT 4o and others. It's epic.

@AJRadaza Ай бұрын

At this point, this channel can train it's own AI to make new videos

@asifrezai2195 Ай бұрын

2nd comment

@MikkoRantalainen Ай бұрын

I wish every AI emitted probability for the answer to be correct even without me specifically prompting for it.

@miked7373 Ай бұрын

SOOOO BAD! I've been testing it out for a few days. It will copy and paste the exact same response several times in a row. I even tried giving it false information and it almost always agreed with me! Even after I provided the correct information afterwards, it would still go with its first incorrect answer. SOOOOO BAD! 😂😂😂

@TheAutisticRebel Ай бұрын

Same experience. But so don't use it for that then. It's a powerful tool.

@Topnichemarket Ай бұрын

this new feature in ChatGPT is incredible! Not just regular search, but interactive answers that dive deeper, cite sources, and adapt to complex questions-this could change everything! Excited to see how this reduces those pesky "hallucinations." What a time for AI! Thanks, OpenAI!

@SireBab Ай бұрын

Honestly man, I'm really tired of these Ai videos. Llms are a neat toy but it feels like it's all you put videos out on anymore. If this keeps up you're losing a sub. I can't shake the feeling also you have a strong bias towards openai that feels unfair to us viewers. Please do better.

@TheAutisticRebel Ай бұрын

Go then! What's your problem? Seriously GO!

@SireBab Ай бұрын

@TheAutisticRebel wow, it's almost like I don't "want" the channel to turn into Ai only stuff, like I said in my comment?

@NostraDavid2 Ай бұрын

The OpenAI bias is real - he could've covered Claude controlling a PC, which is a pretty decent breakthrough. As big as the voice mode of OpenAI, IMO. I hope he'll do better in that regard.

@catube6915 Ай бұрын

Nobody care about "A.I."

@mattmmilli8287 Ай бұрын

Maybe your industry not being touched yet but yes, yes we do. If you are not on top of this, gonna get left behind real soon 😮

@skyless_moon Ай бұрын

@@mattmmilli8287 r/woosh!

@billr3053 Ай бұрын

@@skyless_moonDoubt it. To be fair to the OP, this channel was better tackling 3D rendering techniques. Physics simulations, human kinetics. ChatGPT is flavor of the day and therefor ensures more clicks… so…