Fascinating exploration of AI’s push to address hallucinations! Especially intriguing was the idea of AI's confidence adjustment, reminds us how essential it is for these tools to not just deliver answers, but to do so with a level of assuredness we can trust.
@yaazaraiАй бұрын
A level of assuredness Google* can trust. These models can be biased, dont trust them...
@ChurdingtonАй бұрын
@@yaazarai The interesting thing is -- Just about every negative critique we could give to AI, we could also give to people. AI can be biased, but, so can human intelligence. AI can be bad at creating artwork, and it can be influenced by other people's art, but, human intelligence often creates worse art, and human intelligence often learns from other artists, too, even practicing their techniques to learn. Human intelligence first has to learn what the physical world looks like, though. For example, a person that has been blind their whole life can't imagine what colors look like. They also don't understand what it means for objects to appear smaller as they are farther away, because they've never seen that happen before. (There's some interesting interviews with blind people on youtube) Similarly to humans, AI has to learn before it can create art. Also, AI can be bad at math, but, again, human intelligence is often much worse. Etc, etc. (That wasn't an argument against what you said, btw)
@shApYTАй бұрын
I wish this channel went back to highlighting ignored papers instead of covering hyped press releases.
@KevinGriffin-l6rАй бұрын
It is hard to do that with the quantity of papers being written in the AI space. There is just so much to look at and evaluate.
@Lucky-df8uzАй бұрын
It's one of the main reasons I stopped watching as much :( Like sometimes a press release is fine but it's just not what I originally came to two minute papers for and if I want a press release they are already there in a ton of formats.
@RogerBernsteinАй бұрын
I totally agree! This channel has unfortunately become one of those typical hyped up AI youtubers where all we get is one sensationalist / overselling press release after another with no questions asked. At first I was quite interested in the progress in AI, but I stopped watching a few months ago and checked in today after a long time to see that almost every video is either OpenAI or Nvidia. The worst part is that he also started to sensationalize and oversell the results like in the AlphaProteo video. AlphaFold and AlphaProteo are nice but closed, commercialized products with very limited use cases in actual research, not like "protein folding is solved" as he claimed.
@shabadrandhawa382926 күн бұрын
@@RogerBernstein agree
@telotawaАй бұрын
it's far more censored than google search - it's basically useless imo, censorship kills usefulness
@RyluRockyАй бұрын
I completely disagree it’s objectively far less censored than Google.
@thornelderfinАй бұрын
@@RyluRocky He means Google Search not Google Gemini. Google Search is not censored at all.
@malditonukeАй бұрын
Google search is censored. Look for things that you know you aren't supposed to find. Then notice how you don't find them.
@thornelderfinАй бұрын
@@malditonuke I meant porn and similar legal things that are usually censored (definitely censored in all AI searches and all AI prompts except Mistral). What exactly did you mean by "what you aren't supposed to find". I hope not some conspiracy theory.
@Faizan29353Ай бұрын
@@thornelderfin "Compare them with YANDEX" Then yes google is a bit censored... Say "Movie, Software Piracy sites" etc etc
@jsalsmanАй бұрын
Thank you for addressing hallucinations. I find the new search hallucinates much more than GPT-4o with web browsing from before a week ago. Insidiously, it doesn't just halucinate titles, authors, and dates, but snippets which make the fake citations extremely inciting. There needs to be a button to push on hallucinated references that you really want to exist that will make some agent on the back end go out and write the paper as penance for lying.
@UserName-pi9noАй бұрын
You're welcome
@DialingSpoon527Ай бұрын
Have you seen the oaisis minecraft ai? It's an absolutely insane project.
@TwoMinutePapersАй бұрын
Thanks for the recommendation, I haven't seen it yet. Super cool - can't wait to have a closer look!
@skyless_moonАй бұрын
@@TwoMinutePapers new video? 😯 I hope so!
@DialingSpoon527Ай бұрын
@skyless_moon I do partialy fear the might of the scholars killing the relatively short wait times currently enjoyed.
@dinhero21Ай бұрын
pretty insane, it's basically a public version of GameNGen
@TheAutisticRebelАй бұрын
@@TwoMinutePapersIt's INSANE!
@PaulginzАй бұрын
The most interesting result from this paper, in my opinion, is that even when trying on purpose to come up with uncontroversial questions that have a single correct answer, and double-checking, humans only agree on facts 97% of the time.
@amitraam1270Ай бұрын
If AI learns from human texts, why wouldn't it hallucinate? Don't we work with incomplete data all the time? So, that is what its "learning".
@steve23063Ай бұрын
Why should it hallucinate? It could instead say it’s very unsure and that its response is a guess that’s likely incorrect due to incomplete data. Instead it provides a response with the same sense of certainty every time, whether it’s hallucinating or not.
@amitraam1270Ай бұрын
@steve23063 hi, have you met us humanz? 😀 We do that, too!
@jerrygreenestАй бұрын
It hallucinates all the time. For common things it's okay, as they're easy to answer. But ask it something just slightly non-trivial, it will always give you bullshit answers, while being certain it is not wrong (but it is wrong)
@jerrygreenestАй бұрын
In programming it happens all the time. Ask it something for bash, a very common widespread thing, it will likely answer properly, but ask it about relatively new shell Nushell, which is 5 years old already, - it knows about its existence in overall, but completely wrong about how it's being used, what's code valid, and what's not, invalid knowledge all the time.
@TheAutisticRebelАй бұрын
Hallucination is a stupid confusing word. All we do is hallucinate. That is the structure of neural networks. The map is never the territory. It's a computation engine. It doesn't HALLUCINATE it does what neural networks do. Mind you they are very very very VERY DUMB insulated NETWORKS that are rigid and not brain plasticity like the human brain. Hey are doing what they were designed to do. We just need more nuance in it's own the feedback loop.
@meowcooАй бұрын
Isn't this basically bing AI?
@dtibor5903Ай бұрын
Similar but not identical
@n-i-n-oАй бұрын
Yes
@maxpilip9277Ай бұрын
Bing ai more worse
@bapo224Ай бұрын
Perplexity AI seems better
@n-i-n-oАй бұрын
@@bapo224 I agree, i have the pro subscription got 1 year for free and its really nice.
@mercerwing1458Ай бұрын
I've been waiting for something like this! Google needs competition SO BAD
@tannenbaumxyАй бұрын
You might want to give Perplexity a try. In my subjective experience, it's been better across the board.
@BoothieproАй бұрын
I mean Microsoft's bing ai/copilow was using gpt4 with web search for at least a year now, it cited sources too
@kevindixonmusicАй бұрын
I hope OpenAI makes a search site. I would use it from day 1
@rkan2Ай бұрын
@@Boothieproit still always felt like a downgrade to ChatGPT
@egyvalaki2774Ай бұрын
Like Google "Learn About"?
@TewaAyaАй бұрын
Chatgpt still will not source from the old forums. I found a reference to what I wanted to see arounds 2003-2011 on bing.
@Chef_PCАй бұрын
That Dream Theater search put a big smile on my face.
@TheAutisticRebelАй бұрын
Me TOO... THAT DID NOT GO UNNOTICED! I love the subtleties in his videos. Love love LOVE THIS CHANNEL!!! 🎉🎉🎉
@Iris_and_or_GeorgeАй бұрын
Uuuh planning a trip, finding a place for dinner asking about the weather for a specific day have been possible in normal chatgpt and copilot for ages right, same as showing sources? Ive done all three. I've been using them instead of Google for months. Or am I missing something, please enlighten me.
@Iris_and_or_GeorgeАй бұрын
Except whenever it censors something stupid and annoys me. Gemini and copilot don't like it when you reply to their : "I can't give you blahblah " with: your own search engine will give it to me no problem, so the company doesn't mind.
@Sus_BakАй бұрын
Finally a competitor to Perplexity 😂
@edumazieriАй бұрын
a bit late to the party :p
@bzikariusАй бұрын
What do i can say? Finally fact-checking made. Something, that should be made at start, like constant memory for saved conclusions, knowledge base, where fact can be checked or aligned. Neural network finally become useful.
@SunnyOstАй бұрын
One of the biggest things that kept leaving a poor taste for me when using chatgpt - when it was super confidently wrong. Excited to see it being unsure!
@dinhero21Ай бұрын
sadly, RLHF will do that, it's one of the biggest problems in AI currently (aside from stuff like, spatial reasoning)
@Badumtss2468Ай бұрын
Doesn’t Perplexity AI already do that?
@JYobtat8156Ай бұрын
I just use Perplexity lol
@krux02Ай бұрын
I am kind of over with AI. In software it just tries to solve the wrong problem: automating code duplications instead of cutting down on them. Code is a liability not an asset after all. In art I am not as deeply invested, but it also feels like it, but I can't put it into words. For search engines I just don't trust it. I really prefer a link to a dodgy site that I can judge on my own, rather than an AI summarizing it for me and not telling me where it came from. So can you please do some other science instead? The unpopular science maybe? How AI generated code creates more bugs faster? How computers in education consistently worsen students learning effects?
@SimGuntherАй бұрын
That's what parsers and linters are supposed to be for, but instead we get a bunch of guessing hoping a cluster of those neurons will come up with the right optimizations and that's if this code actually solves the problem you're invested in.
@antonystringfellow5152Ай бұрын
Is that also the case with Claude 3.5 (new) sonnet? I've heard a lot of devs saying it's much better than the other models out there
@DreckbobBratpfanneАй бұрын
@@antonystringfellow5152it is. o1 models can also create algorithms. Not really sure what the original post wants here. Research has also proven that even the old gpt3.5 is useful as an assistant in eg programming, the upgraded models even more so.
@krux02Ай бұрын
@@SimGunther I absolutely agree. I try to work on this problem my inventing my own language. But you have to agree that there is the problem of, programming languages kind of need to keep their bugs and inconsistencies, otherwise they break their existing code, and therefore stop being the language they initially were. This is a hard problem with no good solution to it (yet).
@krux02Ай бұрын
@@antonystringfellow5152 I have not tried, and I am kind of not interested, as I already stated in my original post. I am kind of over with AI.
@jantube358Ай бұрын
So all of the custom versions of ChatGPT that you can create for a specific purpose that Open AI calls "GPTs" that have been introduced in November 2023 will be redundant, soon? Or are they already? I remember the Tech Advisor GPT, the Law GPT, the JavaScript GPT and other GPTs. I was already wondering if using them makes sense at all or if the regular Chat GPT would give me the same results. For example when I had specific questions regarding job application letters or national law codes. What do you think about the redundancy of those GPTs for the consumer customer?
@ipodtouchiscoollolАй бұрын
doesnt microsoft copilot already do this???
@theupsiderАй бұрын
It does.
@kipchickensoutАй бұрын
i asked it to get the newest CS2 update and it searches the wrong pages. i googled cs2 updates and the first link was the correct one. then i told it what webpage to go to and it got the correct date of the recent update but then it just listed only items from previous updates because they were on the same page but further down...
@appvoidАй бұрын
it will, in fact, hallucinate
@aburak621Ай бұрын
Karoly, the Dream Theater fan let's goooooooo!!!
@cupofjoenАй бұрын
Forcing chatgpt to cite real sources was the hardest thing for me. Too bad I already finished my thesis lol. No worries I only use it to fix my grammar.
@LorenzoValenteАй бұрын
Wait a second... Karoly is a Dream Theater fan????? 0:31
@TwoMinutePapersАй бұрын
Just went to my first concert a couple days ago...it was a Dream (Theater) come true. Couldn't believe I finally saw them! So good!
@LorenzoValenteАй бұрын
@@TwoMinutePapers Oh boy, I saw them live in Milan two weeks ago! So glad you enjoyed it too, it was indeed a dream ❤ I'm sure you also loved the crazy lasers/lights show (as a light simulation researcher by trade!)
@TheAutisticRebelАй бұрын
So jealous. Seen hundreds of concerts... not one of them... 😢
@someghostsАй бұрын
Has anyone made the joke that you’re still getting hallucinations since the page you get back on google was probably ai generated too? If not then I’d like to make that joke now.
@hotel_arcadiaАй бұрын
DREAM THEATER MENTION ❗❗
@ak-gi3euАй бұрын
It would be great if we had option to modify the prompt after generated . instead of typing again new prompt
@dennisdelgado4276Ай бұрын
Love how dude sprinkled ever so lightly his political stance 😅
@luckyankrajАй бұрын
So they got scared of perplexity 😂
@thelowlytrinityАй бұрын
Google is quaking.
@parthasarathyvenkatadriАй бұрын
Ask it funny questions like "why john wick is an idiot ?" 😂
@cbnewham5633Ай бұрын
Perplexity AI has been doing this for ages. OpenAI is late to the party. And no mention of Perplexity in this video.
@TheAutisticRebelАй бұрын
Agreed, but its NOT ABOUT Perplexity so WHO CARES
@cbnewham5633Ай бұрын
@TheAutisticRebel I care, obviously, otherwise I would not have commented. Yes, I know it's not about Perplexity (and I just knew someone would follow up my comment to make that obvious statement). It is remiss not to mention Perplexity at all.
@mekniwassime2098Ай бұрын
"Hallucinations No More" I expected better from this channel.
@DanielRutАй бұрын
hallucination and accuracy vs perplexity?
@the-0-endless376Ай бұрын
AI chatbots are a dead end tech. This is a waste of money and time they could have put towards medical AI instead. Cancer predictions, not planning trips to hotels that don't exist.
@crawknАй бұрын
If you can ask an AI for a level of confidence and get a meaningful evaluation correlated with accuracy, why wouldn't you just have the AI assess its confidence automatically, and make adjustments where appropriate? There should be a threshold of confidence below which a given statement should be rejected, or appropriate disclaimers provided within the response. Don't wait to be corrected and then disclaim.
@joletunАй бұрын
This is wrong. It still hallucinates. Specially with new data. Like new API code and programming languages.
@SP-ny1fkАй бұрын
It isn't very good at telling seo spam websites apart from real websites though
@dzxtricksАй бұрын
Is there no "confident" parameters included in each answer so we know how much we can trust that? Considering we can ask "how confident" means it can actually be an extractable Information for users to acknowledge
@noriniatpacgniekcahАй бұрын
This is the beta testers contribution to the world.. Indirect way
@KevinLSchwartzАй бұрын
Have you had a chance to try out the new o1 (not preview, but whatever is after that)? What do you think of it?
@xtc_wАй бұрын
I thought he did a video on it
@mantekarysАй бұрын
He already di not that long ago
@KuchirielАй бұрын
I can only comment other ppls comments, why
@goodfortunetoyouАй бұрын
I asked an AI model on a the United states - Mexico Canada Free Trade Agreement website whether or not lumber was part of the agreement, and the location of that information in the pdf text of the agreement. I wanted the pdf to reference the underlying language. It failed to correctly recognize that I wanted a pdf, and failed to correctly localize any part of the agreement that focused on whether or not lumber was subject to the agreement, or had an exemption which would allow for tariffs. I'm by no means an expert on legal texts or trade agreements, but I count that experiment as failed, using an unknown version of a public chatbot model.
@mahdoosh1907Ай бұрын
×10 more heat = x10 less water
@Yourname942Ай бұрын
will the search results be skewed by sponsors paying to be at the top?
@kachowbltch3585Ай бұрын
I wonder what happens to all those sites that rely on advertising for upkeep once they aren’t getting visitors anymore
@TheAutisticRebelАй бұрын
As the pie gets bigger they are less likely to be found anyway... Unless... hmmnnn 🤔 what if they aren't on the first page of Google?
@TheAutisticRebelАй бұрын
Sure, I wonder what will happen to all those newspapers too!
@V0TIONАй бұрын
Coal search
@DR20005Ай бұрын
Since this dropped, I stopped using Google
@vblaas246Ай бұрын
Two minute papers, have you cloned your own voice recently?
@-_Nuke_-Ай бұрын
Hello guys, Im using the free version of chat gpt for coding. But I wonder, is there a free ai that I can use for coding that is dedicated to be used for coding?
@TheAutisticRebelАй бұрын
There are but they all suffer the same issues to one degree or another. Coding is still a human activity... for now.
@-_Nuke_-Ай бұрын
@@TheAutisticRebel I see! I mean, so far with Chat GPT I get really good results (taking into account that its not an actual person). I wonder if there is anything that it can see the entire code that you are writting in VS code for example. I know about copilot but this is not free right?
@TheAkdzynАй бұрын
I find myself prompting Google like an Ai more often nowadays. This will make things easier!
@gwenandersen6803Ай бұрын
i dont this its gonna be better then google, even there is ads google gives very good result and dont need to find what i need for too long
@henrythegreatamerican8136Ай бұрын
Just what we needed.... Another search engine.
@TheAutisticRebelАй бұрын
🎉🎉🎉 Hahaha spit out my coffee loling!!! How true is THAT! 🤣🤣🤣
@stanleesiele6028Ай бұрын
Still can't break ecdlp
@NoThing-ec9kmАй бұрын
Wowiee
@Billy4321ableАй бұрын
0:14 Ah yes, write a Python script. I ask that all the time. 🤦
@billr3053Ай бұрын
I might. If I was forced to work in Python.
@TheAutisticRebelАй бұрын
I have AUTOMATED some cool stuff using Python. "Real coders" laughed at the language... Ridiculous. It's so powerful.
@SimGuntherАй бұрын
@@TheAutisticRebel I've also done plenty of automation in Python and bash. Pretty good stuff to be made with those two languages!
@CookieXD1998Ай бұрын
I actually do. I am not good with python yet and needed some simple scripts, that reformat text strings
@ripplerxeonАй бұрын
IDK the direction we are going, Soon everything will be done by an AI which takes the fun out of searching on the web. Like before if i get a code Error i would search it and look at what other people did to solve that and also learn new things on the way but now when i ask chatgpt i get the answer and the fun of solving a problem by myself is gone. It just feel wrong (i know no one is stopping me to do the old method but as human we are lazy and in the end we do the dumb sh*). I feel like we will be doing nothing in the future and just doom scrolling some KZbin shorts/ tiktoks etc.
@dioc8699Ай бұрын
Or u could go out and have fun with all that free time. Choice is urs.
@gremlinsaregold8890Ай бұрын
To be honest this is rubbish only yesterday correcting a transcription I got a hilarious and totally hullicinted answer that had nothing at all to do with the expected output. It recited I wanna hold your hand by the Beatles
@_CliveyАй бұрын
As always though, having links to source is good, but if that source itself is AI generated, then we are back to square one
@nyyotam4057Ай бұрын
The problem with this, is that the model can invent sources as well. The only way to minimize hallucinations is to make the model stick to the training data using setting Temp to 0. Which is, perhaps, what they do.
@CharlieShreadАй бұрын
Sider extension has been doing this for quite a long time already. Integrating various different models with internet search, including Claude Sonnet 3.5, GPT 4o and others. It's epic.
@AJRadazaАй бұрын
At this point, this channel can train it's own AI to make new videos
@asifrezai2195Ай бұрын
2nd comment
@MikkoRantalainenАй бұрын
I wish every AI emitted probability for the answer to be correct even without me specifically prompting for it.
@miked7373Ай бұрын
SOOOO BAD! I've been testing it out for a few days. It will copy and paste the exact same response several times in a row. I even tried giving it false information and it almost always agreed with me! Even after I provided the correct information afterwards, it would still go with its first incorrect answer. SOOOOO BAD! 😂😂😂
@TheAutisticRebelАй бұрын
Same experience. But so don't use it for that then. It's a powerful tool.
@TopnichemarketАй бұрын
this new feature in ChatGPT is incredible! Not just regular search, but interactive answers that dive deeper, cite sources, and adapt to complex questions-this could change everything! Excited to see how this reduces those pesky "hallucinations." What a time for AI! Thanks, OpenAI!
@SireBabАй бұрын
Honestly man, I'm really tired of these Ai videos. Llms are a neat toy but it feels like it's all you put videos out on anymore. If this keeps up you're losing a sub. I can't shake the feeling also you have a strong bias towards openai that feels unfair to us viewers. Please do better.
@TheAutisticRebelАй бұрын
Go then! What's your problem? Seriously GO!
@SireBabАй бұрын
@TheAutisticRebel wow, it's almost like I don't "want" the channel to turn into Ai only stuff, like I said in my comment?
@NostraDavid2Ай бұрын
The OpenAI bias is real - he could've covered Claude controlling a PC, which is a pretty decent breakthrough. As big as the voice mode of OpenAI, IMO. I hope he'll do better in that regard.
@catube6915Ай бұрын
Nobody care about "A.I."
@mattmmilli8287Ай бұрын
Maybe your industry not being touched yet but yes, yes we do. If you are not on top of this, gonna get left behind real soon 😮
@skyless_moonАй бұрын
@@mattmmilli8287 r/woosh!
@billr3053Ай бұрын
@@skyless_moonDoubt it. To be fair to the OP, this channel was better tackling 3D rendering techniques. Physics simulations, human kinetics. ChatGPT is flavor of the day and therefor ensures more clicks… so…