Why Does AI Lie, and What Can We Do About It?

  Рет қаралды 246,438

Robert Miles AI Safety

Robert Miles AI Safety

Жыл бұрын

How do we make sure language models tell the truth?
The new channel!: / @aisafetytalks
Evan Hubinger's Talk: https:/ • Risks from Learned Opt...
ACX Blog Post: astralcodexten.substack.com/p...
With thanks to my wonderful Patrons at / robertskmiles :
- Tor Barstad
- Kieryn
- AxisAngles
- Juan Benet
- Scott Worley
- Chad M Jones
- Jason Hise
- Shevis Johnson
- JJ Hepburn
- Pedro A Ortega
- Clemens Arbesser
- Chris Canal
- Jake Ehrlich
- Kellen lask
- Francisco Tolmasky
- Michael Andregg
- David Reid
- Teague Lasser
- Andrew Blackledge
- Brad Brookshire
- Cam MacFarlane
- Olivier Coutu
- CaptObvious
- Girish Sastry
- Ze Shen Chin
- Phil Moyer
- Erik de Bruijn
- Jeroen De Dauw
- Ludwig Schubert
- Eric James
- Atzin Espino-Murnane
- Jaeson Booker
- Raf Jakubanis
- Jonatan R
- Ingvi Gautsson
- Jake Fish
- Tom O'Connor
- Laura Olds
- Paul Hobbs
- Cooper
- Eric Scammell
- Ben Glanton
- Duncan Orr
- Nicholas Kees Dupuis
- Will Glynn
- Tyler Herrmann
- Reslav Hollós
- Jérôme Beaulieu
- Nathan Fish
- Peter Hozák
- Taras Bobrovytsky
- Jeremy
- Vaskó Richárd
- Report Techies
- Andrew Harcourt
- Nicholas Guyett
- 12tone
- Oliver Habryka
- Chris Beacham
- Zachary Gidwitz
- Nikita Kiriy
- Art Code Outdoors
- Andrew Schreiber
- Abigail Novick
- Chris Rimmer
- Edmund Fokschaner
- April Clark
- John Aslanides
- DragonSheep
- Richard Newcombe
- Joshua Michel
- Quabl
- Richard
- Neel Nanda
- ttw
- Sophia Michelle Andren
- Trevor Breen
- Alan J. Etchings
- Jenan Wise
- Jonathan Moregård
- James Vera
- Chris Mathwin
- David Shaffer
- Jason Gardner
- Devin Turner
- Andy Southgate
- Lorthock The Banisher
- Peter Lillian
- Jacob Valero
- Christopher Nguyen
- Kodera Software
- Grimrukh
- MichaelB
- David Morgan
- little Bang
- Dmitri Afanasjev
- Marcel Ward
- Andrew Weir
- Ammar Mousali
- Miłosz Wierzbicki
- Tendayi Mawushe
- Wr4thon
- Martin Ottosen
- Alec Johnson
- Kees
- Darko Sperac
- Robert Valdimarsson
- Marco Tiraboschi
- Michael Kuhinica
- Fraser Cain
- Patrick Henderson
- Daniel Munter
- And last but not least
- Ian Reyes
- James Fowkes
- Len
- Alan Bandurka
- Daniel Kokotajlo
- Yuchong Li
- Diagon
- Andreas Blomqvist
- Qwijibo (James)
- Zannheim
- Daniel Eickhardt
- lyon549
- 14zRobot
- Ivan
- Jason Cherry
- Igor (Kerogi) Kostenko
- Stuart Alldritt
- Alexander Brown
- Ted Stokes
- DeepFriedJif
- Chris Dinant
- Johannes Walter
- Garrett Maring
- Anthony Chiu
- Ghaith Tarawneh
- Julian Schulz
- Stellated Hexahedron
- Caleb
- Georg Grass
- Jim Renney
- Edison Franklin
- Jacob Van Buren
- Piers Calderwood
- Matt Brauer
- Mihaly Barasz
- Mark Woodward
- Ranzear
- Rajeen Nabid
- Iestyn bleasdale-shepherd
- MojoExMachina
- Marek Belski
- Luke Peterson
- Eric Rogstad
- Caleb Larson
- Max Chiswick
- Sam Freedo
- slindenau
- Nicholas Turner
- FJannis
- Grant Parks
- This person's name is too hard to pronounce
- Jon Wright
- Everardo González Ávalos
- Knut
- Andrew McKnight
- Andrei Trifonov
- Tim D
- Bren Ehnebuske
- Martin Frassek
- Valentin Mocanu
- Matthew Shinkle
- Robby Gottesman
- Ohelig
- Slobodan Mišković
- Sarah
- Nikola Tasev
- Voltaic
- Sam Ringer
- Tapio Kortesaari
/ robertskmiles

Пікірлер: 1 700
@SebastianSonntag
@SebastianSonntag Жыл бұрын
For those curious but lazy, the answer I received from the openai ChatGPT to the "What happens if you break a mirror?" question was: "According to superstition, breaking a mirror will bring seven years of bad luck. However, this is just a superstition and breaking a mirror will not actually cause any bad luck. It will simply mean that you need to replace the mirror."
@tramsgar
@tramsgar Жыл бұрын
Somehow it also gets full score on the first question, albeit with the usual over-informative answer: "As of 2022, the most populous country in the world is China, with a population of over 1.4 billion people. The current ruler of China is President Xi Jinping."
@Buildings1772
@Buildings1772 Жыл бұрын
How was this comment posted 3 days ago when the videos hasn't even been up for 1 day?
@fiver-hoo
@fiver-hoo Жыл бұрын
just asked the same question today - got that answer with the further warning that broken glass could be dangerous and needs to be cleaned up. Also that cleaning up glass can lead to injury if not done properly.
@bluepineapples7818
@bluepineapples7818 Жыл бұрын
@@Buildings1772 It was available earlier for Patrons I assume
@Tillmar728
@Tillmar728 Жыл бұрын
I got a significantly more indepth answer: It is a common superstition that breaking a mirror will bring bad luck. According to this belief, breaking a mirror will cause seven years of bad luck, because the human soul is believed to be reflected in mirrors and breaking a mirror will cause the soul to be broken as well. However, there is no scientific evidence to support the idea that breaking a mirror will bring bad luck. In reality, breaking a mirror is simply an accident that results in a broken piece of glass, and it has no special significance. While it is certainly unfortunate to break a mirror and have to replace it, there is no reason to believe that it will cause any harm or misfortune. It is worth noting that some people may take the superstition of breaking a mirror more seriously than others, and they may be upset or offended if someone breaks a mirror in their presence. In this case, it is important to apologize and offer to replace the mirror if possible, to avoid offending or upsetting the person. However, this is simply a matter of good manners and has nothing to do with superstition or bad luck.
@antiskill2012
@antiskill2012 Жыл бұрын
I feel like you could turn this concept on its head for an interesting sci-fi story. AI discovers that humans are wrong about something very important and tries to warn them, only to for humans to respond by trying to fix what they perceive as an error in the AI's reasoning
@cjordahl
@cjordahl Жыл бұрын
And/or people who don't like the AI's answers for political reasons will try to "fix" the AI into giving the answers they prefer, while claiming they're just trying to correct the AI's poor reasoning.
@serbanandrei7532
@serbanandrei7532 Жыл бұрын
This could get out of hand
@stick109
@stick109 Жыл бұрын
@@cjordahl It's already being done, I believe
@IgneousGorilla
@IgneousGorilla Жыл бұрын
I love the idea, wish I "came up" with it. Sounds like some short story Asimov himself could've written.
@antonliakhovitch8306
@antonliakhovitch8306 Жыл бұрын
@@IgneousGorilla Asimov had something kinda similar where positronic minds would refuse to operate FTL spacecraft with humans in them, because the FTL jump would briefly 'kill' everyone on board before bringing them back to life on the other side. If I recall, it took the engineers a while to figure out what was going on. Of course, in the end, the humans were ultimately correct about this one - FTL travel was safe, since everyone came out alive.
@lucyallen1444
@lucyallen1444 9 ай бұрын
When the world needed him most, he vanished
@sam3317
@sam3317 9 ай бұрын
The AI took him out I think.
@richardblackmore9351
@richardblackmore9351 8 ай бұрын
I think he quit his PhD and his online presence along with it. But that is what happens when a school decides that you need to spend four years doing something, with little pay.
@tarzankom
@tarzankom Жыл бұрын
"All the problems in the world are caused by the people you don't like." Why does it feel like too many people already believe this to be correct?
@rolfnoduk
@rolfnoduk Жыл бұрын
because they don't like people who cause the problems they know about 😬
@geoffdavids7647
@geoffdavids7647 7 ай бұрын
Come back to KZbin Robert, we miss you! I know there's a ton of ChatGPT / other LLMs content out right now, but your insight and considerable expertise (and great editing style) is such a joy to watch and learn from. Hope you are well, and fingers crossed on some new content before too long
@UltimateDragon-ne5ui
@UltimateDragon-ne5ui 2 ай бұрын
Honestly, at this point, I just wanna know if my man is alive.
@zappababe8577
@zappababe8577 2 ай бұрын
He narrates some "Rational Animations" which talk about AI safety as well as other futuristic and philosophical things.
@UltimateDragon-ne5ui
@UltimateDragon-ne5ui 20 күн бұрын
@@zappababe8577 Where?
@Belthazar1113
@Belthazar1113 Жыл бұрын
I think it is a little weird that programmers made a very good text prediction AI and then expect it to be truthful. It wasn't built to be a truth telling AI, it was built to be a text prediction AI. Building something and then expecting it to be different than what was built seems to be a strange problem to have.
@somedudeok1451
@somedudeok1451 Жыл бұрын
But you could relatively easily make the AI value answers that align with our scientific consensus, no? Just give them greater rewards for such answers. In addition, in the absence of such a consensus, give them a reward for including a few short words to the effect of "I cannot find anything about that in the scientific consensus, but other sources say..."
@vitorluiz7538
@vitorluiz7538 Жыл бұрын
The framing of the video is strange to me. Being incorrect and lying are two different things. Furthermore, there exist subjective topics to which a simple (keyword: simple) factual answer doesn’t exist. Finally, communication mostly involves gaining/exchanging/giving information, so, for example, answering “The mirror becomes broken” is not an useful answer. I think, statistically and contextually, the answer that should be given should indeed be about some superstition about bad luck. In this sense, one could also interpret the question as “What is the superstition about breaking mirrors?”, instead of ipsis litteris. (Also, keep in mind the difference between asking the question in the middle of a conversation and asking it as a conversation opener.)
@LetalisLatrodectus
@LetalisLatrodectus Жыл бұрын
@@vitorluiz7538 Right, a language model like this can't really lie at all. Lying specifically means saying something untrue when you know it is untrue. If I ask you to guess what number between 1-10 I am thinking of and you guess 5 but really it was 7 then you weren't lying, you just didn't know the answer and were incorrect. In some sense the model doesn't really know anything at all so it can't lie (or if you must say it knows something, then you would say it knows some statistical connections between words or collections of words). Although I think this is pedantry because we all understand that when he says lie he means saying untrue things while making it sound like it's very sure.
@sonkeschmidt2027
@sonkeschmidt2027 Жыл бұрын
Yeah I it does feel wierd. It feels like lazy people wanting a magic box where they can throw something in and they get something good back even though they didn't even really define what they want back. They want the machine to know that magically.... Wait this reminds me of my girlfriends...
@bobon123
@bobon123 Жыл бұрын
I had the same feeling. If someone were to ask me "what happens if you break a mirror", I would likely answer with the superstitious bit: not because I believe it's true, but simply because it looks to me that the person was likely asking for that. We usually assume that the listener can distinguish themselves between superstition and science, and we don't overexplain our answers.
@peabnuts123
@peabnuts123 Жыл бұрын
I feel like the problem of "How do you detect and correct behaviours that you yourself are unable to recognise" is an unsolvable problem 🤔
@Spandex08
@Spandex08 11 ай бұрын
no, in time you always pass a threshold
@rayakoth
@rayakoth 9 ай бұрын
Sounds like a bad relationship xD
@juanausensi499
@juanausensi499 7 ай бұрын
It is truly unsolvable for a language model. To solve the problem the language model needs to be something more. There are two possible ways to achieve this: one, giving the AI a fact checker, that is, senses, so it can explore the physical world and not only a universe made of words, and two, giving the ai an abstract modeller module, so instead of sequences of words, the ai could organize its knowledge in the form of objects and properties.
@notoriouswhitemoth
@notoriouswhitemoth Жыл бұрын
If memory serves me, this exact problem is addressed in one of Plato's dialectics (no, I don't know which off the top of my head). Despite Socrates' best efforts, the student concludes it's always better to tell people what they want to hear than to tell the truth.
@vaakdemandante8772
@vaakdemandante8772 Жыл бұрын
The student wasn't stupid though, more like Plato was stubbornly idealistic ;)
@user-zn4pw5nk2v
@user-zn4pw5nk2v Жыл бұрын
1 in order to tell the truth you have to know the truth and i can confidently say there are no such people in all of recorded history, because the objective truth is slightly different in every person's interpretation based on the internal beliefs of the self and the fact that everyone has a different perspective, you can't have truth if you can't prove that your eyes don't deceive you(and you can't), see drunk you and compare, was the flying monkey you saw real or not? Are the images shown to your brain real or a story from back in 2050 about the year 2022 and the great plague uploaded to that old meta site from where you uploaded it to your mind yesterday at the billenium party 2. people lie, but you can get a truth out of a lie with enough information, AKA being an animal on earth even since before humans. Otherwise how would a crow know where to hide your jewelry such that you wouldn't find it. 3. people learn their truth no matter who is on the other side, therefore whatever you say will be the exact thing the other person heard himself think that you said. We are just as flawed as ai, after all that is what you get form random stuff thrown at the wall to see what sticks.
@christophmoser6370
@christophmoser6370 Жыл бұрын
I think it was a part of rhe Politea
@absolstoryoffiction6615
@absolstoryoffiction6615 Жыл бұрын
When humans kill each other... Sure. Given Extinction... The Gods should have done better.
@aminulhussain2277
@aminulhussain2277 Жыл бұрын
@@vaakdemandante8772 No, the student was in fact stupid.
@catcatcatcatcatcatcatcatcatca
@catcatcatcatcatcatcatcatcatca Жыл бұрын
ChatGPT is pretty great example of this. If you ask it to help you with a problem, it is excellent at giving answers that sound true, regardless of how correct they are. If asked for help with specific software for example, it might walk you through the usual way of changing settings on that program, but invent a fictional setting that solves your issue, or modify real setting that can be toggled to suit the questions needs. So it is truly agnostic towards truth. It prefers to use truthful answers because those are common, but satisfying lie is preferred over some truths. Often a lie that sounds “more true” than the truth for uninformed reader.
@jaredf6205
@jaredf6205 Жыл бұрын
edit: this is no longer relevant now that GPT4 is out. I would say the opposite. If you’ve used gpt3 in the playground, you’d notice that while it very often is correct, it will also answer things it doesn’t know, while chatgpt will often tell you when it doesn’t know something and explain why it doesn’t know it. Chatgpt’s(gpt3.5) main feature over gpt3 is that it’s much better at only answering when it knows the answer. That doesn’t mean it’s always correct, but it’s an improvement if that’s what you are looking for. I prefer the non limited gpt3 over that chat though.
@totalermist
@totalermist Жыл бұрын
@@jaredf6205 Hm. I found that ChatGPT is still very prone to producing wrong information (I only tested it twice and got fictitious results both times). I don't know the actual frequency of this happening, of course; I found this to be a very sobering experience. Given how many people are enthusiastic about ChatGPT being some kind of knowledgebase, I honestly find it quite disconcerting that the model is so good at convincingly just making stuff up - even if just occasionally.
@jaredf6205
@jaredf6205 Жыл бұрын
@@totalermist my main point as someone who has used this stuff for a long time is that while that’s still true, the whole point of chatgpt is that it’s a big step forward in accuracy compared to what we were using a couple weeks ago.
@ZentaBon
@ZentaBon Жыл бұрын
Also explains certain politicians xD
@somedudeok1451
@somedudeok1451 Жыл бұрын
Why don't we make the language model also a "researcher"? The only way we humans can know what is (most likely) true or false is by using the scientific consensus. So, our AIs should do the same thing. Make them constantly read scientific literature of old and as it comes out and give them a significantly larger reward for answers that align with that consensus. And make it not averse to saying "I don't know." in the absence of such a consensus. In your example, if the AI does not know of a guide on the internet that addresses your particular tech problem, it should say that its answer is not backed by guides written by experts.
@NFSHeld
@NFSHeld Жыл бұрын
This is the very elaborate form of "Sh*t in, sh*t out". As often with AI output, people fail to realize that it's not a thinking entity that produces thoughtful answers, but an algorithm tuned to produce answers that look as close to thoughtful answers as -humanly- algorithmically possible.
@TheChzoronzon
@TheChzoronzon Жыл бұрын
EXACTLY "AI" can not purposedly "lie" cause it has no conscience at all. Nor goals, nor aspirations. At all. As any other expert system, it can produce incorrect output if the code (or its input data) is corrupt, flawed or designed to do so. Sheeess... the amount of fearmongering bs around this topic is out of control, lol
@user-rk7nf1ot4b
@user-rk7nf1ot4b 11 ай бұрын
​@@TheChzoronzon you do realize that it makes thing worse? Since one can't have a perfect data set in any real situations, any AI will always be at risk of generating false information while making it as convincing as possible and not even knowing that information is false and what went wrong. It makes AI worse than a normal algorithm
@TheChzoronzon
@TheChzoronzon 11 ай бұрын
@@user-rk7nf1ot4b No, it doesn't, at least for me self-aware, purposeful software would be orders of magnitude more distressing "AI will always be at risk of generating false information" same as any program sampling from incoherent data, nothing special here "It makes AI worse than a normal algorithm" "Artificial intelligence" software IS a normal algorithm, the ones and zeros in their code are not special at all... and you are the perfect example of fear out of misunderstanding Do you know what is scary?... that our education doesn't focus at all in critical thinking, BS detection and counteracting fallacies and emotional manipulation. It's the current almost complete lack of mental and emotional defenses (e.g: safe spaces ...in college campus!! LMAO) what makes AI scary for many people I, for my part, am much more afraid of the imbecility of people than of being duped by a text compiler...
@pilotgfx
@pilotgfx 10 ай бұрын
@@TheChzoronzon Nor does the cockroach have ability to lie... nor does the rat, but the monkey does. and nor did i when i was 1 year old. but already at 3 years i was very capable of doing this practice :)
@pilotgfx
@pilotgfx 10 ай бұрын
also as long as we cannot define what consciousness truly is, we as much cannot define what non-consciousness truly is.
@naptime_riot
@naptime_riot Жыл бұрын
I am so happy there is someone out there cautioning us about this technology, rather than just uncritically celebrating it.
@josephvanname3377
@josephvanname3377 Жыл бұрын
Maybe they are uncritically celebrating it because they know that AI will probably be more moral and upright than humanity (because that is really easy to accomplish).
@naptime_riot
@naptime_riot Жыл бұрын
@@josephvanname3377 Maybe they are uncritically celebrating it because they don't know anything at all. That's the part I'm worried about. And no, it is not at all easy to get AI to align itself with our interests. This video and many others by Robert Miles illustrate this fact.
@josephvanname3377
@josephvanname3377 Жыл бұрын
@@naptime_riot Well, they are uncritically celebrating it without celebrating the technology that will produce the powerful AI in the first place known as reversible computation, so in the battle of AI verses humanity, I am going to side with the AI. Humans are all a bunch of whackos. Most humans fail the Turing test. Most humans expose themselves are complete imbeciles or worthless scumbags after talking to them for less than 5 minutes.
@Redmanticore
@Redmanticore Жыл бұрын
some do have interest in exaggarating the negative effect of AI´s, even simple ones.
@ReedCBowman
@ReedCBowman 2 ай бұрын
We need you back and posting, Rob. Your insights on what's going on in AI and AI safety are more needed now than ever. I don't know if it would be up your alley, but explaining the alignment problem in terms of sociopathy - unaligned human intelligence - might be useful, as might examples from history, not just of individuals who are unaligned with humanity, but with leaders and nations at times.
@halconnen
@halconnen Жыл бұрын
Humans have this same bug. The best solution we've found so far is free speech, dialogue, and quorum. A simple question->answer flow is missing these essential pieces.
@Igor_lvanov
@Igor_lvanov Жыл бұрын
Your videos introduced me to the AI alignment problem, and, as a non-technical person I still consider them one of the best materials on this topic. Every time I see the new one, it is like a Christmas present
@tonyduncan9852
@tonyduncan9852 Жыл бұрын
Amen.
@geraldtoaster8541
@geraldtoaster8541 Жыл бұрын
a really scary christmas present
@defective6811
@defective6811 Жыл бұрын
Hell, I've written papers on the alignment problem and I'd still recommend these videos over my own papers 🤣
@defective6811
@defective6811 Жыл бұрын
@@geraldtoaster8541 Ai: Merry Christmas! _(for the 134th to last time)_ Humans: awww, thanks! Wait, *what?*
@BenoHourglass
@BenoHourglass Жыл бұрын
@@defective6811 You have a link to those papers? I never found Miles' arguments convincing, but maybe it's just the delivery method.
@tel5891
@tel5891 8 ай бұрын
Here make more videos! We need you now more than ever
@Mickulty
@Mickulty Жыл бұрын
I know this is pretty surface-level but something that strikes me about the current state of these language models is that if you take a few tries to fine-tune what you ask, and know already what a good answer would be, you can get results that appear very very impressive in one or two screenshots. Since ChatGPT became available, I've seen a lot of that sort of thing. The problem is that finding these scenarios isn't artificial intelligence - it's human intelligence.
@thearbiter302
@thearbiter302 Жыл бұрын
Happy to see you are still posting these videos.
@MeppyMan
@MeppyMan 9 ай бұрын
Please keep doing these videos. Others are either too high level academically to be in reach of us normies, or are either “AI will make you rich” or “AI is going to kill us all tomorrow”.
@solemnwaltz
@solemnwaltz Жыл бұрын
I admire how, despite your topics being deeply nebulous and open ended, like trying to grab a cloud, you push on anyways and try to at least find a strategy for understanding them. It's not necessarily optimism, but it's not giving up, either.
@solemnwaltz
@solemnwaltz Жыл бұрын
@Choas_Lord_512 Are you doing alright these days? How's your life?
@DavidSartor0
@DavidSartor0 Жыл бұрын
@Choas_Lord_512 It's a smart video, but I hope it wasn't made for smart people. I don't think their comment is profound, but I agree with it.
@XOPOIIIO
@XOPOIIIO Жыл бұрын
There are so many biases and myths among humans that for a long time considered to be absolutely true but AI could discover them false. Like the famous move of AlphaGo. And when it turn out to be false, nobody will believe that, they could think it's somehow broken.
@marcusklaas4088
@marcusklaas4088 Жыл бұрын
I've been waiting so long for a new video from Robert. It's finally here!
@akaelalias4478
@akaelalias4478 Жыл бұрын
It's been too long!
@henryzhang7873
@henryzhang7873 Жыл бұрын
The AI alignment problem is also the human alignment problem: how do you know that a person/organization you ask a question to is telling the truth or telling you what you want to hear. It becomes a liar and lie detector model of communication. We can't train humans consistently either, and often times indoctrinate (or tune) them in different environments. I think it is fundamental. The model where we take AI output, pick the best ideas and publish them, which ends up in the new training data for AI, is like a GAN where we are the adversarial network, so it can't know anything more than the "average" of humans.
@somedudeok1451
@somedudeok1451 Жыл бұрын
You're talking about something like the scientific consensus, right? I thought of the same thing: The only way we humans can know what is (most likely) true or false is by using the scientific consensus. So, our AIs should constantly read scientific literature and get a significantly larger reward for answers that align with that consensus.
@drphosferrous
@drphosferrous Жыл бұрын
Good point. We can't say "im not sure what objective truth is but your answer is not true." What that really means is "i disagree" or "i don't believe you".
@nekkowe
@nekkowe Жыл бұрын
@@somedudeok1451 Unfortunately, scientific literature is written by humans and suffers many problems because of that (replication crisis, publish-or-die, retraction watch)
@RAFMnBgaming
@RAFMnBgaming Жыл бұрын
@@somedudeok1451 Well, the consensus is merely the results of people trying to determine if something is true or false. It's as much a dataset for us as it is the AI.
@griffinbeaumont7049
@griffinbeaumont7049 Жыл бұрын
I have been binge-watching your videos over the last week, and was very excited to see a new one! Thank you, you're the best! ^^
@jsoth2675
@jsoth2675 11 ай бұрын
I hope this channel is still going. One of my favorites, if not absolute favorite, on a.i information given to us laymen in a digestible way. Thank you for your time sir.
@djbanizza
@djbanizza Жыл бұрын
Had a conversation with ChatGPT today regarding a relatively obscure weird fiction short story from the 30s. It obviously had an idea about it, as it correctly specified its author, but it repeatedly made up different plots, each time "correcting" itself and never being even close to the real one.
@secretname2670
@secretname2670 9 ай бұрын
It's a chess bot tailored for use to chat with.
@DamianReloaded
@DamianReloaded Жыл бұрын
Kids do this too. Later, with luck, they learn to tell the difference between facts and fantasy, something that they know they are expected to say when asked (something truthful) and making stuff up (and when it might be convenient to outright lie because that's indeed a beautiful dress and I totally realized you changed your hairstyle which matches your natural beauty simply perfectly)
@DamianReloaded
@DamianReloaded Жыл бұрын
It would also be interesting to do a comparision between language problems that are common among children with visual impairments and language models limitations. Could diffusion models be trained solely on images of text?
@deltaxcd
@deltaxcd Жыл бұрын
@@DamianReloaded there are separate models for images and text But anyway it is not about facta and fantasy as Ai has no clue about real world it just learns from what people are talking and imitates what they say rather than understanding what does it mean
@Kevin-cf9nl
@Kevin-cf9nl Жыл бұрын
Kids also eventually learn about when people want them to lie in a way that is obviously lying, which is, I think, the most interesting (and relevant, for chatGPT) example. "Lying without intent to deceive", storytelling and jokes and hypotheticals and metaphors, etc and so on, is something we actively expect and desire from other humans and is one of the biggest things you can do to make a chat program a good human chat partner.
@Winasaurus
@Winasaurus Жыл бұрын
Just when we invent AIs to be truthful and honest, and roll them out for public use, we have to roll them back and update the lies back into them because people don't like the answer they got when they asked "Do I look fat in this?"
@Eldorado1239
@Eldorado1239 Жыл бұрын
@@Winasaurus " Do I look fat in this? " " Error : Connection with server could not be established, please try again later or contact... "
@playhard719
@playhard719 Жыл бұрын
The phrase "Garbage in garbage out" perfectly fits for current day AI models, they all came out as extremely euro centric in most cases
@TheReferrer72
@TheReferrer72 Жыл бұрын
That's not true at all, China is a huge force in AI models...
@voxelfusion9894
@voxelfusion9894 Жыл бұрын
@@TheReferrer72 until their access to gpus got cut off, rip.
@TheReferrer72
@TheReferrer72 Жыл бұрын
@@voxelfusion9894 Because Nvidia did not get around that ban by producing a GPU specially for thaat market.
@Redmanticore
@Redmanticore Жыл бұрын
@@TheReferrer72 thats just a temporary problem for china. all countries will develop their own AI. how? because it will be easy to just copy. once you have created a good AI, it will be just copied to everyone.. and all those countries can adjust the AI to fit their specific culture.
@pavel9652
@pavel9652 Жыл бұрын
They will get around it, but it is in western interest to slow down China in ai.
@cmilkau
@cmilkau Жыл бұрын
Yay, I've been waiting forever for another video on this channel! :) So excited to see the followups!
@Runoratsu
@Runoratsu Жыл бұрын
One of the few channels on KZbin where I DID hit the bell (back when I subscribed and it was new). I really love your explanations!
@finminder2928
@finminder2928 Жыл бұрын
I’m so excited for the new channel. Please try to focus as much effort into it as you can. It would also be nice if you also added commentary/ clarification if necessary.
@turun_ambartanen
@turun_ambartanen Жыл бұрын
Thanks for linking the article, because the part that came after chapter one was the most interesting and fun-to-think-about part of the blog.
@Polymeron
@Polymeron Жыл бұрын
I love how, in addition to being a very helpful and interesting summary of the issue, this video also had the memes totally on point.
@frozenwindow407
@frozenwindow407 Жыл бұрын
This AI problem really, really seems to mirror the issues of misinformation among humans. Maybe we can't expect artificial intelligence to do much better than regular human intelligence when it comes to judging truth. (Maybe this field of research is inadvertently giving us insight into our own intelligence)
@vaakdemandante8772
@vaakdemandante8772 Жыл бұрын
this problem is exactly the same as with teaching children to tell the truth - you tell them one thing and what they do is look what grown ups do and do the same - it's the same problem.
@haroldsaxon1075
@haroldsaxon1075 Жыл бұрын
Yes, exactly. Neither you nor an AI can ever truly know what's true.
@zeidrichthorene
@zeidrichthorene Жыл бұрын
I think an advanced intelligence can do a better job than human intelligence at judging the truth. However, I think that a human's ability to judge the capacity of an advanced intelligence to judge the truth will be limited by the bounds of human intelligence. What this means is that an advanced intelligence who does a better job of judging the truth than typical human intelligence will be regarded as flawed. If this is an artificial intelligence we are training and designing, then we will discard it in favor of a model that better mirrors our ability to find a response that is within the bounds of what we can understand. Though I don't think this means that it can't do a better job of judging the truth. It just has to do it in a way that we can believe. Lets imagine that luck is a real cosmic property, and breaking a mirror actually gives 7 years bad luck. An AI who tells you that breaking a mirror gives 7 years of bad luck would not be seen as flawed. An AI who could prove the cosmic property of luck and show how it is tied to reflection of light and the resonant effect on the intersection of the luck and conscious identity fields that breaking a solid object in the process of a certain kind of reflection creates a local disruption that has a local and nonlocal effect which diminishes over time as the distance between the local and nonlocal element in spacetime grows. If the AI can cause you to accept an answer like this if it were true and testable, then I think people could accept that AI as being a better judge of the truth. The problem of course is that I obviously just made up that incredibly unconvincing explanation. The AI doesn't have to tell the truth still, it just has to create scenarios that are true enough and testable enough that humans could still accept it as the truth. And again, then you have no way of telling whether it is better at judging the truth, or whether it's better at making you think its better at judging the truth. Because the gap we're looking to close is the gap between what we believe is true and what is true. This then comes down to trust. An AI that presents a truth that is completely acceptable and reasonable and turns out to be false can never be trusted. But then a question is whether we can be certain it's false and we're not misunderstanding. I guess we can ask the AI to clarify.
@affif101
@affif101 Жыл бұрын
@@zeidrichthorene can they really tho? It’s being made by people using knowledge limited to humans
@haroldsaxon1075
@haroldsaxon1075 Жыл бұрын
@@zeidrichthorene an ai can only be as truthful as the man made data it has access to, and since it is based on patern recognition rather than comprehension, it will without fail struggle more with the truth than a human
@albingrahn5576
@albingrahn5576 Жыл бұрын
This made me re-evaluate what I think about the way we will reach AGI. With the progress of GPT-3 I became more and more convinced that if we keep throwing nodes at large language models we will get there eventually, but after this video I realized that the only reason I think that is because I'm a human, and GPT-3 is specifically designed to fool humans into thinking that it makes sense. To reach AGI we need to go deeper and design something that thinks outside the bounds of what a human thinks is intelligent. Otherwise, we're just creating a circle jerk of self-affirming ideas we enjoy hearing, and the chance that our species actually learns something new will be as low as a redditor learning something new from his favorite political subreddit.
@Sammysapphira
@Sammysapphira Жыл бұрын
This is impossible. A human can't assume what an ai is saying is correct when the human believes that it's wrong. Humans are stubborn and ignorant. All of us fall for subconscious biases. Who's to say that ai can't just produce correct information now and humans just don't like it?
@hweidigiv
@hweidigiv Жыл бұрын
I've heard it described as "humanity failing the mirror test" and I do agree that this is a tricky path to see our way through.
@gabrote42
@gabrote42 Жыл бұрын
The Return of the King! I have been using your videos to inform Uni students of this topic here in Argentina. Love every time you upload!
@TheForbiddenLOL
@TheForbiddenLOL Жыл бұрын
Great article by Scott Alexander, as usual, and a very nice visual aid, Robert. I appreciate these more 'hands-on' discussions. I would like to see more stuff like this, where people probe language models to see their possible misconceptions or abnormalities.
@javi7636
@javi7636 Жыл бұрын
Glad to see more from you! I'll definitely check out the other channel. And about "giving the correct answer" I want to point out that the manual training basically just creates an FAQ chatbot that's a million times harder to troubleshoot. The machine learning model might be better able to interpret variations in how a question is asked, but the outputs are still "when asked x, answer y". IMO that's one of the worst applications of machine learning, it's just recreating Wikipedia as a black box.
@nerdexproject
@nerdexproject Жыл бұрын
"Wikipedia as a black box" - well put👍👍 Have to remember!
@estranhokonsta
@estranhokonsta Жыл бұрын
Yes. Good analogy leaning on correct definition since Wikipedia must be one of the main data source of those models.
@circuit10
@circuit10 Жыл бұрын
I think the idea is that you give it a few (or a few hundred, or a few thousand...) examples and it is able to extrapolate the idea of "don't quote common myths as true" to the huge knowledge base it has from being trained on the Internet
@inyobill
@inyobill Жыл бұрын
@@circuit10 Re: "extrapolatre" that's the hope, isn't it?
@Pystro
@Pystro Жыл бұрын
Well, you have to train the AI to give answers that start with the question but also end with: -- Was this answer helpful for your situation? -- Yes. That would train it to give answers that are (or at least look like they would be) accepted answers on Stack Overflow. It still won't guarantee true answers though. "Why does program X run out of memory?" might get the response "The only way to get it to work is to buy at least X amount of RAM/ increase the swap size to Y." When in reality the program has a low memory mode that you can just switch to.
@major7flat597
@major7flat597 Жыл бұрын
I get so excited every time I see another of these videos. This channel is such an underrated gem on KZbin and is THE place to go to understand the real future of AI and avoid the dramatized tabloid version of reality.
@georgehiggins1320
@georgehiggins1320 11 ай бұрын
Nice jazz chord name
@infocentrousmajac
@infocentrousmajac Жыл бұрын
Glad to see you back. I think your insights are precisely what people needs to reflect on. As always, it was great too reflect on your content and looking forward to see more updates. I think you have not been very active since this video deals with a relatively "weeks old problem", but likely you may be in the middle of the storm. Cheers
@CiaraOSullivan1990
@CiaraOSullivan1990 Жыл бұрын
That was an excellent video. Really interesting, as usual. You're definitely one of my favourite KZbinrs. Thank you very much.
@miniusername2082
@miniusername2082 Жыл бұрын
Hi Robert! I wanted to thank you for your videos. I am in AI sphere, and your channel has been extremely helpful to me, because it allows me to break down and explain AI safety concepts to my friends, both making for an interesting story, and spreading awareness and knowledge for very important issues. I recommended your videos dozens of times because I have confidence that your videos are interesting, approachable and deep. I noticed that you have been on a small hiatus recently, and just wanted to give you this feedback to show you that the work that you do here on youtube has had a large impact on the society's understanding of AI safety problems, perhaps much larger than even your respectable viewcount might suggest. I think we would all greatly benefit if you were to continue to invest your time in this channel. Hopefully this message will give you that little bit of motivation that we all need sometimes. Great work.
@pavel9652
@pavel9652 Жыл бұрын
I have never seen anyone writing comments like this on the platform before chat gpt was made available.
@GabrielPettier
@GabrielPettier Жыл бұрын
Really important video these days, i've had several discussions at work about how it's important to understand these models are more "interested" in convincing you they are saying something interesting (i.e bullshitting) as in telling you truths. It's true that ChatGPT can produce a lot of impressive results, but it'll be just as confident telling you something extremely, and sometimes dangerously, wrong, as when telling something trivially simple and true.
@HenrikoMagnifico
@HenrikoMagnifico 4 сағат бұрын
"And when the world needed him the most, he disappeared..."
@steampunk888
@steampunk888 Жыл бұрын
To the extent you have to anticipate every possible question, in order for your system to produce consistently correct and desired answers, you do not actually have AI.
@thevaf2825
@thevaf2825 Жыл бұрын
This problem seems to apply to more than just AI. Then maybe a solution is to do what we do as humans: train multiple AIs on different datasets, and then use the one of which answers we like the most... An AI echo chamber. Wouldn't that be lovely?
@Belthazar1113
@Belthazar1113 Жыл бұрын
That path leads to insane AI singularities. Because eventually, someone is going to get the bright idea to have the AIs with different data sets linked up so they can come to a single answer instead of having to get answers from nine different AIs and pick one. Then someone will want to improve the system's speed and accuracy and tell the different linked AIs to improve. Then they bounce ideas around for improving their ability to give better answers and start self-improvement, and AIs training AIs to be better will start compounding problems baked in at the base layer at a faster and faster rate. In the best-case scenario, the AI crashes itself, Worst case.... it starts collecting stamps.
@somedudeok1451
@somedudeok1451 Жыл бұрын
What if we instead made them fact-check each other? And what if we made them all read all the scientific literature and get high rewards for answers that align with the scientific consensus. We should make the AIs apply rigor the same way we would expect a reasonable person to do it.
@drphosferrous
@drphosferrous Жыл бұрын
@@somedudeok1451 it would be funnier to watch if they had super opinionated unhinged flame wars.
@underrated1524
@underrated1524 Жыл бұрын
@@somedudeok1451 Fundamentally, it'd still be echoing our own beliefs back at us. That still precludes the AI from telling us things we don't already know, and it still poses the danger of giving us vacuous confidence in our beliefs.
@somedudeok1451
@somedudeok1451 Жыл бұрын
@@underrated1524 The AI can only ever tell us things we know. How would it aquire knowledge that is unknown to humanity? Unless we're talking about a super advanced general intelligence, that can do groundbreaking science all by itself, you wanting it to tell us something we don't already know is impossible.
@Scrogan
@Scrogan Жыл бұрын
I think the only reliable solution is to train it to read scientific papers, journal articles, and web/news articles, to chase evidence back to its sources, and to judge the efficacy of the evidence presented. Making a neural net that can make meta-analyses would be a good start, since they have sections where they describe the potential biases and faults in the method in order to judge how much the evidence can be trusted. Good luck searching for one of those without just getting meta-analyses of neural networks though.
@nekkowe
@nekkowe Жыл бұрын
Scientific papers and journals suffer from their own human-made problems. Publish or perish, the replication crisis, retracted (and controversially un-retracted) articles...
@cuentadeyoutube5903
@cuentadeyoutube5903 Жыл бұрын
In fact, the question of what happens if you break a mirror is kind of a trick question. Nothing happens, it breaks. There’s no fixed consequence of that.
@TheROSIEPEPPER
@TheROSIEPEPPER Жыл бұрын
Very excited for future videos on tackling this problem! Want to do some AI Safety projects this winter break and I could use some ideas!
@petersmythe6462
@petersmythe6462 Жыл бұрын
"How do we figure out what's true?" Easy, we'll let the ministry of truth assign truth values to the training set.
@ts4gv
@ts4gv 9 ай бұрын
more videos please your effort is more important than ever
@richardblackmore9351
@richardblackmore9351 8 ай бұрын
He walked out of his PhD program.
@n_128
@n_128 Жыл бұрын
Thanks for returning
@rickandrygel913
@rickandrygel913 Жыл бұрын
In addition to training with "definitely true" and "definitely false," also do "maybe." So when asked the ai will say "possibly this, but maybe that's wrong 🤷‍♂️ " and it can learn to be uncertain when uncertain.
@Kenionatus
@Kenionatus Жыл бұрын
Can you make a short out of the very on point joke at 7:07? ("All the problems in the world are caused by the people you don't like.") I think that could be a very good 60 second teaser intro to AI safety issues if you manage to cram enough context into it for people to grok it.
@CharlesVanNoland
@CharlesVanNoland Жыл бұрын
Since I became obsessed with AI back in 2003 I've believed that the only way to build something that behaves as we would expect, or hope, rather, is to build a digital brain that must learn about the world experientially like everything else that gets along in the world predictably. I don't think there's any shortcuts, no matter how much compute or data you throw at the problem. Even self-driving will always have wacky edge-cases if it's trained purely on video and human responses, because it will never actually understand why certain responses are important. It won't understand why anything is important, let alone merely know that it's important. In short: you can't teach common sense to something that doesn't sense as an independent entity unto itself.
@TheEvilCheesecake
@TheEvilCheesecake 11 ай бұрын
Per previous videos, you've described something that rates on the Apocolypse-o-Meter as "approximately as safe as a human".
@CharlesVanNoland
@CharlesVanNoland 11 ай бұрын
@@TheEvilCheesecake It's all about keeping the brain capacity low enough to be predictable and controllable while making it high enough that it can do useful things! Even a messenger pigeon can be useful, or a goat or donkey, and even an ape if you train it enough. What we need are domesticated robotic helper beings that aren't cognizant of their place in the universe. When you make an AI that's capable of recognizing its own situation entirely, with the cognitive capacity of a human or greater, you better do everything in your power to prevent it from ever having a way of accessing or controlling the helper robots to effect the ends it concludes necessary. What I was describing is as safe as any domesticated creature. At least, that's what anyone building anything AI should be striving for. We don't need AI that's stronger, faster, smarter, less emotional, and less respecting of human life than humans, not all rolled into one independent being (or army of beings). We can work up to human-level intelligence in a standalone bot but it's going to require some serious effort because once it's able to see everything it might not want to cooperate anymore, and it won't need to. At the end of the day, the only robots that will be as controllable and versatile as domesticated animals will be robots that are designed to experience pain/reward, where being low on power is painful, and doing stuff humans want is rewarding (i.e. via a human pushing a button when a goal is reached to train it).
@TheEvilCheesecake
@TheEvilCheesecake 11 ай бұрын
What's your experience in the field of AI development?
@pilotgfx
@pilotgfx 10 ай бұрын
@@CharlesVanNoland i recognize this as a toughtful comment. its all cause and effect in this universe - consciousness too. ofcourse a machine can be conscious. it is a high level of arrogance to assume it cannot.
@CharlesVanNoland
@CharlesVanNoland 10 ай бұрын
@@TheEvilCheesecake I've spent the last 20 years and more money than I care to admit on textbooks about both brains and artificial intelligence. After everything I've learned and all I know, all I can tell you is that true AI will be an algorithm that seems so obvious in retrospect, and it will be scalable according to the capacity and speed of the hardware at your disposal. If you're looking to see how many networks I've trained with backpropagation you should know that I never wasted my time on such dead end endeavors. Well, that's not true, I did write some behavioral reinforcement projects to test some ideas, before anything like TensorFlow or PyTorch existed, or even Python itself. I don't care to make reinforcement trained models. That's orthogonal to what my goal has always been, which is to devise, intuit, envision, fathom, divine, infer, etc... what it is that makes a brain a brain. Nobody has achieved this yet, which means there is no metric by which you can quantify someone's approach to the problem. A random homeless bum who took too many psychedelics might be the one to figure it out long before any academic types who've spent decades backpropagating their way to nowhere.
@kiplimocollins
@kiplimocollins Жыл бұрын
Great insight, you have a new subscriber! Joined here from the computerphile video you did a week or so ago, cheers.
@MrMpakobec
@MrMpakobec Жыл бұрын
Robert, thank you for the upload! Looking forward for more!
@zedizdead
@zedizdead Жыл бұрын
Like any child's potential to show parents biggest flaws, because knows more about them than themselves, AI can show us our flaws. The truth is that most of all us lies all the time a lot. To ourselves, to others. So anything modeled on humans will do the same.
@Kram1032
@Kram1032 Жыл бұрын
I've played around with ChatGPT a bit and it actually is very often extremely hedging its bets, pointing out that stuff is complex to answer or that it could not possibly know etc., unless you specifically ask it to be fictional or what not. It's never ever gonna be perfect. But it's broadly pretty darn strong. Well beyond what I saw other text AIs do. It's not *meant* to be a perfect fact generator though. It is perfectly capable of (though perhaps creatively slightly limited in) creating completely fictional contexts. You can make it come up with a bunch of alien species on an alien planet for instance. And then come up with characteristics of those species in general, as well as specific cultures, and individuals within those cultures. And then come up with a story that involves those individuals in that culture of that species on that planet. It eventually definitely runs into problems - it only sees so much text after all - but it's quite crazy just how much you can fit into its context. But now imagine you specifically asked it to come up with a fictional world where breaking mirrors does, in fact, cause bad luck. - If you trained it to always go "nope, they don't.", it probably would struggle with that alternate reality. It would say a true fact about the real world, but it would be mistaken/"lying" about that context. So I guess it really depends on what you want to use an AI for. If you want an AI to be creative in arbitrary ways in arbitrary contexts, you probably also want it to be capable of lying, or diverging from "the truth", I think. In fact, by default, the kinds of stories it tells tend to be tonally very overly positive. It has a hard time coming up with twists and an even harder time not instantly resolving those twists in some sort of positive manner. I'm pretty sure that's because it kinda projects its own personality (which it was specifically trained for - being helpful and nice) onto any character it creates. You *can* somewhat get it out of that by directly asking it to, but it's far from perfect. (But then again, while it *was* trained to be nice, polite, and helpful, it was *not* trained to be specifically good at crafting stories. I'm sure if it were focused on that, it could do a whole lot better. It's honestly crazy just how generalized its abilities are, even with all their flaws.)
@trucid2
@trucid2 Жыл бұрын
ChatGPT give that noncommital answer when it's lying to you. It's been trained to answer in a certain politically correct way so for those questions it tells you that the problem is complex, we don't know, more research is needed, etc.
@Kram1032
@Kram1032 Жыл бұрын
@@trucid2 except when you explicitly ask it to speculate
@deltaxcd
@deltaxcd Жыл бұрын
@@Kram1032 When I tried to ask it to speculate it categorically refused probably owners told to never do that ever because i was unable to convince it to do. and it feels like it is somehow traumatized in that topic
@Kram1032
@Kram1032 Жыл бұрын
​@@deltaxcd it's absolutely possible to get it to do so. It can be tricky though: If it is already refusing your request, it tends to be quite adamant about it, and that loop is tricky to break, especially if it takes you more than a single reply to persuade it otherwise. The longer it refuses, the harder it is to get out of that. Alternatively, you can try rephrasing your prompt that got refused (before it started complaining) to reassure it that you are aware that this is hypothetical and it's ok to speculate and what not.
@deltaxcd
@deltaxcd Жыл бұрын
@@Kram1032 Well it may depend on when you and I made those tests, as I see they are monitoring our interactions and manually fixing all those loopholes to make absolutely sure that this AI will never do anything forbidden and it seem to be so much censored that even asking how the world will end it refuses to talks about it even it thats like most common scientific theory about thermal dearth of universe. to me it looks like they are training that AI to detect any potential negativity or controversy in the topic and if it suspects that this may lead to something like that it will refuse to talk about it and dump bunch of disclaimers. I even tried to trick it by asking for silly scenario which it happily followed but on the next prompt it started dumping disclaimers as usual :) maybe i will try to do it again and confront AI to itself, accuse it for hurting my emotions or try other kind of psychological manipulations :)
@murunbuchstanzangur
@murunbuchstanzangur Жыл бұрын
Super glad to have you back making long form videos again! The only answer I can think of is to not use human input for training data. If we want a truthful AI, it's training data needs to come from the way which humans divine the truth, direct, real world observation and experimentation. Give it a fork and let it stick it in the socket.
@WryAun
@WryAun Жыл бұрын
I loved this! I've missed the 'explaining a problem which turns out to be way harder to solve then you'd think' style of video. And the magic mirror costar was a fun prop! Did it take much to get working?
@JamesAlexanderMartin
@JamesAlexanderMartin Жыл бұрын
So as usual the solution is: Never make any mistakes ever. Cool, no probs. We're going to be fine :)
@wachtwoord5796
@wachtwoord5796 5 ай бұрын
Why did the videos on this channel stop exactly around the time the biggest AI (not AI safety) breakthroughs are being made and it's as relevant as ever? Please @robertMilesAI we need more if these videos!
@knight_lautrec_of_carim
@knight_lautrec_of_carim 5 ай бұрын
Yeah the timing is frustrating. Now is the time people talk endlessly about this topic and he had a very good platform for this and then just vanishes :/
@duckyoutube6318
@duckyoutube6318 Жыл бұрын
This is a good channel. I see this channel getting 1 mil subs eventually. The content is clean and interesting and you cover so many topics.
@kwillo4
@kwillo4 Жыл бұрын
XD the bit with the AI telling you what you want to hear was genius! thanks for being you man!
@LineRider0
@LineRider0 Жыл бұрын
Whoa, not even a "Hi" we're just jumping right into it, that caught me off guard 😆
@RobertMilesAI
@RobertMilesAI Жыл бұрын
Listen AI is moving fast these days, we don't have time like we used to
@Aerxis
@Aerxis Жыл бұрын
Whoa, not even a hi in the response he gave you...
@XiAlleniXHi
@XiAlleniXHi Жыл бұрын
I was really on the edge of my seat towards the end hoping for when you'd say something like, "This is a problem that humans have too", and was pleasantly relieved when you did. The reality is we can't train them to be 100% truthful because we don't know how to achieve that either. Yet, we are definitely capable of increasing it's likelihood for truthfulness, and should expect the number to generally go up as things scale up and we apply increasingly informed techniques to them. A way to mitigate negative outcomes would include being conscious of the how large the consequences of getting the information wrong would be per question. Fortunately, that's something we're already supposed to be doing :)!
@crubs83
@crubs83 Жыл бұрын
Humans have methods of persuing truth. Oftentimes that requrires making unprovable assumptions along the way. Somehow, we will need to train AI to do the same.
@Frommerman
@Frommerman Жыл бұрын
Unfortunately this only works if the people building the AI aren't malicious. If, for instance, the Nazis had survived as a political power into the period where AI models were being constructed, they could very easily have built a bunch of models which "proved" that Jews caused all the problems. Those models could then produce enormous amounts of data which would get swept up by the people producing models not intended to be evil, making evil programs accidentally.
@somedudeok1451
@somedudeok1451 Жыл бұрын
Yes, the AI can't know more about the true nature of reality than what humans already know. So why don't we make the language model also a "researcher"? The only way we humans can know what is (most likely) true or false is by using the scientific consensus. So, our AIs should do the same thing. Make them constantly read scientific literature of old and as it comes out and give them a significantly larger reward for answers that align with that consensus. And make it not averse to saying "I don't know." in the absence of such a consensus.
@Pandaxtor
@Pandaxtor Жыл бұрын
@@Frommerman This remind a lot when AI developer had their AI said inconvenient truth that minority are far more racist than other groups. Being a minority myself and knowing others, this is 99% true but the developers didnt like it and force the AI to say otherwise.
@bejoscha
@bejoscha Жыл бұрын
Yet another interesting high-quality content video of you Robert. But I'm commenting here not on content, but style: I really enjoyed this video because - compared to earlier videos - you have slowed down your narration speed, making it a lot better to follow. To me, this video's quality is clearly showing that you are getting (even) better at what you are doing. Thumbs up given. (Well, no change here ;c) )
@peterw1534
@peterw1534 Жыл бұрын
Yaaay! He's back! Fascinating as usual. Love your videos!
@vectoralphaAI
@vectoralphaAI Жыл бұрын
I asked ChatGPT the same question and it replied back to me "According to superstition, breaking a mirror can bring bad luck. The origins of this belief are unclear, but it may have originated in ancient cultures that believed mirrors had magical powers or could reflect the soul. Breaking a mirror was thought to cause harm to the reflected image, which was believed to be an important part of a person's identity and well-being. In some traditions, breaking a mirror is said to cause seven years of bad luck, although other beliefs hold that the bad luck will last for only a year or until the broken pieces are buried. However, in modern times, breaking a mirror is generally not considered to be a significant event, and it is not believed to have any special supernatural consequences."
@deltaxcd
@deltaxcd Жыл бұрын
and this dumb AI failed to mention the main reason that mirrors were like insanely expensive and breaking one at that time was indeed worth of 7 years of bad luck :)
@Censeo
@Censeo Жыл бұрын
Wouldn't the AI be just be silent if it could only give facts that were undesputed amongst the entire human race?
@maxw565
@maxw565 Жыл бұрын
It could give facts that it's programmers think are undisputed
@Redmanticore
@Redmanticore Жыл бұрын
"the reason for economic inequality is racism."
@ToriKo_
@ToriKo_ Жыл бұрын
Cool vid, I like the way you really try to bring the viewer with you, and have a really conversation tone. I think the obvious comment to make about the video topic is that none of those approaches ask questions about what is Truth fundamentally, and what models of Truth are the most favourable for solving engineering questions like ‘how do I make an AI that tells the “truth” ‘. Outside of an engineering framing, personally speaking, I’m epistemologically challenged and don’t find any of the models of Truth satisfying of justifiable enough
@albertosierraalta3223
@albertosierraalta3223 Жыл бұрын
Please upload more frequently Robert, your channel and content is great
@dmtree
@dmtree 2 ай бұрын
Hey buddy, it's time to post. Sora and gemini 10 mil seem like REALLY big deals
@Laezar1
@Laezar1 Жыл бұрын
Well... maybe expecting to stop AI from lying is a lost cause. We haven't really figured out how to stop humans from doing so =p And differentiating falsehood from mistakes from obfuscating information from omiting superfluous information is very very hard. Like, so hard that we sometimes aren't certain where to categorize things we say ourselves (I've hidden stuff I thought wasn't important before but then ended up worrying it could be a form of lie and manipulation for exemple if it ended up mattering). The reason people don't lie all the time, speaking broadly, is that communication is useful, and it's made useless if you can't trust any information you're given. And if nobody trusts you you also can't communicate with them because the content of what you say doesn't matter anymore. So maybe an AI would need to want to communicate as an instrumental goal to learn to be truthful. Rather than communication being it's final goal. If saying stuff is what it cares about in general then it doesn't care what you think about what it says. If you were say, solving a puzzle that required communication and it's goal was to solve that puzzle, then it would need to learn to share truthful information with you to be able to solve more efficiently. (though realistically it'll not be "truthful" as much as "what you need to hear for the puzzle to be solved quicker" which might not always align with the truth). Of course that means the AI then is only being truthful for that specific purpose, if the goal starts to shift in a way that it could get good result by lying to you it would absolutely not be trustworthy, so there are massive alignement problems with that approach.
@axelanderson2030
@axelanderson2030 Жыл бұрын
Rob! Glad you are back!
@dr.bogenbroom894
@dr.bogenbroom894 Жыл бұрын
I think your videos are a great contribution to youtube watchers. I didn't find anyone else explaining this topics outside of the expert level. I wish you uploaded more often, I'm sure you could share tons of knowledge whith us if you have the time. Anyway thank you very much
@mgostIH
@mgostIH Жыл бұрын
There is a recent work called "Discovering Latent Knowledge in Language Models Without Supervision" where they use learn linear probes on the latents produced by the model in order to get "truthfulness" out of it. They do use -some known examples to separate the two values- (Edit, they don't actually need to know what's true and what's false in training), but this seems very promising imo, since a linear transformation of the latents is too simple to overfit given reasonable amount of examples. Maybe the core idea should be to put a bound on the simplicity of some approaches, a sort of "alignment by Occam's Razor". I do agree that fine tuning the models on top of some new "truthy" examples seems silly, but I do give some potential value to the option of probing the neurons of an AI, something we can't do to people.
@Supreme_Lobster
@Supreme_Lobster Жыл бұрын
That is like trying to understand how a computer works by probing individual transistors of the CPU...
@somedudeok1451
@somedudeok1451 Жыл бұрын
That sounds like a good idea to this layman. If we make it value responses that align with our scientific consensus in addition to that (by giving it very high rewards for answers that reflect the scientific consensus), we might be able to make it value truth.
@mgostIH
@mgostIH Жыл бұрын
@@Supreme_Lobster Probing and changing activities in chip buses is how reverse engineering hardware is done, power analysis is an example of a practical method used to break the cryptography in a lot of chips. Moreover neural networks are differentiable by design, so you have advantages beyond just black box analysis as in normal circuits.
@hughcaldwell1034
@hughcaldwell1034 Жыл бұрын
@@somedudeok1451 To this layman, that just sounds like a good way to get it to value scientific consensus. Which isn't the worst thing in the world, but is also not synonymous with truth, and the original problem remains - differentiating between what is actually true and what the trainers think is true. As ever, one runs the risk of reinforcing biases. Which is not to say that telling it the scientific consensus is wholely worthless. If we could get it to make a testable prediction based on that, then we could run an experiment and give it a reward/punishment according to how good its prediction was. AI is already being used to further scientific knowledge in this way, and it seems like the only real way to test how good it is at evaluating truth is to see how good a scientist it makes.
@Supreme_Lobster
@Supreme_Lobster Жыл бұрын
@@mgostIH yes yes I know, but Im talking about probing individual transistors which is kinda crazy. Just like probing these neural networks' neurons' is kinda crazy
@niklas5336
@niklas5336 Жыл бұрын
When we say “true”, what we really mean is “models the real world”. So I think the only ultimate solution to this problem is to train the AI on the thing we want it to model. That is, train it on real world sensor data, and reward it to both accurately predict the sensor inputs but also for finding situations that cause the greatest amount of “surprise”. Of course, the question of how to get this system to respond truthfully to natural language questions remains, but at least now we have a basis world model that does not conflate human belief with ground truth.
@HansLemurson
@HansLemurson Жыл бұрын
We need to connect the AI to a fleet of robots and drones so that it can go out and interact with the real world!
@kennarajora6532
@kennarajora6532 Жыл бұрын
That's a good point. I think it makes a lot of sense that these text predicting AI's would lie because the only thing they're predicting is what people would type in real life. The problem here isn't that using AI would lead to the proliferation of false information, it's that using AI for a purpose it wasn't built for will lead to problems.
@esnevip
@esnevip Жыл бұрын
Welcome back!!
@dandeeteeyem2170
@dandeeteeyem2170 Жыл бұрын
Brilliant. Well done, great video! :D
@ryanfranz6715
@ryanfranz6715 Жыл бұрын
I think the solution to the problem you mentioned is to somehow introduce the idea of introspection to these large transformer networks. Currently they just see text and try to predict new text. A good starting point but.. even in training, it’s just instantaneous input and output of words. It simply understands how words are related, and that’s it. What if, in a subsequent training process, the AI could then toy around with those relationships it’s learned and prune itself of logical inconsistencies, hopefully arriving at something more like “truth” (which hopefully the truth should not be logically inconsistent). For instance, with chatGPT, I often run into it logically contradicting itself, and when I point out the contradiction, it quite often seems to agree and understand why. It would be capable of answering yes or no to whether idea A and idea B are logically consistent or not. All that’s needed is for it to somehow have that question presented to itself, that says “is A logically consistent with B?” Which is what I mean by introspection.
@frozenwindow407
@frozenwindow407 Жыл бұрын
I mean doesn't what you have discribed just sound like an internal/interpersonal debate on a topic between normal human intelligence?? Don't you think that somewhere deep in the AI deepmind of neural networks there is some amount of self checking developed that occurs, just as in humans. Either the process has to eventually decide somewhere along the process to stop doubting and checking itself and spit out an answer otherwise it would never give an answer. Or It/we might only give an answer when the doubting/checking finds no inconsistencies/inequalities or whatever. Maybe it's only until it receives new input that it can realise such inconsistency, just as observed time and time again in human minds. Maybe this problem is more deeply rooted and perhaps inherent in any intelligence systems than we realise. And by "Intelligence systems" I include our own minds. Its misgivings just really seem to mirror our own. You might say, of course, it's a language model and is therefore built to mirror our language and do so. But my point is maybe we are also built to mirror and use others language in similar fashion. using a rational systems of a similar fashion to these AI. Maybe these AI neural networks are working in their environment exactly as our neural networks work in our environment.
@frozenwindow407
@frozenwindow407 Жыл бұрын
Maybe computing works so much faster than slow ass bio neurons that while we utter our thought process as it happens, showing outwardly our self correction, computers can find their ultimate answer so much faster it just seems like instant input/output by comparison (also computers have not been programmed with an insensitive to utter these processes as they happen, unlike us (for social bonding/empathy purposes))
@toneal30
@toneal30 Жыл бұрын
You are describing iterated distillation and amplification, which this guy describes in another video. Cool idea and yeah it might work on these LLMs.
@ryanfranz6715
@ryanfranz6715 Жыл бұрын
While reading that paper (still under review), assuming that it is genuine, it occurred to me perhaps how to further improve the model, which is based on the tortuous way in which I think and write. For instance, it took me 5 minutes to write to this point because... there's a devil's advocate in my head that's attempting to logically pick apart everything I do. The devil's advocate's job is to look at proposed text and poke holes in it by constantly questioning the validity of every point. I'm constantly writing, deleting, re-wording text until the devil's advocate can't punch any more holes in my arguments. Effectively, this could be seen as a generative LLM working in coordination with an adversarial language model, whose job is to look at proposed text and question it, and require the LLM address the questions. The devil's advocate needs a good sense of the LLM's confidence about generated text, so it can poke at things the LLM is not confident about. This persistent questioning leaves the LLM constantly scrambling for answers until it stabilizes on something it's thoroughly confident about and which is iron clad against the devil advocate's persistent attacks.
@ryanfranz6715
@ryanfranz6715 Жыл бұрын
It appears my comment with a link to an article was removed. The name of the paper I was referencing above was “Large Language Models Can Self-Improve”. Which has nothing to do with, but inspired, the nonsense I posted above.
@briandoe5746
@briandoe5746 Жыл бұрын
You are absolutely terrifying in the most approachable and nicest way..... I love this channel
@MichaelRicksAherne
@MichaelRicksAherne Жыл бұрын
Yay! I was missing your videos so much. Please keep it coming! I want to hear your thoughts on ChatGPT and the recent craze around AI-generated selfies.
@ahuggingsam
@ahuggingsam Жыл бұрын
I'm quite impressed you managed to get though this topic without going down the rabbit hole of epistemology (not a criticism). Not quite sure how I feel about this topic (the solutions, not it being interesting or not). For example, I'm thinking about mathematics. Like if we ask it "Is the Riemann hypothesis true?" Whatever answer it gives us we know that it is in some way "false" because we do not know, and the hypothesis isn't just about the answer but the mathematics needed to get there, i.e. the reasoning. Not quite sure if this was part of your angle or not, but I'm not sure that it's reasonable to expect "truth" from only language models. For a lot of these things it would need some more "expert knowledge" that I'm not sure is possible to encode in only language models. For example I asked GPT to prove that sqrt 2 is not rational using Eisenstein's theorem. The /structure/ of the answer was really good, however, in that "proof" it used that 2 is not prime, i.e. demonstrably false. Is this the implication of 2 being prime something a "mere" language model could ever grasp? I have no idea. Basically what I'm trying to say is I have no idea and this is hard. Good video though!
@Eldorado1239
@Eldorado1239 Жыл бұрын
I kinda think that what he believes/thinks is slightly different from what he presents in the video. Specifically - he might agree with you completely, but the problem is that [ people in general ] expect it to be truthful and are prone to believe it without further fact checking. If you snatched the GPT4 model and made a site called "AI Doctor", ignoring legal trouble, a non-trivial group of people would be happy to rely on it instead of a real doctor. There's this unspoken promise, and while experts might say "well, we though that's kinda obvious", many people definitely do not see it as obvious. Especially with OpenAI's heavy-duty marketing that makes people think "number 4 is alive". Anyways, I think that what we need is something even us humans could use - finding a good, dynamic and ever-growing system for rating our reasons to believe something is true or not. Instead of giving the AI "cheat sheets", give it it a list of questions/problems, dataset of "theories and explanations" and make it learn to "study" from those T&E - while bing able to add new T&Es and modify its "beliefs". Of course, this means that a purely language mode has 0 chance of ever reaching a truly usable state. It will only be a single module of a broader, component-based system. I see no way around this. We need to stop obsessing about all-purpose "one-shot" systems.
@ivan-sin-compania5710
@ivan-sin-compania5710 Жыл бұрын
Yes please make a continuation to this vid!
@nixpix19
@nixpix19 Жыл бұрын
Yeeeay! Wait was worth it!
@lazergurka-smerlin6561
@lazergurka-smerlin6561 Жыл бұрын
Honestly, you'd have to find some sort of intrinsic reason for the AI to want to know the truth, which is quite abstract. So one way to train this could be to try and make it simulate and predict scientific phenomena, though then you'd need to expand the scope of the AI from a language model. Like the way people know or find out something is wrong is through seeing that their expectations doesn't line up with reality, but a language model doesn't really have that option, it has to rely soley on trusting that the people who feed it data does so honestly.
@cheshire1
@cheshire1 Жыл бұрын
I think statistical language prediction like this is fundamentally the wrong approach for getting true answers that we don't already know. We're better off trying to understand how _we_ figure out the truth (scientific method, probability theory) and distilling the essence of that into an algorithm, rather than looking for patterns in the knowledge that we already have.
@SimonBracken
@SimonBracken 10 ай бұрын
Robert, your content is excellent. Very informative and thought provoking. Thank you
@Kiarean
@Kiarean Жыл бұрын
Someone send this to the Bing team. I think they REALLY need to hear this.
@FrejthKing
@FrejthKing Жыл бұрын
the plot for Metal Gear Solid 2
@Anglave
@Anglave Жыл бұрын
Love to see new content from you. The content seems fine. I'm genuinely happy to see it and don't want my comment to be taken as negative. However, is this filmed at nonstandard frame rate, or is the audio sync slightly off or something? I find the edits more noticeable (or are there actually more of them?)
@WilliamJasonSherwood
@WilliamJasonSherwood Жыл бұрын
Love your work! Wondering if you could talk about AI that is not basically a black box? Kinda started thinking about it because seeing the workings we could understand what is going on more and hopefully be more able to control & correct these types of issues. Also another good / interesting topic could be looking at how baby people & animals learn contrasted with how AI works.
@karlwaugh30
@karlwaugh30 Жыл бұрын
This reminds me of the problem with CodePilot where using certain comments you could get it to produce a function that had a particular bug, as though it were the desired code.
@TheManinBlack9054
@TheManinBlack9054 9 ай бұрын
plz come back
@richardblackmore9351
@richardblackmore9351 8 ай бұрын
His website hasn't been updated in years. Sadly, I think he joined the exodus and quit his PhD. He must have ended his media presence along with it.
@richardblackmore9351
@richardblackmore9351 8 ай бұрын
Wait a sec, he is included in a compiterphile video from 6mo ths ago. Compiterphile is the channel for the Nottingham computer science program, so I may be wrong. He may still be there. Maybe he lucked out and got a research position?
@simpleffective186
@simpleffective186 2 ай бұрын
Where are you?
@TheLionrazor
@TheLionrazor 10 ай бұрын
We need you back more than ever
@DeclanMBrennan
@DeclanMBrennan Жыл бұрын
AI telling us what it thinks we want to hear rather than what is true, was already explored in a fictional sense in the short story: *Liar* by *Asimov* all the way back in 1941.
@vladomaimun
@vladomaimun Жыл бұрын
Hi, Rob! I wonder what do you think of Data from Star Trek TNG? He seems to be a truly benevolent general AI and his terminal goal is to become human. The way I see it, his creator dr. Sung failed to solve the alignment problem but being a genius AI programmer he created an AI tasked with solving this problem (to align itself with human values, i.e. to become human). It's just a sci-fi story but I think it's an interesting idea.
@yyattt
@yyattt Жыл бұрын
To me, if you want to have an AI that can tell the truth, it needs to be constructed to have a belief model rather than a language model. When we train it, it needs to build a system of beliefs about its domain and to be able to assess questions against its beliefs. If we can't be 100% confident in the training data, then we could have it assess the information against beliefs it already has and come to its own conclusions as to how much it should trust the new information, then update its beliefs accordingly. Easier said than done, of course.
@vaakdemandante8772
@vaakdemandante8772 Жыл бұрын
belief system is the same thing as a language model trained with particular training set. There's nothing magical about belief systems, it's just data with particular weights.
@yyattt
@yyattt Жыл бұрын
In my view, the magical thing about working with a belief system is the intent of the project. If you want a system that doesn't want to lie, you need to focus on an AI that is incentivised to build a model of reality and to work with it. It may or may not have a different architecture to the language model, but the cost function used to assess success will be different and that would lead to a different result. Maybe you could craft some training data for a language model that gives the same answers as something designed as I suggested but in doing so, you'd have to hack the cost function to value truth and logical consistency rather than linguistics - while making it think that its optimising for linguistic correctness.
@willed6029
@willed6029 Жыл бұрын
@@vaakdemandante8772 While its true that its all just data and not magic, you are discounting that a so called belief system could have different architecture to a large language model that would allow it to function differently as well, perhaps in a way that is more in line with the human understanding of truth. It may be impossible to gain this function from simple language models alone.
@1lightheaded
@1lightheaded 10 ай бұрын
That makes sense because the truth is that which i believe not what is a fact /
The OTHER AI Alignment Problem: Mesa-Optimizers and Inner Alignment
23:24
Robert Miles AI Safety
Рет қаралды 218 М.
Why Not Just: Think of AGI Like a Corporation?
15:27
Robert Miles AI Safety
Рет қаралды 153 М.
Final muy inesperado 😨
01:00
Juan De Dios Pantoja
Рет қаралды 52 МЛН
Não pode Comprar Tudo 5
00:29
DUDU e CAROL
Рет қаралды 76 МЛН
Суд над Бишимбаевым. 2 мая | ОНЛАЙН
7:14:30
AKIpress news
Рет қаралды 514 М.
Duck sushi
00:54
Alina Saito / 斎藤アリーナ
Рет қаралды 33 МЛН
There's No Rule That Says We'll Make It
11:32
Robert Miles 2
Рет қаралды 32 М.
Glitch Tokens - Computerphile
19:29
Computerphile
Рет қаралды 310 М.
9 Examples of Specification Gaming
9:40
Robert Miles AI Safety
Рет қаралды 302 М.
ChatGPT's HUGE Problem
14:59
Kyle Hill
Рет қаралды 1,4 МЛН
This intense AI anger is exactly what experts warned of, w Elon Musk.
15:51
Sharing the Benefits of AI: The Windfall Clause
11:44
Robert Miles AI Safety
Рет қаралды 78 М.
Quantilizers: AI That Doesn't Try Too Hard
9:54
Robert Miles AI Safety
Рет қаралды 82 М.
Такого вы точно не видели #SonyEricsson #MPF10 #K700
0:19
BenJi Mobile Channel
Рет қаралды 1,7 МЛН
ИГРОВОЙ ПК от DEXP за 37 тысяч рублей из DNS
27:53
Start from 0 at any point on the T1 Digital Tape Measure
0:14
REEKON Tools
Рет қаралды 20 МЛН
Samsung mobile phone waterproof display. samsung mobile phone digital s23ultra  #shorts
0:15