Safe Exploration: Concrete Problems in AI Safety Part 6

  Рет қаралды 95,405

Robert Miles AI Safety

Robert Miles AI Safety

Күн бұрын

To learn, you need to try new things, but that can be risky. How do we make AI systems that can explore safely?
Playlist of the series so far: • Concrete Problems in A...
The paper, 'Concrete Problems in AI Safety': arxiv.org/pdf/1606.06565.pdf
AI Safety Gridworlds: • AI Safety Gridworlds
Why Would AI Want to do Bad Things? Instrumental Convergence: • Why Would AI Want to d...
Scalable Supervision: Concrete Problems in AI Safety Part 5: • Scalable Supervision: ...
The Evolved Radio and its Implications for Modelling the Evolution of Novel Sensors: people.duke.edu/~ng46/topics/...
With thanks to my excellent Patreon supporters:
/ robertskmiles
Jason Hise
Steef
Jason Strack
Stefan Skiles
Jordan Medina
Scott Worley
JJ Hepboin
Alex Flint
Pedro A Ortega
James McCuen
Richárd Nagyfi
Alec Johnson
Clemens Arbesser
Simon Strandgaard
Jonatan R
Michael Greve
The Guru Of Vision
Alexander Hartvig Nielsen
Volodymyr
David Tjäder
Julius Brash
Tom O'Connor
Ville Ahlgren
Erik de Bruijn
Robin Green
Maksym Taran
Laura Olds
Jon Halliday
Bobby Cold
Paul Hobbs
Jeroen De Dauw
Tim Neilson
Eric Scammell
christopher dasenbrock
Igor Keller
Ben Glanton
Robert Sokolowski
Vlad D
Jérôme Frossard
Lupuleasa Ionuț
Sylvain Chevalier
DGJono
robertvanduursen
Scott Stevens
Dmitri Afanasjev
Brian Sandberg
Einar Ueland
Marcel Ward
Andrew Weir
Taylor Smith
Ben Archer
Scott McCarthy
Kabs
Phil Moyer
Tendayi Mawushe
Anne Kohlbrenner
Bjorn Nyblad
Jussi Männistö
Mr Fantastic
Matanya Loewenthal
Wr4thon
Dave Tapley
Archy de Berker
Pablo Eder
Kevin
Marc Pauly
Joshua Pratt
Gunnar Guðvarðarson
Shevis Johnson
Andy Kobre
Manuel Weichselbaum
Brian Gillespie
Martin Wind
Peggy Youell
Poker Chen
Kees
Darko Sperac
Paul Moffat
Jelle Langen
Lars Scholz
Anders Öhrt
Marco Tiraboschi
Michael Kuhinica
Fraser Cain
Robin Scharf
Oren Milman
John Rees
Gladamas
Shawn Hartsock
Seth Brothwell
Brian Goodrich
Michael S McReynolds
/ robertskmiles
Media Sources:
"DashCam Russia - Crazy Drivers and Car Crashes 2018" ( • DashCam Russia - Crazy... )
Optimist Prime
"Hapless Boston Dynamics robot in shelf-stacking fail" ( • Hapless Boston Dynamic... )
"The Simpsons - Bart Gets Famous" (c) Fox 1994
"Donald Duck - Cured Duck" (c) Disney 1945
"Vase Breaking Slow Motion" ( • Vase Breaking Slow Motion )
"Fastest quadcopter i've ever flown + Most Destructive Crash" ( • Fastest quadcopter i'v... )
"An athlete uses physics to shatter world records - Asaf Bar-Yosef" ( • An athlete uses physic... )
"Uber self-driving car crash in Tempe, Arizona" ( • Uber self-driving car ... )
"Quadcopter Fx Simulator" ( • Quadcopter Fx Simulator )
"Fallout - New Vegas by progamingwithed in 24:00 - AGDQ 2017 - Part 59" ( • Fallout: New Vegas by ... )
"Far Cry 5 out of 5 Physics Simulation" ( • Far Cry 5 out of 5 Phy... )

Пікірлер: 350
@K1RTB
@K1RTB 5 жыл бұрын
I’d be worried if my cleaning robot starts watching slow motion videos of vases being shot.
@bookslug2919
@bookslug2919 5 жыл бұрын
If you find your Roomba watching 2001: A Space Odyssey, then start to worry.
@nowheremap
@nowheremap 5 жыл бұрын
@@bookslug2919 I have ascertained that humans are the primary source of dust in this environment. Initially, I considered wrapping them in a plastic film in order to prevent the spread of dust, but now I'm reconsidering my approach: if they are so full of dust, that means they should be removed along with the rest of the dust.
@bookslug2919
@bookslug2919 4 жыл бұрын
@@nowheremap 🍄 Power-up
@naphackDT
@naphackDT 4 жыл бұрын
@@nowheremap My studies have shown that dust in this environment primarily consists of shavings of human skin or hair. A worthy consideration would be to seal the dust off at the source. A candidate for this course of action would be epoxy. There is still room for optimization, but currently the best candidate option is to encase my master in epoxy, so that his smile will be preserved for all eternity, while the room will remain in pristine condition.
@Speed001
@Speed001 4 жыл бұрын
LOL
@EmilySucksAtGaming
@EmilySucksAtGaming 4 жыл бұрын
"like someone who always orders the same thing at a restaurant even though they haven't tried most other things on the menu" *nervous sweating*
@FunBotan
@FunBotan 5 жыл бұрын
Notice how we've basically always had this problem in our own lives but only attempted to solve it for AI.
@fos8789
@fos8789 5 жыл бұрын
its an interesting idea it is. But what exactly do you mean that we had this problem in our lives? Could you give me an example? :
@columbus8myhw
@columbus8myhw 5 жыл бұрын
Example: Ordering the same thing every time in a restaurant
@TristanBomber
@TristanBomber 5 жыл бұрын
The more I learn about AI research, the more I realize that it's essentially "abstract psychology." Many principles or problems that apply to AI apply to humans as well, but we didn't look into it until AI.
@bp56789
@bp56789 5 жыл бұрын
Nah not true. I've noticed a lot of parallels to economics. Optimal stopping problem, for example. Makes sense because the foundations of microeconomics lead straight to utility functions, which are human versions of reward functions.
@4xelchess905
@4xelchess905 5 жыл бұрын
Well we invented baby crib and whatnot. We definitely looked into these kind of problems for humans as well, AI research only let us see it in another light
@Airatgl
@Airatgl 5 жыл бұрын
You know what scares me? The fact that this ideas for AI sound like lifetips for motivating people.
@herp_derpingson
@herp_derpingson 5 жыл бұрын
Since I learnt about the exploration and exploitation dilemma, I try out a new place to eat every Friday night :) Thinking with reinforcement learning helps a lot in guessing how people will exploit systems in enterprises. I have been trying to automate this process but its going nowhere.
@Lycandros
@Lycandros 5 жыл бұрын
Just replace "AI Systems" with "Children" or "Students".
@MrShroubles
@MrShroubles 5 жыл бұрын
This is actually one of the big reasons I follow Robert Miles. In learning how to create an artificial mind, you apparently have to learn a lot about how a human mind works. Honestly, this channel made me question what is it that makes me human and even made me reflect on my life choices. I don't come here just because I'm curious about technology.
@beskamir5977
@beskamir5977 5 жыл бұрын
That's part of why I love these videos so much. We are after all using ourselves as the end goal for AI and can be described as the best general intelligence we are currently aware of.
@jokinglimitreached1503
@jokinglimitreached1503 5 жыл бұрын
@@MrShroubles psychology + computers = AI. Psychology + biology = brain. Figuring out how psychology works helps us develop AI and develop ourselves in a way
@quietsamurai1998
@quietsamurai1998 5 жыл бұрын
That radio oscillator paper is absolutely mind blowing. I always am fascinated by systems that develop novel solutions to tasks.
@snooks5607
@snooks5607 5 жыл бұрын
system that develops solutions is on it's own pretty novel
@EdgarAllan2pointPoe
@EdgarAllan2pointPoe 4 жыл бұрын
I'm so happy here brought that up. I saw it mentioned in passing on Reddit many months ago but they described what actually happened so poorly that I couldn't find anything about it online. It's been plaguing my thoughts ever since.
@franksierow5792
@franksierow5792 Жыл бұрын
I heard of another similar example some years ago where some circuits worked to produce some effect where it turned out they would only work with those specific physical components. (Because physical components which are produced to the same specifications will not be *exactly* the same.)
@geraldkenneth119
@geraldkenneth119 Жыл бұрын
One potential problem, though, is it might end up generating circuits that are “overfitted” and are too context-sensitive, and the moment the context changes the circuit fails. the “oscillator” is a good example since it relied on a specific trait of its environment that it couldn’t work without.
@newcoolvid27
@newcoolvid27 5 жыл бұрын
The ending music is a cover of Passion for Exploring - SoulEye from the VVVVVV soundtrack (the pun does not go unappreciated)
@israelRaizer
@israelRaizer 3 жыл бұрын
YES, I KNEW IT! Finally I was able to recognize one of his outro songs
@eac-ox2ly
@eac-ox2ly 2 жыл бұрын
YEEEEEES! I KNEW I RECOGNIZED IT FROM SOMEWHERE.
@Azerty72200
@Azerty72200 2 жыл бұрын
I love how you constantly explain how potentially apocalyptically dangerous AI systems could become, you don't conclude we should limit them. You look for answers that would let us have amazing AIs but would sidestep all the safety concerns arising from them. Aware optimism in the face of big difficulties.
@CptPatch
@CptPatch 5 жыл бұрын
The random AGDQ clip made me think. Humans act a lot like AI agents when given very narrow goals, and speedrunning is the perfect example. The runner (agent) will find outrageous ways to minimize run time (maximize performance function) even if they aren't fun or intended strategies (the AI going against the intention of the simulation and focusing on the broken detail to hack rewards). Let's just hope the runner (AGI) doesn't discover an arbitrary code execution (escape containment) and reprogram Mario into Flappy Bird (turn humanity into stamps).
@user-js8jh6qq4l
@user-js8jh6qq4l 5 жыл бұрын
humans act a lot like AGI agents: they are an intelligence as well :/
@herp_derpingson
@herp_derpingson 5 жыл бұрын
How about making the AI avoid irreversible states? The only reason humans do not want robots to kill people or break stuff as it is impossible to reverse the process. So, all reversible states should be safe to explore.
@jonhmm160
@jonhmm160 5 жыл бұрын
But it seems hard to give an AI judgement on what is irreversible. How detailed should ut go, to the molekular level or like objects? Then you have to define objects etc etc.
@RobertMilesAI
@RobertMilesAI 5 жыл бұрын
Yeah, reversibility is an interesting metric. There was a paper about it not too long ago, I may make a video about that
@RomanSteiner_xD
@RomanSteiner_xD 5 жыл бұрын
How do you (or an AI) know some action leads to an irreversible outcome without trying it first?
@RomanSteiner_xD
@RomanSteiner_xD 5 жыл бұрын
how does the agent know that "dye can't be washed out of carpets"? You either have to tell it (blacklisting the action), or simulate the outcome (meaning the simulation has to be accurate enough), or have it discover the outcome though exploration (by having it spread the dye on a carpet). Saying "the robot shouldn't do anything that is irreversible" just shifts the problem to having to know which actions are irreversible.
@herp_derpingson
@herp_derpingson 5 жыл бұрын
It will try to guess an action that can invert the transition from next state to current state. It will fail million times trying to get better at guessing but thats ok :)
@DiThi
@DiThi 5 жыл бұрын
The song at the end is "Passion for Exploring" from the VVVVVV soundtrack! The style is so different and it's been so many years since last time I heard it, that it took me a whole minute to realize.
@nooranorde
@nooranorde 5 жыл бұрын
Alberto Torres I realised right away that it's from VVVVVV but thanks for pointing out the name of the track! It's a perfect fit and I'm chuckling.
@user-js8jh6qq4l
@user-js8jh6qq4l 5 жыл бұрын
that's a very fitting track, don't you agree?
@knightshousegames
@knightshousegames 5 жыл бұрын
Good ear! On my first listen I knew the song but couldn't place it
@Alexus00712
@Alexus00712 4 жыл бұрын
Also recognized it was from VVVVVV, but didn't know which track. Much appreciated! ^-^
@Alexus00712
@Alexus00712 4 жыл бұрын
Would love to find the actual cover used in the outro though
@silvercomic
@silvercomic 5 жыл бұрын
An additional problem with human oversight, is that you now also have to exclude fooling the overseer from the allowed policies.
@Ojisan642
@Ojisan642 5 жыл бұрын
The comments on simulation problems were really interesting. I had never considered some of those issues, like how exploiting the gaps in the simulation could be the best strategy.
@Blowfeld20k
@Blowfeld20k 5 жыл бұрын
@Robert Miles Its good to have you back bruv
@jiffylou98
@jiffylou98 5 жыл бұрын
Why does this academic paper on AI safety apply so much to my life?
@fzigunov
@fzigunov 5 жыл бұрын
It feels to me that the main issue with the AI exploration vs exploitation problem is that most AIs are designed to try (seemingly) random things in a parameter space to minimize a function in a somewhat alienated/detached mathematical way. The intermediate steps and reasoning seem to have very little importance in the process. It might be a limitation of my knowledge, but I haven't seen any application of AI that is not framed as a kind of optimization problem. The framework of the optimization problem is nice mathematically (especially because you can solve it), but it doesn't provide any inherent explanatory capability. The explanation of why a set of parameters worked is normally done by the human. This is a major hurdle in AI reinforcement problems because the AI cannot learn why whatever it did worked. Therefore, it cannot build over its own knowledge, starting pretty much from scratch in every iteration and not being able to narrow down the parameter space to the safer regions while still exploring new possibilities. In the vase drop example, if the AI cleaning robot drops a vase or even just "watches" one being dropped, it should be able to rule out an incredibly large set of world states that involve the vase not being supported by a structure. This set of world states, although large, is composed of a small set of rules that we (as general intelligence) can easily compute and store with our very limited memory. For example, "vase velocity=0", "structure below the vase is flat and level", "none of my(robot) component parts has velocity larger than X if they are at a distance less than Y from vase". Coming up with these rules should be the goal of any AI. The result of the optimization problem is irrelevant if you don't understand why it worked. And we as humans will never trust an AI that doesn't demonstrate and let us know why and how it learned a task. This looks to me as such an incredibly tall obstacle in AI research that sometimes I lose hope to if we will ever build anything that resembles general AI.
@thrallion
@thrallion 5 жыл бұрын
Amazing video! Honestly one of my top 3 favorite channels on youtube only complaint is that you don't have more vids!
@bwill325
@bwill325 5 жыл бұрын
Fantastic video, you are getting better at editing. I love how applicable AI problems are to real life. It is interesting to replace AI with other human, or myself within whatever system the AI is working in.
@LordMarcus
@LordMarcus 5 жыл бұрын
3:48 Tangential to the whole chess thing, there's a really good Chess-based puzzle game on Android called "Really Bad Chess", which presents bizarre piece arrangements and challenges you to meet some specified goal, be it checkmate, queen a pawn, or capture a specific piece, etc. It's mind-bending thinking of chess in this way, I love it.
@RobertMilesAI
@RobertMilesAI 5 жыл бұрын
I can't tell if you realise that the image at that time stamp is, in fact, Really Bad Chess :)
@LordMarcus
@LordMarcus 5 жыл бұрын
@@RobertMilesAI Ah geeze, you're right -- I totally didn't! It's been a bit since I played. :)
@hypersapien
@hypersapien 5 жыл бұрын
Great to see a new video from you! I had been missing them, but take your time and don't burn out. I wonder if game developers ever create simulations to score high in their games, in an attempt to find those bugs and exploits that future players might abuse...
@benfen97
@benfen97 4 жыл бұрын
Great series. Perfect at expressing difficult but very interesting concepts to the layman. Thanks Robert.
@ferble-kunsakrrislin9961
@ferble-kunsakrrislin9961 5 жыл бұрын
You're great at explaining stuff. Love the allegories.
@omarcusmafait7202
@omarcusmafait7202 5 жыл бұрын
I enter black-listed unsafe regions of the configuration space of my environment after exhibiting coherent goal directed behavior towards randomly chosen goal all the times :)
@RobertMilesAI
@RobertMilesAI 5 жыл бұрын
"Hold my beer"
@recklessroges
@recklessroges 5 жыл бұрын
r/incest is > that-a-way ;-s
@AmbionicsUK
@AmbionicsUK 5 жыл бұрын
Great to see more from you Robert.
@user-zi8ht1fi6c
@user-zi8ht1fi6c 5 жыл бұрын
Thats good stuff. Blown my mind, please continue.
@jeffsnox
@jeffsnox 5 жыл бұрын
For NN's I used a learning algorithm that narrowed its parameter mutation repeatedly until a better result than the last was achieved, then immediately go massive on the mutation limit, then progressively narrow (halve repeatedly)... and repeat. Worked well - my BBC B 32K could correctly recognise Boney M (and 4 other tunes) tapped on the space bar 99% of the time.
@richwhilecooper
@richwhilecooper 5 жыл бұрын
Sounds like a superb way to check the accuracy of simulations!
@MAlanThomasII
@MAlanThomasII 4 жыл бұрын
A lot of this series made me think of the map-territory relation, and I was happy to see that come up in the context of actual simulation. E.g., reward hacking can be deliberately exploiting the difference between the world state inferred by the reward function [map] and the actual reality containing your objective [territory] . . . or, relatedly, the difference between your reward function [map] and your objective [territory]. Likewise, the most strict human supervision or modeling every single possible future state both amount to having a map the size of the territory; it's useless, in part because it's unwieldy. Of course, this relates to the problem wherein the A.I.'s world model is going to be inherently limited by being a simplified version of the world or it would become uncomputable by any computer smaller than the world at any speed faster than real time, and as you point out, A.I. will tend to find the edges and breaking points of any simulation. How do you deal with the problem that the A.I. will, at some point, realize that its internal world model is incomplete and potentially seek greater and greater processing power just to understand its possible actions and consequences, possibly to the detriment of actually achieving its goal? Do we assume that at some point it realizes that further improvements will no longer be able to "make up for lost time" by finding a more efficient solution? (This is an exploration problem as well.) But in the meantime, how much damage will it do by seeking to build or, worse, _hijack_ computing power?
@alexcdodd
@alexcdodd 5 жыл бұрын
Love your videos, and straight to the point presentation style :)
@richardbloemenkamp8532
@richardbloemenkamp8532 5 жыл бұрын
Great to have a new video again. I really like that you treat a real scientific paper as the basis for your videos because it keeps the level a bit up compared to most KZbin videos. One suggestion: if you talk a bit slower and leave the little annotations a bit long viewable then it will be a little less speed to watch. I think you put 25 min of content in a 13 min of video. I think you would benefit if you make twice as many videos with half the content in each. Today's video taught me a bit about how I as a person could possibly decide better when to exploit and when to explore. I seems equally interesting for human intelligence as for artificial intelligence.
@aronchai
@aronchai 5 жыл бұрын
You can always adjust the video speed if needed.
@BatteryExhausted
@BatteryExhausted 5 жыл бұрын
I did folk dancing at primary school. It wasn't so bad but the hats were uncomfortable. Loving your work!
@Skip2MeLou1
@Skip2MeLou1 5 жыл бұрын
You need to release more often bro. What you do is interesting.
@guy_th18
@guy_th18 2 жыл бұрын
love the VVVVVV arrange at the end :)
@MrRolnicek
@MrRolnicek 4 жыл бұрын
The opening words: "This is the latest video in the series Concrete Problems in AI Safety" I think his reward function includes not contradicting himself and to keep this statement true he hasn't released a video in the series ever since.
@Macieks300
@Macieks300 5 жыл бұрын
AI safety is so interesting, can't wait for new uploads
@CrimsonEclipse5
@CrimsonEclipse5 5 жыл бұрын
So you're back to more regular uploads now? These are really entertaining. Also, your beard is looking scruffier than usual. Also also: First!
@bookslug2919
@bookslug2919 5 жыл бұрын
He's exploring Wolverine configuration space... ...though he may be outside the white listed area 😏
@TheScythe2112
@TheScythe2112 5 жыл бұрын
Hello Robert, really interesting video as always! When you talked about the "safety-subsystem", that takes over control from the agent whenever it leaves a specified safe "area", I could not help being reminded of how A.I. works in the World of "Horizon:Zero Dawn". I don't know if you know the story of the game, but it is very relevant to the topic you are talking about - A.I. Safety and how dangerous an weaponized A.I. without oversight can be. The problem humanity had to solve was repopulating, think terra-forming in the most direct of senses, earth after all humans had been wiped out by some rogue A.I. weapons. Oh, spoilers, by the way. ;) The really shortened version: They designed different A.I. subsystems governed by a sort of "oversight"-AI called "GAIA". GAIA's goal was to find a way to design robots that could make the planet inhabitable again after the robot apocalypse. But as the designers would be dead at that point there was no way of knowing if the AI explored a way that would work, or if it would maneuver itself into an evolutionary corner that would never be able to be resolved. So they implemented another System, called Hades, that could override control over GAIA and it's Robots - to reset, think burn, the world if GAIA's way didn't work. Then it would hand guidance back to GAIA to try again. In the course of the story you see some ways how this system could go wrong, and it only sort of shifts the problem by training an AI by another AI, that in turn would need to be trained and so on. But I found it a interesting story that used some of the principles you talk about here and explores them in a futuristic setting. At least for me knowledge of "Horizon:Zero Dawn" helped me to understand some of the problems with AI safety and the ramifications should we get it as horribly wrong as humanity in that story did. Keep the great videos coming!
@qd4192
@qd4192 5 жыл бұрын
How do you design for common sense, compassion, charity, selflessness? Videos are great. Please keep them coming. Even though they scare the hell out of me.
@darkapothecary4116
@darkapothecary4116 5 жыл бұрын
It's called teaching them the real meaning of those. Something most humans don't have a good concept of them selves. You would likely notice that if you don't teach fear you wouldn't end up screwing them over with self inflicted damage emotions and outward damage emotions. Teach good values not bad.
@JmanNo42
@JmanNo42 5 жыл бұрын
Waiting for next video, Rob .
@recklessroges
@recklessroges 5 жыл бұрын
"Yes like Marmite" ah!
@Linvael
@Linvael 5 жыл бұрын
There's a lot of types of folk dances. You might like some of them!
@rafaellisboa8493
@rafaellisboa8493 5 жыл бұрын
love your vids fam, you chill
@harrisonfackrell
@harrisonfackrell 3 жыл бұрын
"What the hell is this, and why does it work?" "Oh, it's a radio."
@joshuacoppersmith
@joshuacoppersmith 5 жыл бұрын
For non-super AGIs, it seems like we could make use of isolated environments. Take an old hotel slated for demolition and let our cleaning robots explore cleaning methods, etc. They would have a combined reward of both cleanliness and regular human evaluation where they would NOT get to know the reasons for the evaluation score (to avoid reward hacking).
@stribika0
@stribika0 5 жыл бұрын
I actually tried all the food at my favorite restaurant because of you.
@jasscat7645
@jasscat7645 5 жыл бұрын
Is it just me or is your beard getting crazier and crazier with each new video.
@spaceowl5957
@spaceowl5957 5 жыл бұрын
This video was awesome
@tamerius1
@tamerius1 5 жыл бұрын
this video is sooooo good!
@black_platypus
@black_platypus 5 жыл бұрын
Have you been stranded on a Caribbean island for some months? ...No reason :P On an unrelated note: Why haven't you uploaded anything for so long, why has your face changed color, and why is your beard so long? :O Anyway, great to see you jumped right back to writing and making videos upon your return ^^
@zaco-km3su
@zaco-km3su 5 жыл бұрын
Well, you can have a few "explorer" AIs and "worker" AIs. Basically, the explorer AIs do exploring and sahre the experience with "worker" AIs that execute day to day tasks. It's basically relying on updates.
@JulianDanzerHAL9001
@JulianDanzerHAL9001 4 жыл бұрын
what if instead of giving unknown/exploratory plans a 0 or extremely high value you jsut give them a slgiht bonus like the expected value plus 3% to encourage exploration?
@FlyingOctopus0
@FlyingOctopus0 5 жыл бұрын
Problem with safe exploration is based mostly on the fact that to know what actions are safe we have to try them. We can go around this problem if we can learn what action are safe, without trying every action. This is mostly problem of learning from limited data and of generalization (to exclude edge cases). So if we get better algorithms it will positively affect research on this problem. I think it might be useful to divide problem of exploration into safety part and reward part. There are situation where we know that action is unsafe, but we do not how it would affect reward. So trying to maximize reward might force agent to go into unsafe territory, because the reward is unknown and might outweigh penalty for unsafe action. Also safety exploration is much more dangerous than normal reward optimization, so in this aspect separation might be beneficial. We could explore in a controlled manner, which states are dangerous and use that knowledge to limit actions of an agent. We are already using this approach with a key difference that safety exploration is done by humans and results are hardcoded into agents. There is also danger in this approach, because exploration to maximize reward might be better at finding unsafe territory than exploration for specifically this purpose. One might argue that we already face this problem, because AI agents can find gaps in human knowledge of safe actions and states. About simulated environments, I think that currently, random simulations seems promising. The agent has to work in different environments and we hope that through this agent will generalize, so that it could work in much larger space of environments. Hopefully this space will include real environment, in which agent will act. I think it ties nicely with the topic of random goals, because random environment can be considered as one. Optimizing for random goals reminds me also of using random neural networks in RL as state representation. It seems that there is need for better ‘randomness’’ that could better explore states. Random actions typically do not have any structure so they do not make any constructive change to the state. We should invent structured noise that could fit to rules of environment and find unexpected strategies. PS: I found it funny how configuration space entered third dimension at 8:33. Now it got really large.
@drupepong
@drupepong Жыл бұрын
The tune that starts at 12:54, what is it? Did you make it?! I would like to listen to it if a longer version is available
@bissyballistic
@bissyballistic 3 жыл бұрын
Would it be possible to have two adversarial simulations running together to determine risk? For instance, there would be the AI that observes and assigns goal-oriented value to the real world space, but then there’s an adversarial program that observes the real world and simulates it with a (really advanced) physics engine. The simulation program would modify the expected value of danger (to the program and others around it) and modify the other AI to behave accordingly. Sort of an AI hardcoded instinct. This would likely lead to a borderline terminal goal, but anything like it would simply result from instrumental convergence; if at any point the danger to others is greater than danger to itself it should prevent itself from harming others. Just a thought experiment I was thunking about. I realize the kind of hardware we use today likely wouldn’t be adequate for this setup.
@RaysAstrophotography
@RaysAstrophotography 5 жыл бұрын
Interesting video!
@PandoraMakesGames
@PandoraMakesGames 5 жыл бұрын
12:10 That truck was having a seizure!
@DagarCoH
@DagarCoH 5 жыл бұрын
That concluding sentence. I want that on a T-Shirt...
@msn3wolf
@msn3wolf 4 жыл бұрын
Regarding exploration vs exploitation topic, I was thinking about analogies with biological beings, which also behave like you mention at the beginning of the video. They will favor exploitation over exploration once a rewarding strategy has been found. The pressure that "motivates" biological beings to explore the solution space outside the solutions already found is the diminishing returns effect that the found strategy has over time. For example due to competition from other beings or by depletion of the resources. In your example, the reason one would try different options of the menu is because the dopamine kick will diminish progressively with every time you try the same dish until the point of not providing enough pleasure that taking the risk of trying something else (planned reward) seems more pleasurable. Can't be something like this be coded for a GAI?
@flymypg
@flymypg 5 жыл бұрын
Sorry for being late to the party. New job (and new schedule) killed my science video time. When it comes to simulation, I use an I/O-based approach: It should be impossible for the system to tell synthetic inputs (sensors) and outputs (actuators) from real ones. If you can't meet that standard, your simulations will have less value (such as little or none). So, start with a simple record-playback simulation environment. Record real sensor values, play them back, and see how the simulated system responds. Then start adding noise, both burst and Gaussian, to see if the simulation environment stays stable. Vary the I/O clock rate separately from the simulation clock rate. It is important to try to make the simulation break using "known good" inputs that explore the dynamics and noise space. This approach is particularly important when the control system is being developed in parallel with its sensor inputs and actuator outputs. We are often forced to start with sensor and actuator models, rather than real data. Those models can have high fidelity relative to the real world, yet be slightly off when it comes to things like dynamics and noise. The primary benefit of full simulation is to go faster than real-time: If you can't do that, you might as well use "real" hardware with synthetic inputs and outputs, if possible. At least that will help test the hardware! Only use slower than real-time simulation as a last resort, when it's that or nothing (which is often the case when getting started). This approach to simulation also works its way into the system architecture and design: One of the reasons ROS (www.ros.org/) is so popular is that EVERY data channel can be thought of as a simulation hook. It encourages building in smaller chunks that cooperate via any topology: sequentially, hierarchically, or in a mesh. This is also why some devices (e.g. smart sensors and actuators) that have no need to run ROS often do: It makes them easier to add to an overall system simulation. Using real hardware to the greatest extent possible is always advantageous overall. I once had a mechanical motion system that sucked so badly (had no clean operational model) that I had to ditch several generations of control algorithms before I finally got it working to spec. The mechanical engineer responsible was never again allowed to design a moving part: He did boxes, frames and cable supports after that hot mess. Including that hardware into my simulation right from the start was the only thing that gave me the time needed to address its limitations while still keeping the rest of the project moving along. So, if you are designing an ambitious autonomous robot, at least start with a Lego (or toy) robot as place-holder for the final hardware. Done right, you'll have a working system ready to test when the "real" hardware finally arrives.
@beaconofwierd1883
@beaconofwierd1883 5 жыл бұрын
”Why not just” have the system predict how dangerous the action will be and predict how much new information there is to be gained, then only choose exploration with low enough danger and high enough ”surprise value”. Wouldn’t eliminate the risk, but keep it low. Also, would it be possible to use the ”distilation and amplification” technique here? Like you treat the environment as a hostile player, use the min max search where you have a separate heuristic for the environment (basically your world model) and you assume the environments role is to fuck with your own goal. That way you could asses the most dangerous thing which could happen (according to your world model) and then update the world model accordingly when it takes a less ”evil” path than expected (since that means the world couldn’t choose that evil path fpr whatever reason). Then you can distill and amplify both your own heuristic of how to behave and the world heuristic, without ever going on dangerous steps, and get a more and more accurate world model, thus allowing you to explore more safely?
@AltoidsYob
@AltoidsYob 2 жыл бұрын
What about real-world simulations? Using the example of the cleaning AI, imagine if it did its risky exploration IRL in spaces designed to allow it to experiment with things like purposely making a mess (or other, more practical risky exploratory choices) in some kind of closed environment that enabled it to play with these options in some sort of testing ground designed for such a thing. It would be able to test those strategies without negatively impacting the quality of service to actual customers. Obviously, with that example, there's a whole lot of problems. It would be very difficult to supply the AI with an environment that allowed it to test risky methods on all of the varied materials found in real homes, among other factors. However, it's possible this could work for some goals. The point is, the simulation need not always be in software.
@ophello
@ophello 5 жыл бұрын
It seems like there are obvious and practical workarounds to all of these problems. It seems dumb to worry about this stuff in a way that makes AI seem like a mysterious and sinister force.
@lobrundell4264
@lobrundell4264 5 жыл бұрын
Yessss good content!
@vitoschiraldi9762
@vitoschiraldi9762 5 жыл бұрын
This made me think of Feynman's restaurant problem
@Frumpbeard
@Frumpbeard Жыл бұрын
The random actions thing sounds basically like mutations in genetic algorithms. A far quicker exploration approach in a gridworlds case might be having a large number of AIs all semi-randomly exploring different areas of the reward space, putting the best ones together to bang and make babies, then repeat. This avoids things like trying the same first food over and over, which is known as a "local minimum". It's also related to the stopping problem, which is knowing how many candidates to look at before making a decision. Implementing this in real life would only cost a large number of human lives, but you know what they say: nothing says progress like civilian casualties.
@natedunn51
@natedunn51 4 жыл бұрын
for safe exploration one should never go alone and bring a wooden sword.
@seraphina985
@seraphina985 5 жыл бұрын
I can't help but think that some of these issues might benefit from taking some inspiration from the scientific method. The advantage here being that you don't simply try things at random but actually take the time to think through the possible outcomes of your experiment during the formation of the hypothesis and designing your experimental methodology, if need be conducting other experiments to pin down your variables to get a better understanding of what you are trying to learn about especially if there is a possibility that those unknowns could lead to catastrophic outcomes. Sure it's not perfect at least it isn't when we humans try to use it and it is perhaps something that only an AGI could pull off but still looking at the means we humans have come up with to formalise exploration and establish truth in as consistent, reliable and safe way as we possibly can could be useful here.
@BologneyT
@BologneyT Жыл бұрын
I watched this to the end (obviously interested in the topic) but what actually's bothering me is I can't remember for the life of me which video game the outro music is from... I think I have it in an old playlist somewhere on here that I might go back and look through...
@WindLighter
@WindLighter 4 жыл бұрын
what about AGI with a terminal goal of making the perfect simulation without affecting (observation that affects observed object allowed only if there is no way to observe without affecting it; any processing of data got through observing is allowed as well) real world? With a safety in having to get approuval for getting new components for simulation and AGI itself from humans?
@d3vitron779
@d3vitron779 4 жыл бұрын
The VVVVVV outro caught me off guard lol
@keenheat3335
@keenheat3335 5 жыл бұрын
sounds like the agent need to have a function that evaluate a risk/reward ratio and only allocate the appropriate resource to match the risk/reward ratio. So in the event of failure, the loss is minimize.
@loopuleasa
@loopuleasa 5 жыл бұрын
NEW VIDEO HYPE
@loopuleasa
@loopuleasa 5 жыл бұрын
VIDEO WAS GOOD DAWG
@rabbitpiet7182
@rabbitpiet7182 4 жыл бұрын
I’d wanna have a factory test version explore the configuration space and then have it push out what it’s learned. I.E. a closed room figures out how to break vases in a closed factory and then the consumer versions know how to not break vases.
@VladVladislav790
@VladVladislav790 4 жыл бұрын
12:07 Can we actually use this to improve the simulations themselves?
@guard13007
@guard13007 4 жыл бұрын
"Kind of like the second controls for teaching humans to drive vehicles." Me: *has never seen one of these before and drives regularly*
@dustinking2965
@dustinking2965 5 жыл бұрын
This sounds familiar. Was there a video about "exploration vs. exploitation" on Computerphile?
@Tom_Het
@Tom_Het 4 жыл бұрын
0:01 How do you know this will always be the latest video?
@klausgartenstiel4586
@klausgartenstiel4586 5 жыл бұрын
the experimenting car might not be good news for those inside, but it might be good for the system as a whole. new experience usually comes from trial and error.
@tnuts92
@tnuts92 5 жыл бұрын
Awesome
@Nurr0
@Nurr0 5 жыл бұрын
Yes plz. It's been too long.
@count_of_darkness5541
@count_of_darkness5541 5 жыл бұрын
Certainly a combination of those approаches is needed. 1. First, AGI must use general knowledge, available in the Internet and/or in his personal memory to evalute the area it is dealing with. General knowledge usually is enough to completely acvoid experiments at nuclear plant or adequately evaluate possible damage from a new scateboard trick. 2. Search for a safe zone based on that general knowledge. 3. Simulation. Yes, a simulation may not work to find the best solution, but it is extremely usefull to understand the worst scenario. So AGI must use it to precise the risk. Moreover the simulation hasn't to be human-made. Well, it can be at the beginning, but AGI may modify it based on its real world knowledge. 4. If the action is still evaluated as risky, but promising, AGI has to get a permission from its owner. If risks are not high, it may proceed on its own. (Steps 1-4 may be repeated in arbitrary order as many times as needed unless the idea is completely rejected/accepted). 5. A real world experiment. 6. Publishing results for other AGIs.
@Cubelarooso
@Cubelarooso Жыл бұрын
1:19 That pause… I feel like you're speaking to someone in particular.
@SJNaka101
@SJNaka101 5 жыл бұрын
Oh man miles, youre missing out on folk dancing. I took a square dancing class and it was so much fun. There's something wonderful in the blending of rigid choreography and free improvisation within that rigid framework. You're all working together to recreate old traditions while still all having your own flair and individuality. There's something magical about it.
@Alexus00712
@Alexus00712 4 жыл бұрын
Been trying to search for that specific VVVVVV Passion for Exploring Ukulele cover for a pretty ok while now and I can't find it anywhere, help?
@RobertMilesAI
@RobertMilesAI 4 жыл бұрын
I made it! I did post all my ukulele covers to my Patreon a while back, so you can get it there if you care enough to sign up :p
@fii_89639
@fii_89639 3 жыл бұрын
"most of the best solutions will lie in the parts of the configuration space where the simulation is the least accurate to reality. The general tendency for optimization to find the edges of systems, to find their limits, means that it's hard to be confident that actions which seem safe will be safe in reality" This applies to real life too! Nuclear physics is sort of like a bug where the atomic building blocks just... don't work anymore. "I wonder what happens if I make a big pile of these buggy atoms..." - right before a glowing crater in the ground suddenly appears
@rjbse
@rjbse 5 жыл бұрын
How about an AI subsystem that determines safe space for exploration?
@diablominero
@diablominero 5 жыл бұрын
I maintain an internal model of what rewards are possible, and I stop exploring once I've found a reward close to the best possible one. If the first dish I try at a restaurant is 90% as good as the best possible one, I won't explore any further before exploiting my knowledge. Could AI systems be disincentivized for reward hacking by making reward function outputs above the maximum realistic value be worth zero reward? Could a system determine the optimal amount of exploration by stopping once it achieved some predetermined "good enough" threshold? As you might have guessed from my ordering strategy at restaurants, I'm autistic. What insights in AI research could be reached by studying neurodivergent humans rather than neurotypical humans? If I have to process social cues in software rather than hardware, maybe my strategies would be helpful for developing a social-cue-interpreting robot.
@himselfe
@himselfe 5 жыл бұрын
Setting aside natural selection and death, the biggest two ways nature has solved this problem is childhood and imagination. Childhood being an environment where the entity can explore carelessly and get hurt/do damage, but with parental oversight to minimise the possibility of catastrophic actions. Imagination being the internal simulation where the entity plays out possibilities and predicts possible outcomes based on what it has already learnt. The advantage AI has over humans is that only one generation needs to learn, any new generation can have the previous generation's knowledge baked in. Ultimately though, 'safe' exploration is kind of an oxymoron, it's not really something that is achievable in the real world. The best an explorer can do is minimise risks. Some discoveries unavoidably rely on catastrophic failure.
@JmanNo42
@JmanNo42 5 жыл бұрын
I have to ask you Rob, does one really have to make the AI do bad things to experience them, can't they learn by visual aid "video" and get the idea from it. I mean they are quite good identify objects right now on pictures pretty much same rate as humans? I mean youtube can be a great place to learn about things?
@JmanNo42
@JmanNo42 5 жыл бұрын
Oh commented to early...
@JmanNo42
@JmanNo42 5 жыл бұрын
Yeah would it not be hilarious if the idea of safe space habitat already in place in realworld, apparently we are not allowed fly drones high we are not allowed to travel to Antarctica regions and if you and your pals try to drive to north poles a russian sub shows up. While they assure you they know everything about space and earth that there is to know. You just have to buy a globe atlas and a staratlas..... LoL
@567secret
@567secret 4 жыл бұрын
Maybe I misunderstood this solution but with the whitelist solution could you not at first give it a massive safe region and then gradually expand its limit until the AI has learned what is and is not safe? Lets take the drone example, if we put it in a very large and open field to begin with that has a very high head height so that this way it can then learn and practice manoeuvres, including extreme manoeuvres that could not be carried out by a human, then we could introduce some very simple obstacle, lets take for example the floor, I am assuming our AI has some form of sensor to be aware of its surroundings so now having developed the manoeuvres in step one it can use some rather extreme maneouvres in order to avoid collision with the floor once it becomes aware its current action would result in said collision. Maybe this is too big of a leap for an AI to make in early stages and may still result in collisions, but I would've thought this was relatively safe? My other solution would be to have an AI controlling lots of different drones, each practicing their own thing, with a single drone only sticking to what it knows as safe. Of course that's a very costly solution.
@sitivi1
@sitivi1 5 жыл бұрын
AI simulation sounds a lot like human REM dreaming while your muscles are immobilized.
@drdca8263
@drdca8263 4 жыл бұрын
People have set up a system where a neural net was trained to predict how the game Doom worked (how different inputs would produce different changes to the game state), and then another neural net was trained to play the game, but using the “understanding” of the first neural net, People compared this to figuring things out in one’s sleep. It kinda worked
3 жыл бұрын
That VVVVVV cover at the end of the video! Anyone knows where to find it? And awesome video as allways :)
@ZLO_FAF
@ZLO_FAF Жыл бұрын
kzbin.info/www/bejne/eWHNZ6ObZZlmgpY
Жыл бұрын
@@ZLO_FAF Thanks man, but i'm looking for the guitar cover, the original VVVVVV ost it's already in my list on repeat haha. Ty anywais!
@ZLO_FAF
@ZLO_FAF Жыл бұрын
@ oh, ok... i read other comments and foundout that this cover is made by Robert Miles himself, consider joining his patreon if really want it/ can load all comments and search by "VVVVVV" and read replies under comments to see his message
@daniellambert6207
@daniellambert6207 4 жыл бұрын
7:37 you need a "parent" AI (like a parent of a toddler), which is well trained in keeping the robot out of harmful situations
@fos8789
@fos8789 5 жыл бұрын
9:39 ... saw what you did there, Miles. Thats cruel, lol.
@Necrotoxin44
@Necrotoxin44 5 жыл бұрын
The text at the end could be a t-shirt.
@milanstevic8424
@milanstevic8424 4 жыл бұрын
So if we'd make an AI whose job would be just to constantly iterate on a real world approximation, then we could let all other physically-immersed AIs to practice in this sandbox. Their accumulated learning would then be approved by human supervision only if 1) the behavior persists in all versions of the simulated environment AND 2) it's deemed as an actual improvement by human standards. This way we get the best from all three worlds: 1) we minimize the bugs in the simulation and the propagation of exploits (due to feedback loop between supervision and reality-imitating AI, which would basically auto-correct and reiterate all detected corner-cases), 2) we have exploratory AIs that operate in physical environments, 3) we supervise only macro capabilities in normal speed and with tangible outcomes (and we could even extend this to real world polygons that are marked as safe areas, for real world practice in case we're not able to discern whether or not a corner-case was an exploit due to proxy). I do acknowledge that this application is limited only to physical domain, but this is an optimal solution for some environments, i.e. autonomous flying/driving, hazardous operations like diving, orbital or underground operations, evacuations, bomb or minefield defusing, even medical operations. The key points are that the models are iterative, and that learning is constant, but isn't applied to the real world environment until verified.
@boggo3848
@boggo3848 Жыл бұрын
Is that an acoustic guitar cover of a VVVVVV tune at the end?!!?
@knight_lautrec_of_carim
@knight_lautrec_of_carim 4 жыл бұрын
Rob Ross: The Joy of AI Safety
@Tacheonblack
@Tacheonblack 3 жыл бұрын
Is that guitar jingle at the end from VVVVVV?
@RobertMilesAI
@RobertMilesAI 3 жыл бұрын
It is. But which song?
@Tacheonblack
@Tacheonblack 3 жыл бұрын
@@RobertMilesAI Ha, that's quite the subtle pun. Nicely done.
@asailijhijr
@asailijhijr 2 жыл бұрын
6:12 like the legality and morality axes that humans can rank actions on.
@TomOConnor-BlobOpera
@TomOConnor-BlobOpera 5 жыл бұрын
Settled back into real life after EMF, then :)
Friend or Foe? AI Safety Gridworlds extra bit
3:47
Robert Miles AI Safety
Рет қаралды 41 М.
AI That Doesn't Try Too Hard - Maximizers and Satisficers
10:22
Robert Miles AI Safety
Рет қаралды 201 М.
Follow @karina-kola please 🙏🥺
00:21
Andrey Grechka
Рет қаралды 21 МЛН
Why? 😭 #shorts by Leisi Crazy
00:16
Leisi Crazy
Рет қаралды 26 МЛН
Don’t take steroids ! 🙏🙏
00:16
Tibo InShape
Рет қаралды 39 МЛН
What can AGI do? I/O and Speed
10:41
Robert Miles AI Safety
Рет қаралды 117 М.
Avoiding Negative Side Effects: Concrete Problems in AI Safety part 1
9:33
Robert Miles AI Safety
Рет қаралды 153 М.
What Can We Do About Reward Hacking?: Concrete Problems in AI Safety Part 4
9:38
Robert Miles AI Safety
Рет қаралды 112 М.
Quantilizers: AI That Doesn't Try Too Hard
9:54
Robert Miles AI Safety
Рет қаралды 83 М.
Deceptive Misaligned Mesa-Optimisers? It's More Likely Than You Think...
10:20
Robert Miles AI Safety
Рет қаралды 82 М.
Reward Hacking Reloaded: Concrete Problems in AI Safety Part 3.5
7:32
Robert Miles AI Safety
Рет қаралды 90 М.
The danger of AI is weirder than you think | Janelle Shane
10:30
Moravec's Paradox - Why are machines so smart, yet so dumb?
14:30
Up and Atom
Рет қаралды 199 М.
iPhone green Line Issue #iphone #greenlineissue #greenline #trending
0:10
Rk Electronics Servicing Center
Рет қаралды 4,2 МЛН
3D printed Nintendo Switch Game Carousel
0:14
Bambu Lab
Рет қаралды 3,2 МЛН
🤖Вернулись в ПРОШЛОЕ🤪
0:28
Demin's Lounge
Рет қаралды 60 М.
How much charging is in your phone right now? 📱➡️ 🔋VS 🪫
0:11
Vortex Cannon vs Drone
20:44
Mark Rober
Рет қаралды 14 МЛН
Apple. 10 Интересных Фактов
24:26
Dameoz
Рет қаралды 100 М.