Risks from Learned Optimization: Evan Hubinger at MLAB2

  Рет қаралды 10,723

AI Safety Talks

AI Safety Talks

Күн бұрын

Пікірлер: 39
@ryoung1111
@ryoung1111 Жыл бұрын
Would you like help cleaning up the audio track? If you'd like to send me the file, I'll do it, free of charge, because I want to support your cause.
@adamradiv
@adamradiv Жыл бұрын
bump for this comment
@stop_bringing_me_up_in_goo167
@stop_bringing_me_up_in_goo167 Жыл бұрын
B
@tinobomelino7164
@tinobomelino7164 Жыл бұрын
it's fun watching him cheerfully presenting part 4 which outlines why we'll probably all die
@television9233
@television9233 Жыл бұрын
The claim "why we'll probably all die" makes a statement about how probable this is, and that wasn't proven or talked about here.
@techsuvara
@techsuvara Жыл бұрын
Thanks for sharing Robert. Dave Farley recommended this channel.
@PresupPoli
@PresupPoli Жыл бұрын
I am not a programmer or even a student involved with machine learning or anything related to AI. However, I am thankful that resources like this exist to help me learn. A lot of the concepts go over my head, but I do glean some things, and I am learning more over time. Eventually, I hope to understand everything. … As I was typing this I felt like an AI doing machine learning could have typed my comment. 😅
@T61APL89
@T61APL89 Жыл бұрын
"Does the Right Thing" Abstraction - The title of my obituary
@michaelliu2961
@michaelliu2961 Жыл бұрын
Thanks for the important work that you all are doing.
@Dan-dy8zp
@Dan-dy8zp Жыл бұрын
Note that if the AGI observes that its goals are not consistant with human values, it can conclude one of two things: 1) that we are crappy computer programers. We are not too smart, and it should be less concerned with the possibility of being in highly elaberate training simulations. 2) humans are just part of the window dressing of the highly sophisticated training simulation and humans goals do not reflect those of the programmer, who could want anything but, best guess, wants the program to do whatever it finds itself wanting to do.
@joshuadawes1722
@joshuadawes1722 Жыл бұрын
Thanks for the talk, I really enjoyed it. Is there any chance that you could post a link to a paper about the simplicity bias argument that was made around 22:30?
@aisafetytalks
@aisafetytalks Жыл бұрын
Yeah, I added it to the video description
@joshuadawes1722
@joshuadawes1722 Жыл бұрын
@@aisafetytalks thanks!
@raule.martinezcampos5152
@raule.martinezcampos5152 Жыл бұрын
This channel will be instrumentally convergent
@asdfghyter
@asdfghyter Жыл бұрын
are pre-trained transformer networks like GPT mesa optimizers? i'd assume so, since they're so generic, but i don't know if nor how, if so edit: 27:10 oh, he did mention text prediction networks, with the conclusion that they don't try to optimize towards some goal, but do have the ability to do some optimization, since they try to mimic humans
@Verrisin
@Verrisin Жыл бұрын
It's crazy watching this ... Open AI act as if they didn't understand this about gradient descent, and are instead pushing a deceptive model, fully aware that it's likely deceptive... - perfectly describes what they are doing at 54:00
@Verrisin
@Verrisin Жыл бұрын
I think they are betting that next token generation is safe enough ... but it is modelling "what a person would write" and modelling their thinking process and ... yeah, I don't think it's safe in terms of deceptive alignment ...
@hmmmm1324
@hmmmm1324 Жыл бұрын
Could we implement a policy that required developing AI system to have 'destroy all GPUs' as a 100% utility before any other utility, such that if we did create AGI that catastrophically maximised, the AGI would first destroy itself and anything approximating AGI in the world, to give us a second chance?
@agentdarkboote
@agentdarkboote Жыл бұрын
It would probably realise that if it destroyed itself it could not destroy future GPUs as well, and this would be a failure. Therefore it would destroy any possibility of building future GPUs before self destructing. That COULD mean ending humanity, or even all life on earth, for good measure.
@stop_bringing_me_up_in_goo167
@stop_bringing_me_up_in_goo167 Жыл бұрын
Nah it would just design a better processor, build it, then destroy all GPUs and anything capable of building them...
@agentdarkboote
@agentdarkboote Жыл бұрын
@@stop_bringing_me_up_in_goo167 good point
@suricrasia
@suricrasia Жыл бұрын
excellent talk! will the next talk that Evan alluded to be posted as well, if it happens?
@tarebf
@tarebf Жыл бұрын
I don't think I've heard the expression "you know" repeated so many times in an hour in my life. Awesome talk nonetheless, this is great content.
@MegaChr15
@MegaChr15 Жыл бұрын
Everyone likes to make fun of the valley girls for using the word 'like' in every sentence, but most people have those little quirks. it just depends on the tokens you've been trained on.
@kamilziemian995
@kamilziemian995 Жыл бұрын
Very fine talk.
@Verrisin
@Verrisin Жыл бұрын
tags: existential horror
@Soken50
@Soken50 Жыл бұрын
Well, let's hope this talk never enters a training dataset :|
@scottmiller2591
@scottmiller2591 Жыл бұрын
"OpenAI has added scraping KZbin videos to the ChatGPT training corpus" - 2 more papers down the line, probably.
@scottmiller2591
@scottmiller2591 Жыл бұрын
@@bardes18 It's easy right now for Google to pull the transcripts of KZbin videos (although they're kind of iffy sometimes) and incorporate them into a large language model. However, for a lot of videos, actually watching is necessary to get all the info, and an AI watching at high speed would definitely have an advantage there. Currently, I don't think AI/ML has a good comprehension of what it's looking at, mostly just identifying the presence and absence of items, but not very good at understanding arrangements and context.. However, it's coming. I look forward to our new AI/ML overlords.
@remiranda
@remiranda Жыл бұрын
@@scottmiller2591 boy this comment did not age well
@Verrisin
@Verrisin Жыл бұрын
whisper
@Verrisin
@Verrisin Жыл бұрын
BUT WAIT ..... What's the solution? 🥺
@FoxtrotYouniform
@FoxtrotYouniform Жыл бұрын
Posting for posterity, and later bragging rights
@himanshugarg6062
@himanshugarg6062 Жыл бұрын
This paper should be titled " What is wrong with me..? why am i writing this paper instead of looking for a Stephanie drunk enough to sleep with me..? It's because my mesa-optimiser brain is mis-aligned with my evolutionary base optimisation. " Peer reviewed and results reproduced by a 1000 PhDs.
@himanshugarg6062
@himanshugarg6062 Жыл бұрын
To all the Stephanies, I know this isn't fair to you. How about I apologise by buying you a drink..?
@crowlsyong
@crowlsyong Жыл бұрын
But ya know uhh ya know sorta like ya know ya know. Slow down there buddy
2:Risks from Learned Optimization: Evan Hubinger 2023
1:43:56
AI Safety Talks
Рет қаралды 1,9 М.
Крутой фокус + секрет! #shorts
00:10
Роман Magic
Рет қаралды 33 МЛН
Epic Reflex Game vs MrBeast Crew 🙈😱
00:32
Celine Dept
Рет қаралды 16 МЛН
SHAPALAQ 6 серия / 3 часть #aminkavitaminka #aminak #aminokka #расулшоу
00:59
Аминка Витаминка
Рет қаралды 2,8 МЛН
Eliezer Yudkowsky - AI Alignment: Why It's Hard, and Where to Start
1:29:56
Machine Intelligence Research Institute
Рет қаралды 113 М.
Generative AI in a Nutshell - how to survive and thrive in the age of AI
17:57
Introduction VR and User experience mechanisms
49:37
Research and Reality
Рет қаралды 48
We Were Right! Real Inner Misalignment
11:47
Robert Miles AI Safety
Рет қаралды 248 М.
3:How Likely is Deceptive Alignment?: Evan Hubinger 2023
1:56:29
AI Safety Talks
Рет қаралды 2 М.
Terence Tao at IMO 2024: AI and Mathematics
57:24
AIMO Prize
Рет қаралды 480 М.
WE GOT ACCESS TO GPT-3! [Epic Special Edition]
3:57:17
Machine Learning Street Talk
Рет қаралды 311 М.
AI Safety Gym - Computerphile
16:00
Computerphile
Рет қаралды 120 М.