Owain Evans - AI Situational Awareness, LLM Out-of-Context Reasoning

  Рет қаралды 2,233

The Inside View

The Inside View

Күн бұрын

Пікірлер: 10
@TheInsideView
@TheInsideView 3 ай бұрын
OUTLINE 01:12 Owain's Research Agenda 02:25 Defining Situational Awareness 03:30 Safety Motivation 04:58 Why Release A Dataset 06:17 Risks From Releasing It 10:03 Claude 3 on the Longform Task 14:57 Needle in a Haystack 19:23 Situating Prompt 23:08 Deceptive Alignment Precursor 30:12 Distribution Over Two Random Words 34:36 Discontinuing a 01 sequence 40:20 GPT-4 Base On the Longform Task 46:44 Human-AI Data in GPT-4's Pretraining 49:25 Are Longform Task Questions Unusual 51:48 When Will Situational Awareness Saturate 53:36 Safety And Governance Implications Of Saturation 56:17 Evaluation Implications Of Saturation 57:40 Follow-up Work On The Situational Awarenss Dataset 01:00:04 Would Removing Chain-Of-Thought Work? 01:02:18 Out-of-Context Reasoning: the "Connecting the Dots" paper 01:05:15 Experimental Setup 01:07:46 Concrete Function Example: 3x + 1 01:11:23 Isn't It Just A Simple Mapping? 01:17:20 Safety Motivation 01:22:40 Out-Of-Context Reasoning Results Were Surprising 01:24:51 The Biased Coin Task 01:27:00 Will Out-Of-Context Resaoning Scale 01:32:50 Checking If In-Context Learning Work 01:34:33 Mixture-Of-Functions 01:38:24 Infering New Architectures From ArXiv 01:43:52 Twitter Questions 01:44:27 How Does Owain Come Up With Ideas? 01:49:44 How Did Owain's Background Influence His Research Style And Taste? 01:52:06 Should AI Alignment Researchers Aim For Publication? 01:57:01 How Can We Apply LLM Understanding To Mitigate Deceptive Alignment? 01:58:52 Could Owain's Research Accelerate Capabilities? 02:08:44 How Was Owain's Work Received? 02:13:23 Last Message
@Max-bh1pl
@Max-bh1pl 3 ай бұрын
Finally, a new episode! I've been eagerly waiting for this!
@MrCheeze
@MrCheeze 3 ай бұрын
We're so barack
@human_shaped
@human_shaped 3 ай бұрын
Really very interesting. It's good to let AIs know how they're being tested so they can take that into consideration too. Thanks for the transcript ;)
@simonstrandgaard5503
@simonstrandgaard5503 3 ай бұрын
great interview
@TheJokerReturns
@TheJokerReturns 2 ай бұрын
I'll like to see if we can coordinate on podcasts. How can we best reach you?
@bilalchughtai_
@bilalchughtai_ 3 ай бұрын
banger
Leading Indicators of AI Danger: Owain Evans on Situational Awareness, from The Inside View
2:25:53
Cognitive Revolution "How AI Changes Everything"
Рет қаралды 1 М.
Evan Hubinger (Anthropic)-Deception, Sleeper Agents, Responsible Scaling
52:32
Как Я Брата ОБМАНУЛ (смешное видео, прикол, юмор, поржать)
00:59
How Much Tape To Stop A Lamborghini?
00:15
MrBeast
Рет қаралды 252 МЛН
2 years in Dubai - my honest thoughts
16:13
Liam Ottley
Рет қаралды 616 М.
Demis Hassabis - Scaling, Superhuman AIs, AlphaZero atop LLMs, AlphaFold
1:01:34
Top Minds in AI Explain What’s Coming After GPT-4o | EP #130
25:30
Peter H. Diamandis
Рет қаралды 506 М.
The "Modern Day Slaves" Of The AI Tech World
52:42
Real Stories
Рет қаралды 703 М.
Coping with AI Doom
25:33
The Inside View
Рет қаралды 896
The OTHER AI Alignment Problem: Mesa-Optimizers and Inner Alignment
23:24
Robert Miles AI Safety
Рет қаралды 234 М.
Как Я Брата ОБМАНУЛ (смешное видео, прикол, юмор, поржать)
00:59