What does AI believe is true?

  Рет қаралды 1,925

Samuel Albanie

Samuel Albanie

Күн бұрын

Пікірлер: 19
@farrael004
@farrael004 Жыл бұрын
I'm really glad to see such an well researched video about this paper instead of a clickbaty headline reporting that glosses over the more interesting details. Adding that second paper at the end also helps to show that CCS is not the silver bullet for LLM hallucination that some could believe after reading the original paper.
@SamuelAlbanie1
@SamuelAlbanie1 Жыл бұрын
Thanks @farrael004!
@Subbestionix
@Subbestionix Жыл бұрын
I'm glad I found your channel :3 This is extremely interesting
@SamuelAlbanie1
@SamuelAlbanie1 Жыл бұрын
Thanks @Subbestionix!
@TheThreatenedSwan
@TheThreatenedSwan Жыл бұрын
Are you reading David Rozado? I've noticed that while chat AI has gotten better at keeping things consistent, it doesn't give one answer to one thing and then give you a completely contradictory answer for other things even if they're dependent on the former, but this only seems to work linguistically. You can also ask things in a different mode like analytically where you ask it to examine data and analyze it and then make a statement, but then when you ask it for what should be the same thing in other ways, it gives you a completely different answer. Similarly the framing can give you one answer even if it is generally going back to what is pc for the model. It would be nice if it could establish what exactly is meant in material terms, what is communicated not merely what the words are, and also establish bayesian priors to then make more drawn out conclusions, but I don't see how this could be done for gpt and other chatbot style models.
@quantumjun
@quantumjun Жыл бұрын
It might be interesting if they can do True,False and Yes,No at the same time to check the consistency
@SamuelAlbanie1
@SamuelAlbanie1 Жыл бұрын
Do you mean to supervise the model to predict this also? One idea could be as follows: If the authors trained a regressor that could predict yes/no from the normalised features, then they have proof that this signal is leaking. So instead, they could learn a projection and then use a trick from domain adaptation (reversing gradients) to ensure that the projected features contained no information about yes/no labels.
@juliangawronsky9339
@juliangawronsky9339 Жыл бұрын
Interesting work. I think it's trying gather the validness, or logic, rather soundness of concept, or objective nature of claim, in my understanding.
@SamuelAlbanie1
@SamuelAlbanie1 Жыл бұрын
Thanks for sharing your perspective. My interpretation of the work is that the goal is to infer which claims the models "thinks" are true, in an unsupervised manner.
@jobobminer8843
@jobobminer8843 Жыл бұрын
Thanks for the video
@SamuelAlbanie1
@SamuelAlbanie1 Жыл бұрын
Thanks for watching!
@JustAThought01
@JustAThought01 Жыл бұрын
Reality is generated by random events. Knowledge is defined to be logically related non random facts.
@SamuelAlbanie1
@SamuelAlbanie1 Жыл бұрын
An interesting philosophical perspective!
@younesprog2629
@younesprog2629 Жыл бұрын
What about the LLMs used by the CIA, NSA or DARPA....they're classified projects.
@SamuelAlbanie1
@SamuelAlbanie1 Жыл бұрын
Unfortunately (or perhaps fortunately), I don't know much about the LLMs of the CIA and NSA...
@younesprog2629
@younesprog2629 Жыл бұрын
@@SamuelAlbanie1 what I'm trying to say how we can verify the data they're using.
@XeiDaMoKaFE
@XeiDaMoKaFE Жыл бұрын
yeah lets base ai on the current peer reviewed consensus bs and not the actual truth of the scientific method
@SamuelAlbanie1
@SamuelAlbanie1 Жыл бұрын
I suspect modern large language models (GPT-4, Claude etc.) are often trained on large collections on peer reviewed articles, so they will pick up on these. But I'm not sure I understand your comment (the focus of this work is on trying to determine what the AI thinks is true).
@XeiDaMoKaFE
@XeiDaMoKaFE Жыл бұрын
@@SamuelAlbanie1 my focus is on the root of the problem of who's deciding whats the consensus truth between humans in the first place vs the actual truth in the real world , ai could very well use the principles of logic to determine of something is true or not by picking the fundamentals instead of the assumptions , for example when you ask if michelson morley means that there no aether or means there's no static aether on a moving earth , he's trained to pretend the consensus is the truth instead of looking into the actual roots of the michelson morley and relativity to understand that in the interference of the light can also mean a moving aether on a stationary earth my point is they will never make ai actually solve problems about truth
Is Chain of Thought faithful?
25:31
Samuel Albanie
Рет қаралды 2,1 М.
Can we verify training data?
17:10
Samuel Albanie
Рет қаралды 1,1 М.
Cheerleader Transformation That Left Everyone Speechless! #shorts
00:27
Fabiosa Best Lifehacks
Рет қаралды 16 МЛН
Каха и дочка
00:28
К-Media
Рет қаралды 3,4 МЛН
99.9% IMPOSSIBLE
00:24
STORROR
Рет қаралды 31 МЛН
How might LLMs store facts | DL7
22:43
3Blue1Brown
Рет қаралды 948 М.
Challenges with unsupervised LLM knowledge discovery
16:02
Samuel Albanie
Рет қаралды 1,4 М.
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 1,5 МЛН
Animation vs. Math
14:03
Alan Becker
Рет қаралды 78 МЛН
Deliberative Alignment: Reasoning Enables Safer Language Models
24:02
Samuel Albanie
Рет қаралды 1,2 М.
Mamba - a replacement for Transformers?
16:01
Samuel Albanie
Рет қаралды 253 М.
Anthropic - AI sleeper agents?
19:35
Samuel Albanie
Рет қаралды 2,1 М.
7 Outside The Box Puzzles
12:16
MindYourDecisions
Рет қаралды 169 М.
Alignment Faking in Large Language Models
24:04
Samuel Albanie
Рет қаралды 8 М.
The Full History of ChatGPT
26:55
Art of the Problem
Рет қаралды 1,1 МЛН