What does AI believe is true?

Рет қаралды 1,925

Күн бұрын

Пікірлер: 19

@farrael004 Жыл бұрын

I'm really glad to see such an well researched video about this paper instead of a clickbaty headline reporting that glosses over the more interesting details. Adding that second paper at the end also helps to show that CCS is not the silver bullet for LLM hallucination that some could believe after reading the original paper.

@SamuelAlbanie1 Жыл бұрын

Thanks @farrael004!

@Subbestionix Жыл бұрын

I'm glad I found your channel :3 This is extremely interesting

@SamuelAlbanie1 Жыл бұрын

Thanks @Subbestionix!

@TheThreatenedSwan Жыл бұрын

Are you reading David Rozado? I've noticed that while chat AI has gotten better at keeping things consistent, it doesn't give one answer to one thing and then give you a completely contradictory answer for other things even if they're dependent on the former, but this only seems to work linguistically. You can also ask things in a different mode like analytically where you ask it to examine data and analyze it and then make a statement, but then when you ask it for what should be the same thing in other ways, it gives you a completely different answer. Similarly the framing can give you one answer even if it is generally going back to what is pc for the model. It would be nice if it could establish what exactly is meant in material terms, what is communicated not merely what the words are, and also establish bayesian priors to then make more drawn out conclusions, but I don't see how this could be done for gpt and other chatbot style models.

@quantumjun Жыл бұрын

It might be interesting if they can do True,False and Yes,No at the same time to check the consistency

@SamuelAlbanie1 Жыл бұрын

Do you mean to supervise the model to predict this also? One idea could be as follows: If the authors trained a regressor that could predict yes/no from the normalised features, then they have proof that this signal is leaking. So instead, they could learn a projection and then use a trick from domain adaptation (reversing gradients) to ensure that the projected features contained no information about yes/no labels.

@juliangawronsky9339 Жыл бұрын

Interesting work. I think it's trying gather the validness, or logic, rather soundness of concept, or objective nature of claim, in my understanding.

@SamuelAlbanie1 Жыл бұрын

Thanks for sharing your perspective. My interpretation of the work is that the goal is to infer which claims the models "thinks" are true, in an unsupervised manner.

@jobobminer8843 Жыл бұрын

Thanks for the video

@SamuelAlbanie1 Жыл бұрын

Thanks for watching!

@JustAThought01 Жыл бұрын

Reality is generated by random events. Knowledge is defined to be logically related non random facts.

@SamuelAlbanie1 Жыл бұрын

An interesting philosophical perspective!

@younesprog2629 Жыл бұрын

What about the LLMs used by the CIA, NSA or DARPA....they're classified projects.

@SamuelAlbanie1 Жыл бұрын

Unfortunately (or perhaps fortunately), I don't know much about the LLMs of the CIA and NSA...

@younesprog2629 Жыл бұрын

@@SamuelAlbanie1 what I'm trying to say how we can verify the data they're using.

@XeiDaMoKaFE Жыл бұрын

yeah lets base ai on the current peer reviewed consensus bs and not the actual truth of the scientific method

@SamuelAlbanie1 Жыл бұрын

I suspect modern large language models (GPT-4, Claude etc.) are often trained on large collections on peer reviewed articles, so they will pick up on these. But I'm not sure I understand your comment (the focus of this work is on trying to determine what the AI thinks is true).

@XeiDaMoKaFE Жыл бұрын

@@SamuelAlbanie1 my focus is on the root of the problem of who's deciding whats the consensus truth between humans in the first place vs the actual truth in the real world , ai could very well use the principles of logic to determine of something is true or not by picking the fundamentals instead of the assumptions , for example when you ask if michelson morley means that there no aether or means there's no static aether on a moving earth , he's trained to pretend the consensus is the truth instead of looking into the actual roots of the michelson morley and relativity to understand that in the interference of the light can also mean a moving aether on a stationary earth my point is they will never make ai actually solve problems about truth