Mechanistic Interpretability - NEEL NANDA (DeepMind)

Рет қаралды 33,885

Күн бұрын

80000hours.org/mlst
Visit our sponsor 80000 hours - grab their free career guide and check out their podcast! Use our special link above!
Support us! / mlst
MLST Discord: / discord
Twitter: / mlstreettalk
In this wide-ranging conversation, Tim Scarfe interviews Neel Nanda, a researcher at DeepMind working on mechanistic interpretability, which aims to understand the algorithms and representations learned by machine learning models. Neel discusses how models can represent their thoughts using motifs, circuits, and linear directional features which are often communicated via a "residual stream", an information highway models use to pass information between layers.
Neel argues that "superposition", the ability for models to represent more features than they have neurons, is one of the biggest open problems in interpretability. This is because superposition thwarts our ability to understand models by decomposing them into individual units of analysis. Despite this, Neel remains optimistic that ambitious interpretability is possible, citing examples like his work reverse engineering how models do modular addition.
Key areas of discussion:
* Mechanistic interpretability aims to reverse engineer and understand the inner workings of AI systems like neural networks. It could help ensure safety and alignment.
Neural networks seem to learn actual algorithms and processes for tasks, not just statistical correlations. This suggests interpretability may be possible.
* 'Grokking' refers to the phenomenon where neural networks suddenly generalize after initially memorizing. Understanding this transition required probing the underlying mechanisms.
* The 'superposition hypothesis' suggests neural networks represent more features than they have neurons by using non-orthogonal vectors. This poses challenges for interpretability.
* Transformers appear to implement algorithms using attention heads and other building blocks. Understanding this could enable interpreting their reasoning.
* Specific circuits like 'induction heads' seem to underlie capabilities like few-shot learning. Finding such circuits helps explain emergent phenomena.
* Causal interventions can isolate model circuits. Techniques like 'activation patching' substitute activations to determine necessity and sufficiency.
* We likely can't precisely control AI system goals now. Interpretability may reveal if systems have meaningful goal-directedness.
* Near-term risks like misuse seem more pressing than far-future risks like recursiveness. But better understanding now enables safety.
* Neel thinks we shouldn't "over-philosophize". The key issue is whether AI could pose catastrophic risk, not whether it fits abstract definitions.
What do YOU think? Let us know in the comments!
Neel Nanda: www.neelnanda.io/
/ @neelnanda2469
Pod version: podcasters.spotify.com/pod/sh...
TOC
00:00:00 Intro
00:03:57 Discord questions
00:09:41 Chapter 1: Grokking and super position
00:32:32 Grokking start
01:07:29 How do ML models represent their thoughts
01:20:30 Orthello
01:41:29 Superposition
02:31:09 Chapter 2: Transformers discussion
02:41:06 Emergence
02:44:07 AI progress
02:57:01 Interp in the wild
03:09:26 Chapter 3: Superintelligence/XRisk
Transcript: docs.google.com/document/d/1F...
Refs: docs.google.com/document/d/11...
See refs in pinned comment!
Interview filmed on May 31st 2023.
#artificialintelligence #machinelearning #deeplearning

Пікірлер: 94

@MachineLearningStreetTalk 6 ай бұрын

References: Toy Models of Superposition dynalist.io/d/n2ZWtnoYHrU1s4vnFSAQ519J transformer-circuits.pub/2022/toy_model/index.html kzbin.info/www/bejne/iGTRk4udgtOJp7M (Nanda) Supermasks in Superposition arxiv.org/abs/2006.14769 INTERPRETABILITY IN THE WILD arxiv.org/pdf/2211.00593.pdf Actually, Othello-GPT Has A Linear Emergent World Representation (Nanda) www.lesswrong.com/s/nhGNHyJHbrofpPbRG/p/nmxzr2zsjNtjaHh7x thegradient.pub/othello A Mathematical Framework for Transformer Circuits transformer-circuits.pub/2021/framework/index.html kzbin.info/www/bejne/gYeYmJWFoq2VoLc (Nanda) Attention is all you need arxiv.org/abs/1706.03762 A STRUCTURED SELF-ATTENTIVE SENTENCE EMBEDDING openreview.net/forum?id=BJC_jUqxe Distributed Representations of Words and Phrases and their Compositionality (Mikolov) arxiv.org/abs/1310.4546 Deep Residual Learning for Image Recognition arxiv.org/abs/1512.03385 Attribution Patching: Activation Patching At Industrial Scale (Nanda) www.neelnanda.io/mechanistic-interpretability/attribution-patching In-context Learning and Induction Heads transformer-circuits.pub/2022/in-context-learning-and-induction-heads/index.html kzbin.info/www/bejne/mnTOgoSPrLWrmq8 (Nanda) The Quantization Model of Neural Scaling arxiv.org/pdf/2303.13506.pdf Interpreting Neural Networks to Improve Politeness Comprehension aclanthology.org/D16-1216/ Progress measures for grokking via mechanistic interpretability arxiv.org/abs/2301.05217 (Nanda) kzbin.info/www/bejne/f3nMnH-Cbbp_l5Y twitter.com/NeelNanda5/status/1616590887873839104 Grokking paper arxiv.org/abs/2201.02177 A Toy Model of Universality arxiv.org/abs/2302.03025 twitter.com/bilalchughtai_/status/1625948104121024516 A circuit for Python docstrings in a 4-layer attention-only transformer www.lesswrong.com/posts/u6KXXmKFbXfWzoAXn/a-circuit-for-python-docstrings-in-a-4-layer-attention-only Fodor, J. A., & Pylyshyn, Z. W. (1988). Connectionism and cognitive architecture: A critical analysis. psycnet.apa.org/record/1989-03804-001 Maximal Update Parametrization (μP) and Hyperparameter Transfer (μTransfer) github.com/microsoft/mup Spline theory of NNs proceedings.mlr.press/v80/balestriero18b/balestriero18b.pdf Counterarguments to the basic AI x-risk case (Katja Grace) www.lesswrong.com/posts/LDRQ5Zfqwi8GjzPYG/counterarguments-to-the-basic-ai-x-risk-case The alignment problem from a deep learning perspective (Ngo) arxiv.org/abs/2209.00626 Superintelligence: Paths, Dangers, Strategies. www.amazon.co.uk/Superintelligence-Dangers-Strategies-Nick-Bostrom/dp/0199678111

@jordan13589 6 ай бұрын

Neel’s facial expressions and hand gestures are 10/10 on the likability scale 🤗

@MachineLearningStreetTalk 6 ай бұрын

Some Quotes: "The empirical question of whether language models do this and the theoretical question of could they do this are two different things. In my view, the theoretical question is nonsense." "Models can be thought of as ensembles of shallow paths, with a trade-off between having more computation and better memory bandwidth." "Models, in my perspective, have linear representations more than geometric representations." "The model does not align features with neurons and superposition is a mechanistic hypothesis for why both of these phenomena occur." "An ensemble of shallow paths is a good way to think about models, and there's a trade-off between having more computation and better memory bandwidth." "Emergence is when things happen suddenly during training and go from not being there to being there fairly rapidly in a non-convex way, rather than gradually developing." "Language models predict the next token. They learn effective algorithms for doing this within the constraints of what is natural to represent within transformer layers." "The key thing to be careful of when probing is, is your probe doing the computation, or does the model genuinely have this represented?" "A lot of my motivation for this work comes from I care a lot about risk and alignment and how to make these systems good for the world." "There's lots of things that a sufficiently capable model could do that might be pretty destabilizing to society." "I guess I mostly just have the position of --man! It sure is kind of concerning that we have these systems that could potentially pose risks, but you don't know what they do and decide to deploy them." "I really want a better and more scientific understanding of emergence. Why does that happen? Really understanding particularly notable case studies of it." "I believe emergence is often underlain by the model learning some specific circuit or some small family of circuits in a fairly sudden phase transition that enables this overall emergent thing."

@sandropollastrini2707 6 ай бұрын

For me this is in the top ten interviews ever. Dense of interesting information, a lot to listen and understand, and definitely some different from usual points of view.

@nowithinkyouknowyourewrong8675 6 ай бұрын

Yeah, and great production

@ArchonExMachina 6 ай бұрын

The topic is certainly of most importance in the current state of AI technology, where we have a breakthrough scheme with surprising results, that relies on "spooky emergence".

@bgtyhnmju7 4 ай бұрын

Four hours, and I coulda listened for twice as long. Some great ideas, discussions, and perspectives. And some good humour. Thanks for the video.

@ConceptsWithCode 6 ай бұрын

As others have already mentioned, this is one of the most substantive interviews to date on how we can interpret what is happening inside a neural network. A striking aspect of this interview, in addition to the valuable insights into this 'inscrutable pile of linear algebra' (Neel's alias for a neural network), is the brutal honesty that, in the end, we may never be able to fully interpret what is happening, but at least we need to try. Lastly, this is not one of those lectures where, whether you struggled or breezed through it (based on one's perspective), it feels well worth the time. Thank you for sharing this conversation.

@manslaughterinc.9135 6 ай бұрын

And they didn't say Black Box once!

@AICoffeeBreak 5 ай бұрын

The birds chirping in the background are offering such a relaxing listening experience! 😊

@cs-vk4rn Ай бұрын

One of my all time favorite interviews! Thank you!

@oncedidactic 6 ай бұрын

In love with everything Neel* has to say 30 minutes in- can’t believe there’s so much more to enjoy!!

@alexijohansen 6 ай бұрын

This was a great deep dive, letting Neel just share/riff… fantastic. I learned so much. As for the Robert Miles interview I remember having the impression that he was pushed much harder than other guests to Justify his views, which seemed unfair. That said, I am super grateful for your work, it’s fantastic… this channel is absolutely my favorite, please don’t take any criticism too seriously.

@proximo08 6 ай бұрын

I think ill have to rewatch that interview many times

@RickeyBowers 6 ай бұрын

Thank you so much for the effort - these long streams help get me into the headspace!

@gulllars4620 3 ай бұрын

First off, kudos for hosting such a great podcast and this interview in particular I feel shows the depth of the interviewer. I'm late in catching up on some of the episodes on this channel, but I've got to say this was a great episode. Neel can talk concisely and informatively in a way that is easy to follow about pretty heavy interpretability concepts. I'm definitely going to go check him out for more stuff. For example the clean summaries of Grokking vs general phase transitions, and the cleanup being critical, or superposition of infrequent or composite concepts in both knowledge and computation. And also the concept of potentially universal learned circuits (or families of such) being composited and combined, including induction for attention for in context learning. Just so much stuff in here that 4 hours didn't feel long at all. If anything the final roughly hour about AGI/ASI risk and x-risk was a bit less dense but still a good section to listen to. The first 3 hours are incredibly dense with information.

@diga4696 6 ай бұрын

Thank you for reading your KZbin comments, and thank you for such a great video and such great content. Cherished every minute of Tim and Neel sharing their worldly perspectives with us in these turbulent times. Up next, more amazing world views from Prof. Friston & Dr. Wolfram - better make some popcorn!

@ArchonExMachina 6 ай бұрын

I also love reading the comments here, and I think this is one of the best communities. Very little spam, and much enhusiasm. Also it is great that the comments are read and responded to by the team, and taken feedback and inspiration from.

@stevengill1736 6 ай бұрын

Agreed....cheers!

@mallow610 6 ай бұрын

4 hrs?! I need my popcorn

@ArchonExMachina 6 ай бұрын

I would LOVE it if MLST could do some sort of intermittent little "knowledge bomb" videos here and there for those of us engaged in separate fields, but "morbidly curious" on the subject. Perhaps explain every now and then some term or concept that might not be so familiar for all of us. Even if MLST could "curate" in so far as linking or reposting some seminal well-formed key lectures on the topic would be highly appreciated, in this vast landscape of knowledge and ideas.

@AICoffeeBreak 5 ай бұрын

1:32:00 Loved the point that given a bounded context length, it is still very much a finite number of inputs that a model would receive, so philosophically, that LLMs are not very different to AlphaGo.

@_obdo_ 6 ай бұрын

Great conversation - well done!

@mlock1000 6 ай бұрын

"But... non-linear probing is *particularly* sketchy" - so good. Neel Nanda's thinking about all this is so measured, informed, and logically sound. That combined with being deeply insightful and still willing to embrace the beauty and wonder and go searching for it is such a breath of fresh air. Thanks for this, such a fascinating subject. Has the potential for shedding light on some pretty heavy stuff, this is truly undiscovered country. I'd argue we can put goals in, we just can't know what goal we are encoding. And it won't be whatever goal you/we think it is. The geometric structures people found made my hair stand on end when I stumbled across the work. The toy models paper is one of the coolest things of all time, wraps up so many subjects in one neat experiment and being able to run the code is too good. If it helps anyone else, two personal rules that help me as I try to understand all this: Whatever you think it's doing, it's probably doing something different. Whatever you think it is, it's probably something else.

@alertbri 6 ай бұрын

Best intro yet 🙏👍

@ShawhinTalebi 6 ай бұрын

This is awesome! I didn’t know mechanistic interpretability was a thing, but it satisfies the physicist in me 😁

@SjS_blue 3 ай бұрын

What a marathon. Listened to the whole thing on a country drive and was not bored.

@M0481 5 ай бұрын

I love how Neel is able to achieve these amazing results and then refer to them as "cute" xD. Great interview, learned a lot!

@simonstrandgaard5503 5 ай бұрын

Great interview

@paxdriver 6 ай бұрын

Algo bumps by comment engagements are my fist bumps.

@oncedidactic 6 ай бұрын

Verily! Engagement promulgated

@Robert_McGarry_Poems 6 ай бұрын

I think I follow, but some of the jargon is over my head. You did a great job trying to mitigate for that. It's my own lack of knowledge in the actual terms used in the industry. Perfect dialectics based conversation.

@Robert_McGarry_Poems 6 ай бұрын

This conversation makes me think of the "autistic agent" I tried to describe. Very very interesting stuff.

@isajoha9962 6 ай бұрын

Very interesting and more in depth viewpoint of analysing Neural Networks, LLMs etc thanks. 👍

@isajoha9962 6 ай бұрын

Kind of more fascinating the LLMs can't handle limited tiny prompt context and loose its way when it can manage so much around the context. 🤔😅

@FranAbenza 6 ай бұрын

To talk about a periodic table of circuits reminds me a bit of the RNA World Hypothesis. Basic structures that don't dissipate but rather serve as building blocks of more complex structures 1:05:24. Also it makes me think of a fuzzy relation between knot theory and the topology created by NNs with the main difference of that this structures are probabilistic/noisy in contrast to platonically perfect topologies found in the knot theory periodic table

@billyf3346 6 ай бұрын

the one question i want to ask to neel that i dont believe gets addressed in this interview: "if everyone switched over from transformer models to neuro-symbolic models from the beginning, woud that just automatically render all known mechinterp trivial and solved, or would there still be remaining mysteries, and does neel think mechinterp and neuro-symbolic research could work hand in hand toward the same final goals in the end?" i hope i can get this answered. thanks. ☮

@Niohimself 5 ай бұрын

It's an interesting question. The field is definitely missing out on more neuro-symbolic research. I can only provide a historic note, that for purely symbolic systems - automated theorem provers - the results might still not be interpretable. For example: in 2016, a supercomputer was used to automatically generate a proof (a mathematically rigorous and precise explanation, that you can read and unsterdstand as a human, for why something, by logic, must be true) to the problem of "boolean Pythagorean Triples". The proof is in a form of text that you can read to know what the prover program is thinking and why it's thinking it every step of the way. The catch? This proof is 200 TB in size. It would take several lifetimes for a human to even read that text. It is completely inscrutible to us mortals. Why is that a problem? Mathematicians hoped that, since it is a difficult problem that doesn't take easily to common ways of solving it, they hoped that it would eventually be solved by more clever, more powerful techniques than we have now. Well, now the "AI" solved it, and we don't know how it did it, despite every nut in bolt in the system being completely observable. It may be that there is no clever scheme here and the prover simply brute-forced every single possibility. The answer is somewhere in that proof, that we can't meaningfully read because it's so humongous.

@u2b83 6 ай бұрын

I've always wondered how to prevent "sub-circuits" from being overwritten. I can imagine some kind of gradient gating, or separate loss function, dedicated for maintaining these sub-circuits. I wonder for really large models if sufficiently large context is an effective gradient gating mechanism.

@knowledgepower3558 4 ай бұрын

I woke up to hearing this conversation and it had me dreaming some crazy shit 😂. Yall have a new subscriber eventually I ll figure what yaĺl talking about 😊

@willd1mindmind639 6 ай бұрын

Humans understand language as a dynamic recurrent context with multiple potential vectors that can overlap and be used as a kind of "abstract model" in which language functions. For example, if you say "Is Paris in", that would be interpreted as having many potential contexts based on the specific sequence of tokens used, where the contexts apply to each token in the sentence, such as "Is" meaning either "exists", "exists at a certain location", "exists in a certain state" and so forth and "Paris" could be "person/place/thing" and "in" could mean spatial reference or logical reference. Not to mention each token has a context of whether the token is a legitimate word, which means exists in a dictionary as part of a given language and whether the sequence of tokens is part of a valid set of grammar rules., which is what makes words not equal to tokens in the sense of language understanding. (Ie, if a language model is trained on lorem ipsum texts, it only has a model of lorem ipsum patterns, which is not a knowledge of latin). That kind of dynamic context is not found in large language models because it is not based on simply predicting the next token as opposed to generating the correct context internally corresponding to the sequence of words given. And generally this is part of what is measured by reading comprehension tests and in a more advanced sense, logic tests. Also another good example of that human language context is the fact that most people have their own internal dictionary of terms which is ad hoc and based on a general ability to convey the meaning of a specific word that does not rely on rote memorization of a specific dictionary text. That context is also not covered by token prediction algorithms. In fact dictionaries themselves express this idea of multiple contexts being dynamically evaluated and or updated all at once in the fact that any word in a dictionary can have multiple meanings. Part of the problem with large language models is that many of the designers refuse to measure these systems according to these more rigorous types of evaluations. Which goes back to the fact that the whole point of large language models was more of a research endeavor to understand how these systems behave as more and more data is given to them in order to quantify the actual side effects, behaviors, benefits and functioning of such models and how those things can be used for solving specific types of problems. However, since these efforts cost so much money, most of this work is now privatized and those side effects and behaviors are just arm waved off as if there is no need to go into detail about the issues such research should be revealing. Not to mention the resulting system is built on an a-priori theory that is flawed, which is that just being given random text from a near limitless source of text will make a near omniscient kind of program which transcends simplistic functions such as plain language, which is false. FIrst, because language itself is something that is a model unto itself that has to be learned and second, learning a language does not automatically impart advanced knowledge of the world or anything else beyond language. As in, just because somebody knows French doesn't mean that they know advanced chemistry. That is simply not how it works in humans. A more reasonable test of any kind of language model should therefore be strictly be within the domain of language and language related functionality before going into these more esoteric domains. So, as mentioned before, if I train a language model on specific texts, can that language model reliably and accurately answer questions about that text? And the more critical question is how much training data is required to achieve a certain degree of accuracy in such a model. Similarly, what would the data and compute cost be for a language model that can reliably translate between languages and how does the complexity and cost increase as more language are added and is there any loss of semantic or comprehension as the number of languages goes up?

@bobopokomono-nu3gv 2 ай бұрын

excellent interview, amazing discoveries. we need more science and less commercial applications of AI!

@joebowbeer 6 ай бұрын

10:05 a nun's first curry

@oncedidactic 6 ай бұрын

Same 😂

@JM-xd9ze 6 ай бұрын

Thanks for putting this together. I can't help but feel that the average earth citizen would be shocked to know that the best and brightest in the field admit there's a reasonable chance that AI will be a terminal event in the next decade, give or take. Most people have NO idea.

@_ARCATEC_ 5 ай бұрын

Position in the Rotation of a monoid Group or Monster Group.

@ArchonExMachina 6 ай бұрын

2:26:48 "Tokenizers are f*d". I have been thinking this as well. I just have this feeling that it would make way more sense if tokens would be at the word level. A part of a word as a token seems like noise in the system. You give meaning, an emphasis to something that is arbitrarily cut from a larger whole, and actually does not have proper meaning in itself. Then, as he describes, things, meanings, connections get built around that, and I would guess in clunky ways. It seems some way a bit nasty, like a "hack". I guess it originates from performance aspects of encoding tokens in a small format. I would bet it plays a part in the bad understandability of what these models do. I will at some point attempt getting into this stuff in the technical sense when I have time, currently just curious. If I build my own little model tech, I will attempt the word-level tokenizing. Again, someone might shoot my thoughts right off as wrong, and that is fine also, I genuinely have no clue atm, just hunches by my own tech experience as a dev.

@cs-vk4rn Ай бұрын

At 3:08:23 is referenced Neel Nanda's article on getting started as a mech interp researcher and it's mentioned it will be in the description, but I do not see it there. I'm very interested and would love to read that post. Can you please link it?

@MachineLearningStreetTalk Ай бұрын

www.neelnanda.io/mechanistic-interpretability/quickstart

@cs-vk4rn Ай бұрын

@@MachineLearningStreetTalk Wonderful! Thanks! Perhaps you'll see some interpretability papers from me in the future, and know you've been a key inspiration 😌

@churde 6 ай бұрын

The subtitles butchered "mech interp" all the while in like who the f is this Mcinturff guy?? awesome episode once again!

@alexijohansen 6 ай бұрын

Thanks for the great show. I would also like to take this opportunity to point you to Michael Levin an American biologist who’s work is fascinating but has a lot of overlap with discussion you’re having. Maybe an interesting match.

@MachineLearningStreetTalk 6 ай бұрын

Thanks! We have had Micheal on but will do again soon I'm sure 🙏

@DeruwynArchmage 6 ай бұрын

I really feel like the transformer is insufficient to properly represent everything we want to represent. Some things *are* recursive in nature. You can’t just unroll the loop all the way. I feel like we need a heterogeneous network where some of it is transformers and some of it is other things. For example, arbitrarily deep mathematics, even addition with a lot of digits can’t be fully represented in a linear network. You need the ability to loop. I also feel like the network kind of needs to be allowed to maintain an internal state, kind of an inner monologue where it can think about something for a bit before outputting the next token. On the bright side, this internal monologue could be embeddings so that it could be translated into English and we could actually get some insight into what’s going on inside.

@tanmaykhattar3489 6 ай бұрын

What is the name of the music in the intro?

@MixedRealityMusician 6 ай бұрын

Flight of the Bumblebee by Nikolai Rimsky-Korsakov

@oncedidactic 6 ай бұрын

Totally respect and relish Neel's pragmatic slant. But, it is easy to dismiss philosophy until you need it.

@MatthewKowalskiLuminosity 6 ай бұрын

I will help no worries. :)

@stevengill1736 6 ай бұрын

Yes, on the Earth today...there was a time I felt like I was at the leading edge of things, but now I'm a septugenarian, and my 80 kilohours is just about up. But I like The Long Now foundation, and it looks like you youngsters have things well in hand.... hopefully... ;*[}

@matthewpublikum3114 6 ай бұрын

If there are similar circuits within deeper layers, and access to residual streams, doesn't that indicate transformers can be recursive?

@FranAbenza 6 ай бұрын

kinda fractal

@orterves 6 ай бұрын

Even if everything was perfect and every one was healthy and happy, and it was going to be that way for millions of years - at some point, it all comes to an end. You can't beat thermodynamics, at some point entropy wins. What does it matter if that happens in a million million years, or 100? What need do you really have to invent a new religion around "effective altruism" and "long term good"? (the reasons are at least apparent when the most effective altruism is of course to give rich people our money but I digress) All we can truly do, is strive to be our best selves at all times and otherwise live until we die.

@KevinKreger 6 ай бұрын

I think it needs some math or signal processing tools while it's training so it doesn't have to re-invent the sine table and DFT every time.

@_ARCATEC_ 5 ай бұрын

1:08:00

@abaybektursun 6 ай бұрын

Why such heavy video filter?

@siarez 6 ай бұрын

I didn't get "the knowledge is based" joke @2:15:10

@zerothprinciples 6 ай бұрын

1. Learning language and facts are separable from learning behaviors (the latter is supervised learning + RLHF) 2. Animals and humans have been shaped by Evolution in a Darwinian Competition and are therefore competitive, hoarding, and self-preserving. This is conserved as instincts in our DNA. 3. Learning about racism doesn't make you a racist. Learning racist behavior from family and peers makes you a racist, because it matches your built in prejudices in your DNA. 4. LLMS and AIs have not evolved. They are created by Intelligent Design (Researchers and Engineers) and therefore have inherently neutral non-competitive behaviors. 5. We cannot guarantee non-evil non-racist children because they have their instincts and DNA. But we can guarantee that our AIs can learn any behavior we want, and nothing else in the way of behaviors. 6. Evil humans abusing AI is still a problem. The article on my SubStack called "AI Alignment is Trivial" hints at one strategy. 7. In a future SubStack article I'll discuss more directly how AI can Moderate Moloch

@Achrononmaster 6 ай бұрын

@12:30 semantics"? A lot of random junk thrown together "artfully" (linalg stuff) and trained can implement an exact algorithm say 80% of the time. So 80% of the time is interpretable and even discoverably so --- by who though? By us! You need a mind to interpret junk, or art, or whatever. Once we interpret, a chatGPT then can too, but not in the same qualia-filled way as us.

@whemmakatatt5311 6 ай бұрын

When KZbin turns into gpt6

@user-bs9wq1lk4o 6 ай бұрын

OTHELLO - is a game - why do you not get it right ?

@palfers1 2 ай бұрын

in higher dimensions, most vectors are mutually (nearly) orthogonal. you guys don't seem to appreciate this.

@alertbri 6 ай бұрын

🙋

@jordan13589 6 ай бұрын

Glad to see Neel push back on your tired x-risk takes (little prestige or money is attached to ai risk). Here Tim, you dropped this: (e/acc)

@discipleofschaub4792 6 ай бұрын

Are you kidding? He even got a sponsorship playing up the xrisk trite at the start of the video. A lot of money is attached to playing up these scifi risks

@didack1419 6 ай бұрын

I'm not a fan of arguing for conspiracies on this stuff, given that many of these people have been consistently arguing for the possibility of AI-x-risks years before OpenAI and Anthropic emerge or got all of the money they have. And now you also have people like Hinton, Bengio and Andrew Yao signing letters on AI risks, who are not working in any for-profit organisations, linked to EA and they are also putting themselves in an unpopular faction.

@durden91tyler 6 ай бұрын

mindblowing how small the audience for this channel is. and infuriating. where are all of you in real life?

@retrofuturism 6 ай бұрын

Please don't put these effects and borders on in post production. It makes it hard to watch

@MachineLearningStreetTalk 6 ай бұрын

Sorry, I massively overexposed everything on the day. The footage was almost unwatchable without processing it like this. I learned a lot from this experience!

@akmonra 6 ай бұрын

@@MachineLearningStreetTalk haha, I was thinking "Neal looks like he's been painted"

@benshums 6 ай бұрын

Ah cool. Trippy.

@ArchonExMachina 6 ай бұрын

@@MachineLearningStreetTalkI thought it looked kinda cool, and I only listen during exercise anyways.

@matt0sai 6 ай бұрын

Not a problem here, watched multiple times and only appreciate your hard work! Ganbatte!