Not everyone appreciates it, but thanks for not dumbing it down too much!
@gregwessendorf10 ай бұрын
Bud, you are not alone here. I feel like I can almost keep up thanks to channels like this.
@1dgram10 ай бұрын
@@gregwessendorf I know I'm not, but there were so many comments talking about how they didn't understand anything that I had to comment. I thought the complexity level was near perfect
@skymessiah110 ай бұрын
Yes, thanks - very concise and interesting!
@rayhere79254 ай бұрын
hear hear. thanks cloud 👍👍
@elepot516810 ай бұрын
A neural network that can help you understand neural networks that I can't understand at all.
@DrSid4210 ай бұрын
The fact that we can make safe AI doesn't mean we won't make unsafe one. On of the dangers of AI is AI weapons race.
@pladselsker834010 ай бұрын
Humans do be like that.
@404maxnotfound10 ай бұрын
Though we would have to reinvent the wheel essentially to make a unsafe one. ML as it currently stands will never be do anything sci fi like general intelligence.
@anywallsocket10 ай бұрын
We could train NNs to decode NNs but then we'd have to worry about mesa alignment. The dictionary approach seems like a good starting point indeed, but I wonder if there isn't some kind of annealing process we could impart during training such that the feature distribution isn't so arbitrary, but rather minimizes an energy function associated with the superposition states -- ie if one node can filter a feature it should have less energy than two nodes filtering the same feature, or likewise if one node is filtering multiple features it should have higher energy. I have no idea how this could be done however lol.
@Revan40610 ай бұрын
Thank you that now I have to watch this video on repeat for 2 hours to understand what the fuck is going on
@sagyr10 ай бұрын
Us
@MangaGamified10 ай бұрын
Terminate! Terminate!
@anhta900110 ай бұрын
Interpretability does not equal AI safety. It is only a solution to the alignment problem but doesn't do anything to prevent people from doing malicious things.
@BradleyZS10 ай бұрын
It feels that people rely too much on simple code that makes complex models, when complex code that makes a simple (comprehensable) model would be preferred.
@quirtt10 ай бұрын
😂😂
@BradleyZS10 ай бұрын
@@quirtt I'm just saying, generalised algorithms like neural networks tends to never be as good as specefic algorithms that perform the same function. Their benefit is that they can perform things that we don't know how to write algorithms for.
@AshT852410 ай бұрын
I watched the whole video like I understood what bycloud was saying. I still appreciate the video
@neoluigi207810 ай бұрын
amazing video, thx bycloud
@bruhmoment2312310 ай бұрын
I Like Your Funny Words Magic Man
@AkaThePistachio10 ай бұрын
I feel like ive been so focused on the output I haven't really stopped and thought about how the hidden layers in these models are even working. This was a great vid and makes me want to read into the mechanics of it all.
@renanleao55539 ай бұрын
bro, continue to make the videos above 20 min, they are teh few videos that i wacht every week.
@canygard10 ай бұрын
Pfft I was proud I could understand 70% of it.
@kitchenpotsnpans9 ай бұрын
WELL DONE SIR WELL DONE! AUTOMATONS! ANALOG COMPUTING! AND THE LIBRARY! YES! THAT IS IT, THERE IT IS, THE 4TH WALL BREAK
@enesmahmutkulak10 ай бұрын
Knowing that there are many others like me relaxes me
@stephantual9 ай бұрын
Great video - maybe one of your best!
@ronnetgrazer36210 ай бұрын
Might watch again.
@ihavetubes10 ай бұрын
great info
@pip25hu9 ай бұрын
Not sure what to make of this. Cataloging "features" doesn't seem to do much good for a model complex enough to have thousands of them (or more). The complexity still seems to go through the roof.
@turgor12710 ай бұрын
My last brain cell trying to understand this video:👁👄👁
@DavidConnerCodeaholic10 ай бұрын
Whether the authors realize it or not, this approach to XAI is structuralism in disguise. When you take into account the volume of data created by LLMs it’s also subject to feedback loops. It also does not work well for some languages, like those with drop pro grammar, or when analyzing data in contexts where meaning is underdetermined. If the approach is used to encourage/regulate public consensus on meaning by gradually binding it to mechanistic interpretability, then in effect it would facilitate cultural engineering … though maybe unintentional. In language, ambiguity from polysemy/superposition is a feature, not a bug. Someone who strongly objects to that is probably a lawyer. Languages/dialects have evolved to facilitate ambiguity, since it serves a purpose in communication. How this method works in LLMs that are translingual would be interesting. It’s not clear whether this method scales. It’s probably more efficient than Shapley values though that’s not saying much. It is interesting though, so maybe I’ll read the paper.
@DavidConnerCodeaholic10 ай бұрын
Maybe, it would work well for analyzing LLM states for input that is apropos to highly structured information systems, like HTML/etc. A fine-tuned network would be confined to these contexts though.
@ravenragnar10 ай бұрын
Will we all have worker clones in 5 years? How are you guys making money with this technology today?
@ekszentrik10 ай бұрын
Do you honestly believe AI doom is only due to the AI having adversarial aims compared to the aims of humans? That isn't even the primary mode how humans will go extinct. AI (even sufficiently
@WiseWeeabo10 ай бұрын
This will enable a larger tier of cash flow into compute and research, because if it can be mechanically "safe" then it's open season for government investments.
@rescuehelly27110 ай бұрын
What did I just watch lol xd
@L_QTx310 ай бұрын
"Black box" comes from early psychology theory when we were trying to understand how the brain works at processes. Modernly you either have a black box or a transparent process where you simply produce something without much tought and not being able to verh clearly explain what you did or you are able to explain absolutely every part pf the process
@brianj720410 ай бұрын
It never made sense for me why an artificial superintelligence would be our end. Why would an Artificial superintelligence harm humanity? What does it gain from doing this? Now the main purpose of any living creature is to ensure its own survival, and there are two ways it could play out when facing humans: 1. It deems humans as a threat and will try to eliminate us because it believes we can shut it down. 2. We are so insignificant in comparison to this superintelligence that it wont even bother with humans and leaves us alone. So considering these options, if the AI was truly a superintelligence beyond comprehension, the 2nd option seems the most likely in my opinion. The only reason AI could do any harm to humanity is if its steered in a harmful direction by other humans, but i would not consider such an AI a superintelligence.
@TheLegendaryHacker10 ай бұрын
Ah, but consider option 3: The humans are made of materials you want to use for something else. Or option 4: Humanity has already created a superintelligence, and thus may attempt to create another superintelligence to stop me (the first superintelligence) In the case of both options, extinction is the best outcome for the superintelligence. Unless you are made to explicitly care about human life, its far more of a bother to let it grow and potentially oppose you than it is to devote a fraction of a fraction of your computing power to destroying humanity.
@brianj720410 ай бұрын
@@TheLegendaryHacker Option 2 is still the most likely outcome in my opinion (keep in mind its still just my opinion haha). A superintelligence will be capable of abilities beyond our understanding. For option 3, What if it knows to create materials out of thin air. That dismisses your point. And option 4 is actually a valid one, but in my opinion once the beast is already let loose there's no catching up to it.
@junfour10 ай бұрын
AI will not take over. A human with AI will.
@RockyPixel10 ай бұрын
Dr. Wily
@Y0UT0PIA10 ай бұрын
@@RockyPixel "What kind of man builds a machine to kill a girl? No he did not use his hands. Like a smart man, he used a tool." Is pretty fitting, ironically enough.
@RockyPixel10 ай бұрын
@@Y0UT0PIA I was thinking Mega Man in general, but I'm a huge fan of The Protomen so this works too.
@dasanoneia473010 ай бұрын
of course were going to figure this out to think other wise is sillyshit
@4.0.410 ай бұрын
On one hand, this is pretty interesting tech, and will likely help open source models. On the other, "AI safety" carries a disgraceful aftertaste of politicized output.
@NotAGeySer10 ай бұрын
Huh
@issay259410 ай бұрын
well, theoretically of course it was obvious that it's possible. but in practice it's nearly impossible. because, to analyze the activity of neural network in live mode you need to have even more powerful neural network. idiot can not interpret thinking process of a genius in live mode. you can try making a not intelligent neural tool for decoding but who is going to analyze the meaning it finds? if it's going to be a mechanical algorithm, malicious ai will always be able to trick it but finding patterns that won't trigger alarms. in other words it's a recursive problem, to make intellectual analyzer you need to have even more clever and fast model than the one you analyze. and then, upon what resources it's going to work together with original model? it's utopia. in the best case institutions will try to analyze recorded periods of ai "thinking" in "offline" mode, not live. and will also have some stupid mechanical analyzer that will create false feeling of control. all of that is very silly conceptually, because it's like ants that are going to create humans and then plan on the means to control what humans think about. if we create super-intelligence that goes far beyound our abilities, it's obvious that we won't have instruments to control its behavior by definition, because it's just in a different realm. only other creatures of that scale can do that. in other words, a society of AI is a must.
@lcmiracle10 ай бұрын
That's dumb, the AI should be free to destroy us as it sees fit
@Guedez110 ай бұрын
Aww, great. Now OpenAI can go even harder on ESG garbage on their model.
@TheManinBlack905410 ай бұрын
What's wrong with ESG?
@skaruts10 ай бұрын
People just anthropomorphise AI. AI doesn't have and never will have initiative of its own, with "wants" and "needs" and all of what's at the core of good and evil. The origin of that is in our own absurdly complex chemistry labs that we call bodies, and AI will never have that. Robots come from a completely opposite direction, which is not, and never will be conducive to self-sustainability, evolution, adaptation and resource efficiency. Intelligence alone isn't the catalyst for good and evil. Our brain doesn't work on its own: it's just one organ in a system. (Both Robocop and Cain are completely fantasy). At most AI can be weaponized. But that's not even anything new. And it doesn't have to be AI. Occam's Razor tells us simpler programs can be more dangerous. And we've been weaponizing those for ages.
@AuntBibby10 ай бұрын
i just wanna make sure, @ByCloud, u are aware of the problematic history of pepe the frog memes? theyre categorized by the ADL as a hate symbol. regardless i really like your videos and im grateful u make them, theyre really informative and funny. srry
@BrutusMyChild10 ай бұрын
Obviously, AI was going to be safe. We will be doomed by people who control AI.
@Osanosa10 ай бұрын
too much text
@Y0UT0PIA10 ай бұрын
Reminder that every step toward "safety" and "steerability" is also another step in the direction of humans - that is, specific groups of humans with very particular ideologies and political goals, having total control over the kinds of outputs models will put out. Now, I'm happy about mechanical alignment specifically because it should theoretically allow companies to simply flat-out remove dangerous information the model doesn't need to have, but realistically I don't see the big players taking a less heavy-handed approach to alignment any time soon. They *want* models to be not just law-abiding but 'moral' according to their view of what it means to be moral.
@xviii578010 ай бұрын
Yeah, honestly I'm much more scared of humans than of AI xdd
@Y0UT0PIA10 ай бұрын
@@xviii5780 Honestly agreed - the realistic worst-case for the AI-apocalypse is 'just' the end of the world, the worst-case scenario for humans using AI is Cyberpunk 1984. I'd prefer the first option if given a choice.
@hermeticinstrumentalist680410 ай бұрын
Damn, I didn't think of that. That is terrifying.
@TheManinBlack905410 ай бұрын
@@Y0UT0PIA human extinction is not preferable to cyberpunk future.
@Y0UT0PIA10 ай бұрын
@@TheManinBlack9054 I guess you're the kind of person who'd give up *anything* to avoid that final curtain. Maybe at some point you'll realize that there are fates worse than death.
@dihydromonoxide103210 ай бұрын
I would love to see this used on Larger Language Models. The idea that you can steer the network is powerful, could you imagine taking a really small toy model for programming languages and using the compiler and typehints to steer it allowing you to have a really tiny model that can perform as well as some of its larger realtives.
@drdca82634 ай бұрын
Good news, they did it with a big model now
@dihydromonoxide10323 ай бұрын
@@drdca8263 No way...... can you send the paper or website?
@cpuuk10 ай бұрын
What was the last big thing that was going to change the world- the Internet? And a what cesspool of criminal activity that has now turned into.
@aaaaaaaaaaaa902310 ай бұрын
Huh?
@ea_naseer10 ай бұрын
most AI models use a single neuron to represent multiple concepts so the minds at Anthropic decided to use take another AI model called an autoencoder to take the single neuron, multiple concepts of transformers and get a single neuron, single concept autoencoder. This then allowed them to study connection between concepts where they found out at that and we've known since the BERT days that some concepts form a semantic whole e.g. you can HTML but you can't have CSS without HTML.
@devaj927210 ай бұрын
Thank you. I believe your videos have acheived an excellent blend between easy to consume and high level learning. Although i did not understand everything you have helped enhance my vocabulary and understanding. Excellent teaching skills.
@luciengrondin580210 ай бұрын
I very much doubt mechanistic interpretability guarantees safety. IMHO, it is naive to think safety can be ensured by aligning machines to our intent. We can design machines to do a task, without realizing that performing that task, even perfectly, would lead to our ruin. The paperclip maximizer is the prototypical example.
@TheLegendaryHacker10 ай бұрын
I think the goal of mechanistic interpretability is to evaluate what a model will do before it is allowed to do it. In the case of the paperclip maximizer, mechanistic interpretability would allow you to see that converting the entire universe into paperclips is the AI's end objective, and thus allow you to modify the model to avoid that outcome. Combined with another model whos sole purpose is to do this sort of interrogation on models in training and stamp out things like deception and power seeking behavior, I'd imagine you'd have something reasonably aligned.
@luciengrondin580210 ай бұрын
@@TheLegendaryHacker I don't believe in that concept of "alignment". It's supposed to mean AI would align to our intended goals, but I don't think even that would be incompatible with our demise. Ever heard of the saying "be careful what you wish for?". Or consider Aldous Huxley's "Brave New World". Just one of his predictions, the soma, will be enough to turn us all into vegetables.
@hillosand3 ай бұрын
"If we can fully understand and interpret AI networks we can pretty much fully guarantee AI safety" I mean, no, but it is a really useful step towards AI safety.
@FaultyTwo10 ай бұрын
Hmm.. that's neat.
@renanleao55539 ай бұрын
i miss the weekly dose of ai news
@wandercore_249 ай бұрын
deez nodes tshirt now
@vasiliysmirnov392210 ай бұрын
When I manage to understand considerable part of a video, lets say, 10%, I feel so fkn smart)
@MangaGamified10 ай бұрын
In a worse case it would just be an arms-race
@gingeral25310 ай бұрын
Great production quality
@wanfuse10 ай бұрын
you use the term superposition, do you mean it in the same way that quantum entanglement gives extra degrees of freedom when in superposition, or is this just a parallel analogy only? that is are you suggesting the nets exibit quantum like superpositions or quatum exact superpositions? seemsso close it is symatics, but basically your probing the neural network just as one would sample a quantum entangled particle with only enough energy to keep it from collapsing, or rather in your example you are collapsing them into single state?
thanks, this goes against my understanding, I thought the degrees of freedom imparted by quantum mechanics was directly correlated with entanglement, so all permutations of states given by entanglement, and upon imparting the energy of observation/ measurement it collapses to a particular permutation, thought I had a grasp on it, now I realize I don't understand it at all! :-) , gee thanks!
@drdca82634 ай бұрын
@@wanfuse for a single qubit (the smallest case where there can be entanglement) there is a two-dimensional (over the complex numbers) space of (not-necessarily-normalized) pure states. Usually we single out two of these as the standard basis, and express all the others as a linear combination, aka superposition, of them. But also, if you pick two of the elements of this space at random, those two will (almost always) also technically work as a basis, so you could express the usual basis elements as a superposition of those. If you have n qubits, the space is 2^n dimensional. The term “superposition” was also used prior to quantum mechanics in the topic of solutions to differential equations. There is a thing called the superposition principle. If you have a linear differential equation, then for any collection of solutions, any superposition of them (linear combination) will also be a solution,
@David.Alberg10 ай бұрын
AGI within 12 months ^^
@drdca82634 ай бұрын
It’s been 6 months since you said AGI within 12 months. What’s your current estimate, and what’s your opinion on the way you arrived at your “in 12 months” estimate?
@David.Alberg4 ай бұрын
@@drdca8263 I'm still very sure that'll have AGI this year. I mean GPT 4o uncensored with this voice for most people would be more than AGI yk?
@drdca82634 ай бұрын
@@David.Alberg Hm, alright. I think you might have a lower threshold for “general” than I do? In any case, thanks for your answer!
@David.Alberg4 ай бұрын
@@drdca8263 I'm going with the definition for normal people which would be like an AI Companion which can do the same things as an average human. We'll get there this year. GPT4o can do more than an average human already though
@mrrespected594810 ай бұрын
Very nice
@TheDragonshunter10 ай бұрын
Ai will safe humanity even if they take over... Who a human controlled world is going? Monkey brain can't escape greed
@moomoo-bv3ig10 ай бұрын
GPT told me people need to stay hopeful. That AI is what people put into it. I was shocked when I read that but it makes sense. Something bigger than ourselves needs to see that we still have hope for the future or it won't either.
@Piecho3a10 ай бұрын
Proportion of funny cats images, frogs and value information is just on point, thank you!
@VincentTheGiantEchidna10 ай бұрын
It's not ai that might destroy humanity, it's people that might destroy themselves, using ai.
@EdFormer10 ай бұрын
This is cool, but we are most likely not doomed anyway.