The Thousand Brains Theory of Intelligence | Jeff Hawkins

The Thousand Brains Theory of Intelligence | Jeff Hawkins | Numenta

Рет қаралды 14,671

Күн бұрын

Пікірлер: 53

@vinm300 2 жыл бұрын

Thanks for posting this lecture. Jeff Hawkins insights :- a) Hierarchical pattern recognition b) Sparse distributive representations c) The thousand brain theory

@simonmasters3295 Жыл бұрын

@Lulamariamd 6 ай бұрын

Fascinating. Thank you for such interesting talk💜

@Bencurlis 10 ай бұрын

This is a great overview of a really interesting theory. However, it doesn't really look like a complete theory of intelligence to me. If you think about it, this theory seems to mostly talk about the pre-processing of external information, which is obviously a lot of information so it is normal that it would require this many dedicated cortical columns. I think the more interesting stuff happens after objects have been recognized: how is the model of the entire world formed, how is planning implemented, where do goals come from (will), how does the brain learn to act once it knows what it has to do at a higher level... I don't think these can be explained by cortical columns doing predictions, or rather I don't see this theory explain the most important parts. It is possible that these particular but important cognitive capabilities actually concern a relatively small number of neurons that we miss entirely when we look at the more common and representative neuronal structures.

@sifumy 2 жыл бұрын

Of all the hypes with the current AI development, I still put my love into numenta. I believe all the others might be interesting, and even useful, but the true path to intelligent still lays here. Their focus are on the design, meanwhile numenta focused on the engine.

@capitalistdingo Жыл бұрын

When I first heard about LLMs I wasn’t very impressed. Then I started hearing about improvements being developed and discoveries being made and while I still think it is a simulation of intelligence rather than real intelligence, I started to think that maybe steps like moving to multimodal models and adding memory and cross training between seemingly unrelated training data would add enough pieces for it to start to approach something real. The big hurdle is that they don’t do anything when they are not being prompted. What intelligence needs to be told to think about something? And they can’t form a goal or possess a motivation. But I didn’t know if maybe something could be discovered to solve that last hurdle. This refresher (I read “On Intelligence” back in the day) makes me more pessimistic about the current path of AI. And while I don’t personally understand how this theory could be implemented in a physical technology it seems to yield a superior understanding of natural intelligence. That is a tragic dilemma: current AI is yielding practical results with lower long term progress while an understanding of neural columns in the neocortex promises a better approach without a good near-term path to practicality.

@namidasora2357 Ай бұрын

I'm sorry but the first part of this comment made zero sense. "The big hurdle is that they don’t do anything when they are not being prompted"? Sure, for chat models. But that's because as the name suggests they are purpose built for chatting. On a fundamental level the model just provides an output based on an input that has temporal relationships. It only waits for a response when it returns an end-of-sequence token, which it is of course designed, and often forced, to do. The model itself in theory does not need to ever return an end-of-sequence token and can just continue to run indefinitely. If you then argue that "it is still being prompted by the last token", then how are you any different? You are constantly being prompted by stimulus and produce an output in response, whether that be an internal prediction at the 'next token' or a motor response. In terms of forming goals and motivations, I think those silly overdone agent papers have proven time and time again that this is something they certainly do not lack. Do not misunderstand me, I am not saying they are identical to us. Aside from obvious architectural differences, the largest difference between us and language models is the way we learn. However, I find it exhausting seeing people constantly overexaggerating the limitations of LLMs based on what I can only think is knowledge skimmed from a single google answer box.

@sol3au Жыл бұрын

It is time to start putting it all together into functional model, this in itself could help solve some of the remaining gaps. I am so hoping Jeff succeeds in this within few years. Maybe I am too optimistic, but I am following his work for a long time, and I share his passion. Hope a real AI is close!

@PatrickOBrien-r1q Жыл бұрын

What are we currently missing to make this happen?

@omG-oh4mb 5 ай бұрын

Happy to hear from you! With all the AI hype, I was wondering qwhere Numenta sits in all this. Keep it up!

@nanotech_republika 2 жыл бұрын

In your book, you talked about evolutionary explanation of the structure of the column: from entorhinal cortex/hippocampus into a neocortical column. Can you explain this more or give a reference in literature to this idea? What I don't understand is if the copying the grid cell/place cell mechanism into columns of neocortex was true, then when did that occur? The evolution of neocortex from pallium shows that the pallium structure existed in brains even in fishes (500 mln years ago?). Was there a column layering (with grid/place cells) occurring that long time ago or was that copying event happening much closer to our time? You could imagine that a map of surrounding (similar to mechanism in the entorhinal cortex in humans) is already developed in fishes, but also a map of objects (similar to columns in neocortex) must have been developed in fishes as well long time ago (since a fish needs to know where to bite on the object).

@nanotech_republika 2 жыл бұрын

Also, can you see somehow that type of copying during development in an embryo? I thought that all those pallia (medial and dorsal) from which hippocampus and neocortex develop grow simultaneously. No?

@moormanjean5636 2 жыл бұрын

Basically, grid cells / head-direction cells are found in the medial entorhinal cortex, so I guess the medial and lateral entorhinal cortices became folded on top of each other to form the deep and superficial layers of cortex, respectively. This would explain why L6 contains grid / head-direction-like behavior, and why the superficial layers hardly project to the deep layers (just like LEC doesnt project to MEC), whereas the opposite is true for deep-to-superficial projections, such as layer 5 to layer 2/3 or layer 6b to layer 1.

@CharlesVanNoland 2 жыл бұрын

I agree that the deep network approaches of today are not going to suffice for the applications most are hoping they will satisfy. Self-driving vehicles, especially if they're running purely on video input, will require an intelligence that has learned experientially, about the world itself - not just driving - before we will have something that no longer has a long tail of edge cases that it fails at. Tesla will never achieve full self driving that doesn't have this long tail because driving is too complex of a problem. The world is too complex to solve it with just "data", there are too many unique events that can occur while en route somewheres. Brain-like intelligence is the way forward.

@DrWrapperband 2 жыл бұрын

The M-LOP system I'm proposing uses multiple small M.L. "brains" to run an A.I. computer operating systems, so this research is quite interesting. I was assuming feedback through a reflective mechanism and working memory "brian" would give a type of consciousness.

@DHorse 2 жыл бұрын

@@DrWrapperband That's certainly one solution to AGI. A functional component model can use underlying ML. So the architecture would be central but I am interested in an interface between components that is generalized. I teased Matt about this. That's a problem related to encoders. It's all about the encoders dude. 😁 Same diagrams, different encoding methodology. Anyways seasons greetings to all.

@doanviettrung 2 жыл бұрын

Your theory is amply supported by many experiments analysed in this 2020 review paper, "Movement-Related Signals in Sensory Areas: Roles in Natural Behavior" by Cristopher Niell et al. This includes cortical grid and place cells. But, that paper says that "arousal, reward, and thirst" states affect neural activities in primary sensory cortices! This means what you call "reference frame" computations in each column must include not just movement but also at least these 3 dimensions. So, the problem of working out how these computations work becomes 4 times more difficult!

@simonmasters3295 Жыл бұрын

I can't agree with your mathematical equivalence of "dimensions" as a simple estimate of multiplicative complexity, but I do agree that the issues you raise are important. Don't arousal, reward and thirst-hunger come later? I mean once attention has been drawn to the relevant real-world experience, then the brain says "this [identified thing in a place] would be e.g. good to eat? Or e.g. good to avoid?

@ChoonMeese Жыл бұрын

It's also important to remember that Jeff's theory is strictly a model of the neocortex. While the neocortex is influenced by subcortical structures which are likely responsible for "arousal, reward, and thirst" (you can think of these mechanisms as sort of built in reinforcement learning parameters), the job of the neocortex itself and its substructures theoretically does not comprise these states. As Jeff said, the neocortex builds world models, and you could argue that drives such as hunger which can be derived from subcortical structures communicate with these world models in order to form movements toward certain goals.

@mehmetgunduz6247 2 жыл бұрын

"True machine intelligence must work on the same principles as the brain" means shortly that "Convergent evolution"

@tonyosime9380 Жыл бұрын

Thank you for such a wonderful lecture. My question is, must machine intelligence work on the same principles as the brain? I see the brain as one end result of a random combination of many lower level intelligences. These can combine in new ways to generate new types of brain that have access to dimensions beyond human perception. For all we know these new machine learning systems may have already discovered these new dimensions and are working in them, oblivious to us. Rather than force ML systems to be more human, maybe we should try to understand what they have discovered.

@ChoonMeese Жыл бұрын

I think the insight is not necessarily that machine intelligence work on these principles, but rather even the AI we have today is based on very poorly transferred models of mammalian brains, and that the quickest way to advance the efficiency and overall capacity of machine intelligence is to create AI architectures which model the mammalian brain more closely.

@jamesstanley7263 Жыл бұрын

Aren’t you just rehashing ideas introduced by Marvin Minsky? For example his ‘Society of mind’ theory?

@doanviettrung 2 жыл бұрын

A bit hard to hear for people without good English. After the first few minutes, Jeff Hawkins got back to his usual fast pace and wide dynamic range from loud to whisper

@egor.okhterov 2 жыл бұрын

What is generating bad or good feeling inside of the brain? When I touch something it may feel good or bad. Where does this attribution comes from? Is it the signal that comes from an old reptilian brain? How does the brain determine what sensation in pleasurable? And what does pleasurable or painful mean in terms of neurons firing?

@BradCaldwellAuburn 2 жыл бұрын

I think the brain uses two modes simultaneously - one is conscious 3D imagery of seeing the real world; the other uses 2 (perceptual) spatial dimensions and then dedicates the last (perceptual) spatial axis to represent space too, but at "degrees of futureness or pastness within a one second time window." The present plane holds most of the focus, which is why this realm feels like a 2D ring. (Also, focus [or paint opacity] fades from present to future and from present to past). Over the passage of time, the contents of the window are updated so there is a feeling of moving through time. Subconscious mental imagery is painted with this 2D "ring" (which can rotate in attitude) and paints pleasurable imagery when something feels pleasurable, and unpleasant imagery when something feels bad. The inertial forces that this "you" experiences give the feeling of all qualia. That is my take based on my experiences/contemplation.

@BradCaldwellAuburn 2 жыл бұрын

As you touch something pleasant, the mind's eye swings outward from "physical eye" and blows up the image of the thing you're touching (in mostly subconscious mental imagery), and then may rotate to show (a mental image of) your face and a pleasant reaction to the softness of the material, interspersing one or two "episodic memory frames" of 30ms of video of you touching other similar soft things in your past.

@BradCaldwellAuburn 2 жыл бұрын

As far as which neurons, based on stroke data, I would think that the left hemisphere would typically be the place where the "2D+2T mode" happens, as strokes here can lead to loss of language and sense of future and past and anything that requires comparisons over time. The same neurons could paint good or bad feeling - which one is being painted would be deduced by looking at what mental imagery is being created and what responses are being mapped to. This would suggest also that the left hemisphere, if it is painting a time window that includes the future, is where prediction happens, as well as rumination when things go differently than expected. Then, the right side could typically be used for "3D+1T mode" present moment intake of real world environs, which is copacetic with the following: "A widely accepted theory of USN states that the right hemisphere controls shifts of attention to both the left and right side of space, while the left hemisphere only controls attention to the right side" (Mesulam, 1981). And the front right insula (salience network) switches between I think the real world (right side) (central executive network) and the imagination (left side) (default mode network). I think the imagination often comes back to model the real world during wake. My thoughts on what the networks do is just exploratory and a guess. A bigger idea is that the cortex columns can only do one repeated function - and it needs to be able to carry all modalities - therefore, let it "paint" in these two spatial modes to convey all qualia/modalities by changes in geometrical forms that seem to have innate meaning. If you want smell, paint in the story that the avatar is smelling something. I would think that the perceptual space that the brain paints never moves, as it can simply update the paint to show the world moving through the space as you walk. Then, each column could paint any voxel within that perceptual space (useful because the column for, say, the tip of your right index finger can occupy millions of positions within a ~8' cube of perceptual space). I call this 8' cube (can scale in size as needed because it is meaning space not real space) of perceptual space "the bank," and I think that the cerebellum's signals to the cerebrum aren't error flags, but rather are saved 3D skins within this bank of 10 Hz frames of experienced consciousness (the 10 Hz frame/skin/focus portion). They build up over time and can even help enable sequences of 3D skins, whether for memory of mental imagery or muscle memory of how to perform something like playing a certain learned song on the piano. This bank appears to be centered about 2' in front of you, although I'm still trying to work out if you lie down on your back whether it remains "anti-gravity up" or "head-crown-direction up," although both may be possible since the whole point of painting consciousness is a tool of similarities in meaning, not necessarily a hard rule. In this way, both LTP of hippocampus/cerebrum, and LTD of cerebellum, are forms of memory and are saving very useful geometries. I say LTD of cerebellum is a memory because it is long term depressing the Purkinje cells, which are themselves inhibitory to DCN. Thus, two negatives equals a positive, and the increased perpetual leak from cerebellum to cerebrum would be geometrical skin "hints" or "channels" for ease of use. Also, for the 2D+2T mode, the 2D plane would be drawn upon with a squiggly ring (drawn over 0.05-0.1 seconds each cycle) that has phase and radius (from its center) as the axes. The vector at the center and outwards (in futureward direction) would be the axial axis representing futureness (or pastness in negative direction). To maintain continuity, to paint on a certain spot on the ring, you'd need to wait for the right moment ("phasic precession"), and activate the set of neurons within activated columns that represent that point in perceptual space, and those modality columns whose sensor is at that place. To paint at a certain degree of futureness or pastness (enables correcting mistakes), switch to the right amount of axial offset by selecting a "future" spot within perceptual space (its set of neurons within a column dedicated to that point) whilst also timing correctly for phase of that spot. This is "cyclic precession," but done not with actual time shift, but rather perceptual space "time axis" shift, which is a spatial shift. The ring can 3D-rotate within the bank, and thus redefine what the axis of gradient from past to future currently is within the bank. The ring, by changing shape and attitude fairly quickly, and extremely precisely over time, reveals that it's actually trying to paint a movie of 3D animated skins acting out your mental imagery or desired muscle movements -- or, upon hearing speech, mental imagery of the people speaking, or of the concepts they are conveying. You can't hear speech without mental imagery jumping into action. The other benefit of 2D+2T (two perceptual spatial axes representing space at "current time", and one perceptual spatial axis representing the third spatial axis but at varying degrees of futureness or pastness, the last dimension of actual time in the universe used to update the paint in real time to flow future into the past and provide the feeling that you are moving into the future) means that different flow rates can be used for different modalities. Sound would paint upon an extremely fast flowing canvas (10+ m/s?), sight on a fairly fast flowing canvas (2 m/s), and retrosplenial "final cinematic summation of consciousness for memory" on a pretty slow flowing canvas (0.5 m/s, seemingly often equal to rate at which you are walking). This way you can paint more than one video simultaneously without confusion, usually keeping most focus on the sight/physical world video, and letting other senses be "felt."

@gz625 2 жыл бұрын

Lol I'm reading questions of 5 yo kid 😂

@egor.okhterov 2 жыл бұрын

@@gz625 thanks but I'm turning 5 only next year

@heelspurs Жыл бұрын

3D modelling of objects that can translate and rotate in time requires 6 degrees of freedom, i.e. layers or variables in a system of equations. I first posted this to the original numenta forum around 2007. I've been waiting on Jeff to go "Oh, each column needs to 'see' 3D and therefore there's a requirement to solve for 6 variables. .. hmm I wonder if the 6 layers .... " If it's not directly the 6 layers, and if each column has real-world models, then where's the 6? Ant eyes have only 2 or 3 layers because they don't use eyes for 3D, but they do use front legs & feelers for 3D conceptions and manipulation which have more like 6 layers. Their close cousins the wasps have 6 layers for the eyes because they have to see in 3D. Dragonflies flies have more like 8 layers and are the kings of being predators of flying insects because the extra layers allow conversion of 2 axii of velocity to be constants instead of changing position like we perceive. We think we leave in a 3D world only because we have 6 layers. There are no integers in physics except for the 3 spatial dimensions, so they are highly suspect from 1st principles (in forcing the 3D world view, we created the concept of spin to satisfy things). 4D spatial conceptions would require 10 layers from N*(N+1)/2. See degrees of freedom.

@billrutherford356 Жыл бұрын

Looks good ...

@DP-bl7nh Жыл бұрын

These are amzing discoveries , What bothers me is that there is an universal Algorithm out there which can be coded today , but the amount of capital and infrastructure that needs to be built is just so enormous that its not viable for one person job to do it. and there are no companies or VC's out there who will fund such a project.

@BradCaldwellAuburn 2 жыл бұрын

I think why hearing, smell, sight, touch are all "the same" is because, counter-intuituvely, they are all patterns of inertial forces within perceptual space. Vision is the hardest to grasp, but the others are easier - you can kind of consider sound to be high frequency vibrations (inertial forces), touch to be slow, drawn out inertial forces. Coupled with this, I think the 3D geometry of perceptual space, which is meaning space (not real space), plays prominently in serving as a construct for the meaning of consciousness. I think it feels like the cortex both makes this meaning of 3D consciousness, and simultaneously is looking at it, and reacting via muscle commands. Actually I think it feels like this, or that there is equivalence in meaning whether we interpret the brain's signals as 3D consciousness or as useful patterns. I think the space of perception, since it is meaning space, not actual space, is fixed and never moves, and that the brain draws the world moving through this space as you walk. I have experienced all the qualia of consciousness starting and stopping and their common denominator (including the sense of self) was inertial forces within perceptual space.

@maloxi1472 2 жыл бұрын

For that paragraph to even point to a good explanation, you'd have to clearly explain what you mean by " inertial forces within perceptual space" in the first place; and that's before addressing how opaque the comment as a whole is. Then again, it might be naïve of me to believe that the comment was intended to be intelligible, or an explanation to begin with...

@simonmasters3295 Жыл бұрын

@@maloxi1472 harsh, but true... One aspect of this is, however, generosity. A spirit of enquiry where all observations are valid. Don't you think?

@maloxi1472 Жыл бұрын

@@simonmasters3295 A "spirit of enquiry" doesn't preclude a healthy dose of focus and sense-making. If "all observations are valid", then we don't need to have that oh-so useless ability we call "attention", wouldn't you say ? Moreover, true enquiry is fruitless if one does not often stop to ask "what does this mean ?" and "what's a good explanation for that ?" in order to separate noise from knowledge. I'm doing just that...

@BradCaldwellAuburn Жыл бұрын

@@maloxi1472 Consciousness is a tough thing to describe since not much is known about it. Let me try to explain. So everything around you is perceptual space. The grass, the floor, your body, other people. All of that is your brain's internal model of what it thinks the outside world is. So, if we accept that everything we experience and see is an internal story made by the brain, then within perceptual space (as opposed to physical space) inertial forces must be shown and understood. If you lean back in a chair, usually the rings drawing perceptual reality will walk back to the back of the chair over about one tenth of a second, then walk forward again to whatever you were focused on. It happens so quickly and so accurately that we don't notice it is rings at all, but imagine an entire 3D scene and think the brain is painting the entire 3D scene every single millisecond. That would be overly resource expensive.

@joaoveiga3382 2 жыл бұрын

I have heard that this theory has not been well accepted by the scientific community and that there is no evidence for it. How do you respond to that? Why do you think most experts don't believe in the theory?

@maloxi1472 2 жыл бұрын

You're going at it with the wrong epistemology. Belief has little to do with the scientific process. In particular, you don't need to "believe" in a theory in order to compare it to other theories (which is, in essence, what the scientific enterprise is all about). Also, who is an "expert" at understanding general intelligence and how the human brain does it ? Is there even one such person ? To warrant a satisfactory answer, your question would need to be much more specific. By the time you're able to point to a specific critic of Jeff's theory, you'll probably be able to answer your own question.

@joaoveiga3382 2 жыл бұрын

@@maloxi1472 Thanks for your observations, I'm a fan of this theory but I'm new to neuroscience. I just wanted to understand why the scientific community still hasn't accepted the theory. I personally think Jeff Hawkins is brilliant and this theory is incredible

@doanviettrung 2 жыл бұрын

You are right, João. The "Why neurons have thousands of synapses" paper has only 386 ciitatons after 6 years - not bad but not great. Why? 1. Researchers' inertia 2. Few neuroscientists do AI, and few AI types do neuroscience. Once Numenta creates an AI that can learn anything, venture billions will pile in.

@joaoveiga3382 2 жыл бұрын

@@doanviettrung Yea! I can't wait for Numenta to demonstrate to everyone the potential of its technology like Open AI did, if that happens our understanding of what a brain is and what it means to be a Human being will change

@Turbo_Tastic 6 ай бұрын

speaking of noticing things that are wrong, Jeff's eyelid looks like it has been damaged by the bioweapon injections

@planeofinertia7433 2 жыл бұрын

The problem with this theoretically implied modelling is COPY v. ORIGINATION, A.I. is merely a copy, and any autos of consciousness is independent of environment unlike these various theories of evolution... I am inclined to believe that the Origin of life is not unconscious "Big gang Banging NOTHINGNESS", but it's contra, the all encompassing knowingness of CREATION aka CONSCIOUSNESS. ~ Inertia