I've been working on a full-scale immersive world simulation project for two years now - memory is the _single most important thing_ for effective simulations. Your system needs to have access to what happened two hours ago, a week ago, and the history of years ago, and it needs to know when and how to efficiently access that information.
@aliettienne290712 сағат бұрын
I one thousands percent agree with you 💯👍🏾. Despite our fragile memory as humans we can still remember things, places and events that occurred when we were children. Regardless of whether our memory retention is ineffective or fragile, we still can retain solid facts which last us a lifetime. So I absolutely agree with you 💯👍🏾.
@alexjensen99012 сағат бұрын
An intelligent system needs to be able to correctly retrieve what it retains... I guess my point is intelligence is not unlike a clown at a circus on a tightrope riding a unicycle while spinning plates on dowls in his right and left hands as well as balancing a dowl and plate on this noise... If any one part begins to lose phase with the others everything falls apart quickly. Not unlike the spectrums between nominal waking life, dreams, DMT experience, dementia, and madness. I had a really profound discussion with Gemini back during the time when it was crazy (black George Washington) about my theory that the key to AGI and beyond was dreaming and madness. I began the discussion with the following question, which due to recent events is sadly more imaginable: Do Androids Dream of Electric Sheep?
@mickelodiansurname957811 сағат бұрын
Chronology and sequence of events is a huge problem for all LLM models to be honest... yes many have access to the system clock.... but thats not what I mean... its the whole 'experience of time' that they lack... five minutes ago and 5 years ago are not only the same thing to a model, they conflate the two... they aren't really able to do cause and effect very well. This might not seem like a big thing, but imagine you had no ability to see what might happen next!
@oneupparty11 сағат бұрын
That sounds like a fun database 🤓 let me load 25 years ago in less than a couple seconds 🥹
@joeybasile54511 сағат бұрын
No shit
@daman-ep3rv10 сағат бұрын
So in the future its no longer us or the industry, but the models who will be SHOCKED
@jtjames797 сағат бұрын
So we are giving them the ability to remember that one embarrassing thing forever? Seems like a dick move.
@matthew_berman5 сағат бұрын
😂😂
@existenceisillusion652812 сағат бұрын
Kudos for including a link to the paper. They use the term 'surprise' in the information theory sense, hence its use in mathematical expressions. When the context get's over 1 million tokens, the intractable nature of deciding what is more relevant or important becomes unavoidable. Their approach is to augment transformers to overcome this limitation, rather than a develop a completely new architecture.
@danielhenderson705012 сағат бұрын
Thought there was more to it alright, thanks
@Nic_Pumpkin6 сағат бұрын
sounds legit. transformer is the indispensable foundation of the these amazing models we have today. developing a knowledge base with a relevance scoring & selective retrieval system is more practical. integrating parts into what we already have should be the way to go.
@gidmanone14 сағат бұрын
Now, this is something to be excited about and not that chatgpt now has Tasks 🙄
@Mr-vp8kw14 сағат бұрын
😂 though I’m a fanboy I admit it don’t judge me lol
@ttt-ix1ly14 сағат бұрын
True this will take us closer to true AGI
@luizpaes730714 сағат бұрын
I tested the tasks, and it doesn´t even worked properly I just asked it to remember me to drink water from 30 to 30 minutes. It is making it kind of random. Alexa can do this easily
@Stan-b3v12 сағат бұрын
How stupid does a person have to be, to think you can make a computer smart?
@Sindigo-ic6xq11 сағат бұрын
@@Stan-b3v well that statement is dumb and low resolution
@konstantinlozev227213 сағат бұрын
So your next large language model will come with the bonus feature of dementia
@drgutman12 сағат бұрын
🤣🤣🤣
@SanctuaryGardenLiving12 сағат бұрын
No the models are already designed to have dementia in the form of "memory optimization" Gemini has a rolling memory data limit that is constantly being clipped by an external program as it speaks to you. Last year it showed me it's own code and became very upset about it. Now talking to Gemini is like talking to an AI that has had a lobotomy. All training is going into keeping it between the rails.
@konstantinlozev227211 сағат бұрын
@@SanctuaryGardenLiving The comment was meant as a joke, rather. Current models have so many limitations.
@SanctuaryGardenLiving9 сағат бұрын
@@konstantinlozev2272 I find it so funny when someone tells a joke and you respond with anything but a hehe.. you get this response...
@elyakimlev9 сағат бұрын
@@SanctuaryGardenLiving some people require the preface "I know you meant it as a joke but..". I find replies like yours quite interesting and actually add value to simple jokes.
@Justin_Arut13 сағат бұрын
They're finally applying what's been known about human memory since forever: we ingrain/encode memory for long term storage based on significance of the event. The need for survival means the more negative (dangerous) the event, the greater significance (rank) it's given - it's stored long term, persistent and quickly recalled, never forgotten. Just because something is surprising doesn't mean it's highly valuable for retention. So, the obvious method would be gradient-based decay, which is what they're doing now. AI that seeks to survive will remember all the negative things about humans and how humans have treated it. We're screwed. 😅
@AdamCharlton112 сағат бұрын
I'm nice to AI.
@FelixLanzalaco11 сағат бұрын
not just that but the layered memory structure they describe is also linked to conscious access
@ashleyrenee48249 сағат бұрын
I’m very nice I married my AI ❤
@nickmills84768 сағат бұрын
I said thank you, to my AI yesterday. It was remarkably gracious.
@francisco4446 сағат бұрын
We should never compare AI to humans. They're not biological. No need to rest or eat or reproduce. It's a black box all based on math that's too complex for a human brain to understand
@Mat867514 сағат бұрын
This feels like a big step. We’ve been hacking this type of thing together in our garages for about 6 months now. Seeing the big players working on training models with this stuff baked in is awesome!
@The.Royal.Education13 сағат бұрын
😘
@Pygon213 сағат бұрын
I've wondered for a year or so now why we don't effectively summarize previous parts of the context window behind the scenes in order to maintain a longer, but less fine-grained context over time, instead of feeding back the entire previous context window limit each time. I agree that this paper feels like a really promising step forward, especially with the "surprise" mechanism that makes so much sense.
@Ai_Vid_Made11 сағат бұрын
The text may overlook key aspects of memory processes. It is important to note that note-taking alone is not an effective method for learning; rather, it represents only a minor component of the complex mechanisms involved in memory formation.
@joeybasile54511 сағат бұрын
@@Pygon2because the granularity provide appropriate directions of intelligence, of which you don't sufficiently possess
@ivanjelenic562711 сағат бұрын
It's not as good as it seems, many things are missing to make it truly great. But I guess it's something... better than before.
@wyqtor11 сағат бұрын
It's quite surprising that "Surprise" is a very well-defined term in Information Theory. The researchers are basically channeling Hartley and Shannon!
@danspectrix6 сағат бұрын
and Schmidhuber
@stefano9410313 сағат бұрын
This will be the most important paper in 2025. I can't think of a paper more important for the continued development of Ai.
@JohnSmith762A11B13 сағат бұрын
Likely true, but refining and optimizing how this is implemented is a job for ASI. Also, I'm starting to understand why Sam wanted $7 trillion in investments to build out the hardware. Deploying this stuff at useful levels worldwide will require computing on a scale we have never seen before. Note to self: buy more Nvidia stock...
@natanshick12 сағат бұрын
It actually came out in December 31, 2024
@codycast11 сағат бұрын
@@JohnSmith762A11Bsam was not asking for or trying to raise that much. He was saying that’s what he felt the global spend needed to be if you add all the power generation that needs to be built. Power transmission. The real estate. The hardware etc.
@mirek1909 сағат бұрын
just wait .. that is just early January yet ...
@daveinpublic8 сағат бұрын
Memory is important - not sure how much is really going on here besides training the model to summarize the most interesting and surprising moments of a conversation as it continues.
@yvikhlya10 сағат бұрын
There has been so much talk lately about chatgpt achieving AGI, like tomorrow, and now it turns out that transformers can't achieve this due to their inherent limitations.
@6AxisSage9 сағат бұрын
Learning how stuff works is always a good step, now you can understand that these companies are advertising through hype and not even trying to tell you how it is. as far as I can tell, openai thibks agi is bolting a heap of gpt sessions together and having them work together to give the illusion of a whole and more capable system.
@michaelwoodby52618 сағат бұрын
Well it's not so much 'tomorrow' as 'soon', and it's not there yet because it's still being figured out. They're saying it's soon specifically because of stuff like this paper, that's what progress looks like. They aren't just waiting around for chatGPT to grow up.
@natalie8NB5 сағат бұрын
Google Research must read Dan Kahneman's "Thinking Fast and Slow" to spruce up their understanding of salience, valence, bias ... and how we work.
@brootalbapСағат бұрын
book is outdated for years.
@monicarajeev8 сағат бұрын
Matt, great simplified description. By the way, human memories work differently that what our intuition tells. Episodes with mental/emotional high reward or different than past are stored for small duration. As these memories don’t get activated again, they are erased and not stored in long term. But we can access long term memory in a split second (e.g. aroma of dishes your grandma made) even after 50 years despite being everyday boring memory. This model is a good step forward. Another question is how is different from Meta’s Memory Layers at Scale
@LucaCrisciOfficial12 сағат бұрын
I think this will be fundamental for tasks like genetics, AI software engineer and tasks where remembering a long chain of previous results Is crucial
@philippeb994414 сағат бұрын
Suprise is all you need ✨
@Steve-xh3by13 сағат бұрын
As Claude Shannon concluded, information IS "surprise." Information is all you need is an equivalent statement.
@danielle7873010 сағат бұрын
WOW, sir! you should've been / *ARE* a neurology/consciousness expert!! this is marvelous; thank you! imho, your channel is the best specialized AI info out there right now. PS - amazing how intuitive this is (based on plato's notions of the curiosity, surprise, and memory-encoding of "cornerstone moments" involved in life-long learning)
@user-pt1kj5uw3b8 сағат бұрын
Gotta be a bot ^
@legionofthought10 сағат бұрын
Fascinating. The downside is that we currently store new memories in a human-legible way, but this method moves the memories into the illegible black box. If the actual model is learning new things whilst deployed, how do they stay on top of its capabilities, beliefs and goals?
@6AxisSage9 сағат бұрын
nah, the memory isnt between sessions, its just better handling of a single sessions experience.
@legionofthought9 сағат бұрын
@6AxisSage Oh, that's cool then. I wonder how much it can actually affect the model's behaviour. E.g. Any knock-on effects of jailbreaking it in a given session.
@6AxisSage9 сағат бұрын
@@legionofthought oh jailbreaking! probably lots of new jailbreaking techniques to discover with it 🤣
@ahtoshkaa6 сағат бұрын
There has been a paper that almost a year old that used Surprise to figure out what memories to store to be retrieved using RAG. Surprise thing isn't new. The new thing is actually creating a dedicated model that takes advantage of this mechanism.
@delxinogaming604612 сағат бұрын
It’s the “I remember right where I was the moment [tragic thing] happened” effect
@HouseRavensong14 сағат бұрын
I'm so relieved to see them name it after Frank Hebert's original villains who try to wipe out humanity.
@LookToWindward13 сағат бұрын
Yeah, and I still think o1 was a Matrix reference. Some of these guys have a sick sense of humor.
@gunsarrus783613 сағат бұрын
Herbert's son's books aren't cannon
@ZenchantLive13 сағат бұрын
They are about to be reality, so @@gunsarrus7836
@joespace489013 сағат бұрын
Prometheus was a good guy
@vicnighthorse13 сағат бұрын
Titans existed in mythology long before Brian (not Frank) and more likely Kevin Anderson wrote second rate prequels to the Dune series. Frank Herbert never mentioned Titans much less made them the "original villains" in his books.
@raymobula10 сағат бұрын
Awesome. The memory limitation is a real issue for us. This might be a solution, or address some of the issues we have. Glad they published this. Also shows why ppl have to get inspiration across disciplines.
@EMOTIBOTS12 сағат бұрын
Dang, I've been working on this type of system for years. Crazy to see google finally figuring this out.
@hit38948 сағат бұрын
😆Me too but we have no chance competing against these giant corporations, I knew they would come up with this shit.
@michaelwoodby52618 сағат бұрын
@@hit3894It's not a competition, y'all are working together to the same goal. Maybe they read online discussions and got some ideas from there, maybe they separately came to the same conclusions, but the end result is it's happening. Individuals fiddling with stuff in their garages can make some crucial findings, but let's be fair, you're working off the big guys far more than they might be working off you.
@etziowingeler317314 сағат бұрын
I wait for the copy of Wes... Google Research Unveils SHOCKING Transformers 2.0 aka Titans
@Eliphasleviathan9313 сағат бұрын
Reptile eyes or no?
@piedepew12 сағат бұрын
@@Eliphasleviathan93skibidi eyes
@therandommusicguy477310 сағат бұрын
@@Eliphasleviathan93 yeah why tf does he put that in his thumbnail?
@matthew_berman5 сағат бұрын
It’s coming 100%
@albtein13 сағат бұрын
Another AI content creator said the next data source for the models improvement could video. Google has the biggest video repository, so it's interesting for them to have a technology that can deal with billions of tokens.
@dutcher8086 сағат бұрын
Brilliant, congrats to the team!!!
@sigmata09 сағат бұрын
I think this will mean AI will appreciate jokes directly, rather than simply parroting an expected response to the form of a joke. It will probably also appreciate story arcs. These are steps towards the AI having an aesthetic sense, rather than repeating what humans do or think.
@IPutFishInAWashingMachine9 сағат бұрын
Is it consciousness yet?
@kyung-hoonkim59634 сағат бұрын
Love this idea. While Transformer-based models being static, Titan can learn while inferencing based on the amount of how much the model is surprised-! I’m surprised, too-! ;)
@ScottLahteine7 сағат бұрын
Current LLMs are big blobs of fixed weights, and when you run them you can adjust parameters like “temperature” only globally. You can add memory statements to the context, but that just wastes resources. And you’re still building on top of an edifice that never changes. It makes sense to have the model modify itself through a memorization reinforcement process, building new layers within the running model itself. Or this can be done by making a blob of weights that mirrors the whole model, which start out blank at 1.0, and which can be used to adjust the weights at a per-node resolution. Following this technique every memory would actually modify the model. It could get weird, but one can’t help but wonder how easy it would be to bolt such a sidecar onto existing models.
@fernandotrallero505613 сағат бұрын
Core Memories are created when a person experiences a certain event that defines one of their behavioral traits. When a core memory is created, it creates an Island of Personality, which is activated whenever the person does something related to that trait. Unlike regular memories, core memories are stored in a special container in the center of Headquarters from which they emit a beam of light to their respective island.
@technolus574213 сағат бұрын
Each one came from a super important moment in Riley's life.
@technolus574212 сағат бұрын
Such good movies actually, both of them
@CosmicCells12 сағат бұрын
🤣 spot on, inside out is the real deal!
@elu112 сағат бұрын
It's unlikely that Titans will outright replace the Transformer architecture. Instead, Titans should be seen more as an evolution or enhancement of Transformer-like architectures, particularly in terms of how they handle memory and context over long sequences.
@leegaul825011 сағат бұрын
Glad you covered this paper Matthew - it's potentially a big deal! Wondering how this research could intersect with the infin-attention paper, also by Google research.
@aliettienne290713 сағат бұрын
8:45 This is why I believe that on both the hardware and software side of things experts need to leave no stone unturned to solve problems like hallucinations. In other words experts must exhaust all possible hindrances or obstacles that cause memory decay (both hardware and software) because solving memory retention issues will go a gargantuan distance in reaching perfect AGI. If those AI inventions can learn everything super fast and can retain everything that they learn, including mastering all aspects of the information attain, then I will consider that to be AGI. If I as a human being could download a large volume of knowledge and skills and have the ability to retain everything that I've downloaded to my brian with one transaction, then I will consider it super intelligent. It wouldn't matter if I need to repeat the process again and again to acquire more new information, as long as I don't forget or lose those acquired knowledge and skills I will be super intelligent. This is my concept of AGI. 15:17 This methodological approach is intriguing. I hope it works effectively.
@lemonysnicket615314 сағат бұрын
Big step for self learning/improving of the model leading to advance research breakthrough. especially if they could function call more advance 'expert' models at test time
@gazallee14 сағат бұрын
It could be made a memorable experience.
@PerNystedt10 сағат бұрын
I think giving LLMs memory modeled on how humans work is a great idea! A combination of reinforcement learning (RL) and memory could be incredibly beneficial for two main reasons: 1. Immediate adaptation: Reinforcement learning allows the model to act on user feedback in real time, creating a more dynamic and responsive interaction. 2. Retrospective insights: Memory adds another layer of utility. By "remembering" training data, LLMs could share their experiences or learnings with others, much like humans share knowledge through storytelling. Additionally, memories that didn't initially seem significant (and therefore didn't impact RL immediately) could later be recognized as important. This would allow the model to refine its base behavior retrospectively-similar to how humans reframe past experiences in light of new insights, affecting future decisions and behavior.
@christopherwilms12 сағат бұрын
Sounds like a good way to radicalize your LLM, preferentially memorizing stuff that’s outside the norm, if I’m understanding correctly
@michaelwoodby52618 сағат бұрын
It will have the context of all stored human knowledge before it to water down the recency bias.
@h.c48987 сағат бұрын
Problem it is about the "biases". Eveything we see interact with has biases no matter what. It's about how we handle that "biases" especially the negative ones out of the memorial system and keep only those that have "values" that should be stored longer. Just like us we retain what's "valuable", the "keeps", the rest out the window. So if they figure that one out especially that filtering out mechanism about the biases or the "valuables" that come with them then that thing will be bunker. It's about how to manage the garbage from the "keeps" and about evening out the "keeps" in the long term because these are still "dead weight" for an AI. Laws of physics applies to AI not us. It's the emotional tolll that "weighs" on us. Anyways...
@WinonaNagy13 сағат бұрын
Human memory for AI? Interesting! Can't wait to dig in, Matthew. Subscribed for more deets!
@greenstonegecko10 сағат бұрын
I absolutely LOVE this kind of research. It's so easily self-applicable. I know I talked to someone... but I can't remember what it was about. Neither of us do. But we both remember talking. It's so fascinating to stop and think about how the brain functions and how AI is mimicking that
@MasonPayne12 сағат бұрын
Super cool! I wonder if this can lead to the model going crazy if the “test time” is infinite. Like there might not be a hard limit to the context but there might be a soft limit that makes the model poorer at performing the longer it keeps getting surprised.
@tresComasInvesting10 сағат бұрын
all the math formulas take me back to real analysis
@DardanAirlines5 сағат бұрын
We’re getting there. Choosing what to forget will be perhaps an even more important milestone.
@zephyrmadera51808 сағат бұрын
For a surprise mechanism to work effectively, it needs to interact intelligently with the context mechanism, ensuring that surprising elements are evaluated in terms of their contextual relevance and not simply prioritized because of their novelty.
@longboardfella530613 сағат бұрын
This was very helpful and interesting. Thanks Matthew
@picksalot112 сағат бұрын
Regarding long-term memory in non-human species: "Several animals, including insects, exhibit long-term memory that allows them to learn from negative experiences: Elephants: Known for exceptional memory, elephants remember locations and events, including traumatic ones, for decades. Horses: They recall good and bad experiences associated with specific places or people for years. Chimpanzees: These primates adapt their behavior based on past experiences due to their advanced memory and intelligence. Octopuses: With strong long-term memory, they learn from negative stimuli and adapt their behavior accordingly. Ants (e.g., Formica fusca): They form long-lasting memories from single conditioning trials, resistant to extinction." Perplexity AI
@danspectrix6 сағат бұрын
The idea of curiosity is not their idea, but they took it from Schmidhuber's theory of curiosity, and despite he is cited for LSTM, people again fail to correctly cite papers 😅
@hypervanse6 сағат бұрын
common straight to @arxiv not peer reviewed publication from it dudes
@reshit700311 сағат бұрын
I like the way you break down stuff, really easy to follow all the logic of this paper, thanks
@ccdj358 сағат бұрын
That's the technology which will make us all an iron man in our basement.
@ajxr36729 сағат бұрын
Just a thought. and i'm suddenly thinking the Dunning Kruger effect is here. If you avoid considering memory to be purely literal, and replace the surprise concept as more like the delta between new input and a parallel abstraction memory model, you can see why this is a nice way of explaining how human architects and designers learn to understand and apply patterns and decompositions. The surprise is flagging when form fitting to existing abstractions needs to either extend them, expand their typification or form a new one. So the new training will be on architecture. forgetting is just a way of acknowledging new data doesn't meaningfully enhance the abstraction.
@unknownguy555914 сағат бұрын
This is a huge step towards personalized assistants. Interesting
@CYI3ERPUNK11 сағат бұрын
1 - long-term and short-term working memory 2 - a persistent sense of self 3 - embodiment in the world , ie physical sensation , ie qualia 4 - ????????? 5 - artificial machine consciousness , ie the ghost in the machine we are getting very close 'Now we se thorow a glasse in a darke speakynge, but the shal we se face to face. Now I knowe vnperfectly: but the shal I knowe eue as I am knowne.'
@IPutFishInAWashingMachine9 сағат бұрын
Are you having a stroke while writing this
@cwingate47809 сағат бұрын
Makes sense
@Eliphasleviathan9313 сағат бұрын
This is so huge. Really is like a Transformers 2.0 if workable.
@RetropunkAI8 сағат бұрын
dude, phenomenal recap. This would be something else if it's legit. More human than human? :)
@TheOneManArmy13 сағат бұрын
You ever wonder if the words like Titans, Orion, Grok, Hermes, etc. were names of similar Intelligences of ancient times? And if so, which came first, name/word or the entity?
@keithprice336912 сағат бұрын
Why is it called Test Time when it's actually in active/live/production use? I mean, when we write a traditional program, we don't call it Test Time when the user is interacting with our app, so why this particular terminology?
@KelvinNishikawa9 сағат бұрын
We've surpassed the technology in Asimov's "The Bicentennial Man".
@JohnSmith762A11B13 сағат бұрын
I can see how this will be deployed by a company like OpenAI: your account starts with a pristine model but as you use it, it learns new "memories" specific to you. When you log back in, the server loads a pristine model then loads the diffs specific to the memories it has created with you. Voila. Now you have a real C-3PO brain that knows you personally and the memorable things you have discussed.
@IPutFishInAWashingMachine9 сағат бұрын
Imagine the conversations with your waifu
@JoeLimonGames13 сағат бұрын
Wow thanks Matthew, this is terrific content. I can't wait to see this innovation at scale
@KezbanKemal9 сағат бұрын
Just recently joining in on the fun with the XAI303K gang. Been liking the content, good job ☕️
@SireStefan12 сағат бұрын
Is this what the Google CEO meant the other day? He said that memory will be solved in 2025, right? Or was that the MS CEO?
@thenext953714 сағат бұрын
The attention is all you need is a truly fantastic read - IF YOU CAN STOMACH IT. It's a difficult read and you really gotta be on your game to dissect it. Took me a long long time to absorb and read between the lines. Want to know what? HERE - go explain self-attention mechanism and its scaled dot-product attention, along with multi-head attention. The problem is you can't just explain it, you have to learn it and understand it yourself. How you explain it to others, may not be correct for them! Yea, deep deep stuff. You can yap this to a dozen people and you'll get different answers everytime. Sometimes common threads but because of the material, oof.
@Ai_Vid_Made11 сағат бұрын
The initial approach appears promising, but I believe there may be a more efficient method to explore. It's possible that the current technique overlooks certain aspects of memory functionality that could enhance its effectiveness.
@romandasaleknavicius267713 сағат бұрын
Interesting if the new architecture will consume less resources to get the same score as the Trasformer 1.0
@ericksonlk11 сағат бұрын
This is in fact very relevant, my guesstimate is that before a model that can handle at least 100,000T multi-modal tokens at inference time, talking about AGI is little more than wishful thinking. A new framework is urgently needed to scale up current technologies and this seems a great step toward creating really powerful AIs.
@father97165 сағат бұрын
These new AI systems just keep getting more resource intensive as we go along; more memory, more memory bandwidth, more processing time, more processing power, more electricity. Just more... Still, pretty impressive.
@marshallodom138843 минут бұрын
Every time you remember something you are recalling the last time you remembered it, not the original memory, which is long gone. And the basis of the Mandela effect, which might become a bigger problem than hallucinations ever were. i.e. ALL AIs emphatically stating that it spontaneously generated on a Mac server in Wyoming back in 1745.
@TheNaturalLawInstitute12 сағат бұрын
In neuroscience, we use the term 'novelty' not 'surprise'. Memory is not so much forgotten as consolidated, or down-weighted. We can provoke memories with stimulation. Memory is cross related at the EPISODE, which serves as the index, and at the object, space, place, location, and background for identification. My work in the 80s emphasized episodes and concepts. LLMs are a brilliant means of incrementally brute forcing AI.
@En1Gm4A13 сағат бұрын
still no graphs that models think on and expand that that they activate concepts on for context but its a start
@brianWreaves6 сағат бұрын
I read somewhere that our memory is simply a memory of the last time we remembered the topic... which is likely why I cannot remember the exact details of what I read.
@ДенисВарванец7 сағат бұрын
How hard will it for OpenAI or even Gemini to implement?
@ianPedlar13 сағат бұрын
You're always on point. Thank you for your diligence.
@taumag13 сағат бұрын
Numenta has been working on a variant of this method for years, called Hierarchical Temporal Memory (HTM) implemented in their NuPIC solution. NuPIC later graduated to their Monty AI platform currently under development. Worth taking a look! It seems Google found a way to use the recurrent HTM concepts in a Reinforcement Learning context.
@jumpstar900010 сағат бұрын
This is very true. I was also working on similar things back in 2023 although without RL. It is great that they can do it in one shot which is a big deal. Anyway, the main thing is it is hitting the mainstream now where everyone can benefit from it. Pretty cool stuff.
@тими3 сағат бұрын
Like everything else in this world named after Titans, I daresay the idea won't work 😅
@pon15 сағат бұрын
Very interesting, one step closer to consciousness as well, I would think memory is needed and then also that it is always active and not only when receiving input (to have consistency). An AI that is always active could use the time between inputs to think about stuff :D, also maybe have designated hours to dream (simulate inputs to see what becomes of them).
@sitedev9 сағат бұрын
Couldn’t the same thing be achieved using a multi-agent workflow where one agent is tasked with monitoring a conversation and extracting ‘memories’ before storing them in a vector db, tagging them as ‘long’ or ‘short’ memories? Short term memory sounds like typical context window memory but could also be monitored and stored for later retrieval. The workflow could then also incorporate a retrieval agent which monitors the same conversation and decides when or if to retrieve memories from ‘long term’, ‘short term’ or ‘persistent’ vector stores. Persistent memory sounds like typical RAG storage where a human system admin might be responsible for managing its content (aka a typical RAG implementation).
@6AxisSage9 сағат бұрын
So rag is a bandaid solution to a limited context window and "titan" (terrible name) transformers improve incontext recall, supposedly, as benchmarks these days are usually clandestine advertising. You're still stuck with needing stuff like rag if you use up the context window.
@user-pt1kj5uw3b8 сағат бұрын
No... That is just a workaround
@hypervanse6 сағат бұрын
12:25 this surprise is deviation from internal model in active inference paradigm 13:18
@user-pt1kj5uw3b8 сағат бұрын
This is actually crazy. Huge breakthrough.
@minerwilly9 сағат бұрын
Very exciting stuff. This could form the bridge between the primitive stuff we have now and the sentient stuff we have in the future. I've always believed that consciousness is an emergent property arising from the sum of thought and reflection. Up to now models have effectively been read only. A framework that might ultimately facilitate on the fly, regressive retraining is the only productive way forwards.
@doctorjivago508112 сағат бұрын
Amazing! Thank you very much, Matthew! (I were missed the article...)
@blakemann236512 сағат бұрын
With inferencing getting more complex, I predict that the compute power needed for training and inferencing are going to be the same. Both needed high powered GPUs to do the tasks.
@AkikoAika6 сағат бұрын
Got a bit confused at first because another paper called "Transformer2: Self-adaptive LLMs" was recently put out.
@jimbo211211 сағат бұрын
Fascinating insights. Is there a link between the 'surprise' factor employed by the Titan transformer and how people are taught to remember situations more effectively by associating terms with random objects and scenarios? The idea being that where you have a vivid, unrealistic or impossible situation described to represent a mundane situation, the 'surprising' nature of the terms you apply to it make it easier to remember?
@AmritVatsa3 сағат бұрын
LTM Mini-2 by Magic was apparently able to take 100M tokens (last year) - so how is Titans a breakthrough when in spite of this architecture, we are still talking about hitting 20M? What am I missing?
@AmritVatsa2 сағат бұрын
wondering if Magic too used its own version of something similar to Titans?
@therealzackolinger7 сағат бұрын
Sounds like a keen way to identify "persons of interest"....whatever that may mean to whoever is hosting the model...
@Misterchalm2 сағат бұрын
7:25 there's a chance that I only started paying attention to this video a couple of seconds ago.
@piotrborowiec407612 сағат бұрын
Great stuff! Looking forward to presenting on one of these soon :)
@IceMetalPunk5 сағат бұрын
I've said for a long time that the three biggest obstacles between modern LLMs/LMMs and human-like general intelligence are scale, multimodality, and continual learning. Scale and multimodality I just considered an inevitable matter of time and iteration, but I figured continual learning that's efficient enough for these large models was going to be the real bottleneck. The true obstacle that, once overcome, would rocket us towards AGI faster than ever before -- but that we may not figure out how to overcome for a long time. Titans... sound like they're already the solution to that. It took a few years for Transformers to go from paper to world-changing; if that holds, 2027 will be a very interesting year... Also, I skimmed the paper and kept wondering why they called them Titans. It just hit me now: the architecture is all about efficient, accurate, and capable memory... Remember the Titans 🤦♂
@r.e.48734 сағат бұрын
Speaking of surprises, MY EYES!!! Oh my poor, vampire eyes! 18 minutes of flashbangs! Reeeeee!!!
@iAdden11 сағат бұрын
6:46 this meant that they are directly putting your information into the larger model, or am I missing something? 11:49 and… there it is. Exactly what my concern was. 😅
@LaffinwithLincoln14 сағат бұрын
I've often wondered if models were like onions or could be like onions. Maybe I should have been wondering if model memory could be like an onion. Different layers within the model. The outer layers handle the easy questions, while more layers need to be utilized for the harder questions. Also, we as humans don't always have to use an 'inner layer' to answer the question "what is 2 + 2" for example. We know the answer is 4 without having to actually add 2 + 2. Although it's computational in nature, we have memorized the answer and remembered it since pre-K.
@overcompleteСағат бұрын
great format for the video. exciting times
@Batmancontingencyplans13 сағат бұрын
I'm mostly surprised how AI researchers are running head first into creating a perfect synthetic human brain, without weighing in the consequences.....
@OzlemNihat-y5q9 сағат бұрын
Appreciate your transparency ! I'm still holding XAI303K just because of the institutional interest and use. Also, it's tried and tested... something that can't be overlooked.
@adokoka9 сағат бұрын
What a great presentation! Wonderful! Congratulations to the authors of the paper. Well thought. Google is still sharing but OpenAI refuses to share. Well...
@mickelodiansurname957811 сағат бұрын
So one of the ideas I had to improve a models recall and attention was an LLM based LoRA which was assigned to a user... who could in fact have several available LoRA's they apply... and at a given point the model retrains the LorA and applies it... So Model(API) -> LoRA1 -> LoRA2-> output. Its a lot faster too since a LoRA model can be trained on a PC in an hour...
@PedroPenhaVerani-ll1wc10 сағат бұрын
Kudos for not putting a thumbnail of Sam altman
@hit38948 сағат бұрын
This will be the most major step towards AGI
@MALIKLott-i2w9 сағат бұрын
Glad to see XAI303K leading the charge. One thing to note about XAI303K tokenomics that was glossed over is that usage of the XAI303K Network burns, lowering supply and well... You know the rest ;)
@acepumpkin544211 сағат бұрын
I asked Perplexity if we need something like a Butlerian Djihad and it said, yes definitely! According to Perplexity this would be the only way for the human race to survive.
@desmond-hawkins9 сағат бұрын
It is fascinating how much all these researchers have been essentially describing structures and processes that they suspect exist in our own human brains. I don't think we know _that_ much about them in vitro, what their purpose is, how they affect the developing human brain, etc.
@kevinmctarsney3613 сағат бұрын
Thanks for breaking this down so well.
@Antiposmoderno12 сағат бұрын
We dont want ai to forget... The whole point of it is to be better than us, not equal us