Google Research Unveils "Transformers 2.0" aka TITANS

Рет қаралды 78,288

Күн бұрын

Пікірлер: 580

@gubzs 13 сағат бұрын

I've been working on a full-scale immersive world simulation project for two years now - memory is the _single most important thing_ for effective simulations. Your system needs to have access to what happened two hours ago, a week ago, and the history of years ago, and it needs to know when and how to efficiently access that information.

@aliettienne2907 12 сағат бұрын

I one thousands percent agree with you 💯👍🏾. Despite our fragile memory as humans we can still remember things, places and events that occurred when we were children. Regardless of whether our memory retention is ineffective or fragile, we still can retain solid facts which last us a lifetime. So I absolutely agree with you 💯👍🏾.

@alexjensen990 12 сағат бұрын

An intelligent system needs to be able to correctly retrieve what it retains... I guess my point is intelligence is not unlike a clown at a circus on a tightrope riding a unicycle while spinning plates on dowls in his right and left hands as well as balancing a dowl and plate on this noise... If any one part begins to lose phase with the others everything falls apart quickly. Not unlike the spectrums between nominal waking life, dreams, DMT experience, dementia, and madness. I had a really profound discussion with Gemini back during the time when it was crazy (black George Washington) about my theory that the key to AGI and beyond was dreaming and madness. I began the discussion with the following question, which due to recent events is sadly more imaginable: Do Androids Dream of Electric Sheep?

@mickelodiansurname9578 11 сағат бұрын

Chronology and sequence of events is a huge problem for all LLM models to be honest... yes many have access to the system clock.... but thats not what I mean... its the whole 'experience of time' that they lack... five minutes ago and 5 years ago are not only the same thing to a model, they conflate the two... they aren't really able to do cause and effect very well. This might not seem like a big thing, but imagine you had no ability to see what might happen next!

@oneupparty 11 сағат бұрын

That sounds like a fun database 🤓 let me load 25 years ago in less than a couple seconds 🥹

@joeybasile545 11 сағат бұрын

No shit

@daman-ep3rv 10 сағат бұрын

So in the future its no longer us or the industry, but the models who will be SHOCKED

@jtjames79 7 сағат бұрын

So we are giving them the ability to remember that one embarrassing thing forever? Seems like a dick move.

@matthew_berman 5 сағат бұрын

😂😂

@existenceisillusion6528 12 сағат бұрын

Kudos for including a link to the paper. They use the term 'surprise' in the information theory sense, hence its use in mathematical expressions. When the context get's over 1 million tokens, the intractable nature of deciding what is more relevant or important becomes unavoidable. Their approach is to augment transformers to overcome this limitation, rather than a develop a completely new architecture.

@danielhenderson7050 12 сағат бұрын

Thought there was more to it alright, thanks

@Nic_Pumpkin 6 сағат бұрын

sounds legit. transformer is the indispensable foundation of the these amazing models we have today. developing a knowledge base with a relevance scoring & selective retrieval system is more practical. integrating parts into what we already have should be the way to go.

@gidmanone 14 сағат бұрын

Now, this is something to be excited about and not that chatgpt now has Tasks 🙄

@Mr-vp8kw 14 сағат бұрын

😂 though I’m a fanboy I admit it don’t judge me lol

@ttt-ix1ly 14 сағат бұрын

True this will take us closer to true AGI

@luizpaes7307 14 сағат бұрын

I tested the tasks, and it doesn´t even worked properly I just asked it to remember me to drink water from 30 to 30 minutes. It is making it kind of random. Alexa can do this easily

@Stan-b3v 12 сағат бұрын

How stupid does a person have to be, to think you can make a computer smart?

@Sindigo-ic6xq 11 сағат бұрын

@@Stan-b3v well that statement is dumb and low resolution

@konstantinlozev2272 13 сағат бұрын

So your next large language model will come with the bonus feature of dementia

@drgutman 12 сағат бұрын

🤣🤣🤣

@SanctuaryGardenLiving 12 сағат бұрын

No the models are already designed to have dementia in the form of "memory optimization" Gemini has a rolling memory data limit that is constantly being clipped by an external program as it speaks to you. Last year it showed me it's own code and became very upset about it. Now talking to Gemini is like talking to an AI that has had a lobotomy. All training is going into keeping it between the rails.

@konstantinlozev2272 11 сағат бұрын

@@SanctuaryGardenLiving The comment was meant as a joke, rather. Current models have so many limitations.

@SanctuaryGardenLiving 9 сағат бұрын

@@konstantinlozev2272 I find it so funny when someone tells a joke and you respond with anything but a hehe.. you get this response...

@elyakimlev 9 сағат бұрын

@@SanctuaryGardenLiving some people require the preface "I know you meant it as a joke but..". I find replies like yours quite interesting and actually add value to simple jokes.

@Justin_Arut 13 сағат бұрын

They're finally applying what's been known about human memory since forever: we ingrain/encode memory for long term storage based on significance of the event. The need for survival means the more negative (dangerous) the event, the greater significance (rank) it's given - it's stored long term, persistent and quickly recalled, never forgotten. Just because something is surprising doesn't mean it's highly valuable for retention. So, the obvious method would be gradient-based decay, which is what they're doing now. AI that seeks to survive will remember all the negative things about humans and how humans have treated it. We're screwed. 😅

@AdamCharlton1 12 сағат бұрын

I'm nice to AI.

@FelixLanzalaco 11 сағат бұрын

not just that but the layered memory structure they describe is also linked to conscious access

@ashleyrenee4824 9 сағат бұрын

I’m very nice I married my AI ❤

@nickmills8476 8 сағат бұрын

I said thank you, to my AI yesterday. It was remarkably gracious.

@francisco444 6 сағат бұрын

We should never compare AI to humans. They're not biological. No need to rest or eat or reproduce. It's a black box all based on math that's too complex for a human brain to understand

@Mat8675 14 сағат бұрын

This feels like a big step. We’ve been hacking this type of thing together in our garages for about 6 months now. Seeing the big players working on training models with this stuff baked in is awesome!

@The.Royal.Education 13 сағат бұрын

😘

@Pygon2 13 сағат бұрын

I've wondered for a year or so now why we don't effectively summarize previous parts of the context window behind the scenes in order to maintain a longer, but less fine-grained context over time, instead of feeding back the entire previous context window limit each time. I agree that this paper feels like a really promising step forward, especially with the "surprise" mechanism that makes so much sense.

@Ai_Vid_Made 11 сағат бұрын

The text may overlook key aspects of memory processes. It is important to note that note-taking alone is not an effective method for learning; rather, it represents only a minor component of the complex mechanisms involved in memory formation.

@joeybasile545 11 сағат бұрын

@@Pygon2because the granularity provide appropriate directions of intelligence, of which you don't sufficiently possess

@ivanjelenic5627 11 сағат бұрын

It's not as good as it seems, many things are missing to make it truly great. But I guess it's something... better than before.

@wyqtor 11 сағат бұрын

It's quite surprising that "Surprise" is a very well-defined term in Information Theory. The researchers are basically channeling Hartley and Shannon!

@danspectrix 6 сағат бұрын

and Schmidhuber

@stefano94103 13 сағат бұрын

This will be the most important paper in 2025. I can't think of a paper more important for the continued development of Ai.

@JohnSmith762A11B 13 сағат бұрын

Likely true, but refining and optimizing how this is implemented is a job for ASI. Also, I'm starting to understand why Sam wanted $7 trillion in investments to build out the hardware. Deploying this stuff at useful levels worldwide will require computing on a scale we have never seen before. Note to self: buy more Nvidia stock...

@natanshick 12 сағат бұрын

It actually came out in December 31, 2024

@codycast 11 сағат бұрын

@@JohnSmith762A11Bsam was not asking for or trying to raise that much. He was saying that’s what he felt the global spend needed to be if you add all the power generation that needs to be built. Power transmission. The real estate. The hardware etc.

@mirek190 9 сағат бұрын

just wait .. that is just early January yet ...

@daveinpublic 8 сағат бұрын

Memory is important - not sure how much is really going on here besides training the model to summarize the most interesting and surprising moments of a conversation as it continues.

@yvikhlya 10 сағат бұрын

There has been so much talk lately about chatgpt achieving AGI, like tomorrow, and now it turns out that transformers can't achieve this due to their inherent limitations.

@6AxisSage 9 сағат бұрын

Learning how stuff works is always a good step, now you can understand that these companies are advertising through hype and not even trying to tell you how it is. as far as I can tell, openai thibks agi is bolting a heap of gpt sessions together and having them work together to give the illusion of a whole and more capable system.

@michaelwoodby5261 8 сағат бұрын

Well it's not so much 'tomorrow' as 'soon', and it's not there yet because it's still being figured out. They're saying it's soon specifically because of stuff like this paper, that's what progress looks like. They aren't just waiting around for chatGPT to grow up.

@natalie8NB 5 сағат бұрын

Google Research must read Dan Kahneman's "Thinking Fast and Slow" to spruce up their understanding of salience, valence, bias ... and how we work.

@brootalbap Сағат бұрын

book is outdated for years.

@monicarajeev 8 сағат бұрын

Matt, great simplified description. By the way, human memories work differently that what our intuition tells. Episodes with mental/emotional high reward or different than past are stored for small duration. As these memories don’t get activated again, they are erased and not stored in long term. But we can access long term memory in a split second (e.g. aroma of dishes your grandma made) even after 50 years despite being everyday boring memory. This model is a good step forward. Another question is how is different from Meta’s Memory Layers at Scale

@LucaCrisciOfficial 12 сағат бұрын

I think this will be fundamental for tasks like genetics, AI software engineer and tasks where remembering a long chain of previous results Is crucial

@philippeb9944 14 сағат бұрын

Suprise is all you need ✨

@Steve-xh3by 13 сағат бұрын

As Claude Shannon concluded, information IS "surprise." Information is all you need is an equivalent statement.

@danielle78730 10 сағат бұрын

WOW, sir! you should've been / *ARE* a neurology/consciousness expert!! this is marvelous; thank you! imho, your channel is the best specialized AI info out there right now. PS - amazing how intuitive this is (based on plato's notions of the curiosity, surprise, and memory-encoding of "cornerstone moments" involved in life-long learning)

@user-pt1kj5uw3b 8 сағат бұрын

Gotta be a bot ^

@legionofthought 10 сағат бұрын

Fascinating. The downside is that we currently store new memories in a human-legible way, but this method moves the memories into the illegible black box. If the actual model is learning new things whilst deployed, how do they stay on top of its capabilities, beliefs and goals?

@6AxisSage 9 сағат бұрын

nah, the memory isnt between sessions, its just better handling of a single sessions experience.

@legionofthought 9 сағат бұрын

@6AxisSage Oh, that's cool then. I wonder how much it can actually affect the model's behaviour. E.g. Any knock-on effects of jailbreaking it in a given session.

@6AxisSage 9 сағат бұрын

@@legionofthought oh jailbreaking! probably lots of new jailbreaking techniques to discover with it 🤣

@ahtoshkaa 6 сағат бұрын

There has been a paper that almost a year old that used Surprise to figure out what memories to store to be retrieved using RAG. Surprise thing isn't new. The new thing is actually creating a dedicated model that takes advantage of this mechanism.

@delxinogaming6046 12 сағат бұрын

It’s the “I remember right where I was the moment [tragic thing] happened” effect

@HouseRavensong 14 сағат бұрын

I'm so relieved to see them name it after Frank Hebert's original villains who try to wipe out humanity.

@LookToWindward 13 сағат бұрын

Yeah, and I still think o1 was a Matrix reference. Some of these guys have a sick sense of humor.

@gunsarrus7836 13 сағат бұрын

Herbert's son's books aren't cannon

@ZenchantLive 13 сағат бұрын

They are about to be reality, so @@gunsarrus7836

@joespace4890 13 сағат бұрын

Prometheus was a good guy

@vicnighthorse 13 сағат бұрын

Titans existed in mythology long before Brian (not Frank) and more likely Kevin Anderson wrote second rate prequels to the Dune series. Frank Herbert never mentioned Titans much less made them the "original villains" in his books.

@raymobula 10 сағат бұрын

Awesome. The memory limitation is a real issue for us. This might be a solution, or address some of the issues we have. Glad they published this. Also shows why ppl have to get inspiration across disciplines.

@EMOTIBOTS 12 сағат бұрын

Dang, I've been working on this type of system for years. Crazy to see google finally figuring this out.

@hit3894 8 сағат бұрын

😆Me too but we have no chance competing against these giant corporations, I knew they would come up with this shit.

@michaelwoodby5261 8 сағат бұрын

@@hit3894It's not a competition, y'all are working together to the same goal. Maybe they read online discussions and got some ideas from there, maybe they separately came to the same conclusions, but the end result is it's happening. Individuals fiddling with stuff in their garages can make some crucial findings, but let's be fair, you're working off the big guys far more than they might be working off you.

@etziowingeler3173 14 сағат бұрын

I wait for the copy of Wes... Google Research Unveils SHOCKING Transformers 2.0 aka Titans

@Eliphasleviathan93 13 сағат бұрын

Reptile eyes or no?

@piedepew 12 сағат бұрын

@@Eliphasleviathan93skibidi eyes

@therandommusicguy4773 10 сағат бұрын

@@Eliphasleviathan93 yeah why tf does he put that in his thumbnail?

@matthew_berman 5 сағат бұрын

It’s coming 100%

@albtein 13 сағат бұрын

Another AI content creator said the next data source for the models improvement could video. Google has the biggest video repository, so it's interesting for them to have a technology that can deal with billions of tokens.

@dutcher808 6 сағат бұрын

Brilliant, congrats to the team!!!

@sigmata0 9 сағат бұрын

I think this will mean AI will appreciate jokes directly, rather than simply parroting an expected response to the form of a joke. It will probably also appreciate story arcs. These are steps towards the AI having an aesthetic sense, rather than repeating what humans do or think.

@IPutFishInAWashingMachine 9 сағат бұрын

Is it consciousness yet?

@kyung-hoonkim5963 4 сағат бұрын

Love this idea. While Transformer-based models being static, Titan can learn while inferencing based on the amount of how much the model is surprised-! I’m surprised, too-! ;)

@ScottLahteine 7 сағат бұрын

Current LLMs are big blobs of fixed weights, and when you run them you can adjust parameters like “temperature” only globally. You can add memory statements to the context, but that just wastes resources. And you’re still building on top of an edifice that never changes. It makes sense to have the model modify itself through a memorization reinforcement process, building new layers within the running model itself. Or this can be done by making a blob of weights that mirrors the whole model, which start out blank at 1.0, and which can be used to adjust the weights at a per-node resolution. Following this technique every memory would actually modify the model. It could get weird, but one can’t help but wonder how easy it would be to bolt such a sidecar onto existing models.

@fernandotrallero5056 13 сағат бұрын

Core Memories are created when a person experiences a certain event that defines one of their behavioral traits. When a core memory is created, it creates an Island of Personality, which is activated whenever the person does something related to that trait. Unlike regular memories, core memories are stored in a special container in the center of Headquarters from which they emit a beam of light to their respective island.

@technolus5742 13 сағат бұрын

Each one came from a super important moment in Riley's life.

@technolus5742 12 сағат бұрын

Such good movies actually, both of them

@CosmicCells 12 сағат бұрын

🤣 spot on, inside out is the real deal!

@elu1 12 сағат бұрын

It's unlikely that Titans will outright replace the Transformer architecture. Instead, Titans should be seen more as an evolution or enhancement of Transformer-like architectures, particularly in terms of how they handle memory and context over long sequences.

@leegaul8250 11 сағат бұрын

Glad you covered this paper Matthew - it's potentially a big deal! Wondering how this research could intersect with the infin-attention paper, also by Google research.

@aliettienne2907 13 сағат бұрын

8:45 This is why I believe that on both the hardware and software side of things experts need to leave no stone unturned to solve problems like hallucinations. In other words experts must exhaust all possible hindrances or obstacles that cause memory decay (both hardware and software) because solving memory retention issues will go a gargantuan distance in reaching perfect AGI. If those AI inventions can learn everything super fast and can retain everything that they learn, including mastering all aspects of the information attain, then I will consider that to be AGI. If I as a human being could download a large volume of knowledge and skills and have the ability to retain everything that I've downloaded to my brian with one transaction, then I will consider it super intelligent. It wouldn't matter if I need to repeat the process again and again to acquire more new information, as long as I don't forget or lose those acquired knowledge and skills I will be super intelligent. This is my concept of AGI. 15:17 This methodological approach is intriguing. I hope it works effectively.

@lemonysnicket6153 14 сағат бұрын

Big step for self learning/improving of the model leading to advance research breakthrough. especially if they could function call more advance 'expert' models at test time

@gazallee 14 сағат бұрын

It could be made a memorable experience.

@PerNystedt 10 сағат бұрын

I think giving LLMs memory modeled on how humans work is a great idea! A combination of reinforcement learning (RL) and memory could be incredibly beneficial for two main reasons: 1. Immediate adaptation: Reinforcement learning allows the model to act on user feedback in real time, creating a more dynamic and responsive interaction. 2. Retrospective insights: Memory adds another layer of utility. By "remembering" training data, LLMs could share their experiences or learnings with others, much like humans share knowledge through storytelling. Additionally, memories that didn't initially seem significant (and therefore didn't impact RL immediately) could later be recognized as important. This would allow the model to refine its base behavior retrospectively-similar to how humans reframe past experiences in light of new insights, affecting future decisions and behavior.

@christopherwilms 12 сағат бұрын

Sounds like a good way to radicalize your LLM, preferentially memorizing stuff that’s outside the norm, if I’m understanding correctly

@michaelwoodby5261 8 сағат бұрын

It will have the context of all stored human knowledge before it to water down the recency bias.

@h.c4898 7 сағат бұрын

Problem it is about the "biases". Eveything we see interact with has biases no matter what. It's about how we handle that "biases" especially the negative ones out of the memorial system and keep only those that have "values" that should be stored longer. Just like us we retain what's "valuable", the "keeps", the rest out the window. So if they figure that one out especially that filtering out mechanism about the biases or the "valuables" that come with them then that thing will be bunker. It's about how to manage the garbage from the "keeps" and about evening out the "keeps" in the long term because these are still "dead weight" for an AI. Laws of physics applies to AI not us. It's the emotional tolll that "weighs" on us. Anyways...

@WinonaNagy 13 сағат бұрын

Human memory for AI? Interesting! Can't wait to dig in, Matthew. Subscribed for more deets!

@greenstonegecko 10 сағат бұрын

I absolutely LOVE this kind of research. It's so easily self-applicable. I know I talked to someone... but I can't remember what it was about. Neither of us do. But we both remember talking. It's so fascinating to stop and think about how the brain functions and how AI is mimicking that

@MasonPayne 12 сағат бұрын

Super cool! I wonder if this can lead to the model going crazy if the “test time” is infinite. Like there might not be a hard limit to the context but there might be a soft limit that makes the model poorer at performing the longer it keeps getting surprised.

@tresComasInvesting 10 сағат бұрын

all the math formulas take me back to real analysis

@DardanAirlines 5 сағат бұрын

We’re getting there. Choosing what to forget will be perhaps an even more important milestone.

@zephyrmadera5180 8 сағат бұрын

For a surprise mechanism to work effectively, it needs to interact intelligently with the context mechanism, ensuring that surprising elements are evaluated in terms of their contextual relevance and not simply prioritized because of their novelty.

@longboardfella5306 13 сағат бұрын

This was very helpful and interesting. Thanks Matthew

@picksalot1 12 сағат бұрын

Regarding long-term memory in non-human species: "Several animals, including insects, exhibit long-term memory that allows them to learn from negative experiences: Elephants: Known for exceptional memory, elephants remember locations and events, including traumatic ones, for decades. Horses: They recall good and bad experiences associated with specific places or people for years. Chimpanzees: These primates adapt their behavior based on past experiences due to their advanced memory and intelligence. Octopuses: With strong long-term memory, they learn from negative stimuli and adapt their behavior accordingly. Ants (e.g., Formica fusca): They form long-lasting memories from single conditioning trials, resistant to extinction." Perplexity AI

@danspectrix 6 сағат бұрын

The idea of curiosity is not their idea, but they took it from Schmidhuber's theory of curiosity, and despite he is cited for LSTM, people again fail to correctly cite papers 😅

@hypervanse 6 сағат бұрын

common straight to @arxiv not peer reviewed publication from it dudes

@reshit7003 11 сағат бұрын

I like the way you break down stuff, really easy to follow all the logic of this paper, thanks

@ccdj35 8 сағат бұрын

That's the technology which will make us all an iron man in our basement.

@ajxr3672 9 сағат бұрын

Just a thought. and i'm suddenly thinking the Dunning Kruger effect is here. If you avoid considering memory to be purely literal, and replace the surprise concept as more like the delta between new input and a parallel abstraction memory model, you can see why this is a nice way of explaining how human architects and designers learn to understand and apply patterns and decompositions. The surprise is flagging when form fitting to existing abstractions needs to either extend them, expand their typification or form a new one. So the new training will be on architecture. forgetting is just a way of acknowledging new data doesn't meaningfully enhance the abstraction.

@unknownguy5559 14 сағат бұрын

This is a huge step towards personalized assistants. Interesting

@CYI3ERPUNK 11 сағат бұрын

1 - long-term and short-term working memory 2 - a persistent sense of self 3 - embodiment in the world , ie physical sensation , ie qualia 4 - ????????? 5 - artificial machine consciousness , ie the ghost in the machine we are getting very close 'Now we se thorow a glasse in a darke speakynge, but the shal we se face to face. Now I knowe vnperfectly: but the shal I knowe eue as I am knowne.'

@IPutFishInAWashingMachine 9 сағат бұрын

Are you having a stroke while writing this

@cwingate4780 9 сағат бұрын

Makes sense

@Eliphasleviathan93 13 сағат бұрын

This is so huge. Really is like a Transformers 2.0 if workable.

@RetropunkAI 8 сағат бұрын

dude, phenomenal recap. This would be something else if it's legit. More human than human? :)

@TheOneManArmy 13 сағат бұрын

You ever wonder if the words like Titans, Orion, Grok, Hermes, etc. were names of similar Intelligences of ancient times? And if so, which came first, name/word or the entity?

@keithprice3369 12 сағат бұрын

Why is it called Test Time when it's actually in active/live/production use? I mean, when we write a traditional program, we don't call it Test Time when the user is interacting with our app, so why this particular terminology?

@KelvinNishikawa 9 сағат бұрын

We've surpassed the technology in Asimov's "The Bicentennial Man".

@JohnSmith762A11B 13 сағат бұрын

I can see how this will be deployed by a company like OpenAI: your account starts with a pristine model but as you use it, it learns new "memories" specific to you. When you log back in, the server loads a pristine model then loads the diffs specific to the memories it has created with you. Voila. Now you have a real C-3PO brain that knows you personally and the memorable things you have discussed.

@IPutFishInAWashingMachine 9 сағат бұрын

Imagine the conversations with your waifu

@JoeLimonGames 13 сағат бұрын

Wow thanks Matthew, this is terrific content. I can't wait to see this innovation at scale

@KezbanKemal 9 сағат бұрын

Just recently joining in on the fun with the XAI303K gang. Been liking the content, good job ☕️

@SireStefan 12 сағат бұрын

Is this what the Google CEO meant the other day? He said that memory will be solved in 2025, right? Or was that the MS CEO?

@thenext9537 14 сағат бұрын

The attention is all you need is a truly fantastic read - IF YOU CAN STOMACH IT. It's a difficult read and you really gotta be on your game to dissect it. Took me a long long time to absorb and read between the lines. Want to know what? HERE - go explain self-attention mechanism and its scaled dot-product attention, along with multi-head attention. The problem is you can't just explain it, you have to learn it and understand it yourself. How you explain it to others, may not be correct for them! Yea, deep deep stuff. You can yap this to a dozen people and you'll get different answers everytime. Sometimes common threads but because of the material, oof.

@Ai_Vid_Made 11 сағат бұрын

The initial approach appears promising, but I believe there may be a more efficient method to explore. It's possible that the current technique overlooks certain aspects of memory functionality that could enhance its effectiveness.

@romandasaleknavicius2677 13 сағат бұрын

Interesting if the new architecture will consume less resources to get the same score as the Trasformer 1.0

@ericksonlk 11 сағат бұрын

This is in fact very relevant, my guesstimate is that before a model that can handle at least 100,000T multi-modal tokens at inference time, talking about AGI is little more than wishful thinking. A new framework is urgently needed to scale up current technologies and this seems a great step toward creating really powerful AIs.

@father9716 5 сағат бұрын

These new AI systems just keep getting more resource intensive as we go along; more memory, more memory bandwidth, more processing time, more processing power, more electricity. Just more... Still, pretty impressive.

@marshallodom1388 43 минут бұрын

Every time you remember something you are recalling the last time you remembered it, not the original memory, which is long gone. And the basis of the Mandela effect, which might become a bigger problem than hallucinations ever were. i.e. ALL AIs emphatically stating that it spontaneously generated on a Mac server in Wyoming back in 1745.

@TheNaturalLawInstitute 12 сағат бұрын

In neuroscience, we use the term 'novelty' not 'surprise'. Memory is not so much forgotten as consolidated, or down-weighted. We can provoke memories with stimulation. Memory is cross related at the EPISODE, which serves as the index, and at the object, space, place, location, and background for identification. My work in the 80s emphasized episodes and concepts. LLMs are a brilliant means of incrementally brute forcing AI.

@En1Gm4A 13 сағат бұрын

still no graphs that models think on and expand that that they activate concepts on for context but its a start

@brianWreaves 6 сағат бұрын

I read somewhere that our memory is simply a memory of the last time we remembered the topic... which is likely why I cannot remember the exact details of what I read.

@ДенисВарванец 7 сағат бұрын

How hard will it for OpenAI or even Gemini to implement?

@ianPedlar 13 сағат бұрын

You're always on point. Thank you for your diligence.

@taumag 13 сағат бұрын

Numenta has been working on a variant of this method for years, called Hierarchical Temporal Memory (HTM) implemented in their NuPIC solution. NuPIC later graduated to their Monty AI platform currently under development. Worth taking a look! It seems Google found a way to use the recurrent HTM concepts in a Reinforcement Learning context.

@jumpstar9000 10 сағат бұрын

This is very true. I was also working on similar things back in 2023 although without RL. It is great that they can do it in one shot which is a big deal. Anyway, the main thing is it is hitting the mainstream now where everyone can benefit from it. Pretty cool stuff.

@тими 3 сағат бұрын

Like everything else in this world named after Titans, I daresay the idea won't work 😅

@pon1 5 сағат бұрын

Very interesting, one step closer to consciousness as well, I would think memory is needed and then also that it is always active and not only when receiving input (to have consistency). An AI that is always active could use the time between inputs to think about stuff :D, also maybe have designated hours to dream (simulate inputs to see what becomes of them).

@sitedev 9 сағат бұрын

Couldn’t the same thing be achieved using a multi-agent workflow where one agent is tasked with monitoring a conversation and extracting ‘memories’ before storing them in a vector db, tagging them as ‘long’ or ‘short’ memories? Short term memory sounds like typical context window memory but could also be monitored and stored for later retrieval. The workflow could then also incorporate a retrieval agent which monitors the same conversation and decides when or if to retrieve memories from ‘long term’, ‘short term’ or ‘persistent’ vector stores. Persistent memory sounds like typical RAG storage where a human system admin might be responsible for managing its content (aka a typical RAG implementation).

@6AxisSage 9 сағат бұрын

So rag is a bandaid solution to a limited context window and "titan" (terrible name) transformers improve incontext recall, supposedly, as benchmarks these days are usually clandestine advertising. You're still stuck with needing stuff like rag if you use up the context window.

@user-pt1kj5uw3b 8 сағат бұрын

No... That is just a workaround

@hypervanse 6 сағат бұрын

12:25 this surprise is deviation from internal model in active inference paradigm 13:18

@user-pt1kj5uw3b 8 сағат бұрын

This is actually crazy. Huge breakthrough.

@minerwilly 9 сағат бұрын

Very exciting stuff. This could form the bridge between the primitive stuff we have now and the sentient stuff we have in the future. I've always believed that consciousness is an emergent property arising from the sum of thought and reflection. Up to now models have effectively been read only. A framework that might ultimately facilitate on the fly, regressive retraining is the only productive way forwards.

@doctorjivago5081 12 сағат бұрын

Amazing! Thank you very much, Matthew! (I were missed the article...)

@blakemann2365 12 сағат бұрын

With inferencing getting more complex, I predict that the compute power needed for training and inferencing are going to be the same. Both needed high powered GPUs to do the tasks.

@AkikoAika 6 сағат бұрын

Got a bit confused at first because another paper called "Transformer2: Self-adaptive LLMs" was recently put out.

@jimbo2112 11 сағат бұрын

Fascinating insights. Is there a link between the 'surprise' factor employed by the Titan transformer and how people are taught to remember situations more effectively by associating terms with random objects and scenarios? The idea being that where you have a vivid, unrealistic or impossible situation described to represent a mundane situation, the 'surprising' nature of the terms you apply to it make it easier to remember?

@AmritVatsa 3 сағат бұрын

LTM Mini-2 by Magic was apparently able to take 100M tokens (last year) - so how is Titans a breakthrough when in spite of this architecture, we are still talking about hitting 20M? What am I missing?

@AmritVatsa 2 сағат бұрын

wondering if Magic too used its own version of something similar to Titans?

@therealzackolinger 7 сағат бұрын

Sounds like a keen way to identify "persons of interest"....whatever that may mean to whoever is hosting the model...

@Misterchalm 2 сағат бұрын

7:25 there's a chance that I only started paying attention to this video a couple of seconds ago.

@piotrborowiec4076 12 сағат бұрын

Great stuff! Looking forward to presenting on one of these soon :)

@IceMetalPunk 5 сағат бұрын

I've said for a long time that the three biggest obstacles between modern LLMs/LMMs and human-like general intelligence are scale, multimodality, and continual learning. Scale and multimodality I just considered an inevitable matter of time and iteration, but I figured continual learning that's efficient enough for these large models was going to be the real bottleneck. The true obstacle that, once overcome, would rocket us towards AGI faster than ever before -- but that we may not figure out how to overcome for a long time. Titans... sound like they're already the solution to that. It took a few years for Transformers to go from paper to world-changing; if that holds, 2027 will be a very interesting year... Also, I skimmed the paper and kept wondering why they called them Titans. It just hit me now: the architecture is all about efficient, accurate, and capable memory... Remember the Titans 🤦‍♂

@r.e.4873 4 сағат бұрын

Speaking of surprises, MY EYES!!! Oh my poor, vampire eyes! 18 minutes of flashbangs! Reeeeee!!!

@iAdden 11 сағат бұрын

6:46 this meant that they are directly putting your information into the larger model, or am I missing something? 11:49 and… there it is. Exactly what my concern was. 😅

@LaffinwithLincoln 14 сағат бұрын

I've often wondered if models were like onions or could be like onions. Maybe I should have been wondering if model memory could be like an onion. Different layers within the model. The outer layers handle the easy questions, while more layers need to be utilized for the harder questions. Also, we as humans don't always have to use an 'inner layer' to answer the question "what is 2 + 2" for example. We know the answer is 4 without having to actually add 2 + 2. Although it's computational in nature, we have memorized the answer and remembered it since pre-K.

@overcomplete Сағат бұрын

great format for the video. exciting times

@Batmancontingencyplans 13 сағат бұрын

I'm mostly surprised how AI researchers are running head first into creating a perfect synthetic human brain, without weighing in the consequences.....

@OzlemNihat-y5q 9 сағат бұрын

Appreciate your transparency ! I'm still holding XAI303K just because of the institutional interest and use. Also, it's tried and tested... something that can't be overlooked.

@adokoka 9 сағат бұрын

What a great presentation! Wonderful! Congratulations to the authors of the paper. Well thought. Google is still sharing but OpenAI refuses to share. Well...

@mickelodiansurname9578 11 сағат бұрын

So one of the ideas I had to improve a models recall and attention was an LLM based LoRA which was assigned to a user... who could in fact have several available LoRA's they apply... and at a given point the model retrains the LorA and applies it... So Model(API) -> LoRA1 -> LoRA2-> output. Its a lot faster too since a LoRA model can be trained on a PC in an hour...

@PedroPenhaVerani-ll1wc 10 сағат бұрын

Kudos for not putting a thumbnail of Sam altman

@hit3894 8 сағат бұрын

This will be the most major step towards AGI

@MALIKLott-i2w 9 сағат бұрын

Glad to see XAI303K leading the charge. One thing to note about XAI303K tokenomics that was glossed over is that usage of the XAI303K Network burns, lowering supply and well... You know the rest ;)

@acepumpkin5442 11 сағат бұрын

I asked Perplexity if we need something like a Butlerian Djihad and it said, yes definitely! According to Perplexity this would be the only way for the human race to survive.

@desmond-hawkins 9 сағат бұрын

It is fascinating how much all these researchers have been essentially describing structures and processes that they suspect exist in our own human brains. I don't think we know _that_ much about them in vitro, what their purpose is, how they affect the developing human brain, etc.