NEW: INFINI Attention w/ 1 Mio Context Length

  Рет қаралды 2,212

Discover AI

Discover AI

Күн бұрын

Пікірлер: 7
@softsensei
@softsensei 9 ай бұрын
Just wanted to say you are doing the community such a great service and contribution. Thank you!
@hjups
@hjups 9 ай бұрын
It's not really Internal RAG, it's more of internal summarization - similar to RWKV (the mechanism is different though). RAG would require that the model can retrieve from an infinite DB rather than a finite summarized state. This method would very likely fail at LiM tasks, similar to one that was tried in a previous video (with instructions in the middle of a block of unrelated text). The model would have to know that the instruction is going to be more important than specific details from the text passage (the same concept would apply to retrieving specific details). That also means that this method may fail at copying outside of the current block, similar to Mamba variants (and for the same reason).
@KitcloudkickerJr
@KitcloudkickerJr 9 ай бұрын
So, essentially, Its a key value memroy netowrk abked into a LLM model?
@hjups
@hjups 9 ай бұрын
It's a summarization state, constructed from the outer product of the block K-V vectors. So each block of size S has K and V vectors of size Sxd, and they form a dxd "summary" of the K-V state for that block. Then the next block can "query" into that dxd state using a linear attention mechanism, which is added to the local self attention (within the block). Essentially, a fancy hybrid model like Jamba, just implemented different, but should have similar pitfalls. At least the summarization state here is of size dxd rather than 1x(a*d), where a
@Whysicist
@Whysicist 9 ай бұрын
Virtual Multiport Memory?
@EobardUchihaThawne
@EobardUchihaThawne 9 ай бұрын
I wish i had good level of math to understand where those formulas being derived
@Charles-Darwin
@Charles-Darwin 9 ай бұрын
With their deepmind arm, im thinking theyll reach organics/organic-analog computing first. Imagine if states and events were global - global tx/rx. A chemical solution. Shame on google for assisting in the war machine with their tech. "Dont be evil"
RING Attention explained: 1 Mio Context Length
24:34
Discover AI
Рет қаралды 3,8 М.
Next-Gen AI: RecurrentGemma (Long Context Length)
30:58
Discover AI
Рет қаралды 4 М.
Tuna 🍣 ​⁠@patrickzeinali ​⁠@ChefRush
00:48
albert_cancook
Рет қаралды 148 МЛН
Quilt Challenge, No Skills, Just Luck#Funnyfamily #Partygames #Funny
00:32
Family Games Media
Рет қаралды 55 МЛН
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
57:45
NEW: Smarter AI Reasoning w Knowledge Graphs & Agents
28:44
Discover AI
Рет қаралды 541
In-Context Learning: EXTREME vs Fine-Tuning, RAG
21:42
Discover AI
Рет қаралды 4,5 М.
Generative Model That Won 2024 Nobel Prize
33:04
Artem Kirsanov
Рет қаралды 262 М.
Google's NEW TITANS: Transformer w/ RNN Memory
25:55
Discover AI
Рет қаралды 6 М.
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4,5 МЛН
Attention in transformers, step-by-step | DL6
26:10
3Blue1Brown
Рет қаралды 2,1 МЛН
MIT 6.S191 (2023): Recurrent Neural Networks, Transformers, and Attention
1:02:50