NEW: INFINI Attention w/ 1 Mio Context Length

  Рет қаралды 2,193

Discover AI

Discover AI

Күн бұрын

Пікірлер: 7
@softsensei
@softsensei 8 ай бұрын
Just wanted to say you are doing the community such a great service and contribution. Thank you!
@hjups
@hjups 9 ай бұрын
It's not really Internal RAG, it's more of internal summarization - similar to RWKV (the mechanism is different though). RAG would require that the model can retrieve from an infinite DB rather than a finite summarized state. This method would very likely fail at LiM tasks, similar to one that was tried in a previous video (with instructions in the middle of a block of unrelated text). The model would have to know that the instruction is going to be more important than specific details from the text passage (the same concept would apply to retrieving specific details). That also means that this method may fail at copying outside of the current block, similar to Mamba variants (and for the same reason).
@KitcloudkickerJr
@KitcloudkickerJr 9 ай бұрын
So, essentially, Its a key value memroy netowrk abked into a LLM model?
@hjups
@hjups 9 ай бұрын
It's a summarization state, constructed from the outer product of the block K-V vectors. So each block of size S has K and V vectors of size Sxd, and they form a dxd "summary" of the K-V state for that block. Then the next block can "query" into that dxd state using a linear attention mechanism, which is added to the local self attention (within the block). Essentially, a fancy hybrid model like Jamba, just implemented different, but should have similar pitfalls. At least the summarization state here is of size dxd rather than 1x(a*d), where a
@Whysicist
@Whysicist 9 ай бұрын
Virtual Multiport Memory?
@EobardUchihaThawne
@EobardUchihaThawne 9 ай бұрын
I wish i had good level of math to understand where those formulas being derived
@Charles-Darwin
@Charles-Darwin 9 ай бұрын
With their deepmind arm, im thinking theyll reach organics/organic-analog computing first. Imagine if states and events were global - global tx/rx. A chemical solution. Shame on google for assisting in the war machine with their tech. "Dont be evil"
RING Attention explained: 1 Mio Context Length
24:34
Discover AI
Рет қаралды 3,7 М.
Next-Gen AI: RecurrentGemma (Long Context Length)
30:58
Discover AI
Рет қаралды 3,9 М.
IL'HAN - Qalqam | Official Music Video
03:17
Ilhan Ihsanov
Рет қаралды 700 М.
小丑教训坏蛋 #小丑 #天使 #shorts
00:49
好人小丑
Рет қаралды 54 МЛН
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
57:45
NEW: Better In-Context Learning ICL, Improved RAG (Harvard)
26:43
Why Does Diffusion Work Better than Auto-Regression?
20:18
Algorithmic Simplicity
Рет қаралды 423 М.
Attention in transformers, visually explained | DL6
26:10
3Blue1Brown
Рет қаралды 2 МЛН
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4,3 МЛН
MIT Introduction to Deep Learning | 6.S191
1:09:58
Alexander Amini
Рет қаралды 833 М.