xLSTM: Extended Long Short-Term Memory

  Рет қаралды 2,037

Gabriel Mongaras

Gabriel Mongaras

Күн бұрын

Пікірлер: 5
@gabrielmongaras
@gabrielmongaras 6 ай бұрын
Forgot to mention, you just stack sLSTM/mLSTM layers similar to a transformer, like usual 😏 The sLSTM uses a transformer-like block and the mLSTM uses a SSM-like block which can be seen in section 2.4.
@acasualviewer5861
@acasualviewer5861 6 ай бұрын
Is it slow to train like LSTMs and RNNs are? A major benefit from Transformers is faster parallelized training. I would assume xLSTMs would be constrained by their sequential nature.
@gabrielmongaras
@gabrielmongaras 6 ай бұрын
Yep, should still be slow to train. I don't see any way to make one of the cells into something parallel like a transformer since the cells are so complicated.
@-slt
@-slt 6 ай бұрын
constant movement of the screen makes my (and sure many others) head to explode. please move a little less. zoom in and out less. it helps the viewer to focus on the text and your explanation. thanks. :)
@gabrielmongaras
@gabrielmongaras 6 ай бұрын
Thanks for the feedback! Will keep this in mind next time I'm recording
Players push long pins through a cardboard box attempting to pop the balloon!
00:31
Don't underestimate anyone
00:47
奇軒Tricking
Рет қаралды 23 МЛН
Yay😃 Let's make a Cute Handbag for me 👜 #diycrafts #shorts
00:33
LearnToon - Learn & Play
Рет қаралды 117 МЛН
Чистка воды совком от денег
00:32
FD Vasya
Рет қаралды 3,7 МЛН
Learning to (Learn at Test Time): RNNs with Expressive Hidden States
35:52
Capacitance of a Capacitor with Partially Filling Dielectric
2:00:36
CoPE - Contextual Position Encoding: Learning to Count What's Important
38:55
Round and Round We Go! What makes Rotary Positional Encodings useful?
32:31
TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters
25:22
Players push long pins through a cardboard box attempting to pop the balloon!
00:31