Mamba, Mamba-2 and Post-Transformer Architectures for Generative AI with Albert Gu - 693

  Рет қаралды 5,316

The TWIML AI Podcast with Sam Charrington

The TWIML AI Podcast with Sam Charrington

Күн бұрын

Пікірлер: 15
@first-thoughtgiver-of-will2456
@first-thoughtgiver-of-will2456 2 ай бұрын
Thanks Albert and Sam! Surprisingly insightful for someone researching the Mamba architecture right now!
@l.halawani
@l.halawani Ай бұрын
Super happy to see you on YT! Been missing you since Alphabet scraped Google Podcasts! Awesome content.
@JonathanWong-u8g
@JonathanWong-u8g Ай бұрын
As of 2024, the latent diffusion paradigm has been very successful in these 'natural' modality tasks (sound, images, video) and the paradigm is now being applied to 3D spatial awareness. We've actually been in the post-transformer era for a while (1-2 years)! I am wondering where Gu's work fits in here-- perhaps these Mamba models will produce better latents for extremely long-context video and spatial point cloud data? Will stay tuned. Thanks for the talk!
@mephilees7866
@mephilees7866 Ай бұрын
the problem with latent diffusion (something like DiT) is that it's too slow. especially with high bandwidth data like images. Mamba will help in the encoder part. but i don't see how to benefit from it in the decoder part. i would suggest you check VAR(Visual Autoregression). it works by regressing the next resolution instead of out of noise. around 20x faster with better performance.
@JonathanWong-u8g
@JonathanWong-u8g Ай бұрын
@@mephilees7866 Excellent, thank you!
@wwkk4964
@wwkk4964 4 ай бұрын
Brilliant, the tokenizer ought to be a learned parameter that coevolves in response to task.
@lobovutare
@lobovutare 3 ай бұрын
Interesting to hear that the author of Mamba feels that attention is indispensable. My initial thought was that Mamba is a full replacement for Transformers, but it seems that Gu believes attention layers are still necessary for the model to be able to reason at the level of tokens. Perhaps hybrid models like Jamba are the way to go.
@Noah-jz3gt
@Noah-jz3gt 2 ай бұрын
Well seems like Gu tries to find theoretical relations between attention and SSM in Mamba-2. For me, Mamba even doesn't look like SSM anymore to be honest.
@chickenp7038
@chickenp7038 3 ай бұрын
great interview
@ps3301
@ps3301 3 ай бұрын
How about vision?
@minshenlin127
@minshenlin127 15 күн бұрын
Hi, may I know how to add your channel to Apple Podcast?
@twimlai
@twimlai 15 күн бұрын
Hi. You can follow our channel here: podcasts.apple.com/us/podcast/the-twiml-ai-podcast-formerly-this-week-in-machine/id1116303051
@minshenlin127
@minshenlin127 13 күн бұрын
@@twimlai Thank you for your reply. But I cannot visit the site; the URL seems invalid
@twimlai
@twimlai 13 күн бұрын
Strange. Works on my end. Try twimlai.com/podcast and look for the button on that page.
@minshenlin127
@minshenlin127 2 күн бұрын
​@@twimlai Thank you very much. But it's still not working. So I use Spotify now😃
Supercharging Developer Productivity with ChatGPT and Claude with Simon Willison - 701
1:13:46
The TWIML AI Podcast with Sam Charrington
Рет қаралды 6 М.
Language Understanding and LLMs with Christopher Manning - 686
55:41
The TWIML AI Podcast with Sam Charrington
Рет қаралды 3,6 М.
ТЫ В ДЕТСТВЕ КОГДА ВЫПАЛ ЗУБ😂#shorts
00:59
BATEK_OFFICIAL
Рет қаралды 4,1 МЛН
ТВОИ РОДИТЕЛИ И ЧЕЛОВЕК ПАУК 😂#shorts
00:59
BATEK_OFFICIAL
Рет қаралды 4,9 МЛН
Увеличили моцареллу для @Lorenzo.bagnati
00:48
Кушать Хочу
Рет қаралды 8 МЛН
БУ, ИСПУГАЛСЯ?? #shorts
00:22
Паша Осадчий
Рет қаралды 2,7 МЛН
AI Agents: Substance or Snake Oil with Arvind Narayanan - 704
53:53
The TWIML AI Podcast with Sam Charrington
Рет қаралды 3,6 М.
No Priors Ep. 70 | With Cartesia Co-Founders Karan Goel & Albert Gu
34:09
No Priors: AI, Machine Learning, Tech, & Startups
Рет қаралды 2,9 М.
Why Does Diffusion Work Better than Auto-Regression?
20:18
Algorithmic Simplicity
Рет қаралды 375 М.
Why Your RAG System Is Broken, and How to Fix It with Jason Liu - 709
57:34
The TWIML AI Podcast with Sam Charrington
Рет қаралды 886
The Building Blocks of Agentic Systems with Harrison Chase - 698
58:46
The TWIML AI Podcast with Sam Charrington
Рет қаралды 3 М.
The "Modern Day Slaves" Of The AI Tech World
52:42
Real Stories
Рет қаралды 700 М.
The State Space Model Revolution, with Albert Gu
1:42:16
Cognitive Revolution "How AI Changes Everything"
Рет қаралды 2,2 М.
Attention in transformers, visually explained | DL6
26:10
3Blue1Brown
Рет қаралды 1,8 МЛН
MAMBA from Scratch: Neural Nets Better and Faster than Transformers
31:51
Algorithmic Simplicity
Рет қаралды 202 М.
ТЫ В ДЕТСТВЕ КОГДА ВЫПАЛ ЗУБ😂#shorts
00:59
BATEK_OFFICIAL
Рет қаралды 4,1 МЛН