Representational Strengths and Limitations of Transformers

  Рет қаралды 1,717

Google TechTalks

Google TechTalks

9 ай бұрын

A Google TechTalk, presented by Clayton Sanford, 2023-07-18
Google Algorithms Seminar - ABSTRACT: Attention layers, as commonly used in transformers, form the backbone of modern deep learning, yet there is little mathematical work detailing their benefits and deficiencies as compared with other architectures. In this talk, I'll present both positive and negative results on the representation power of attention layers, with a focus on intrinsic complexity parameters such as width, depth, and embedding dimension. On the positive side, I'll present a sparse averaging task, where recurrent networks and feedforward networks all have complexity scaling polynomially in the input size, whereas transformers scale merely logarithmically in the input size. On the negative side, I'll present a triple detection task, where attention layers in turn have complexity scaling linearly in the input size. I'll discuss these results and some of our proof techniques, which emphasize the value of communication complexity in the analysis of transformers. Based on joint work with Daniel Hsu and Matus Telgarsky.
Bio: Clayton Sanford is an incoming 5th (and final) year PhD student at Columbia studying machine learning theory. His work focuses primarily on the representational properties and inductive biases of neural networks. He has additionally worked on solving learning combinatorial algorithms with transformers (as a Microsoft Research intern this summer) and climate modeling with ML (as an Allen Institute for AI intern in summer 2022).

Пікірлер: 1
@quorkquork
@quorkquork 9 ай бұрын
The robotic voice quality strains hearing, I'd expect Google to do better
The Most Important Algorithm in Machine Learning
40:08
Artem Kirsanov
Рет қаралды 256 М.
ChatGPT: 30 Year History | How AI Learned to Talk
26:55
Art of the Problem
Рет қаралды 993 М.
When someone reclines their seat ✈️
00:21
Adam W
Рет қаралды 24 МЛН
Watermelon Cat?! 🙀 #cat #cute #kitten
00:56
Stocat
Рет қаралды 21 МЛН
PINK STEERING STEERING CAR
00:31
Levsob
Рет қаралды 17 МЛН
What are Transformer Models and how do they work?
44:26
Serrano.Academy
Рет қаралды 98 М.
New Discovery: LLMs have a Performance Phase
29:51
code_your_own_AI
Рет қаралды 12 М.
Nouha Dziri: Faith and Fate: Limits of Transformers on Compositionality
48:50
Formal Languages and Neural Networks Seminar
Рет қаралды 858
A Path Towards Autonomous Machine Intelligence with Dr. Yann LeCun
1:03:05
AFOSR, Air Force Office of Scientific Research
Рет қаралды 17 М.
The Attention Mechanism in Large Language Models
21:02
Serrano.Academy
Рет қаралды 79 М.
Bluetooth Desert Eagle
0:27
ts blur
Рет қаралды 4,6 МЛН
keren sih #iphone #apple
0:16
Muhammad Arsyad
Рет қаралды 1,2 МЛН
TOP-18 ФИШЕК iOS 18
17:09
Wylsacom
Рет қаралды 500 М.
ВЫ ЧЕ СДЕЛАЛИ С iOS 18?!
22:40
Overtake lab
Рет қаралды 71 М.