Gail Weiss: Thinking Like Transformers

  Рет қаралды 12,569

Formal Languages and Neural Networks Seminar

Formal Languages and Neural Networks Seminar

2 жыл бұрын

Paper presented by Gail Weiss to the Neural Sequence Model Theory discord on the 24th of February 2022.
Gail's references:
On Transformers and their components:
- Thinking Like Transformers (Weiss et al, 2021) arxiv.org/abs/2106.06981 (REPL here: github.com/tech-srl/RASP)
- Attention is All You Need (Vaswani et al, 2017) arxiv.org/abs/1706.03762
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding (Devlin et al, 2018) arxiv.org/abs/1810.04805
- Improving Language Understanding by Generative Pre-Training (Radford et al, 2018) s3-us-west-2.amazonaws.com/op...
- Are Transformers universal approximators of sequence-to-sequence functions? (Yun et al, 2019) arxiv.org/abs/1912.10077
- Theoretical Limitations of Self-Attention in Neural Sequence Models (Hahn, 2019) arxiv.org/abs/1906.06755
- On the Ability and Limitations of Transformers to Recognize Formal Languages (Bhattamishra et al, 2020) arxiv.org/abs/2009.11264
- Attention is Turing-Complete (Perez et al, 2021) jmlr.org/papers/v22/20-302.html
- Statistically Meaningful Approximation: a Case Study on Approximating Turing Machines with Transformers (Wei et al, 2021) arxiv.org/abs/2107.13163
- Multilayer feedforward networks are universal approximators (Hornik et al, 1989) www.cs.cmu.edu/~epxing/Class/...
- Deep Residual Learning for Image Recognition (He at al, 2016) www.cv-foundation.org/openacc...
- Universal Transformers (Dehghani et al, 2018) arxiv.org/abs/1807.03819
- Improving Transformer Models by Reordering their Sublayers (Press et al, 2019) arxiv.org/abs/1911.03864
On RNNs:
- Explaining Black Boxes on Sequential Data using Weighted Automata (Ayache et al, 2018) arxiv.org/abs/1810.05741
- Extraction of rules from discrete-time recurrent neural networks (Omlin and Giles, 1996) www.semanticscholar.org/paper...
- Extracting Automata from Recurrent Neural Networks Using Queries and Counterexamples (Weiss et al, 2017) arxiv.org/abs/1711.09576
- Connecting Weighted Automata and Recurrent Neural Networks through Spectral Learning (Rabusseau et al, 2018) arxiv.org/abs/1807.01406
- On the Practical Computational Power of Finite Precision RNNs for Language Recognition (Weiss et al, 2018) aclanthology.org/P18-2117/
- Sequential Neural Networks as Automata (Merrill, 2019) aclanthology.org/W19-3901.pdf
- A Formal Hierarchy of RNN Architectures (Merrill et al, 2020) aclanthology.org/2020.acl-mai...
- Inferring Algorithmic Patterns with Stack-Augmented Recurrent Nets (Joulin and Mikolov, 2015) proceedings.neurips.cc/paper/...
- Learning to Transduce with Unbounded Memory (Grefenstette et al, 2015) proceedings.neurips.cc/paper/...
Paper mentioned in discussion at the end:
- Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth (Dong et al, 2021) icml.cc/virtual/2021/oral/9822

Пікірлер: 8
@swim3936
@swim3936 Жыл бұрын
fantastic presentation!
@GodofStories
@GodofStories Жыл бұрын
This is great
@alexanderkyte4675
@alexanderkyte4675 Жыл бұрын
Could I please have the slides? They’re partially obscured by the listeners here. I’d like to use them for a reading group.
@formallanguagesandneuralne5578
@formallanguagesandneuralne5578 Жыл бұрын
hey, not managing to respond from my own account so positing from here - the slides are on my website, which is hosted on github - gailweiss dot github dot io
@stevenshaw124
@stevenshaw124 Жыл бұрын
this was an excellent presentation! thank you!
@homeboundrecords6955
@homeboundrecords6955 Жыл бұрын
I'll bet this reply will not be read, but... isn't the "subject" = "I" and the "object" = "dog" ?
@LGcommaI
@LGcommaI Жыл бұрын
Yes, that's correct. The terminology is confusing though (IF one knows Latin): the 'subject' literally is 'that which is (thrown) UNDER' while the 'object' is 'that which is (thrown) on top' . Everyday sensibilities thus would expect that the object is the one who does sth. and the subject the one which has sth. done TO it. The standard convention is the OPPOSITE however.
@RaviAnnaswamy
@RaviAnnaswamy Жыл бұрын
@@LGcommaI object generally refers to inert things and the 'subject' is used as English word for persons (King asked his subjects to pay more tax during the drought years...). This could be the reason for English grammar using subject for the actor and object for the acted upon (victim).
Attention is all you need explained
13:56
Lucidate
Рет қаралды 80 М.
Increíble final 😱
00:37
Juan De Dios Pantoja 2
Рет қаралды 94 МЛН
PINK STEERING STEERING CAR
00:31
Levsob
Рет қаралды 22 МЛН
Универ. 10 лет спустя - ВСЕ СЕРИИ ПОДРЯД
9:04:59
Комедии 2023
Рет қаралды 2,2 МЛН
Mor Geva: Transformer Feed Forward Layers are Key-Value Memories, and Build Predictions
1:07:47
Formal Languages and Neural Networks Seminar
Рет қаралды 6 М.
Gail Weiss: Thinking like Transformers
53:50
LFCS Seminar
Рет қаралды 12 М.
LoRA explained (and a bit about precision and quantization)
17:07
How are memories stored in neural networks? | The Hopfield Network #SoME2
15:14
LSTM is dead. Long Live Transformers!
28:48
Seattle Applied Deep Learning
Рет қаралды 525 М.
MIT Introduction to Deep Learning | 6.S191
1:09:58
Alexander Amini
Рет қаралды 287 М.
💅🏻Айфон vs Андроид🤮
0:20
Бутылочка
Рет қаралды 455 М.
cute mini iphone
0:34
승비니 Seungbini
Рет қаралды 796 М.
#miniphone
0:16
Miniphone
Рет қаралды 3,3 МЛН