Representational Strengths and Limitations of Transformers

The Most Important Algorithm in Machine Learning

ChatGPT: 30 Year History | How AI Learned to Talk

When someone reclines their seat ✈️

Watermelon Cat?! 🙀 #cat #cute #kitten

PINK STEERING STEERING CAR

Милана Хаметова Отгадывает Блогеров По Детским Фото #янгер #shorts

Representational Strengths and Limitations of Transformers

Рет қаралды 1,717

Google TechTalks

Google TechTalks

9 ай бұрын

A Google TechTalk, presented by Clayton Sanford, 2023-07-18
Google Algorithms Seminar - ABSTRACT: Attention layers, as commonly used in transformers, form the backbone of modern deep learning, yet there is little mathematical work detailing their benefits and deficiencies as compared with other architectures. In this talk, I'll present both positive and negative results on the representation power of attention layers, with a focus on intrinsic complexity parameters such as width, depth, and embedding dimension. On the positive side, I'll present a sparse averaging task, where recurrent networks and feedforward networks all have complexity scaling polynomially in the input size, whereas transformers scale merely logarithmically in the input size. On the negative side, I'll present a triple detection task, where attention layers in turn have complexity scaling linearly in the input size. I'll discuss these results and some of our proof techniques, which emphasize the value of communication complexity in the analysis of transformers. Based on joint work with Daniel Hsu and Matus Telgarsky.
Bio: Clayton Sanford is an incoming 5th (and final) year PhD student at Columbia studying machine learning theory. His work focuses primarily on the representational properties and inductive biases of neural networks. He has additionally worked on solving learning combinatorial algorithms with transformers (as a Microsoft Research intern this summer) and climate modeling with ML (as an Allen Institute for AI intern in summer 2022).

Пікірлер: 1

@quorkquork 9 ай бұрын

The robotic voice quality strains hearing, I'd expect Google to do better

The Most Important Algorithm in Machine Learning

40:08

The Most Important Algorithm in Machine Learning

Artem Kirsanov

Рет қаралды 256 М.

ChatGPT: 30 Year History | How AI Learned to Talk

26:55

ChatGPT: 30 Year History | How AI Learned to Talk

Art of the Problem

Рет қаралды 993 М.

When someone reclines their seat ✈️

00:21

When someone reclines their seat ✈️

Adam W

Рет қаралды 24 МЛН

Watermelon Cat?! 🙀 #cat #cute #kitten

00:56

Watermelon Cat?! 🙀 #cat #cute #kitten

Stocat

Рет қаралды 21 МЛН

PINK STEERING STEERING CAR

00:31

PINK STEERING STEERING CAR

Levsob

Рет қаралды 17 МЛН

Милана Хаметова Отгадывает Блогеров По Детским Фото #янгер #shorts

00:42

Милана Хаметова Отгадывает Блогеров По Детским Фото #янгер #shorts

ЯНГЕР

Рет қаралды 5 МЛН

Possible End of Humanity from AI? Geoffrey Hinton at MIT Technology Review's EmTech Digital

39:15

Possible End of Humanity from AI? Geoffrey Hinton at MIT Technology Review's EmTech Digital

Joseph Raczynski

Рет қаралды 522 М.

What are Transformer Models and how do they work?

44:26

What are Transformer Models and how do they work?

Serrano.Academy

Рет қаралды 98 М.

Miles Cranmer - The Next Great Scientific Theory is Hiding Inside a Neural Network (April 3, 2024)

55:55

Miles Cranmer - The Next Great Scientific Theory is Hiding Inside a Neural Network (April 3, 2024)

Simons Foundation

Рет қаралды 175 М.

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Paper Explained)

29:56

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (Paper Explained)

Yannic Kilcher

Рет қаралды 327 М.

Understanding Oversmoothing in Graph Neural Networks (GNNs): Insights from Two Theoretical Studies

59:46

Understanding Oversmoothing in Graph Neural Networks (GNNs): Insights from Two Theoretical Studies

Google TechTalks

Рет қаралды 1 М.

New Discovery: LLMs have a Performance Phase

29:51

New Discovery: LLMs have a Performance Phase

code_your_own_AI

Рет қаралды 12 М.

But what is a GPT? Visual intro to transformers | Chapter 5, Deep Learning

27:14

But what is a GPT? Visual intro to transformers | Chapter 5, Deep Learning

3Blue1Brown

Рет қаралды 2,5 МЛН

Nouha Dziri: Faith and Fate: Limits of Transformers on Compositionality

48:50

Nouha Dziri: Faith and Fate: Limits of Transformers on Compositionality

Formal Languages and Neural Networks Seminar

Рет қаралды 858

A Path Towards Autonomous Machine Intelligence with Dr. Yann LeCun

1:03:05

A Path Towards Autonomous Machine Intelligence with Dr. Yann LeCun

AFOSR, Air Force Office of Scientific Research

Рет қаралды 17 М.

The Attention Mechanism in Large Language Models

21:02

The Attention Mechanism in Large Language Models

Serrano.Academy

Рет қаралды 79 М.

Почему Nvidia и Intel лучше, чем AMD? #пк #игры #гейминг #сборкапк #видеокарта #nvidia #amd #intel

0:53

Почему Nvidia и Intel лучше, чем AMD? #пк #игры #гейминг #сборкапк #видеокарта #nvidia #amd #intel

MaxxPC

Рет қаралды 455 М.

Bluetooth Desert Eagle

0:27

Bluetooth Desert Eagle

ts blur

Рет қаралды 4,6 МЛН

Вся презентация iOS 18 с Chat GPT и многого другого с WWDC 2024 за 18 минут на русском!

17:34

Вся презентация iOS 18 с Chat GPT и многого другого с WWDC 2024 за 18 минут на русском!

ProTech

Рет қаралды 97 М.

iOS 18 представлена ОФИЦИАЛЬНО - Итоги презентации Apple WWDC 2024 за 12 минут

12:40

iOS 18 представлена ОФИЦИАЛЬНО - Итоги презентации Apple WWDC 2024 за 12 минут

CopyPasta

Рет қаралды 47 М.

keren sih #iphone #apple

0:16

keren sih #iphone #apple

Muhammad Arsyad

Рет қаралды 1,2 МЛН

TOP-18 ФИШЕК iOS 18

17:09

TOP-18 ФИШЕК iOS 18

Wylsacom

Рет қаралды 500 М.

ВЫ ЧЕ СДЕЛАЛИ С iOS 18?!

22:40

ВЫ ЧЕ СДЕЛАЛИ С iOS 18?!

Overtake lab

Рет қаралды 71 М.