Fellowship: Robust Self Supervised Audio Visual Speech Recognition

MSR-IISc AI Seminar Series: GFlowNets and System 2 Deep Learning - Yoshua Bengio

Google Data Center 360° Tour

Glow Stick Secret (part 2) 😱 #shorts

Мамина еда лучшая🤤Согласны?😁TG:👉🏼great_hustle жду тебя там❤️

Самый легкий ноутбук против чипсов MateBook X Pro vs. MacBook Air

😯 Сурдопереводчица шокировала зрителей концерта своими действиями! | Новостничок

Fellowship: Robust Self Supervised Audio Visual Speech Recognition

Рет қаралды 186

Launchpad

Жыл бұрын

#artificialintelligence #arxiv #datascience #encoding #machinelearning #deeplearning #speechrecognition
Link to paper: paperswithcode.com/paper/robu...
Paper by: Bowen Shi, Wei-Ning Hsu, Abdelrahman Mohamed
Presentation by Fellowship.ai team: www.fellowship.ai/
Fellowship.ai is brought to you by Launchpad.ai: www.launchpad.ai/
Launchpad brings cutting-edge technologies and AI applications to organizations, to learn more about our products and services check: www.launchpad.ai/ai-developme...
_______________________________________________________________
Abstract: Audio-based automatic speech recognition (ASR) degrades significantly in noisy environments and is particularly vulnerable to interfering speech, as the model cannot determine which speaker to transcribe. Audio-visual speech recognition (AVSR) systems improve robustness by complementing the audio stream with the visual information that is invariant to noise and helps the model focus on the desired speaker. However, previous AVSR work focused solely on the supervised learning setup; hence the progress was hindered by the amount of labeled data available. In this work, we present a self-supervised AVSR framework built upon Audio-Visual HuBERT (AV-HuBERT), a state-of-the-art audio-visual speech representation learning model. On the largest available AVSR benchmark dataset LRS3, our approach outperforms prior state-of-the-art by ~50% (28.0% vs. 14.1%) using less than 10% of labeled data (433hr vs. 30hr) in the presence of babble noise, while reducing the WER of an audio-based model by over 75% (25.8% vs. 5.8%) on average. Code and demo are available at github.com/facebookresearch/a...

Пікірлер

MSR-IISc AI Seminar Series: GFlowNets and System 2 Deep Learning - Yoshua Bengio

1:29:38

MSR-IISc AI Seminar Series: GFlowNets and System 2 Deep Learning - Yoshua Bengio

Microsoft Research

Рет қаралды 7 М.

Google Data Center 360° Tour

8:29

Google Data Center 360° Tour

Google Cloud Tech

Рет қаралды 5 МЛН

Glow Stick Secret (part 2) 😱 #shorts

00:33

Glow Stick Secret (part 2) 😱 #shorts

Mr DegrEE

Рет қаралды 49 МЛН

Мамина еда лучшая🤤Согласны?😁TG:👉🏼great_hustle жду тебя там❤️

00:13

Мамина еда лучшая🤤Согласны?😁TG:👉🏼great_hustle жду тебя там❤️

МишАня

Рет қаралды 8 МЛН

Самый легкий ноутбук против чипсов MateBook X Pro vs. MacBook Air

01:00

Самый легкий ноутбук против чипсов MateBook X Pro vs. MacBook Air

Wylsacom

Рет қаралды 6 МЛН

😯 Сурдопереводчица шокировала зрителей концерта своими действиями! | Новостничок

00:19

😯 Сурдопереводчица шокировала зрителей концерта своими действиями! | Новостничок

НОВОСТНИЧОК

Рет қаралды 12 МЛН

What Every Physicist Should Know About String Theory: Edward Witten

44:07

What Every Physicist Should Know About String Theory: Edward Witten

International Centre for Theoretical Sciences

Рет қаралды 182 М.

Simplify Your AI Agents with this Strategy

21:47

Simplify Your AI Agents with this Strategy

The Focused Coder

Рет қаралды 3,6 М.

Affiliate Marketing Tutorial For Beginners 2024 (Step by Step)

29:02

Affiliate Marketing Tutorial For Beginners 2024 (Step by Step)

Santrel Media

Рет қаралды 436

Building makemore Part 5: Building a WaveNet

56:22

Building makemore Part 5: Building a WaveNet

Andrej Karpathy

Рет қаралды 156 М.

MLBBQ: “Are Transformers Effective for Time Series Forecasting?” by Joanne Wardell

56:33

MLBBQ: “Are Transformers Effective for Time Series Forecasting?” by Joanne Wardell

Sergey Plis

Рет қаралды 815

Black Holes and the Structure of Spacetime by Juan Maldacena

1:29:32

Black Holes and the Structure of Spacetime by Juan Maldacena

International Centre for Theoretical Sciences

Рет қаралды 88 М.

Why Transformers fail at Time Series. Why do simple models beat Transformers at TSF

3:15

Why Transformers fail at Time Series. Why do simple models beat Transformers at TSF

Devansh: Chocolate Milk Cult Leader

Рет қаралды 222

Mathematics of Turbulent Flows: A Million Dollar Problem! by Edriss S Titi

1:26:26

Mathematics of Turbulent Flows: A Million Dollar Problem! by Edriss S Titi

International Centre for Theoretical Sciences

Рет қаралды 68 М.

Orignal transformer paper "Attention is all you need" introduced by a layman | Shawn's ML Notes

37:56

Orignal transformer paper "Attention is all you need" introduced by a layman | Shawn's ML Notes

Yuxiang "Shawn" Wang

Рет қаралды 6 М.

СЛОМАЛСЯ ПК ЗА 2000$🤬

0:59

СЛОМАЛСЯ ПК ЗА 2000$🤬

Корнеич

Рет қаралды 2,5 МЛН

Vortex Cannon vs Drone

20:44

Vortex Cannon vs Drone

Mark Rober

Рет қаралды 14 МЛН

Impossible sigma 🤣 - para SAMSUNG A3,A5,A6,A7,J2,J5,J7,S5,S6,S7,S9,A10,A20,A30,A50,A70 /// FREEFIR

1:00

Impossible sigma 🤣 - para SAMSUNG A3,A5,A6,A7,J2,J5,J7,S5,S6,S7,S9,A10,A20,A30,A50,A70 /// FREEFIR

RIHAN ARMY YT

Рет қаралды 11 МЛН

Телефон для WhatsApp #rtx4080 #iphone #android #магазин #электроника

0:10

Телефон для WhatsApp #rtx4080 #iphone #android #магазин #электроника

ЗЕОН

Рет қаралды 1,1 МЛН

Как я сделал домашний кинотеатр

0:41

Как я сделал домашний кинотеатр

RICARDO

Рет қаралды 767 М.

Пленка или защитное стекло: что лучше?

0:52

Пленка или защитное стекло: что лучше?

Слава 100пудово!

Рет қаралды 1,9 МЛН

📱 SAMSUNG, ЧТО С ЛИЦОМ? 🤡

0:46

📱 SAMSUNG, ЧТО С ЛИЦОМ? 🤡

Яблочный Маньяк

Рет қаралды 1,1 МЛН