Developing Machine Learning Based Speech Enhancement Models for Teams and Skype | Ross Cutler

  Рет қаралды 328

@Scale

@Scale

Күн бұрын

Microsoft Teams and Skype are used daily by hundreds of millions of users, and their usage has increased significantly since the COVID-19 pandemic and is a critical tool for working remotely and communicating with friends and family. In this talk we describe how we are replacing traditional digital signal processing components with machine learning based models.
One recent new feature in Microsoft Teams and Skype for removing annoying background noise in telecommunication calls, which is the third most common call quality issue users complain about. We used deep learning to create a noise suppressor that performs 7X better than the previous non-machine learning solution. It’s a great feature, but how we developed it is even more interesting. Starting just under two years before shipping the feature, we first created three open source datasets and test sets for Deep Noise Suppression (DNS), as well as a best-in-class open source subjective test framework. We held two international challenges for DNS at INTERSPEECH 2020 and ICASSP 2021. Using the challenge results and our own models, we created the first background noise objective function that is highly correlated to human perception (PCC=0.97). This allowed us to iterate fast in model training and evaluation, and enabled us to create best in class DNS models. This type of open development model is new at Microsoft, and we are successfully applying it to another speech enhancement components like acoustic echo cancellation and packet loss concealment.

Пікірлер
DFS101: 1.1 Introduction to digital forensics
21:41
DFIRScience
Рет қаралды 146 М.
Prompt Engineering, RAG, and Fine-tuning: Benefits and When to Use
15:21
Я сделала самое маленькое в мире мороженое!
00:43
Кушать Хочу
Рет қаралды 4,4 МЛН
РОДИТЕЛИ НА ШКОЛЬНОМ ПРАЗДНИКЕ
01:00
SIDELNIKOVVV
Рет қаралды 3,9 МЛН
DID A VAMPIRE BECOME A DOG FOR A HUMAN? 😳😳😳
00:56
Хасанның өзі эфирге шықты! “Қылмыстық топқа қатысым жоқ” дейді. Талғарда не болды? Халық сене ме?
09:25
Демократиялы Қазақстан / Демократический Казахстан
Рет қаралды 319 М.
Can AI Disrupt Speech Compression? | Jan Skoglund
21:20
Generative AI in a Nutshell - how to survive and thrive in the age of AI
17:57
RNNoise, Neural Speech Enhancement, and the Browser
7:50
How to Start a Speech
8:47
Conor Neill
Рет қаралды 19 МЛН
Anyscale’s Ion Stoica on Building AI Infrastructure
25:40
A.I. Learns to Drive From Scratch in Trackmania
16:51
Yosh
Рет қаралды 7 МЛН
Amazing Production Process with Modern Machines and Skilful Workers
12:09
NaLac Technique
Рет қаралды 97 МЛН
SQLc is the perfect tool for those who don't like ORMs
28:11
Dreams of Code
Рет қаралды 62 М.
5 Of The Most Dangerous Jobs In The World | Big Business | Business Insider
24:13
What is RAG? (Retrieval Augmented Generation)
11:37
Don Woodlock
Рет қаралды 155 М.
Я сделала самое маленькое в мире мороженое!
00:43
Кушать Хочу
Рет қаралды 4,4 МЛН