Lesson 4: Practical Deep Learning for Coders 2022

  Рет қаралды 86,336

Jeremy Howard

Jeremy Howard

Күн бұрын

00:00:00 - Using Huggingface
00:03:24 - Finetuning pretrained model
00:05:14 - ULMFit
00:09:15 - Transformer
00:10:52 - Zeiler & Fergus
00:14:47 - US Patent Phrase to Phase Matching Kaggle competition
00:16:10 - NLP Classification
00:20:56 - Kaggle configs, insert python in bash, read competition website
00:24:51 - Pandas, numpy, matplotlib, & pytorch
00:29:26 - Tokenization
00:33:20 - Huggingface model hub
00:36:40 - Examples of tokenized sentences
00:38:47 - Numericalization
00:41:13 - Question: rationale behind how input data was formatted
00:43:20 - ULMFit fits large documents easily
00:45:55 - Overfitting & underfitting
00:50:45 - Splitting the dataset
00:52:31 - Creating a good validation set
00:57:13 - Test set
00:59:00 - Metric vs loss
01:01:27 - The problem with metrics
01:04:10 - Pearson correlation
01:10:27 - Correlation is sensitive to outliers
01:14:00 - Training a model
01:19:20 - Question: when is it ok to remove outliers?
01:22:10 - Predictions
01:25:30 - Opportunities for research and startups
01:26:16 - Misusing NLP
01:33:00 - Question: isn’t the target categorical in this case?
Transcript thanks to wyquek, jmp, bencoman, fmussari, mike.moloch, amr.malik, kurianbenoy, gagan, and Raymond Wu on forums.fast.ai.
Timestamps thanks to RogerS49 and Wyquek on forums.fast.ai.

Пікірлер: 35
@zzznavarrete
@zzznavarrete 11 ай бұрын
As a 26-year-old data scientist with three years of industry experience, I closely follow your course, Jeremy. I want to express my gratitude for your excellent teachings and the enthusiasm you bring to every class. Thank you very much.
@jharkins
@jharkins Жыл бұрын
Going through the course - Jeremy your teaching style is amazing. I *really* appreciate what you're doing. 41:42 was my mind blown moment this class. It's arbitrary - you just have to do something consistent for it to learn from. So amazing that we're at this point in the deep learning curve already. Thanks!
@DJcatamount
@DJcatamount 8 ай бұрын
This is a legendary video! Just within a year after this upload human society is being transformed by these general purpose "Transformers" 🚀
@ILikeAI1
@ILikeAI1 7 ай бұрын
It’s great to see that the hugging face model hub is nearly 10X the size since this was recorded
@vikramsandu6054
@vikramsandu6054 Жыл бұрын
Amazing video. The thing I like most about is the small hacks and tricks provided by Jeremy in between the topics.
@TheAero
@TheAero 10 ай бұрын
55k people watched, 5k finished this lesson, 1k will apply what learned, 100 will excel in their knowledge. Work hard and you will be a legend. Small steps, huge goals.!
@mizoru_
@mizoru_ Жыл бұрын
Great to have an introduction to transformers from Jeremy!
@tumadrep00
@tumadrep00 Жыл бұрын
As always, an excellent video Jeremy
@mukhtarbimurat5106
@mukhtarbimurat5106 Жыл бұрын
Great lesson, thanks!
@Xxpat
@Xxpat Жыл бұрын
What an awesome lecture..
@DevashishJose
@DevashishJose Жыл бұрын
Thank you so much for this amazing lecture Jeremy. as always this was really insightful and a good learning experience.
@analyticsroot1898
@analyticsroot1898 Жыл бұрын
great tutorial
@erichlehmann3667
@erichlehmann3667 Жыл бұрын
Favorite lesson so far 👌
@adamkonopka4942
@adamkonopka4942 Жыл бұрын
Awesome lesson! Thanks for this series!
@tanmeyrawal644
@tanmeyrawal644 Жыл бұрын
At 1:33:31 num_labels depends upon the number of categories. So if we treating this as classification problem then it should have been num_labels=5
@harshathammegowda1615
@harshathammegowda1615 5 ай бұрын
the labels in num_labels is in a different context here. consider the label as the feature/category being predicted and not the category's values. so here its just the score which is 1 category/column, hence num_labels=1 albeit it can have upto 5 values - 0, 0.25, 0.5, 0.75, 1.0 so, if the model were to also predict something like patent acceptance/reject, then num_labels=2 (score + this)
@stevesan
@stevesan Жыл бұрын
great video. a question about validation and testing: at 58:44 he says you have to "go back to square one" if your test set result is bad. what does that mean in practice? does that mean you have to delete your data, in the rm * sense? or just perhaps re-shuffle train vs. test vs. validation (which may not be possible like in the time series case..in which case, get new data?) and even if your test result WAS good, there is still a chance THAT was by coincidence, right?
@NegatioNZor
@NegatioNZor Жыл бұрын
If you got a decent result on the validation-set, and then end up with a bad result on your held-out test set, this means that your solution (probably) has some flaw. "Going back to square one" in this sense, just means that you have to re-evaluate your solution. Often the best way of doing that is testing the most basic model, with the most basic data you have, just to see that it gives sensible answers in that case. It has nothing to do with deleting the data or re-shuffling train/test :)
@toromanow
@toromanow Жыл бұрын
Where can I find the notebook for this lesson? The Chapter 4 from the book is about something different (image classifier).
@chrgeo8342
@chrgeo8342 6 ай бұрын
Check chapter 10 of the book
@ucmaster2210
@ucmaster2210 9 ай бұрын
I have a list of keywords, thousands of rows long. Which deep learning model to use to classify them into different topics? Topics are not known in advance. Thank
@blenderpanzi
@blenderpanzi 7 ай бұрын
Does it matter in what order the vocabulary is numbered? Like assume the vocabulary is just the English alphabet, does it matter for how the neuronal network works if A B C is numbered 1 2 3 or e.g. 26 3 15? Given all the continuous mathematical operations in the network (floating point math), does it matter which tokens are numerically next to each other and wich have a bigger distance?
@juancruzalric6605
@juancruzalric6605 2 ай бұрын
if you want to predict two classes: 1 and 0 from a dataset. How can you add the F1_score metric?
@SarathSp06
@SarathSp06 Жыл бұрын
Great content. I was wondering why the AutoTokenizer have to be initialized with a pre-trained model if all it does is tokenization. How would it differ when different models are used ?
@Slava705
@Slava705 Жыл бұрын
A pretrained model has a vocabulary and a tokenizer is based on a vocabulary. Also I guess each model's tokenizer produces a slightly different data structure, that's why there is no single universal tokenizer.
@schrodingersspoon1705
@schrodingersspoon1705 Жыл бұрын
Just my attempt to answer the question, I could be wrong. I believe it is because each pre-trained model has its own method of tokenization that it accepts, so each model has its own Tokenizer. AutoTokenizer given the model you are going to use just fetches the corresponding Tokenizer that works with that model.
@tharunnarannagari2148
@tharunnarannagari2148 Жыл бұрын
I guess, different tokenizers generate different tokens for same sentence. And pretrained model would expect the incoming input tokens to match its embedding layer weights for best finetuning, since model weights are freezed.
@DearGeorge3
@DearGeorge3 Жыл бұрын
I faced a lot weird warning in the transformers block while executing the script. Absolutely unclear which of them can be ignored and which are critical. Can report but where..
@eftilija
@eftilija Жыл бұрын
you can always post any questions or issues on the fastai forums
@davidchen6087
@davidchen6087 6 ай бұрын
A bit confused about the predicted value being a continuous number between 0~1. I thought we were training a classifier that would categorize the inputs as identical, similar, different.
@florianvonstosch
@florianvonstosch 3 ай бұрын
From the Kaggle competition page: Score meanings The scores are in the 0-1 range with increments of 0.25 with the following meanings: 1.0 - Very close match. This is typically an exact match except possibly for differences in conjugation, quantity (e.g. singular vs. plural), and addition or removal of stopwords (e.g. “the”, “and”, “or”). 0.75 - Close synonym, e.g. “mobile phone” vs. “cellphone”. This also includes abbreviations, e.g. "TCP" -> "transmission control protocol". 0.5 - Synonyms which don’t have the same meaning (same function, same properties). This includes broad-narrow (hyponym) and narrow-broad (hypernym) matches. 0.25 - Somewhat related, e.g. the two phrases are in the same high level domain but are not synonyms. This also includes antonyms. 0.0 - Unrelated.
@florianvonstosch
@florianvonstosch 3 ай бұрын
Just noticed the "with increments of 0.25" part. I guess this makes the problem kind of a hybrid between classification and regression.
@xychenmsn
@xychenmsn Жыл бұрын
Are you teaching in a college?
@howardjeremyp
@howardjeremyp Жыл бұрын
Yes, at the University of Queensland
Lesson 5: Practical Deep Learning for Coders 2022
1:42:49
Jeremy Howard
Рет қаралды 64 М.
Lesson 3: Practical Deep Learning for Coders 2022
1:30:25
Jeremy Howard
Рет қаралды 104 М.
small vs big hoop #tiktok
00:12
Анастасия Тарасова
Рет қаралды 17 МЛН
бесит старшая сестра!? #роблокс #анимация #мем
00:58
КРУТОЙ ПАПА на
Рет қаралды 2,8 МЛН
He sees meat everywhere 😄🥩
00:11
AngLova
Рет қаралды 7 МЛН
Каха ограбил банк
01:00
К-Media
Рет қаралды 9 МЛН
Practical Deep Learning for Coders: Lesson 1
1:22:56
Jeremy Howard
Рет қаралды 327 М.
Lesson 7: Practical Deep Learning for Coders 2022
1:46:43
Jeremy Howard
Рет қаралды 38 М.
How Leo DiCaprio cheated the bar exam| Final Scene | Catch Me If You Can | CLIP
4:05
Boxoffice Movie Scenes
Рет қаралды 2,5 МЛН
Lesson 6: Practical Deep Learning for Coders 2022
1:42:55
Jeremy Howard
Рет қаралды 43 М.
lofi hip hop radio 📚 - beats to relax/study to
Lofi Girl
Рет қаралды 22 М.
AI: Grappling with a New Kind of Intelligence
1:55:51
World Science Festival
Рет қаралды 721 М.
Lesson 4: Practical Deep Learning for Coders
2:03:47
Jeremy Howard
Рет қаралды 56 М.
Let's build GPT: from scratch, in code, spelled out.
1:56:20
Andrej Karpathy
Рет қаралды 4,4 МЛН
small vs big hoop #tiktok
00:12
Анастасия Тарасова
Рет қаралды 17 МЛН