Active Learning. The Secret of Training Models Without Labels.

  Рет қаралды 8,822

Underfitted

Underfitted

Күн бұрын

A large part of the success of supervised machine learning systems is the existence of large quantities of labeled data. Unfortunately, in many cases, creating these labels is difficult, expensive, and time-consuming.
An obvious solution is to use machine learning to aid in the creation of the labels, but this presents a chicken and egg problem: how do we build a model to create labels before labeling our data to train that model?
Active Learning is one solution. A semi-supervised learning technique to build better-performing machine learning models using fewer training labels.
Paper mentioned in the video:
Active Learning Literature Survey. burrsettles.com/pub/settles.a...
🔔 Subscribe for more stories: www.youtube.com/@underfitted?...
📚 My 3 favorite Machine Learning books:
• Deep Learning With Python, Second Edition - amzn.to/3xA3bVI
• Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow - amzn.to/3BOX3LP
• Machine Learning with PyTorch and Scikit-Learn - amzn.to/3f7dAC8
Twitter: / svpino
Disclaimer: Some of the links included in this description are affiliate links where I'll earn a small commission if you purchase something. There's no cost to you.

Пікірлер: 52
@tecbrain
@tecbrain 26 күн бұрын
Fantástico vídeo. La verdad es que ahora voy a trabajar el código para entenderlo. Gracias por el trabajo que haces para ayudarnos.
@hasanx8317
@hasanx8317 23 күн бұрын
Duplicated records in the data has a significant meaning. It means that this repeatedly appearing record in the past is probably going to repeatedly appear in the future, it a VIP records, and knowing how to handle it well means you succeeded in high percentage of your supposed to do. So having duplicate data should some how eventually make the model very accurate in predicting it's related lable, more accurate than unique records.
@thecouchman2112
@thecouchman2112 Жыл бұрын
Really helpful video, thanks. One small thing though, the sound effects on the title screens were a bit loud imo :)
@underfitted
@underfitted Жыл бұрын
Noted! Thanks for the feedback!
@underfitted
@underfitted Жыл бұрын
GOOD ONE!
@emeebritto
@emeebritto 2 ай бұрын
yaa... >.
@miguelduqueb7065
@miguelduqueb7065 Жыл бұрын
Nice video! You can also use a similar approach to compare models and stay with the one that performs best. Here is how: A few years ago I was collecting data in the chemistry lab in order to fit some models. Each experiment took 1 day to complete, so I started with a simple factorial design, fitted all models to the initial data set, and then predicted the point of maximum divergence between all models. That point was used as the next experiment and models we refitted thereafter. This procedure was repeated several times. Computing uncertainty in your predictions is similar, but only with one model.
@underfitted
@underfitted Жыл бұрын
Thanks for sharing!
@sahanakaweraniyagoda9866
@sahanakaweraniyagoda9866 Жыл бұрын
This is lit 🔥. Love this practical approach to Machine learning. Keep doing the amazing work 👏👏
@underfitted
@underfitted Жыл бұрын
Thanks! Much more coming!
@fikriansyahadzaka6647
@fikriansyahadzaka6647 Жыл бұрын
Nice video! Could you also explain about semi-supervised learning? There are not many videos that clearly explain about the progress so far in semi-supervised learning, even though the topic become more popular nowadays
@Param3021
@Param3021 Жыл бұрын
Another nice video! Learned a new concept - *Active Learning*
@underfitted
@underfitted Жыл бұрын
Glad to hear that!
@maheshBasavaraju
@maheshBasavaraju Жыл бұрын
Loved the Idea of smart labelling. very cool
@knutjagersberg381
@knutjagersberg381 Жыл бұрын
Love it, world class content! Also agree. A thought: Why not start with few shot or zero shot learning before active learning?
@underfitted
@underfitted Жыл бұрын
If you have a model capable of zero-shot, absolutely!
@jubakala
@jubakala Жыл бұрын
Thanks! This was exactly what I needed at the moment! (:
@vidyachandran944
@vidyachandran944 Жыл бұрын
Great content! Thank you :)
@JoaquinRevello
@JoaquinRevello Жыл бұрын
Excellent Video. This channel is going to be huge soon
@jayantghadge4027
@jayantghadge4027 Жыл бұрын
This method to me seems a little bit like boosting. I might be wrong though, but boosting is what came to my mind after watching the video.
@jainamshroff4998
@jainamshroff4998 Жыл бұрын
A Very good video!
@mahendrakumargohil6384
@mahendrakumargohil6384 Жыл бұрын
Excellent Information 👍👍
@underfitted
@underfitted Жыл бұрын
Glad it was helpful!
@brunoras
@brunoras Жыл бұрын
Super insightfull, I`m using this ideas right now!
@123arskas
@123arskas Жыл бұрын
If you've made it public (for smaller scale projects) please give the link to its repo. Thank you
@underfitted
@underfitted Жыл бұрын
Wonderful!
@kemalariboga
@kemalariboga Жыл бұрын
Great content!
@underfitted
@underfitted Жыл бұрын
Thanks!
@roshanaryal7786
@roshanaryal7786 Жыл бұрын
Hi, Santiago! Love your content! Could you please make a video on how to start machine learning as a beginner with some programming experience. I've been doing web dev but want to transit into ML. I will appreciate your response 😊
@underfitted
@underfitted Жыл бұрын
It's coming soon!
@erdi749
@erdi749 Жыл бұрын
I love your videos, nice and extremely informative! Just a quick comment: is it possible not to have those " bommmm!" soun?(: It make impossible to listen your videos in a car or with headphone. Thank you!
@underfitted
@underfitted Жыл бұрын
Thinks, Erdi! Yes, if you watch my last few videos, I’ve improved the audio, including removing that particular sound 😏
@lorenzoleongutierrez7927
@lorenzoleongutierrez7927 Жыл бұрын
Great explanation, thanks! Do you have some example of labeling services providing this approach?. greetings !
@fobaogunkeye3551
@fobaogunkeye3551 Жыл бұрын
Lovely video Santiago! Quick question: How do we label the low confidence data that the model initially had a hard time predicting since we also didn't know what the label was in the first place. How do we know the label/class to use for that low confidence predicted data when we re-train ?
@underfitted
@underfitted Жыл бұрын
We will start by labeling some of the data manually. The goal is to seed the process to start generating automatic labels.
@dimasveliz6745
@dimasveliz6745 Жыл бұрын
dynamic! Liked it more!
@underfitted
@underfitted Жыл бұрын
Cool, thanks!
@juan.o.p.
@juan.o.p. Жыл бұрын
Very interesting
@underfitted
@underfitted Жыл бұрын
Glad you think so!
@123arskas
@123arskas Жыл бұрын
I've some queries. There's no proper practical application of it is it? Since the paper talks about methods proposed along with practical issues. Since your videos are straight to the point and you try to keep it simple, just wanna know if you've found practical implementation of it in Python etc. Do give a link to it in the description. Thank you
@underfitted
@underfitted Жыл бұрын
Yeah, I've personally used Active Learning multiple times. It's a very practical way to decide how to label a dataset.
@Param3021
@Param3021 Жыл бұрын
1:03 - We need to Build a Model to Label the data we need, to Build a Model 🤯
@underfitted
@underfitted Жыл бұрын
Yup :)
@sodipepaul9370
@sodipepaul9370 Жыл бұрын
Wow.
@underfitted
@underfitted Жыл бұрын
Wow indeed
@CarlosBCU
@CarlosBCU Жыл бұрын
Hi, maybe a silly question but how you calculate the confidence after step 2?
@underfitted
@underfitted Жыл бұрын
Assuming you are using a classification model, for example, that will be the confidence (probability) returned by the model. More specifically, the softmax value corresponding to the highest predicted class.
@CarlosBCU
@CarlosBCU Жыл бұрын
@@underfitted many thanks for your answer! What if we are running a regression?
@modakad
@modakad 29 күн бұрын
@@underfitted Answering CarlosBCU's question on confidence : I dont think your answer sufficiently clarifies the approach. Lets take an example. Suppose we have two classes, class 0, class 1. for observation A, softmax vector is [0.92,0.08] and for observation 2 its [0.60,0.40] {remember, Softmax gives a vector of values, which all add up to 1}. Which observation should we pick ? Not obs1. Obs2 is where the model has low confidence - as the model separates its predictions by only a magnitude of 0.2 (abs(0.6-0.4)) and in osb1, the separation is higher.
@modakad
@modakad 29 күн бұрын
@@CarlosBCU I think the answer would be - choose the observations with higher error (RMSE, MSE etc.)
@modakad
@modakad 29 күн бұрын
If you are using sigmoid loss function, then it would be trickier.
Introduction To Autoencoders In Machine Learning.
13:54
Underfitted
Рет қаралды 12 М.
Little girl's dream of a giant teddy bear is about to come true #shorts
00:32
Они так быстро убрались!
01:00
Аришнев
Рет қаралды 1,6 МЛН
Active (Machine) Learning - Computerphile
6:11
Computerphile
Рет қаралды 114 М.
The Confusion Matrix in Machine Learning
8:45
Underfitted
Рет қаралды 15 М.
What is Active Learning? The Future for Training AI Models
10:06
What's AI by Louis-François Bouchard
Рет қаралды 4,1 М.
A Machine Learning roadmap (the one I recommend to my students)
19:56
The Function That Changed Everything
9:03
Underfitted
Рет қаралды 66 М.
İĞNE İLE TELEFON TEMİZLEMEK!🤯
0:17
Safak Novruz
Рет қаралды 1,8 МЛН
Vision Pro наконец-то доработали! Но не Apple!
0:40
ÉЖИ АКСЁНОВ
Рет қаралды 50 М.
Что делать если в телефон попала вода?
0:17
Лена Тропоцел
Рет қаралды 3,2 МЛН