Cross Validation Explained!

  Рет қаралды 3,962

CodeEmporium

CodeEmporium

Күн бұрын

Пікірлер: 25
@happyduck70
@happyduck70 2 жыл бұрын
Keep up the good work! Really enjoying your lessons as a Machine Learning Novice from The Netherlands. You are in a small niche of KZbin, but good explainers aren’t common! Thanks so much!
@CodeEmporium
@CodeEmporium 2 жыл бұрын
Thank you for the kind words!! I try my best :)
@vastabyss6496
@vastabyss6496 Жыл бұрын
Quick question: how does this prevent overfitting? Since all samples/datapoints are being used for both training and testing, why couldn't the model "remember" what it had learned when it previously overfitted to folds 1-4 (I'm assuming that the data is split into 5-folds) when being tested on fold 4 in the next training cycle? In case I wasn't clear, let me lay out my understanding of K-fold Cross Validation based on what I learned in this video: - First, the model is trained on folds 1-4 and tested on fold 5. Let's say that the model overfitted to folds 1-4 and so it tested horribly on fold 5. - After that, fold 4 becomes the new testing data and fold 5 is freed up for training. However, since the model was previously trained with fold 4 being used as a training block, the model "remembers" the answers to fold 4 since its parameters are still tuned for the data contained within fold 4, thus scoring incredibly well on the testing data despite being overfitted. To re-iterate my question, how can this to a valid way to evaluate a model during and after training?
@mohamedamrouane9730
@mohamedamrouane9730 Жыл бұрын
I don't think you use the same model for the next training iterations. In each iteration you train a new model using the k-1 folds and test it on the remaining one. My question is what comes after these steps ? , let's say I trained k models now and calculated the avg and variance of each model evaluation scores, then what ? how does that help me generalize for a final model for the whole data set ?
@vastabyss6496
@vastabyss6496 Жыл бұрын
@@mohamedamrouane9730 Yeah I think you're right. After watching a lot of videos and asking ChatGPT, I was able to implement it from scratch, pretty much doing exactly what you said (training a new model on each fold, averaging the test loss across all the models, and then training a new model on the entire dataset using the same architecture as the other ones). I found that the results were not any better than what I would've gotten using the traditional holdout validation (train-test split), and it took significantly longer to train due to me having to train k models. Similar disappointments came from implementing things like Dropout, which actually made the model worse. Perhaps it's the dataset I'm testing this stuff on? (I'm using the Titanic dataset as a testing ground)
@sivakumarprasadchebiyyam9444
@sivakumarprasadchebiyyam9444 Жыл бұрын
Hi its a very good video. Could you plz let me know if cross validation is done on train data or total data?
@mohamedamrouane9730
@mohamedamrouane9730 Жыл бұрын
My question is what comes after these steps ? , let's say I trained k models now and calculated the avg and variance of each model evaluation scores, then what ? how does that help me generalize for a final model for the whole data set ?
@onlinenk13
@onlinenk13 5 ай бұрын
are there any rules on how to define the amount of data? like if I want to do a BERT finetuning on a specific task and I have 10000 sentences, can it be qualified as a "lots amount of data"?
@rafibasha4145
@rafibasha4145 2 жыл бұрын
Please start MLOPS series
@CodeEmporium
@CodeEmporium 2 жыл бұрын
Sure thing
@rafibasha4145
@rafibasha4145 2 жыл бұрын
@@CodeEmporium ,thank you
@aloo_explains
@aloo_explains 2 жыл бұрын
Which device do you use for making videos?
@CodeEmporium
@CodeEmporium 2 жыл бұрын
I record the screen with Camtasia studio
@aloo_explains
@aloo_explains 2 жыл бұрын
And what about the pen and tablet?
@CodeEmporium
@CodeEmporium 2 жыл бұрын
Wacom one tablet :)
@aloo_explains
@aloo_explains 2 жыл бұрын
Thank you so much for the info, I also wanted to make content about ML/DL, I was confused about from where to start and what to use.
@CodeEmporium
@CodeEmporium 2 жыл бұрын
Happy to help. I also have a blue yeti microphone, a ring light for lighting and use my MacBook Airs camera for footage (though you may want to get a better camera if you want)
@FoxMonkey-xw5yf
@FoxMonkey-xw5yf 2 жыл бұрын
Is holdout the same as walk-forward for time series?
@CodeEmporium
@CodeEmporium 2 жыл бұрын
Not quite. Walk forward is like cross validation but ensuring the chunk used for testing is always after the chunks used for training to ensure no data leakage
@ajaytaneja111
@ajaytaneja111 2 жыл бұрын
Thanks Ajay!
@CodeEmporium
@CodeEmporium 2 жыл бұрын
Thanks for the consistent support 👍🏽
@balakrishnaprasad8928
@balakrishnaprasad8928 2 жыл бұрын
Sir please create a video series on essential math for Machine learning
@CodeEmporium
@CodeEmporium 2 жыл бұрын
I currently have a playlist for “probability for machine learning “. Hope this helps
@ОПривет-ъ2ъ
@ОПривет-ъ2ъ 2 жыл бұрын
Great video! Thanks
@CodeEmporium
@CodeEmporium 2 жыл бұрын
Thanks so much for the comment!
Bias Variance Tradeoff Explained!
9:45
CodeEmporium
Рет қаралды 6 М.
Cross Validation : Data Science Concepts
10:12
ritvikmath
Рет қаралды 40 М.
It works #beatbox #tiktok
00:34
BeatboxJCOP
Рет қаралды 41 МЛН
“Don’t stop the chances.”
00:44
ISSEI / いっせい
Рет қаралды 62 МЛН
How to treat Acne💉
00:31
ISSEI / いっせい
Рет қаралды 108 МЛН
Machine Learning Fundamentals: Cross Validation
6:05
StatQuest with Josh Starmer
Рет қаралды 1,1 МЛН
RAG - Explained!
30:00
CodeEmporium
Рет қаралды 3,7 М.
Statistical Learning: 5.1 Cross Validation
14:02
Stanford Online
Рет қаралды 11 М.
Regularization - Explained!
12:44
CodeEmporium
Рет қаралды 18 М.
Machine Learning and Cross-Validation
7:40
Steve Brunton
Рет қаралды 35 М.
Informer embeddings - EXPLAINED!
24:59
CodeEmporium
Рет қаралды 2,1 М.
cross validation
6:57
William Hill
Рет қаралды 29 М.
k-Fold Cross-Validation
15:20
David Caughlin
Рет қаралды 22 М.
It works #beatbox #tiktok
00:34
BeatboxJCOP
Рет қаралды 41 МЛН