Keep up the good work! Really enjoying your lessons as a Machine Learning Novice from The Netherlands. You are in a small niche of KZbin, but good explainers aren’t common! Thanks so much!
@CodeEmporium2 жыл бұрын
Thank you for the kind words!! I try my best :)
@vastabyss6496 Жыл бұрын
Quick question: how does this prevent overfitting? Since all samples/datapoints are being used for both training and testing, why couldn't the model "remember" what it had learned when it previously overfitted to folds 1-4 (I'm assuming that the data is split into 5-folds) when being tested on fold 4 in the next training cycle? In case I wasn't clear, let me lay out my understanding of K-fold Cross Validation based on what I learned in this video: - First, the model is trained on folds 1-4 and tested on fold 5. Let's say that the model overfitted to folds 1-4 and so it tested horribly on fold 5. - After that, fold 4 becomes the new testing data and fold 5 is freed up for training. However, since the model was previously trained with fold 4 being used as a training block, the model "remembers" the answers to fold 4 since its parameters are still tuned for the data contained within fold 4, thus scoring incredibly well on the testing data despite being overfitted. To re-iterate my question, how can this to a valid way to evaluate a model during and after training?
@mohamedamrouane9730 Жыл бұрын
I don't think you use the same model for the next training iterations. In each iteration you train a new model using the k-1 folds and test it on the remaining one. My question is what comes after these steps ? , let's say I trained k models now and calculated the avg and variance of each model evaluation scores, then what ? how does that help me generalize for a final model for the whole data set ?
@vastabyss6496 Жыл бұрын
@@mohamedamrouane9730 Yeah I think you're right. After watching a lot of videos and asking ChatGPT, I was able to implement it from scratch, pretty much doing exactly what you said (training a new model on each fold, averaging the test loss across all the models, and then training a new model on the entire dataset using the same architecture as the other ones). I found that the results were not any better than what I would've gotten using the traditional holdout validation (train-test split), and it took significantly longer to train due to me having to train k models. Similar disappointments came from implementing things like Dropout, which actually made the model worse. Perhaps it's the dataset I'm testing this stuff on? (I'm using the Titanic dataset as a testing ground)
@sivakumarprasadchebiyyam9444 Жыл бұрын
Hi its a very good video. Could you plz let me know if cross validation is done on train data or total data?
@mohamedamrouane9730 Жыл бұрын
My question is what comes after these steps ? , let's say I trained k models now and calculated the avg and variance of each model evaluation scores, then what ? how does that help me generalize for a final model for the whole data set ?
@onlinenk135 ай бұрын
are there any rules on how to define the amount of data? like if I want to do a BERT finetuning on a specific task and I have 10000 sentences, can it be qualified as a "lots amount of data"?
@rafibasha41452 жыл бұрын
Please start MLOPS series
@CodeEmporium2 жыл бұрын
Sure thing
@rafibasha41452 жыл бұрын
@@CodeEmporium ,thank you
@aloo_explains2 жыл бұрын
Which device do you use for making videos?
@CodeEmporium2 жыл бұрын
I record the screen with Camtasia studio
@aloo_explains2 жыл бұрын
And what about the pen and tablet?
@CodeEmporium2 жыл бұрын
Wacom one tablet :)
@aloo_explains2 жыл бұрын
Thank you so much for the info, I also wanted to make content about ML/DL, I was confused about from where to start and what to use.
@CodeEmporium2 жыл бұрын
Happy to help. I also have a blue yeti microphone, a ring light for lighting and use my MacBook Airs camera for footage (though you may want to get a better camera if you want)
@FoxMonkey-xw5yf2 жыл бұрын
Is holdout the same as walk-forward for time series?
@CodeEmporium2 жыл бұрын
Not quite. Walk forward is like cross validation but ensuring the chunk used for testing is always after the chunks used for training to ensure no data leakage
@ajaytaneja1112 жыл бұрын
Thanks Ajay!
@CodeEmporium2 жыл бұрын
Thanks for the consistent support 👍🏽
@balakrishnaprasad89282 жыл бұрын
Sir please create a video series on essential math for Machine learning
@CodeEmporium2 жыл бұрын
I currently have a playlist for “probability for machine learning “. Hope this helps