No video

The Wrong Batch Size Will Ruin Your Model

  Рет қаралды 15,721

Underfitted

Underfitted

Жыл бұрын

How do different batch sizes influence the training process of neural networks using gradient descent?
Colab notebook: colab.research...
🔔 Subscribe for more stories: www.youtube.co...
📚 My 3 favorite Machine Learning books:
• Deep Learning With Python, Second Edition - amzn.to/3xA3bVI
• Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow - amzn.to/3BOX3LP
• Machine Learning with PyTorch and Scikit-Learn - amzn.to/3f7dAC8
Twitter: / svpino
Disclaimer: Some of the links included in this description are affiliate links where I'll earn a small commission if you purchase something. There's no cost to you.

Пікірлер: 31
@ErlendDavidson
@ErlendDavidson Жыл бұрын
If you scale the batch size by the learning rate (i.e. lr=(batch_size/32.)*0.01) then the stochastic gradient descent looks sort of okay here.
@underfitted
@underfitted Жыл бұрын
Interesting :)
@jasdeepsinghgrover2470
@jasdeepsinghgrover2470 Жыл бұрын
I completely agree ... Because the number of updates happening depend on batch size and even the size of the update. So if the learning rate is scaled according to batch size linearly the model can perform very well even with much smaller batches.
@Metryk
@Metryk 6 ай бұрын
Hi! Maybe you can help me with this one: if I want to test an already pre-trained image classifier, how do I proceed regarding the amount of images used? The set containing test images has 100k images, I guess it wouldn't make any sense to load them all at once, so how do I proceed? Thanks!
@ErlendDavidson
@ErlendDavidson Жыл бұрын
What do you think of (artificially) adding noise to the learning rate. I feel like it used to be more popular to do that, but almost never see it these days.
@underfitted
@underfitted Жыл бұрын
Yeah… never seen that honestly. I’ve used schedules to decrease the learning rate over time, but never read about adding noise to it.
@johnmoustakas8897
@johnmoustakas8897 Жыл бұрын
Good work, hope your channel gets more attention
@underfitted
@underfitted Жыл бұрын
Thanks, John! It takes time and work but I’ll make it happen.
@Agrover112
@Agrover112 Жыл бұрын
Hey love this video! Was losing touch of the basics !
@underfitted
@underfitted Жыл бұрын
Glad it was helpful!
@OliverHennhoefer
@OliverHennhoefer Жыл бұрын
Really like the videos. However, I want to warn against the general statement that a batch size of one is not recommended. It really depends on the problem/data. So don't simply dismiss stochastic gradient descent, try it!
@underfitted
@underfitted Жыл бұрын
I think that’s fair. I’ve never used it in any of the problems I’ve worked on, but you are right.
@user-zr5mg2wc2i
@user-zr5mg2wc2i Жыл бұрын
i didnt see a helpful video like this one in the entire internet, thank you ♥
@underfitted
@underfitted Жыл бұрын
Glad it was helpful!
@muhammadtalmeez3276
@muhammadtalmeez3276 Жыл бұрын
Your videos are amazing. Thank you so much for this great knowledge and beautiful videos.
@underfitted
@underfitted Жыл бұрын
Glad you like them!
@ziquaftynny9285
@ziquaftynny9285 Жыл бұрын
I love your presentation style! Very energetic :)
@underfitted
@underfitted Жыл бұрын
Thanks
@lakeguy65616
@lakeguy65616 Жыл бұрын
so, what is the optimal batch size?
@underfitted
@underfitted Жыл бұрын
It depends. Start with 32 and experiment from there.
@lakeguy65616
@lakeguy65616 Жыл бұрын
@@underfitted Does the amount of main memory Ram or GPU ram make a difference? (great videos!)
@underfitted
@underfitted Жыл бұрын
It does! Your batch has to fit in memory, or it won't work. When you are working with images, for example, you'll quickly find that your batch size can't be too large if you want to fit it in the GPU's memory.
@edmundfreeman7203
@edmundfreeman7203 Жыл бұрын
This is the kind of thing that I hate about deep learning. A single parameter in the optimization method can completely change the results. Batches should be small but not too small. How small? That's for heuristics but will change on different data sets.
@Levy957
@Levy957 Жыл бұрын
Amazing!! Did u know why the batch size os always 32, 64, 128?
@underfitted
@underfitted Жыл бұрын
I read somewhere about the ability to fit batches in a GPU... can't remember where exactly. That being said, I've seen experiments that show that it really doesn't matter much (if at all.)
@MrAleksander59
@MrAleksander59 Жыл бұрын
It's better for memory usage. GPU, CPU, hard drives, SSD and other in the current 2-bit logic uses memory blocks with sizes of power 2. 2^5 = 32, 2^6=64, 2^7=128 etc. You always want maximum usage of memory. For example you have array with floats, each float will take 32 bits. So, at least it divisible by 32.
@axelanderson2030
@axelanderson2030 Жыл бұрын
If you generate a dummy dataset and set a static learning rate, then smaller batch sizes work better? wtf?
@akshay0072
@akshay0072 3 ай бұрын
Good content. Try improving ur way of teaching. Learning should in relaxed tone
@underfitted
@underfitted 3 ай бұрын
Thanks! This was an old video. I’ve tried to improve in the latest few.
@michaelsprinzl9045
@michaelsprinzl9045 4 ай бұрын
A new cat video. Cute.
@sarahpeterson2702
@sarahpeterson2702 Жыл бұрын
the question is whether if you use a batch and reach the global minimum is your model functionally equivalent to one that didn't batch? Are the weights identical... no they aren't . if your model is generative you don't have equivalence with batch/non batch.
AI Basics: Accuracy, Epochs, Learning Rate, Batch Size and Loss
10:55
Prof. Ryan Ahmed
Рет қаралды 20 М.
Should You Stop Splitting Your Data Like This?
5:38
Underfitted
Рет қаралды 5 М.
小宇宙竟然尿裤子!#小丑#家庭#搞笑
00:26
家庭搞笑日记
Рет қаралды 30 МЛН
لقد سرقت حلوى القطن بشكل خفي لأصنع مصاصة🤫😎
00:33
Cool Tool SHORTS Arabic
Рет қаралды 30 МЛН
Pool Bed Prank By My Grandpa 😂 #funny
00:47
SKITS
Рет қаралды 18 МЛН
Secret Experiment Toothpaste Pt.4 😱 #shorts
00:35
Mr DegrEE
Рет қаралды 42 МЛН
Does this sound illusion fool you?
24:55
Veritasium
Рет қаралды 727 М.
How much faster has Mojo's dictionary gotten?
7:40
EKB PhD
Рет қаралды 3 М.
How to Configure and Tune Batch Size for your Neural Network?
15:00
Machine Learning Mastery
Рет қаралды 2,5 М.
The Function That Changed Everything
9:03
Underfitted
Рет қаралды 66 М.
Introduction To Autoencoders In Machine Learning.
13:54
Underfitted
Рет қаралды 12 М.
Regularization in a Neural Network | Dealing with overfitting
11:40
How to fine-tune a model using LoRA (step by step)
38:03
Underfitted
Рет қаралды 8 М.
Watching Neural Networks Learn
25:28
Emergent Garden
Рет қаралды 1,2 МЛН
小宇宙竟然尿裤子!#小丑#家庭#搞笑
00:26
家庭搞笑日记
Рет қаралды 30 МЛН