The Most Important Curve in Data Science

Рет қаралды 7,578

ritvikmath

Күн бұрын

Пікірлер: 36

@minchulkim87 Жыл бұрын

My favourite curve is the learning curve ;) - which you are helping with. So, thank you.

@ritvikmath Жыл бұрын

You're very welcome!

@fintech1378 Жыл бұрын

damn good witty comment

@pectenmaximus231 Жыл бұрын

I recently tried to explain something of this sort to colleagues as a fishing net trying to catch some kind of medium sized fish. The size of the gaps in the net needed to guarantee you never miss a single fish you were actually trying to catch, happens to be small enough that plenty of other marine life gets caught too. This lowers the quality of the catch and introduce the costs of having to pick through when processing, potentially even ruining some batches when the wrong thing gets scooped up. On the other hand, the size of gap needed that guarantees you never catch anything besides what you want, is large enough that you will also occasionally lose out on the catch of fish you set out for. In my real work example, we can afford to have a bit of junk mixed in that gets picked out, but we can’t afford to miss anything, so I went with the smallest net-gap size I could that still never missed the legitimate targets.

@vps071 Жыл бұрын

your video are 0% click baits & 100% info! :)

@jsebdev1539 Жыл бұрын

I freaking love your videos. I have breakfast, lunch and fall asleep watching them

@Set_Get Жыл бұрын

EVERY TIME you choose a useful topic to talk about. thanks.

@ritvikmath Жыл бұрын

Thanks 🙏

Жыл бұрын

Excellent video, thanks!

@IbrahimSobh Жыл бұрын

I love these intuitive explanations! thank you

@ritvikmath Жыл бұрын

Glad you like them!

@i.dragons Жыл бұрын

Just WOW!

@cheesecake202020 Жыл бұрын

Great video and great thumbnail!

@Ucefmjb Жыл бұрын

The explanation is very good, thank you

@ritvikmath Жыл бұрын

You are welcome!

@youriwatson Жыл бұрын

Very nice explanation to see as an econ student!

@ritvikmath Жыл бұрын

Awesome!

@barnabyinteractive Жыл бұрын

awesome

@ritvikmath Жыл бұрын

👍🏼 thanks!

@MrMoore0312 Жыл бұрын

Absolutely love the video! Would be interested to see a code with me on the topic. How does one go about exploring the precision recall frontier? Is it just hyperparameter tuning via grid search, or a more deliberate method I'm not aware of? I have a neural net trained with decent accuracy for what I want, but it deals with stocks so I'd much rather have no signal than a false signal. Not necessarily asking for neural net, could be logistic regression or random forest, just more clarity on the question would be wonderful! Thanks for all you do man!!

@ritvikmath Жыл бұрын

“How does one go about exploring the precision recall frontier?” is an excellent question. If the question is more about how we construct the frontier, usually that is done in binary classification problems by varying the threshold for marking some example in the positive class. Low thresholds lead to strong recall but poor precision and vice versa for high thresholds.

@MrMoore0312 Жыл бұрын

@@ritvikmath that was my question and your answer makes good sense, thanks! I'll iterate over different classification thresholds to establish the curve and then pick the level that works best for me :)

@abhigyandatta2008 Жыл бұрын

Hi Jesse, my understanding is that while searching for the best model, you need to hyper-parameter tune for area under PRC (or equivalently AUC ie Area under the ROC curve), which is independent of the choice of threshold probability. However given a model with a fixed AUC or area under PRC, you need to find the threshold that suits your problem description. Hope that makes sense.

@MrMoore0312 Жыл бұрын

@@abhigyandatta2008 for sure, thanks! Make sure I've got the best model going in, then and only then test for an appropriate threshold. Makes sense to me!

@venkataramana6975 Жыл бұрын

Good work ❤️

@ritvikmath Жыл бұрын

Thanks 🙏

@grogu808 Жыл бұрын

Could you make a video on useful resources like blogs for inspiration? Also a kaggle series would be 😎 You are the best teacher of advanced statistics topics on YT ♥️

@ritvikmath Жыл бұрын

Great suggestions! And thanks!

@Set_Get Жыл бұрын

خوب بود. تشکر.

@ritvikmath Жыл бұрын

No problem !

@jagatchaitanyaprabhala8668 Жыл бұрын

I think marketing cost n market response is a better example for law of diminishing marginal returns. This carrot apple thing doesn't feel too intuitive isn't it? Also, the precision recall curve may not really be a good example of law of diminishing marginal returns too. Precision recall curve's shape will be a function of class balance as well i think

@OddBarasch Жыл бұрын

Into to Micro Economics

@rahulprasad2318 Жыл бұрын

Isn't bell curve more important?

@ritvikmath Жыл бұрын

Totally a valid point. I think “most important curve in data science” depends on lots of things so this is my take on it. I feel this one’s important to understand higher level data science trade offs but bell curve is certainly crucial for understanding things like statistical behavior in the presence of large sample sizes for example