This type of video is my favorite. It's unique on KZbin. s/o from South Africa
@DimitriBianco2 жыл бұрын
Thanks! I have a similar one coming out in a few weeks on how not to do cross validation and sampling.
@andresrossi92 жыл бұрын
I've lost this one for some reason. Anyway, as someone who knows decision trees quite "in-depth" I'd say this is a very clear lesson, very good material as always
@seanmichael65792 жыл бұрын
Absolutely love this content and these kinds of videos. This was nuanced and came with real-world examples and experience. Beats anything I ever got from my stats grad school years. All the best.
@DimitriBianco2 жыл бұрын
Thanks! I'll try and make a few more of these types of videos.
@QuantPy2 жыл бұрын
Like always, great quality video Dimitri! If you are looking for video suggestions, I would really like to see a video regarding the possible risks of creating models based on observed granger causality between financial timeseries (perhaps not explainable because of the high number of independent variables) that may have led to good out of sample prediction performance. Would like to hear a practical example of model monitoring (perhaps some of the more popular metrics you have used previously) that could help detect if the model is deteriorating. Thanks again for your effort placed into putting these types of videos together.
@DimitriBianco2 жыл бұрын
I'll look into making some videos around these ideas.
@Yasharghami2 жыл бұрын
Can't wait.
@bhargav74762 жыл бұрын
Great stuff as always! I was wondering what's the average age group of your viewers?
@DimitriBianco2 жыл бұрын
85% is between 18 and 34 years old.
@didierdupont57842 жыл бұрын
Great video! How would one avoid such a situation? In a scenario where there are thousands of predictors, I can hardly imagine looking at correlations before building the model could help, as there are just too many to manually go through. The same would apply when pruning a tree.
@DimitriBianco2 жыл бұрын
Cluster analysis. You create clusters based on statistical relationships using something like PCA. There will be a point when the value added from adding more clusters becomes trivial. Often we end up with around 20 clusters for 500 variables. Then you manually review the top few variables in each cluster and build a model with those variables which would give you around 60 final variables.
@didierdupont57842 жыл бұрын
@@DimitriBianco Makes sense, thank you!
@Rizzickk2 жыл бұрын
Please make more ✅
@Shawro2 жыл бұрын
Hi Dimitri. Great video. I’m currently halfway through my first year of undergrad. I’m doing a dual math cs degree. I’ve chosen the Stats ‘track’ for the math part of my degree, but I’m not sure what the optimal ‘track’ for CS would be if I’m looking to best prepare myself for quant work. My options are data science, machine learning and scientific computing. I’m sure they’re all valuable skills to learn, but which do you think is the best foundation for quant work? Thanks in advance.
@DimitriBianco2 жыл бұрын
I would do scientific computing but all the are decent choices as ML and data science are taking off. Scientific computing should give you some nice math overlap and numerical analysis is a key part of quant finance.
@FatmaNurAydin-p7r Жыл бұрын
Do you have any suggestions for scientific articles on the topic you mentioned in the video? Thank you...
@DimitriBianco Жыл бұрын
No but you could Google and see if any come up. Multicollinearity can be logically drawn from the math and method of trees. You don't need a paper to come to this conclusion.
@Jay-xb5du2 жыл бұрын
Hi Dimitri, informative and great video as always! Just a quick question, do you personally think that a degree in statistics to then go onto a Mfe would give me a better chance to become a quant analyst, or a financial mathematics degree to then go onto an Mfe. Which degree do you think will prepare me better for a MFE too? Thanks
@jasdeepsinghgrover24702 жыл бұрын
If one of the correlated variables is used in the split then other ones automatically become unlikely as they won't reduce impurity. Won't this help a Decision Tree be more robust? Added the question from premiere in case someone has the same doubt.
@DimitriBianco2 жыл бұрын
The strength of a decision tree is that it will prevent multicollinearity further down a branch. The issue is when variables are blindly selected based on correlation. If a wrong variable is used it is highly likely the tree will fail quickly which reduces the robustness.
@nyboret63842 жыл бұрын
@@DimitriBianco It is true, blindly selection of variable into model is a very dangerous business in ML/Datai and especially XAI that we wish to interpret Partial Dependency Plot blindly came with some wrong and/or noisy sign. Thanks for good explanations. From An Asian (Cambodian) Applied and Theoretical Economist’s Econometrics Mathematical Statistician
@Felix-vg4mv2 жыл бұрын
Imagine you where the FBI and you predicted crime stats based on ice cream sale. Suddenly in November a video posted on Facebook and then ensues mass riots. Ice cream sale wouldn't change yet crimes would rise.