Machine Learning Lecture 28 "Ball Trees / Decision Trees" -Cornell CS4780 SP17

Рет қаралды 30,251

Күн бұрын

Пікірлер: 32

@aleksandarcvetkovic7045 2 жыл бұрын

I needed to learn about KD Trees and Ball Trees for a specific application but after just 2 videos I am hooked! Now I am going to watch the whole course, thank you so much Kilian for making it public :)

@Dragon-Slay3r Жыл бұрын

🐘? 😂

@abhinav9561 4 жыл бұрын

Decision Trees at 35:14

@shrek22 Жыл бұрын

this guy inspires me. thank you

@bgtheaicompany214 5 жыл бұрын

Thank you for uploading this video,

@marcelo.5271 4 жыл бұрын

Really great content. And very much appreciated! Thanks :)

@salmaabdelmonem7482 4 жыл бұрын

thanks a lot :) for the very unique explanation!

@allennelson1987 4 жыл бұрын

Helpful. Sololearn uses KD trees in the machine learning tutorial.

@adiflorense1477 3 жыл бұрын

Great

@scchouhansanjay 4 жыл бұрын

1:20 I thought he will say, "if something does not make sense then contact me" 😂😂😂

@vincentxu1964 5 жыл бұрын

Question about Ball Tree Complexity. To construct the ball tree, we need to perform argmax distance(x, xi) for xi in S, therefore the algorithm still need to go through all the data points and compute their distance to the random chosen point. In this sense, comparing to KNN, I don't see any advantage using ball tree, since the complexity is almost the same. Or it is because we consider constructing ball tree as a pre-computed process before testing, and we don't add that part of running time to the final running time? Great thanks!

@jeeveshjuneja445 5 жыл бұрын

Yes. It is precomputed.

@rjain3443 Жыл бұрын

A huge thanks to you Prof. Kilian!!! Big fan of your teaching. You really kill the topic and it seems nothing more is left to know about the topic. 😀 I have a little question- why ball trees are not that famous, given they work so well and are invariant of the dimensionality of feature space?

@kilianweinberger698 Жыл бұрын

They have gotten a little out of fashion, because they are not that suitable for GPUs. On GPUs it is often more effective to perform approximate nearest neighbor search with brute force parallelism.

@bhupensinha3767 5 жыл бұрын

Hi I got a slightly different understanding of KD tree from python scikit learn implementation. It says it uses or switches to brute Force method in the final search table to find the k nearest neighbor. The documentation does not talk about going up the tree and checking for neighbors hiding in other partitions. Not sure if I am able state my confusion Your ball tree example seems good. But scikit learn is quiet abt ball tree although it supports it for knn algorithm

@kilianweinberger698 5 жыл бұрын

What you describe sounds like Approximate Nearest Neighbor search. Here, one variant is to also build a tree (ball-tree, KD-Tree, or any other variant), however during test-time you descent into the leaf into which the test-point falls and return the nearest neighbor only from this leaf. There is no guarantee that you will indeed return the exact nearest neighbor in a global sense, but often this is good enough and can be very fast in practice.

@angelmcorrea1704 4 жыл бұрын

thanks for shared this lectures. :)

@ThePatelprateek 2 жыл бұрын

If aligned axes is the issue , would random axis work , why did we create spheres instead of random aligned hyperplanes , any particular advantages to having spheres vs hyperplanes ? any pointers would help . thanks

@vocabularybytesbypriyankgo1558 2 күн бұрын

Thanks !!!

@saketdwivedi8187 4 жыл бұрын

What if we incorporate the label as an additional dimension and begin splitting from that first? Wont that always ensure that our leaf has only 1 label?(because in a way we are trying to explain the variability in labels through variability in dimensions/features so the variability in labels must be significant)

@genesisevolution4243 3 жыл бұрын

@Saket Dwivedi that wouldn't help because during testing your feature vector will not have that dimension of label that you just added because that's what we want to predict.

@vatsan16 4 жыл бұрын

there goes the professor recruiting arthur for a phd :P

@subhasdh2446 2 жыл бұрын

That answer was really psych. Damnnnn!!!

@Dragon-Slay3r Ай бұрын

I always tried to help, each time i did the cult took advantage now its their problem, and i dont know if i can help anybody everything has changed since last night

@prateekpatel6082 2 жыл бұрын

question : why construct sphere instead of hyperplanes like kd-tree but not axes aligned ?

@kilianweinberger698 2 жыл бұрын

A sphere is typically a tighter bound on the distance. In these data structures you want to bound the distance between the test point and a potential training point, and see if it could possibly be your nearest neighbor. In KD trees you say the distance from the test point to the training point is _at least_ the distance to the hyper-plane. In ball-trees you say, it is at least the distance to the sphere. Typically the latter is larger, i.e. a better approximation and allows you to prune away more points. Hope this helps.

@sudhanshuvashisht8960 3 жыл бұрын

Could anyone explain to me why ball trees are better in higher dimensional space (with a lower-dimensional manifold) than kd trees? I find it hard to imagine/understand when Prof. explains (12:55) it is because of the axis-aligned splits that kd trees are bad even in this setting (low-dimension manifold in high ambient dimension)

@forthrightgambitia1032 2 жыл бұрын

Because in higher dimensional space the boxes are more 'crushed up' together. In higher dimensional space points become equally far apart so their bounding boxes become more indistinguishable. One of the earlier lessons covers this.

@sandeshhegde9143 5 жыл бұрын

Decision Tree starts from: kzbin.info/www/bejne/e2LCiHaaiqanr6c

@RGDot422 11 ай бұрын

Hi. Is ball trees and KD trees the same?

@kilianweinberger698 9 ай бұрын

No, KD trees only split along features and essentially place the data into boxes. Ball trees fit the data into little spheres. Another way of thinking about it is, imagine you have a data set and you rotate it. KD-trees will now create a very different data structure, because the axis have changed. Ball-trees will be exactly the same. In general, KD-trees tend to be better in very low dimensional spaces (2D, 3D) and are often used in computer graphics to find nearest neighbors for meshes. Ball-trees tend to be better in medium sized dimensionalities (e.g. 10D-50D).