Clustering (3): K-Means Clustering
15:02
Clustering (1): Basics
6:00
9 жыл бұрын
Neural Networks (2): Backpropagation
14:49
Neural Networks (1): Basics
13:52
9 жыл бұрын
Support Vector Machines (3): Kernels
15:09
VC Dimension
17:42
11 жыл бұрын
Linear classifiers (1): Basics
14:15
11 жыл бұрын
Linear regression (6): Regularization
8:30
Linear regression (2): Gradient descent
14:21
Linear regression (1): Basics
5:47
11 жыл бұрын
Introduction (3): Supervised Learning
9:52
Bayes Classifiers (2): Naive Bayes
15:04
Bayes Classifiers (1)
11:51
11 жыл бұрын
Nearest Neighbor (1)
7:46
11 жыл бұрын
Multivariate Gaussian distributions
14:49
PCA, SVD
17:37
12 жыл бұрын
Clustering
32:09
12 жыл бұрын
Review: Probability
25:04
12 жыл бұрын
Пікірлер
@dr.ganeshbhokare6547
@dr.ganeshbhokare6547 7 күн бұрын
Sir, if any vector is by default a column vector then the dot product equation should be "Theta Transpose x," which gives a scalar y; otherwise, it will give a matrix. I think it is a typo mistake. If your assumption is all are row vector, then it is fine.
@yurpipipchz75
@yurpipipchz75 Ай бұрын
Thank you for the knowledge!
@newbie8051
@newbie8051 Ай бұрын
Bit tough to follow without any visualiations the relation to k-means was intuitive, as Gaussian Mixture Models essentially group the inputs as being sampled from k number of gaussians... thanks
@newbie8051
@newbie8051 Ай бұрын
oh the visualization at 12:50 was amazing, I drew a gaussian on the x-axis to better understand this I love how I am progressing with this, thanks !!!!
@user-nk1kz2fz2v
@user-nk1kz2fz2v 3 ай бұрын
why f(x,b)=sign(x.x-b) is a circle?
@ckjdinnj
@ckjdinnj 4 ай бұрын
Thank you! This video is so much better than all the garbage machine learning click bait videos.
@rasu84
@rasu84 4 ай бұрын
at 11:00 what if the blue point is in between the two red points? I don't think a straight line can shatter them in that case. Or did I just misinterpret the entire thing?
@JieunKo-v1l
@JieunKo-v1l 5 ай бұрын
Thanks for wonderful explanation Do you share slides?
@dadmehrdidgar4971
@dadmehrdidgar4971 7 ай бұрын
great video even after 10 years! thanks! :)
@pedroaragon3435
@pedroaragon3435 7 ай бұрын
By far the best explanation on GMM and specially the EM algorithm.
@carpediemcotidiem
@carpediemcotidiem 7 ай бұрын
00:01 Gradient boosting is a method of converting a sequence of weak learners into a very complex predictor. 01:31 Gradient Boosting in a nutshell 02:56 Ensembling improves predictions by combining weak learners 04:30 Ensembles in Gradient Boosting add complexity gradually for better fit 05:52 Gradient Boosting involves adjusting predictions to reduce error 07:14 Gradient Boosting involves fitting models to error residuals 08:56 Explaining the importance of step size alpha K in Gradient Boosting 10:20 Gradient boosting uses weighted sums of regressors for better predictions. Crafted by Merlin AI.
@mirettegeorgy5123
@mirettegeorgy5123 8 ай бұрын
Thank you for this video, helps alot!
@Noah-jz3gt
@Noah-jz3gt 9 ай бұрын
Very clear and straightforward while containing all the necessary contents to understand the concept!
@shivampadmani_iisc
@shivampadmani_iisc 10 ай бұрын
Thank you so much so much sooooo much
@samfriedman5031
@samfriedman5031 11 ай бұрын
4:07 MLE for sigma-hat should be X by X-transpose (outer product) not X-transpose by X (inner product)
@jokotajokisi
@jokotajokisi 11 ай бұрын
Oh my G. After 5 years of confusion, I finally understood Lp regularization! Thank you so much Alex!
@remandev9074
@remandev9074 Жыл бұрын
This is not a very good explanation at all. There's WAY too much theorem dumping with difficult-to-parse variables all over the place, and a big lack of tangible examples. I don't know what other people see in this video.
@rickclark4832
@rickclark4832 Жыл бұрын
Exceptionally clear explanation of the use of EM with Gaussian Mixture Models
@Why_I_am_a_theist
@Why_I_am_a_theist Жыл бұрын
Nice video , this is what I dig in youtube , an actual concise clear explanation worth any paid course
@spitalhelles3380
@spitalhelles3380 Жыл бұрын
Thanks :)
@GenerativeDiffusionModel_AI_ML
@GenerativeDiffusionModel_AI_ML Жыл бұрын
学习
@ZLYang
@ZLYang Жыл бұрын
At 4:32, if x and μ are row vectors, [x-μ] should also be a row vector. Then how to multiply (Σ^(-1))* [x-μ]? Since the dimension of (Σ^(-1)) is 2*2, and the dimension of [x-μ] is 1*2.
@ZLYang
@ZLYang Жыл бұрын
The best explanation I ever see. Hope can talk a bit about how to derive the equation.
@liubo19831214
@liubo19831214 Жыл бұрын
Prof. Ihler, could you provide the reference for hard EM (in the last slide)? Thx!
@adityabarge8603
@adityabarge8603 Жыл бұрын
thanks for the explanation
@wenzwenzel2529
@wenzwenzel2529 Жыл бұрын
Bootstrap-Aggregation. So helpful when you're trying to learn every AWS Machine Learning tool!
@wenzwenzel2529
@wenzwenzel2529 Жыл бұрын
Very good nuance here.
@theSpicyHam
@theSpicyHam Жыл бұрын
excellent presentation
@celisun1013
@celisun1013 2 жыл бұрын
please publish more videos, professor Ihler!
@Phoenix231092
@Phoenix231092 2 жыл бұрын
Very few videos online give some key concepts here, like what we're truly trying to minimize with the penalty expression. Most just give the equation but never explain the intuition behind L1 and L2. Kudos man
@Aakash-mi8xq
@Aakash-mi8xq 2 жыл бұрын
Very well explained. Thank you!
@nikhilchalla7216
@nikhilchalla7216 2 жыл бұрын
Thank you very much for the amazing video! I have a couple of questions. 1. Would shatter be the right criteria to define VC dimension when classifying the data to the right class is also important? As an example of the circle classifier example, you will be able to shatter two points but you will not be able to classify the two points correctly as shown in the video. 2. One parameter with lots of power: Would it be possible to share a complete example for this case or any reference literature if you have any?
@Ranger-x5k
@Ranger-x5k 2 жыл бұрын
Excellent explanation!! (one correction for the mistake: the outer product was written as the inner product in the slides
@erisha78
@erisha78 2 жыл бұрын
Beautiful!
@manueljenkin95
@manueljenkin95 2 жыл бұрын
This video was so hard to follow and watch (too wordy and very little pauses) but I’m thankful anyway, since eventually I got to understand it by thinking about it and rewatching a few times.
@KulvinderSingh-pm7cr
@KulvinderSingh-pm7cr 2 жыл бұрын
Simply best explanation!!
@KulvinderSingh-pm7cr
@KulvinderSingh-pm7cr 2 жыл бұрын
Beautiful
@nathanzorndorf8214
@nathanzorndorf8214 2 жыл бұрын
Next time, I'd love it if you included the effect lambda has on regularization, including visuals!
@nathanzorndorf8214
@nathanzorndorf8214 2 жыл бұрын
Wow, that was such a great explanation. Thank you.
@TheProblembaer2
@TheProblembaer2 2 жыл бұрын
Hi Alexander Ihler, i watched plenty of videos, and this explanation is the best I found. Thank you!
@spyhunter0066
@spyhunter0066 2 жыл бұрын
Could you explain more about the sum of the vectors in your notations for the maximum likelihood estimates at the minute 1.45? As far as I have noticed, there has been only one data set, namely one x vector. Thus, what actually are you summing up with j indices? Cheers.
@spyhunter0066
@spyhunter0066 2 жыл бұрын
At the minute of 1.34, the maximum likelihood estimates formula has 1 over N coefficient. On the other hand, at the minute of 3.13, there is 1 over m coefficients. We know that N and m is the total number of values in the sums, but what is the reason you used different notations as N and m. Is it just to seperate univariate and multivariate cases while they keep their definitions (or meaning)? Also, the j values in the lower and upper limits of sum sembols are not so clear in this notation. Should we write j=1 to j=m or N for instance?
@spyhunter0066
@spyhunter0066 2 жыл бұрын
One more question about the example at the minute of 4.24, you said independent x1 and x2 variables. Independendent of what??? As far as I see, you can have 2 univariate formula like in this example, but when you combine them to see the combined likelihood, you have to have a mean vector in size of 2 and Sigma matrix iin size of 2x2. That's always the case, right? The size of the mean vector and the Sigma matrix look like defined by the number of combination of x values. Is that right? I saw another example somewhere else, you can have L(μ=28 ,σ=2 | x1=32 and x2=34) for instance to find the combined likelihood at x1=32 and x2=34, and he uses only one mean and sigma for both. REF:kzbin.info/www/bejne/ep-Zk2yceK6Ipq8&ab_channel=StatQuestwithJoshStarmer
@spyhunter0066
@spyhunter0066 2 жыл бұрын
I'd like to know how you call your x value for univariate caseü or x value set for multivariate case in your Gaussian distribuitons? Do you name them as "data set" or " variable set"? Also, what makes the mean value size same as the x data size? Thanks in advance. Should we think that we create one mean average for every added x data point in our data set? That's why we average them when we find the best estimated value in the end.
@spyhunter0066
@spyhunter0066 2 жыл бұрын
In the formula at the minute 2.11, when you find the inverse of a Sigma matrix in the exp(...) , do you use unit matrix method, any coding , or some other method? Cheers.
@spyhunter0066
@spyhunter0066 2 жыл бұрын
At 5.23, you should have said (x-mu) transpose.
@AlexanderIhler
@AlexanderIhler 2 жыл бұрын
These slides have a number of transposition notation errors, due to my having migrated from column to row notation that year. Unfortunately KZbin does not allow updating videos, so the errors remain. It should be clear in context, since i say “outer product” for the few non inner products.
@spyhunter0066
@spyhunter0066 2 жыл бұрын
@@AlexanderIhler NO worries, we spot them.
@d-rex7043
@d-rex7043 2 жыл бұрын
This should be mandatory viewing, before being assaulted with the symbolic derivations!
@spyhunter0066
@spyhunter0066 2 жыл бұрын
Can you tell me the diffference between bivariate and multivariate case ? Can you also mention about when the parameters are dependent where we add extra dependence coefficient parameter? There is a sample video to refer for you give a better idea: kzbin.info/www/bejne/e5nQYaCZob-ma5Y
@AlexanderIhler
@AlexanderIhler 2 жыл бұрын
Bivariate = 2 variables; multivariate = more than one variable. So bivariate is a special case, in which the mean is two-dimensional and the covariance is 2x2. Above 2 dimensions it is hard to visualize, so I usually just draw 2D distributions; but the mathematics is exactly the same.
@spyhunter0066
@spyhunter0066 2 жыл бұрын
@@AlexanderIhler Your initial case of 1D Gaussian with only one x value is indeed a bivariate case with one x value with two parameters,the mean and the sigma value, right? Also, bivariate case can be called the simplest case of multivariate occasion, right? If we have a data set x and a multiple variable of mean and sigmas, we have to use your MULTIVARIATE CASE with a vector of x values and mean values with a covariance matrix for the sigma values, shouldn't we? Thanks for the help in advance.
@AlexanderIhler
@AlexanderIhler 2 жыл бұрын
No, those are the parameters; if “x” (the random variable) is scalar, it is univariate, although the distribution may have any number of parameters. So, if x is bivariate, x=[x1,x2], the mean will have 2 entries and the covariance 4 (3 free parameters, since it is symmetric), so the distribution has 5 parameters total.
@spyhunter0066
@spyhunter0066 2 жыл бұрын
@@AlexanderIhler x is your data point, right! If it is only one scalar value, the case is called univariate case, but if it is a vector of scalar values of two, it is called bivariate by definition. That's it. For bivariate and multivariate case where the data x variable is a vector of size d, the mean is also a vector of the same size of x vector. Thus, the covariance matrix by definition the square matrix has to have d by d matrix if x and mean has d dimension as you said . I assume you said 5 parameters in total, because symmetric terms are equal in covariance matrix, so 4-1=3 parameters coming from that Sigma matrix with size d x d .
@spyhunter0066
@spyhunter0066 2 жыл бұрын
should we get x vector also as a row vector with length d just like nü (mean) vector at the minute of 1.44!
@chyldstudios
@chyldstudios 2 жыл бұрын
Solid explanation.
@itarabichi
@itarabichi 2 жыл бұрын
Great explanation! Every bit of it can be comprehended. Well Done!