this is legitimately such a great explanation. thanks!
@MachineLearningSimulation9 ай бұрын
You're very welcome! 😊
@MachineLearningSimulation3 жыл бұрын
There was an error on the hand-written M-Step in the beginning of the video. For the first 3 minutes I was able to overlay it. Please refer to this as the correct expression for the M-Step.
@vslaykovsky2 жыл бұрын
11:30 isn't it a lower bound of marginal log-likelihood instead?
@MachineLearningSimulation2 жыл бұрын
Hey, are you referring to the Q function?
@patrickg.3602 Жыл бұрын
How are you sure that the zeropoints of Q are maxima? Couldnt it be a saddle point or minima as well? Or did you just skip the part where you have to check the second derivatives?
@MachineLearningSimulation Жыл бұрын
That's a great question! :) There was no specific check for this in the video. For theoretical investigations, you can consult the following paper: kzbin.info/www/bejne/jJuXk2eupM-Dg9k Pragmatically though, EM is often observed to behave robustly if well initialized. Since the runtime of an EM fit is usually quite fast (compared to other ML methods), it is reasonable to start from multiple initial conditions and select the model with the best score (or otherwise best properties). For instance, check out this follow-up Video on sieving: kzbin.info/www/bejne/jJuXk2eupM-Dg9k I can also recommend the documentation of scikit-learn: scikit-learn.org/stable/modules/mixture.html
@nickelandcopper56363 жыл бұрын
Is it possible to have the Gaussian distributions be latent and have the class be non-latent? Basically the continuous variable is latent now? What would this look like?
@MachineLearningSimulation3 жыл бұрын
That's a valid question, but it is rather uncommon to do it in practice. At least, I haven't seen it. What would be your application? In my understanding, the EM algorithm works best for Mixture Distributions, which have a latent discrete part (the Categorical distribution) and a conditioned, observed continuous part. (which could also be a different distribution from the Normal/Gaussian, but commonly it is used for the Gaussian Mixture Model). However, generally speaking, you can build any DGM you like. It is just that many DGMs come with huge difficulties in training them. A more general way for training DGMs is by Variational Inference (kzbin.info/www/bejne/mqnah4CbgJ5rbrs ) or by MCMC (no video yet) which can also handle scenarios, the EM cannot do. In fact, the EM algorithm is identical to Variational Inference if we can analytically express the posterior, what we can for GMMs. But again, regarding your proposal, I do not think it would make a lot of sense to have the latent variable to be the leaf node in the DGM. How I understand latent variables is, that you use them to model an unobserved cause of something, not an unobserved effect.
@bartosz55923 жыл бұрын
Hi, what about EM algorithm for one bivariate Gaussian with missing values
@MachineLearningSimulation3 жыл бұрын
Hey, I answered your similar comment on the other video. Was it referring to the same?
@bartosz55923 жыл бұрын
@@MachineLearningSimulation yes thank you
@sulasrisuddin294 Жыл бұрын
How about syntax in R if we want applied in survival mixture model?
@MachineLearningSimulation Жыл бұрын
Thanks for the comment. :) Unfortunately, I am not familiar with survival mixturel models.
@user-or7ji5hv8y3 жыл бұрын
Just wondering. Could such EM approach work well in cases where X are high dimensional?
@MachineLearningSimulation3 жыл бұрын
Yes, surely that of course depends on how "high-dimensional". But in a reasonable number of high dimensions (2 to 100-ish) you can use the EM for the Gaussian Mixture Model where the Gaussians are Multivariate. This introduces additional degrees of freedom, e.g. choosing full covariance or just diagonal etc. I will cover this in the future once I also introduced the Multivariate Normal in my other playlist. Stay tuned for that ;)