Expectation Maximization for the Gaussian Mixture Model | Full Derivation

  Рет қаралды 5,372

Machine Learning & Simulation

Machine Learning & Simulation

Күн бұрын

Пікірлер: 16
@agrawal.akash9702
@agrawal.akash9702 10 ай бұрын
this is legitimately such a great explanation. thanks!
@MachineLearningSimulation
@MachineLearningSimulation 9 ай бұрын
You're very welcome! 😊
@MachineLearningSimulation
@MachineLearningSimulation 3 жыл бұрын
There was an error on the hand-written M-Step in the beginning of the video. For the first 3 minutes I was able to overlay it. Please refer to this as the correct expression for the M-Step.
@vslaykovsky
@vslaykovsky 2 жыл бұрын
11:30 isn't it a lower bound of marginal log-likelihood instead?
@MachineLearningSimulation
@MachineLearningSimulation 2 жыл бұрын
Hey, are you referring to the Q function?
@patrickg.3602
@patrickg.3602 Жыл бұрын
How are you sure that the zeropoints of Q are maxima? Couldnt it be a saddle point or minima as well? Or did you just skip the part where you have to check the second derivatives?
@MachineLearningSimulation
@MachineLearningSimulation Жыл бұрын
That's a great question! :) There was no specific check for this in the video. For theoretical investigations, you can consult the following paper: kzbin.info/www/bejne/jJuXk2eupM-Dg9k Pragmatically though, EM is often observed to behave robustly if well initialized. Since the runtime of an EM fit is usually quite fast (compared to other ML methods), it is reasonable to start from multiple initial conditions and select the model with the best score (or otherwise best properties). For instance, check out this follow-up Video on sieving: kzbin.info/www/bejne/jJuXk2eupM-Dg9k I can also recommend the documentation of scikit-learn: scikit-learn.org/stable/modules/mixture.html
@nickelandcopper5636
@nickelandcopper5636 3 жыл бұрын
Is it possible to have the Gaussian distributions be latent and have the class be non-latent? Basically the continuous variable is latent now? What would this look like?
@MachineLearningSimulation
@MachineLearningSimulation 3 жыл бұрын
That's a valid question, but it is rather uncommon to do it in practice. At least, I haven't seen it. What would be your application? In my understanding, the EM algorithm works best for Mixture Distributions, which have a latent discrete part (the Categorical distribution) and a conditioned, observed continuous part. (which could also be a different distribution from the Normal/Gaussian, but commonly it is used for the Gaussian Mixture Model). However, generally speaking, you can build any DGM you like. It is just that many DGMs come with huge difficulties in training them. A more general way for training DGMs is by Variational Inference (kzbin.info/www/bejne/mqnah4CbgJ5rbrs ) or by MCMC (no video yet) which can also handle scenarios, the EM cannot do. In fact, the EM algorithm is identical to Variational Inference if we can analytically express the posterior, what we can for GMMs. But again, regarding your proposal, I do not think it would make a lot of sense to have the latent variable to be the leaf node in the DGM. How I understand latent variables is, that you use them to model an unobserved cause of something, not an unobserved effect.
@bartosz5592
@bartosz5592 3 жыл бұрын
Hi, what about EM algorithm for one bivariate Gaussian with missing values
@MachineLearningSimulation
@MachineLearningSimulation 3 жыл бұрын
Hey, I answered your similar comment on the other video. Was it referring to the same?
@bartosz5592
@bartosz5592 3 жыл бұрын
@@MachineLearningSimulation yes thank you
@sulasrisuddin294
@sulasrisuddin294 Жыл бұрын
How about syntax in R if we want applied in survival mixture model?
@MachineLearningSimulation
@MachineLearningSimulation Жыл бұрын
Thanks for the comment. :) Unfortunately, I am not familiar with survival mixturel models.
@user-or7ji5hv8y
@user-or7ji5hv8y 3 жыл бұрын
Just wondering. Could such EM approach work well in cases where X are high dimensional?
@MachineLearningSimulation
@MachineLearningSimulation 3 жыл бұрын
Yes, surely that of course depends on how "high-dimensional". But in a reasonable number of high dimensions (2 to 100-ish) you can use the EM for the Gaussian Mixture Model where the Gaussians are Multivariate. This introduces additional degrees of freedom, e.g. choosing full covariance or just diagonal etc. I will cover this in the future once I also introduced the Multivariate Normal in my other playlist. Stay tuned for that ;)
Implementing the EM for the Gaussian Mixture in Python | NumPy & TensorFlow Probability
20:25
Deriving the EM Algorithm for the Multivariate Gaussian Mixture Model
1:13:08
Machine Learning & Simulation
Рет қаралды 8 М.
How to treat Acne💉
00:31
ISSEI / いっせい
Рет қаралды 108 МЛН
Tuna 🍣 ​⁠@patrickzeinali ​⁠@ChefRush
00:48
albert_cancook
Рет қаралды 148 МЛН
Expectation Maximization Algorithm | Intuition & General Derivation
29:47
Machine Learning & Simulation
Рет қаралды 8 М.
Introduction to Machine Learning - 09 - Clustering and expectation-maximization
53:27
Tübingen Machine Learning
Рет қаралды 7 М.
Mean Field Approach for Variational Inference | Intuition & General Derivation
25:40
Machine Learning & Simulation
Рет қаралды 10 М.
Gaussian Mixture Model | Intuition & Introduction | TensorFlow Probability
17:43
Machine Learning & Simulation
Рет қаралды 5 М.
The Genius Way Computers Multiply Big Numbers
22:04
PurpleMind
Рет қаралды 247 М.
EM Algorithm : Data Science Concepts
24:08
ritvikmath
Рет қаралды 78 М.
Clustering (4): Gaussian Mixture Models and EM
17:11
Alexander Ihler
Рет қаралды 291 М.
Kadyrov Went Against Putin? / Poland Declares War on Russia
11:17
27. EM Algorithm for Latent Variable Models
51:17
Inside Bloomberg
Рет қаралды 20 М.