Not sure if this is all or correct but here is a list of every concept thing done: Derivative, partial derivative Integration; zero from symmetry(odd/even) Lagrange Complete the square with, polynomial format to get a,b,c, to write as vertex notation of line Substitution System of linear equations: I think 4 equations 3 unknowns . Other more common: power, natural e, factoring, basically algebra . Stats: mean, standard deviation, variance, z score @MachineLearningSimulation what u think of this list? Wanna add anything?
@watching441019 күн бұрын
Would like to see how this relates to entropy something about gaussian maximizes entropy. Anything that minimizes it?
@Dhadheechi4 ай бұрын
Thank you so much for this explanation! I had no problem with taking the derivatives of the functional, and getting ln(p) as a function of lambdas. The real novel ideas (at least for me) are writing ln(p) in terms of complete squares, carefully choosing which order you want to substitute it in the three constraints (first in the mean constraint, then the normalisation constraint, then the variance constraint), and also making simple substitutions like y = x - mu. They seem trivial but I was banging my head before watching this video about how to do these substitutions. Your video was really useful in introducing these small and useful tricks in deriving the gaussian from the maximum entropy principle
@MachineLearningSimulation4 ай бұрын
You're very welcome 😊 Thanks for this kind comment. I also remember when struggled with the derivation as I worked through Bishops PRML book. He skipped over quite some details and so I created this video with all the details :D. I'm glad you appreciated it. :)
@Dhadheechi4 ай бұрын
@@MachineLearningSimulation Oh I see... I am working through Bishop's new Deep Learning book and it turns out that the first 2-3 chapters are the same in both books. I see that every time I have a difficulty in the chapter 3 of the new book (Standard distributions), and for every topic I have a difficulty with, like deriving the conditional and marginal for the multivariate distribution, deriving the MLE for the same etc, you have a video covering it! Really appreciate your work man.... reading Bishop is proving to be quite a challenge for me as an undergrad who just completed his first year. Gonna watch your Calculus of variations videos and your scientific python workshop next :)
@MachineLearningSimulation4 ай бұрын
@@Dhadheechi You're welcome! :) Good luck on your learning journey and thanks again for the nice comments also under the other videos.
@shreyasjaiswal47393 жыл бұрын
21:12 We can explain it also as definite integral of odd function is 0. As p(y) is even and y is odd, so there product is odd.
@MachineLearningSimulation3 жыл бұрын
Yes, that's of course correct :)
@Camptonweat Жыл бұрын
Note that this integral is just the mean of p(y). It turns out to be zero because y is centralised (noting that lambda_1 ends up being zero), so the reasoning in the video is a little circular. I think it sufficient to simply say that the limits must converge to zero to preserve normalisation, that way you don't need to make any leading assumptions about p(y) being symmetric.
@user-or7ji5hv8y3 жыл бұрын
Didn’t realize functionals could be so useful. Thanks
@MachineLearningSimulation3 жыл бұрын
Thanks too for your feedback. There are a lot of nice applications for them, not only in Probabilistic Machine Learning but also in the Finite Element Method for Structural Mechanics Problems.
@ttommy516 Жыл бұрын
Thanks so much for your detailed derivation! I suppose it might be better if a simple explanation about Maximum Entropy Principle was added to the video, which could better highlight the motivation.
@MachineLearningSimulation Жыл бұрын
You're very welcome :) Indeed, that could have been a great motivation. Maybe content for a future video ;)
@bmw009a3 Жыл бұрын
A wonderful video and math journey, thank you very much
@MachineLearningSimulation Жыл бұрын
Thank you 😊 I'm happy you enjoyed it. It was a topic that bugged me for a long time, and I was happy once I could conclude it in a video.
@user-or7ji5hv8y3 жыл бұрын
The concept here seems really profound. Just trying to see the intuition why maximizing entropy gets us to the right pdf. Is there a way to see why that should work? I can see why with a coin, you maximize the entropy if you have the probability set to 0.5, which corresponds to having the least amount of information about the coin. Does maximizing entropy work with deriving other distribution as well, as long as you set appropriate constraints?
@MachineLearningSimulation3 жыл бұрын
Regarding the last question: Yes, you can derive a lot of pdf/pmf from this Maximum Entropy principle given the right constraints. Take a look at this table on Wikipedia: en.wikipedia.org/wiki/Maximum_entropy_probability_distribution#Other_examples Regarding the first, I must admit, that I don't have a good intuition except from some mathematical considerations. Due to the differentials entropy as the integral over - log(p) * p I think it makes sense that the distribution will somehow contain exp, but that's not just for the Gaussian/Normal. I will think about it again and might come back to the thread if I can come up with a better intuition. Maybe someone else has something nice?
@mpost909 Жыл бұрын
Love the videos! Keep it up! Do you by any chance get your video ideas partly from Bishops Pattern Recognition and Machine Learning?
@MachineLearningSimulation Жыл бұрын
Thanks a lot 😊 Yes, the videos on probabilistic machine learning are inspires by bishop's book. This very video is essentially the solution to one of the exercises, actually one that plagued me for over two years until I finally figured it out. Hence, I thought it'd be cool to have a detailed video about it 😉 glad, you enjoyed it.
@kennettallgren6402 жыл бұрын
Thank you for your nice derivation
@MachineLearningSimulation2 жыл бұрын
You're welcome 😊
@simonesabbadini664810 ай бұрын
Thank you Very Very Very Much.
@MachineLearningSimulation10 ай бұрын
You're very welcome! :)
@noot_22 жыл бұрын
Where did the minus sign from the entropy expression go to? I think that turns the signs opposite on a couple of expressions. Thank you nonetheless, I was struggling understanding this
@MachineLearningSimulation2 жыл бұрын
You're welcome :) Do you have a timestamp when it got lost? It's been some time since I created the video.
@noot_22 жыл бұрын
@@MachineLearningSimulation Nevermind, as I was writing this I noticed that I completely missed the part of the argMin - H(p). Thank you so much for your time
@MachineLearningSimulation2 жыл бұрын
@@noot_2 You're welcome :)
@mirandasuryaprakash32692 жыл бұрын
isnt that λ2*µsquare at 11:49.. do we get the same result solving this
@MachineLearningSimulation2 жыл бұрын
Hi, it's been some time since I uploaded the video. I think at 11:42 I display the note that it should be different, and I correct it later in the video. It also seems to be correct on the Github pdf: raw.githubusercontent.com/Ceyron/machine-learning-and-simulation/main/english/essential_pmf_pdf/univariate_normal_maximum_entropy_derivation.pdf Or do you mean something different? :) Please let me know, I would be happy to help (and fix the mistake).
@djbuffonage950011 ай бұрын
Can't the gaussian be the minima ?
@MachineLearningSimulation10 ай бұрын
Valid question, indeed. We only showed that it is an extreme point under the prescribed condition. You can find an argument why this is the maximum in chapter 1.6 of Chris Bishop's "Pattern Recognition and Machine Learning": www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf