Maximum Likelihood Estimation (MLE) | Score equation | Information

Maximum Likelihood Estimation (MLE) | Score equation | Information | Invariance

Рет қаралды 111,630

Күн бұрын

Пікірлер: 70

@Manny123-y3j 2 жыл бұрын

I am stunned. This video is about a 1000X clearer than the explanation my professor gave on all this. You are SO clear. It's a life-saver! Thank you!

@davidbanahene307 4 жыл бұрын

You don't know the number of people you are helping every now and then. Kudos! I do appreciate your great effort to help in a way contribute to our success. #GODBLESSYOU

@markwilson9490 3 жыл бұрын

The explanation of the MLE, Score function & Information etc.. here, is unbelievably simple and effective! This alternative perspective really helped my understanding. Thank you.

@michaelbaudin 4 жыл бұрын

Thank you very much for sharing this. There is a possible confusion at 33:03. The equation shows the likelihood depending on (mu, sigma^2) but the plot shows it depending on (mu, sigma) i.e. without square. This is not an error, because the maximum likelihood estimator is for the (mu, sigma^2) vector as well as for (mu, sigma). It does not change much of the graphical meaning of the figure, but introduces a confusion on the intent of this figure. I guess that a clarification might be helpful on this topic. Anyway, your video was very helpful: thanks again for it.

@fightwithbiomechanix 4 жыл бұрын

I'm an engineer in the manufacturing sector. You're videos have been essential in understanding the statistics I use to justify process improvement designed experiments

@Elizabeth_Lynch 6 жыл бұрын

Thank you, so helpful. I appreciate that you touched on MLE with multiple parameters.

@BabakFiFoo 5 жыл бұрын

Thank you for this amazing video! It is very informative and it could be even better if whenever you are using a vector of parameters as "X", use "X bold". Then the notation will become less confusing.

@craighennessy3183 Жыл бұрын

Why can't my textbooks explain it like this. Zed, you are a legend!

@アメリカンドクター 4 жыл бұрын

Awesome video. Much better than the disorganized lecture by my prof lol.

@cdr.dr.shishirsahay9184 Жыл бұрын

Very nicely explained. A BIGGG GOD BLESS to you!

@davidradulovic9034 5 жыл бұрын

The progression graph at the beginning of each video might seem to some people as a minor aspect of the whole video, but it's very significant for me. Lets me know what to expect and that feels good. :)

@ciaranmahon7415 4 жыл бұрын

I would be so fucked in my Math Stats class rn without these videos. Thank u

@harikrishnareddygali6244 Жыл бұрын

You have put a great deal of work into explaining that. Thank you very much.

@wtsg1982 2 жыл бұрын

This helps me in understand how likelihood helps to estimate model in which the max is obtained by score equation. But i might need your help to understand at ~15:40 how derivative to 0 is transformed?

@adamkolany1668 Жыл бұрын

@18:26 So you postulate that θ is normally distributed with mean obtained from MLE and variance being 1/I(θ) ?

@joeekstein9174 2 жыл бұрын

Thanks!

@mikelmendibeabarrategi1102 Жыл бұрын

You are crazy good at this

@srishtigupta9534 3 жыл бұрын

Thank you, it was very helpful.

@erich_l4644 4 жыл бұрын

42 minutes? yuck, no thanks. Oh wait, he said Saddle Up. I'm IN! LETS GO

@lucaslopesf 6 жыл бұрын

You saved my life! Thank you SO much!

@lucarampoldi7743 2 жыл бұрын

Really well done - the examples following the theoretical discussion are especially useful. Thank you so much for uploading this!

@jlz5907 6 жыл бұрын

Thank you SO much! This really helped me a lot

@jimjohnson357 2 жыл бұрын

18:48 you say that the square root of the variance is the standard error (which is then used to find upper and lower limits of confidence interval). I thought the square root of variance is the standard deviation? And therefore, you would need an extra 1/sqrt(n) factor to take the standard deviation to the standard error which can then be used to find the limits? Why in this case is the square root of the variance = standard error and not standard deviation?

@youssefdirani 2 жыл бұрын

17:06 where did this expectation formula come from ?

@arpitanand4693 Жыл бұрын

Hi could anyone help me with reading the notation L(theta ; y) in the context of the pregnancy example which he gave in the video?

@sdsa007 Жыл бұрын

I'm going over my notes...and this tutorial is very clear and I enjoy verifying the math... but I got stuck at around 15:24 trying to understand the estimator mathematically... intuitively it totally makes sense that the estimate should be 20/100, but I am not understanding how it comes from the derivative of l(theta).... when i isolate for theta I get theta/(1-theta) one one side... but that is not the same as reducing to a single theta variable....

@sdsa007 Жыл бұрын

finally got the math right... even though I couldn't isolate theta as a single variable! I got down to n/y-n = theta/1-theta..... substituting I get 20/100-20 = theta/1-theta.... dividing the left side by 100 (top and bottom), I get 0.20/1-0.20 = theta/1-theta... therefore by visual analogy, theta is 0.20 (estimate). You can reduce to a single variable by cross-multiplying the denominators, expanding and reducing, but is a lot of tedious work... 0.20(1-theta)=theta(1-0.20)... blah blah blah....

@Kogsworth 2 жыл бұрын

If I graph the likelihood function at 10:28, it doesn't look anything like the graph in the video. I get really small values for 0.2 rather than really large ones.

@kjyfhjjj 4 жыл бұрын

Thank you so much! This is so helpful! Can you please make more videos with more proof and algebra? For example, the proof that MLE being asymptotically normal, the calculation of variance estimate, etc?

@xuyang2776 5 ай бұрын

Thank you very much. But could you tell me why standard errors of ML estimators are inverse of Fisher information matrix?

@enjoying-the-ride1295 3 жыл бұрын

I'm learning tons from your content Zed, thank you can anyone tell me 36:06 why is mu not a negative? the log likelihood function (after removing the constant and the component with log sigma square), starts with a negative so shouldn't it be nagative?

@steffenmuhle6517 2 жыл бұрын

If x=0 then -x=0 as well. That mu at 36:06 comes from setting the nominator to zero.

@DJMoSheckles Жыл бұрын

Hi this video is incredible as are all of yours, but I'm very confused why the second derivative at 16:39 has both values negative. I've taken it multiple ways and plugged it into Wolfram Alpha and receive (y-n)/(1-theta)^2 - n/theta^2

@aschiffer 7 ай бұрын

This seems right to me too, the derivative of (y-n)/(1-theta) swapped signs on the first derivative and there's no reason it wouldn't swap back on the second. You still have to chain rule d(theta) which is -1, right?

@abcpsc 5 жыл бұрын

Thanks for the video. How about the confidence interval in your multivariable example?

@Maymona93 3 жыл бұрын

Thank you, could you please share the sources that you mentioned could help with calculus & differentiation?

@mohdirfan-pu8fc 2 жыл бұрын

Nice lecture sir. Sir kindly make a vedio on MLE for multiple parameters in implicit form with r code.

@adamkolany1668 Жыл бұрын

@13:45 In order to speak aboyt the "expected" value you MUST have a random variable. Where are they ?? @13:59 WHY ??

@BilalTaskin-om6il Жыл бұрын

Life saver...❤

@ruchikalalit1304 9 ай бұрын

which book is being referred to in this series or any other book for this topic. Anyone who knows please tell

@kprao9949 4 жыл бұрын

superb lecture

@ouafaeouaali4676 Жыл бұрын

Thanks for the course, it's clearly explained.... May i know what logiciel or application you use for the course ( beamer ? PowerPoint? )

@yelshadaygebreselassie3163 3 жыл бұрын

I love your videos. You explain the concepts so clearly. I have one question. In the first example, why would the probability of getting pregnant on the second attempt depend on the first event? Aren't the different attempts independent? Shouldn't the probability of getting pregnant be 0.15 for all individual attempts?

@coolblue5929 2 жыл бұрын

This part I think I can answer. The probability of getting pregnant on the second attempt must exclude the probability of success on the first attempt so, success on the 2nd attempt means failure on 1st AND success on 2nd. Prob of success on 1 = 0.15 so prob of failure on 1 = 1 - 0.15 = 0.85. Therefore prob of failure on 1st AND success on 2nd = 0.85 * 0.15.

@backerlifan 3 жыл бұрын

I once heard OLS and MLE yield the same result under a normal distribution. if that's the case, the pro and cons (especially the pros) just seems negligible, isn't it?

@ProfessionalTycoons 6 жыл бұрын

great video mate.

@sherlocksilver9392 3 жыл бұрын

Does anyone know why in a score test we divide by the information at the null parameter values? I know that the information at the MLE represents the "sharpness" of the likelihood function, but what does information represent at a different parameter value that is not the maxima of the likelihood function?

@minma02262 4 жыл бұрын

If there is a god, I want it to be you.

@angelzash4u2 4 жыл бұрын

hi. can get you assistance in solving a problem using the maximum likelihood method?

@whetstoneguy6717 4 жыл бұрын

Mr. Justin Z--It would have been helpful if you had gone over the intermediary math steps. Thank you. WhetstoneGuy

@Catwomen4512 6 жыл бұрын

I don't understand why the E(Y) is equal to n/theta

@k.sladkina872 6 жыл бұрын

I have the same problem

@Catwomen4512 6 жыл бұрын

@@k.sladkina872 I found out it is simply related to the distribution you use. Google different distributions (normal, binomial, etc.) and if you look at the wikipedia page, on the right, it states what the mean E(X) and variance V(X) are equal to

@anindadatta164 2 жыл бұрын

Var(RV)=EV(RV^2)- (Mean of RV)^2 ,an easy method. So where is the need to do partial differentiation for two simulteneous equations and setting to zero, as effectively same result for variance is thrown up.

@mightbin 2 жыл бұрын

convinced again

@snackbob100 4 жыл бұрын

why cant uni lectures be like this. i pay so much money for an inferior education

@Pier_Py 3 жыл бұрын

You are so f good

@joeyquiet4020 Жыл бұрын

best best best

@coolblue5929 2 жыл бұрын

Where is the sample data though?? Aren’t we supposed to fitting the distribution to a sample? Isn’t that the whole point? Why do you just say, oh, 15%??

@johnmook135 5 жыл бұрын

why does this stuff matter. Im taking math stats for the second time and I understand zero. I can do the basic stuff described in videos but the problems are never just multiply all the pdfs together, take log, derive, and then set to zero... There's always wrinkles. like one problem I have to deal with an absolute value and they start taking about the median in the solution... Iye-yi-yi. I dislike math stats and really want to know how this will help me predict stocks or in any future job.

@zedstatistics 5 жыл бұрын

pretty sure it's what god created on the 3rd day. He created the heaven and earth, the land and the waters, and then differential calculus.

@johnmook135 5 жыл бұрын

@@zedstatistics The calculus isn't that bad. I love it. Although I question it. It's a language to explain something, something very complex. Seems like there could be flaws. But these things work time and time again? crazy. More particularly I just don't know how all this MLE and bayes theorm, sufficient statistics, data reduction, improving an estimator relates to real life problems. I'm data science major. I like sentdex's videos on youtube. All this advanced stats classes I am taking just don;t make sense. Or atleast reading from the book and my teachers just don't relate it to the real world and It doesn't make sense. Any suggestions/tips/ or playlists you could point me to that would help my statistical data science career and understanding? I like math, I like stocks. Not sure how to combined them outside sentdex's videos.

@johnmook135 5 жыл бұрын

any playlist that would help me solve problems like this -- Suppose that 21 observations are taken at random from an exponential distribution for which the mean μ is unknown (μ > 0), the average of 20 of these observations is 6, and although the exact value of the other observation could not be determined, it was known to be greater than 15. Determine the M.L.E. of μ. -- my book is Probability and Statistics 4th edition by DeGroot, there is free pdf available online.

@lzl4226 4 жыл бұрын

On the subject of predicting stocks, I guess you want to build a robot that takes today's stock market data and spits out a distribution of actions you can take that would make you the most money. Let's call this robot π(θ), because it's just a function parameterises by θ. And you want the maximum likelihood of θ that will make you the most money (let's call that Q*, where Q(a|s) is the reward of taking action a at step s). Since you're a data major you probably can see where this is going. You want a neuro-net that models π(θ) and you want to train it to solve for -Δlog(π(θ))Q (notice the score function here), where Q is the Reward of your trading actions (and in practice simulated by another neuro-net). Notice you want to find the set of θ for π(θ) that maximises Q(Q*) (using maximum likelihood and past stock data possibly flattened by some RNN). Furthermore you want to incrementally improve π within a confidence interval, so you don't make too big of a step that will collapse your convergence.... and you'll see the fisher information matrix come up in this calculation if you dig further. So yeah it prob helps in your future job in stock market prediction, if that's where you're headed.

@coolblue5929 2 жыл бұрын

@@lzl4226 except, stock prices are not produced by a stationary process.