Heteroskedasticity

Heteroskedasticity - example 2

Рет қаралды 64,546

Ben Lambert

Күн бұрын

Пікірлер: 9

@diegojaimes7430 4 жыл бұрын

I'm not religious, but I think this man might be the messiah

@SpartacanUsuals 11 жыл бұрын

Hi, thanks for your message. You are correct OLS minimises the sum of the square residuals, however, its ultimate purpose is to estimate the parameters (I normally call them beta) in the population. The problem with ordinary least squares in the presence of heteroscedasticity is that it gives too much weight to those points in regions where the variance is high. This means that the variance of OLS estimators are higher than estimators which give lower weight to high variance regions. Thanks, Ben

@aartinho 2 жыл бұрын

You're saving my legacy mate

@koenvandemoortel2206 Жыл бұрын

Thank you for your explanation. However, as a simple physicist, I don't understand all the fuzz about this "heteroskedasticity". 1)The least squares regression method is designed to make the variance of the residuals minimal, so it will be minimal (on average). If it is more or less for some x ranges, so be it, what's the problem? If there is a pattern, it simply means there is another variable playing a role. In your example, it might be something we could call "foodiness". If a person spends much to food more than the model predicts, it's probably because he's a "foodie". We could define a "foodiness index" F= food budget divided by the expected food budget. And then it would be interesting to search for a correlation of F with another independent variable like e.g. the number of hours a day a person spends cooking/dining. 2)You will never get a good model here with a linear regression, no matter how you try to "cure" that heteroskedasticity. That is pretty obvious. Your model should be a curve starting a bit below 1 (=100%) for x=0, and then going down slower and slower towards a horizontal asymptote y=0, period. A plausible model for the absolute food expenditure (y) vs total expenditure (x) might be for example: y=f(x)=a*ln(1+bx). That starts with th steepest slope f'(0)=ab, and then the slope descends slowly, depending on b. For the relative food expenditure, use f(x)/x of course; that will go slowly to 0. I tried it for data from different countries (ourworldindata.org/grapher/food-expenditure-share-gdp?xScale=linear) and it looks pretty good. I gave the numbers all the same relative errors in order to give the data points plausible weights. My regression can be found here: www.lerenisplezant.be/teksten/FittingKVdm-Manual.pdf then Ctrl-F "food vs total expenditure". Looking forward to seeing your comments! Koen Van de moortel, info@lerenisplezant.be

@sofiyavyshnya6723 3 жыл бұрын

Terrific video!!! Thank you so much for this and the many other videos you have made. PS: I'll be the foodie who spends more money on food. 😉

@FB-tr2kf 7 жыл бұрын

Hi Ben, terrific job - thank you. However, I have one very quick question. I've always struggled in understanding auxiliary regression. What exactly do we do with (in a sample) once we obtained the squared error terms. Do we regress it against the independent variables or the constants? & why?