Pooled-Variance t Tests and Confidence Intervals: An Example

Рет қаралды 71,859

Күн бұрын

Пікірлер: 30

@guillaumegiroux9425 6 жыл бұрын

Dear M. Jbstatistics, I've struggling with a problem I was hopeless to solve and your ressources helped me solve it. I wanted to tell you: A BIG THANK YOU You're the boss ! Big thanks

@jbstatistics 6 жыл бұрын

You are very welcome!

@ege9069 6 жыл бұрын

i just ended up here before my final exam of statistics 1. i just wanted to thank you with all my heart. Thanks and greetings from istanbul!

@jbstatistics 6 жыл бұрын

You are very welcome! I hope your exam went well!

@ege9069 6 жыл бұрын

it went really well sir. thanks a lot.

@timcrowley4191 5 жыл бұрын

Thank you for the video - a fantastic explanation of pooled variance t-tests - this has really helped me!

@jbstatistics 11 жыл бұрын

There is no real advantage to doing a pooled-variance t test as a regression. You get exactly the same information using both methods. I made that video to illustrate 2 things: 1) As a brief intro to including categorical variables in regression, which can be very useful in a multiple regression setting. 2) To illustrate the relationship between the two methods, which may help students understand the model, the assumptions, and the proper interpretation of results.

@stefanfarier7384 Жыл бұрын

This is so useful. Thank you!

@jbstatistics 11 жыл бұрын

There is no need to do it as a regression. My "pooled-variance t test as a regression" video is part of my regression playlist, and it's a topic best discussed in regression. In regression, it's part of the bigger picture of including categorical explanatory variables in regression analysis. We may wish to include both quantitative and categorical variables as explanatory variables in a regression, and we include categorical variables by declaring appropriate indicator variables.

@jbstatistics 11 жыл бұрын

Although one can do a pooled-variance t test as a regression (and I have a video outlining that), here I've simply expressed this as an ordinary test of a difference in means between two groups. The number of groups has nothing to do with the sample size. If we did this as a regression, and coded our groups as X=0 and X=1, then values of X between 0 and 1 would be completely meaningless. The explanatory variable is categorical, and it does not make sense to discuss values between 0 and 1.

@FLEGA 4 жыл бұрын

The t value you got was 1.980 @6:56 - from the table, t(DF = inf) = 1.960, t(DF = 99) = 1.984. How do I estimate t(DF=117) from the data given in the table? I know TI84 has invT(p, v) ; p = significance level, v = degrees of freedom.

@thomaswoodall4505 Жыл бұрын

how did you find this p value in R? is it not 1-pt(4.267,117)????? then double?

@jbstatistics Жыл бұрын

Yes, that'll do it. (It'll also be part of the t.test output, if that's being used.)

@justsomegirlwithoutamustac5837 Жыл бұрын

Since the sample size is large here, why don't we use Z statistics instead of T statistic?

@jbstatistics Жыл бұрын

When sampling from normally distributed populations, the difference between z tests and t tests depends on whether the population standard deviations are known are not. Since the population standard deviations are pretty much never known, in real world situations tests like these are done as t tests. Sure, the t distribution gets very close to the standard normal distribution as the degrees of freedom increase, but that doesn't mean we should just jump to the approximation when it works reasonably well. If it's a t test, it's a t test, regardless of sample size. Your instructor (and other sources, including some texts) may say something different, so if you are in a stats course listen to your instructor to find our what you should do in your particular course.

@justsomegirlwithoutamustac5837 Жыл бұрын

@@jbstatistics Thank you so much for your reply!

@balajikalva188 4 жыл бұрын

One of the assumptions here is "Normally Distributed Populations" . I don't get what that exactly mean ? If we have an unknown population distribution ( which may not be normal ) , if we take large enough sample size , and plot the sample distribution plot then by CLT it will come close to Normal distribution as the sample size increases . Now , the sample distribution is Normal but population distribution is not ...so my doubt here is can I actually apply everything what said in this video to find the confidence interval . Someone please help

@panagiotisgoulas8539 11 жыл бұрын

Thanks for the feedback. One more question regarding this. Then practically what does the regression line accomplish in this situation. I mean you found the same results as your your hypothesis test but why would I wanna make a categorical regression line in the first place? Wasn't the whole point of making one so I can estimate the values on the line given some hypothetical x values?

@fin-pundit9631 4 жыл бұрын

Hi thanks for this video A quick question 1)Suppose We are doing manually to get T Critical From Ttable instead of using Excel or any software Df=117 But Highest Df is 1000 And second highest df is 120 And third highest df is 60 Whcih one should I chose? or how should I chose ? to get t critical value 2) I understood how to get confidence interval. Can I say as an analyst "95% of sample means of particular population lies between lower and upper limit. And remaining 5% the analyst is taking the risk saying that"5% of sample mean are not from the same population" please clarify

@jbstatistics 4 жыл бұрын

1) It's best to use software. If you're using a table, then either take the conservative approach (go down to 60 DF), linearly interpolate, or say "meh, 120 is close enough to 117 for me". There are pros and cons to each one of those, and an instructor might recommend any of them. Relying on software is best. 2) No, we can't say that. Keep in mind that interpretations of confidence intervals always relate to the value of *parameters* and never sample statistics.

@panagiotisgoulas8539 11 жыл бұрын

Ok the whole comment was meant for this video The Pooled Variance t Test as a Regression .Sorry for the trouble I've downloaded the videos since its easier to pause and go back that's why the mess. Now is it possible to give me a better understanding why would use the regression line since as you mentioned yourself the points between 0 and 1 don't express anything. Besides of what you mentioned in that video is there any other practical meaning of plotting the regression? Sorry for the queries.

@mamou8759 5 жыл бұрын

instead of the t value i think you are looking at the t table, why is the t value 1.980 when in the t table for 95% we have for df 100 1.984 and for infinite 1.96, I was gonna use 1.96 but can someone tell me why it is 1.980

@jbstatistics 5 жыл бұрын

I used software to get the actual value that we need, the one corresponding to 117 degrees of freedom. If you drop down to 100 DF in a table, you'll end up with a slightly larger value.

@mamou8759 5 жыл бұрын

@@jbstatistics ohhh okok ,thank you very much for the fast response

@panagiotisgoulas8539 11 жыл бұрын

Ok I think overall I understand what you did. I have the following questions: Basically you quantified X and then you made your regression line. But is this regression line trustworthy. I mean basically it's made by only 2 xi so my sample is really small. Or because these xi come from a bigger initial sample before we don't mind? Also second question what do the other x between 0 and 1 express in this example when you made your regression line?

@bornhere13 5 жыл бұрын

As mentioned, this is equivalent to regressing the outcome on a dummy variable. But in a regression context, we also assume equal variances. Correct? Also, is there an advantage to conducting a t-test in this fashion? I would think it is because you can invoke the Welch procedure. What least-squares regression methods are available when group variances are unequal? Thank you for any insight. I hope my questions make sense!

@oguzhanserce4480 6 жыл бұрын

is this paired t test?

@jbstatistics 6 жыл бұрын

No. I work through an example of a paired t test here: kzbin.info/www/bejne/q6HGZa2ElJKMfK8

@oguzhanserce4480 6 жыл бұрын

i am confused with these terms. There is a t test. t test have subtopics as paired and unpaired for two samples. Paired t test have subtopics as pooled and non-pooled?