Sample Size Estimation in A/B Tests Explained!

  Рет қаралды 48,112

Emma Ding

Emma Ding

3 жыл бұрын

This video is a step-by-step guide on how to estimate sample sizes for A/B tests. This is a very important concept to master in order to ace your Data Science interviews.
Everything You Need to Know About A/B Testing in Data Science Interviews
• Crack A/B Testing Prob...
🟢Get all my free data science interview resources
www.emmading.com/resources
🟡 Product Case Interview Cheatsheet www.emmading.com/product-case...
🟠 Statistics Interview Cheatsheet www.emmading.com/statistics-i...
🟣 Behavioral Interview Cheatsheet www.emmading.com/behavioral-i...
🔵 Data Science Resume Checklist www.emmading.com/data-science...
✅ We work with Experienced Data Scientists to help them land their next dream jobs. Apply now: www.emmading.com/coaching
// Comment
Got any questions? Something to add?
Write a comment below to chat.
// Let's connect on LinkedIn:
/ emmading001

Пікірлер: 43
@dwardster
@dwardster 2 жыл бұрын
Hi two questions: 1. I see this formula is used for two-sample T-tests, but what about the more common case of a two-sample Z-test for proportions? What would we use to calculate sample size in an interview in that case? 2. How do we estimate σ before running the experiment? Just take a sample before running the experiment and use its standard deviation to estimate σ?
@vitheltone
@vitheltone 2 жыл бұрын
Super insightful, but please, at min 1:30 it says "acceptance of H0" , we never accept H0, its reject or fail to reject :)
@waynewang2071
@waynewang2071 2 жыл бұрын
The math trick was wrong: x = \Phi(z_x) for 0 < x < 1. But you can still use the fact that z_{\beta} = - z_{1-\beta} in this case, which will give you the correct formula in the end.
@sitongchen6688
@sitongchen6688 2 жыл бұрын
Hi Emma, thanks for your great video! One question, is here I know you are assuming control and treatment groups have equal variances. But in reality, they may not have same variance, therefore when applying this formula, which variance shall we pick or shall we use pooled variance? Or shall we ensure that those two groups have the same variance? Thanks a lot!
@user-mq9si8eh5k
@user-mq9si8eh5k Жыл бұрын
Wow, so much information, thanks a lot!
@hiteshgupta7099
@hiteshgupta7099 2 жыл бұрын
Hi Emma, thanks for this video. Is the sample size formula to calc only if there are 2 variant groups(control + treatment)? If so how will the formula change if we need more than 2 variant groups?
@shreyaschaturvedi1933
@shreyaschaturvedi1933 3 жыл бұрын
hi emma, can you explain how to calculate the standard deviation in this formula? since the control and treatment group will have different variation, should we use some kind of pooled/unpooled standard error for this number?
@songxiyou2347
@songxiyou2347 3 жыл бұрын
Hi Emma, Question : beta = P(accept H0 | H0 is false) is a conditional probability, why we say beta = P(accept H0) directly?
@jingcaomath
@jingcaomath 2 жыл бұрын
same question here? Everyone can explain this?
@weichiyao5712
@weichiyao5712 2 жыл бұрын
I think this is a typo or something. Formally, Beta = P(Do not accept H0 | H1) = P( |xbar / s /√2n| < z_{alpha/2} | H1). Conditioning on H1 means xbar ~ N(µc-µt, s /√2n). That's why in the next step she subtract µc-µt from xbar.
@chihiroa1045
@chihiroa1045 2 жыл бұрын
Thank you.
@chelseazhang1458
@chelseazhang1458 3 жыл бұрын
Hey Emma! Your videos are very helpful and efficient for preparing interviews! Can you explain a little bit on when should we use t-test vs z-test or what's the difference between the two? I'm a little bit confused about the transition between the two. Thanks!!
@emma_ding
@emma_ding 3 жыл бұрын
More videos about statistic is upcoming. Stay tuned!
@xinyuntang8124
@xinyuntang8124 3 жыл бұрын
T test is for comparing two samples and z test is for comparing sample and population, based on your use case, it’s usually pretty obvious whether you should use t test or z test
@AdiJ8
@AdiJ8 3 жыл бұрын
z-test assumes a population standard deviation/ t-test assumes you don't know the population std and so you estimate it with a sample std
@owenho561
@owenho561 3 жыл бұрын
Your video is very helpful. One question for clarification when you mention delta square is the difference between control and treatment. What the variable is difference of? Can you give us an example. The variance is clear. Thx in advance
@lucasdeoliveirasilva8466
@lucasdeoliveirasilva8466 Жыл бұрын
Is the n the sum of control and variant group sizes? i mean, should i consider n=16*(sigmaˆ2)/(lift ˆ2)= n(control) + n(variant) ? (assuming the AB test is 50/50) Or n=n(variant)=n(control)?
@yuchingyang3089
@yuchingyang3089 2 жыл бұрын
Hi Emma, Thanks for your clearly explained for sample size. I would like to know that in practice, how we can determine the minimum detective effect (delta), is it the same as the practical significance boundary?
@hiteshgupta9286
@hiteshgupta9286 2 жыл бұрын
usually yes.
@owenho561
@owenho561 3 жыл бұрын
This is super helpful! Thanks for sharing. One quick clarification on delta square. Is the Uc and Ut population delta between control and treatment group or it is something else? Can you give us an example?
@emma_ding
@emma_ding 3 жыл бұрын
mu_c and mu_t are population means of the control and the treatment groups. Those are unknown and estimated from the data.
@joe162840
@joe162840 2 жыл бұрын
There is something wrong here....at 1:49, shouldn't it be accepting H(1) or rejecting H(0)?
@zdaman011
@zdaman011 3 жыл бұрын
Great explanation! Out of curiosity, when would we need to apply a formula like this? Wouldn't we already know what the sample size is if we have a sample variance?
@zdaman011
@zdaman011 3 жыл бұрын
@@Doorknob985 That makes sense, but what's confusing me is that the sample size formula requires us to have the variance of the sample means already, which implies we've already ran an experiment
@emma_ding
@emma_ding 3 жыл бұрын
@@zdaman011 No, sample variance can be obtained from the data before the experiment (it's the same as the variance in the control group after you run the test), exactly as Ravi mentioned above i.e. from existing metric. Sample size is the size of the data thats needed to get a certain power of a test. This is an over-simplified formula as it assumes sample variance is the same in control and treatment groups, but it would be helpful for interviews. In reality, it's more common to have a "ramping" process to control risks rather than splitting all users into either control or treatment.
@learnrepeatlearn
@learnrepeatlearn 2 жыл бұрын
@@emma_ding Hi Emma! Thank you for the great video. I have a quick question: What happens if the variance is different between control and treatment, is there another equation we can use to calculate the sample sizes for each? I am assuming that control and treatment group may not necessarily have the same sizes here
@popo-je8ze
@popo-je8ze 2 жыл бұрын
I found that some practical issues are quite hard 1. how to calculate required sample size for ratio or quantile metric ? 2. if we already know that there are some correlations between our sample,thus,underestimating the variance, how to estimate required sample size correctly either for AA test or AB test
@rickyg3390
@rickyg3390 3 жыл бұрын
Hi Emma, isn't phi(Z(x)) = x? instead of -x?
@martingai7333
@martingai7333 2 жыл бұрын
I agree, this is also where I got stuck. Have u figure it out?
@jianzezhou1301
@jianzezhou1301 2 жыл бұрын
This part is right. Actually it is the last part to be wrong, Z0.025 = 1.96 and Z0.2 = 0.84.
@kagelsmith997
@kagelsmith997 2 жыл бұрын
At 1:04, why is the variance 2*sigma^2 / n? Why do you need to multiply by 2?
@davidlee5048
@davidlee5048 Жыл бұрын
Because the pool variance is sample variance+treatment variance, and we assume control and treatment's variance are equal
@jianzezhou1301
@jianzezhou1301 2 жыл бұрын
The last part is wrong, Z0.025 = 1.96 and Z0.2 = 0.84. Actually if Z0.025 is negative everything after the absolute value part is wrong.
@EvanZamir
@EvanZamir 3 жыл бұрын
Is there a book that contains this derivation?
@emma_ding
@emma_ding 3 жыл бұрын
hmm, not I'm aware of. But there're online resources on it if you search on Google.
@mvijayvenkatesh
@mvijayvenkatesh 3 жыл бұрын
I am using "Applied statistics & probability for engineers" by Douglas Montgomery and George Runger, 2nd Edition. Chapters 7,8,9 deal with these topics and the book is an excellent resource.
@overseasafrican9899
@overseasafrican9899 2 жыл бұрын
@evan zamir, You can find another derivation of this formula in Gerald van Belle's book Statistical Rules of Thumb. Chapter 2, page 30. van Belle has a free pdf of this particular chapter posted on his web page.
@adrianusqueiroz
@adrianusqueiroz 2 жыл бұрын
How can we have sample variance, if we did not perform the experiment yet? this is making me confused!
@lucasdeoliveirasilva8466
@lucasdeoliveirasilva8466 Жыл бұрын
that would be an estimate...i think
@empiricalformulas9693
@empiricalformulas9693 Жыл бұрын
right answer, incorrect alpha and beta.
Tutorial 17-Hypothesis Testing And Statistical Analysis Using Z test
25:36
Зу-зу Күлпәш. Стоп. (1-бөлім)
52:33
ASTANATV Movie
Рет қаралды 1,2 МЛН
Did you find it?! 🤔✨✍️ #funnyart
00:11
Artistomg
Рет қаралды 116 МЛН
Product manager METRICS interview - "Engagement drops 10%"
16:48
IGotAnOffer: Product management
Рет қаралды 7 М.
An easier way to do sample size calculations
12:21
Very Normal
Рет қаралды 13 М.
How To Use The A/B Testing Duration Calculator
11:32
invesp
Рет қаралды 3,3 М.
Sample Size Calculation
7:30
The Career Force
Рет қаралды 18 М.