General approximation
For a 95% confidence level, the sample size (for each group) can be approximated by the formula:
$n ≈ \frac{16 \sigma^2}{d^2}$where:
- z is the z-score
- σ is the standard deviation
- which is $\sqrt{p(1-p)}$ for a binomial distribution
- d is the effect size
- which is $\mu_1 - \mu_0$ for a continuous metric
- and $p_1 - p_0$ for a binomial distribution
For continuous metrics
Formula
Applied to continuous metrics, the formula is:
$n = 2 \left ( \frac{(z_{\alpha/2} + z_\beta) \sigma}{d} \right )^2$where
- σ is the standard deviation
- z are the z-scores associated with α and 1 − β
- d is the minimum detectable effect in absolute value, e.g. $\mu_1 - \mu_0$
Python implementation
# Function to get minimum sample size (two-sided test)
import scipy.stats as st
def sample_size_cont(mde_abs, variance, power=.8, alpha=.05):
t_alpha = st.norm.ppf(1-alpha/2)
t_beta = st.norm.ppf(power)
result = 2*((t_alpha + t_beta) * np.sqrt(variance) / mde_abs)**2
print("Sample size: {:.0f}".format(result))
Let’s apply our function for a MDE of 0.05, and a (pooled) variance of 5:
mde_abs = .05 # Minimum detectable effect in absolute value
variance = 5 # Pooled variance
sample_size_cont(mde_abs, variance)
Sample size: 31396
This can also be implemented with statsmodels TTestIndPower().solve_power()
function:
# Statsmodel equivalent function
import numpy as np
import statsmodels.stats.power as smpr
effect_size = mde_abs / np.sqrt(variance)
smresult = smpr.TTestIndPower().solve_power(
effect_size=effect_size,
power=.8,
alpha=.05,
ratio=1,
alternative='two-sided'
)
print("Sample size: {:.0f}".format(smresult))
Sample size: 31396
For proportions
Formula
Applied to proportions, the minimum sample size (for each group) is calculated as:
$n = 2 \left ( \frac {(z_{\alpha/2} + z_\beta) \sqrt{p(1-p)}}{d} \right ) ^2$where
- p is the pooled proportion, i.e. total number of successes over total number of observations
- z are the z-scores associated with α and 1 − β
- d is the minimum detectable effect in absolute value, e.g. $p_1 - p_0$
Python implementation
# Function to get minimum sample size
import scipy.stats as st
def sample_size_ratio(p, mde_abs, alpha=.05, power=.8):
t_alpha = st.norm.ppf(1-alpha/2)
t_beta = st.norm.ppf(power)
result = 2 * (((t_alpha + t_beta)**2 * p*(1-p))/mde_abs**2)
print("Sample size: {:.0f}".format(result))
Let’s try this with a proportion p of 0.20 and an absolute MDE of 0.01:
p = .20 # Baseline rate
mde = .01 # Absolute minimum detectable effect
sample_size_ratio(p=p, mde_abs=mde)
Sample size: 25116
Again, it can be implemented with statsmodels:
# Statsmodels equivalent function
import statsmodels.stats.power as smpr
import statsmodels.stats.proportion as smp
effect_size = smp.proportion_effectsize(p, p+mde)
smresult = smpr.NormalIndPower().solve_power(
effect_size=effect_size,
power=.8,
alpha=.05,
ratio=1,
alternative='two-sided'
)
print("Sample size: {:.0f}".format(smresult))
Sample size: 25580
When randomization unit ≠ analysis unit
When the randomization unit is different from the analysis unit, the minimum required sample size can be estimated by adding to the usual formula, the ratio of the number of analysis units to the number of randomization units:
$n = 2 \left ( \frac{(z_{α/2} + z_β) \cdot σ}{\text{DE} \cdot d} \right )^2$where:
- $z_{\alpha/2}$ is the upper 1-α/2 percentile of the standard normal distribution. For example, if α = 0.05, then $z_{\alpha/2}$ = 1.96.
- $z_\beta$ is the upper 1-β percentile of the standard normal distribution. For example, if β = 0.20, then $z_\beta$ = 0.84.
- $\sigma^2$ is the variance of the outcome variable
- d is the effect size you want to detect. This is the difference in means between the two groups that you want to be able to detect with your test.
- DE is the Design Effect
The Design Effect DE is calculated as:
$DE = 1 + (m - 1) × ICC$where:
- m is the average cluster size (number of analysis units per randomization unit)
- ICC is the Intraclass Correlation Coefficient
👉 Since it’s not trivial to compute sample sizing for this use case, you can fallback on the usual simple formula, and consider the result as the minimum required number of randomisation units. It will be conservative, but it’ll guarantee that you have at least the minimum required size.
Resources
- Statistical Rules of Thumb by Gerald Van Belle
- Evan Miller sample size calculator