Baseline risk and sample size
Source:    Publish Time: 2012-08-24 08:46   1195 Views   Size:  16px  14px  12px
Author: Xuanqian Xie The Baseline risk are strongly associated with sample size. I illustrated it with example of the

Author: Xuanqian Xie

The Baseline risk are strongly associated with sample size. I illustrated it with example of the probiotics (Sinclair et al. 2011). We estimated the sample size required to test the statistical significances between Lactobacillus and placebo at different baseline incidence rates of CDAD, assuming that the “ture” RR is the pooled estimate, 0.3. With the decreasing of baseline risks of CDAD, the sample size required dramatically increases, reaching 4136 at the incidence rate of 1%. The sample sizes required at baseline risks of CDAD of 0.05, 0.1 and 0.2 are 804, 388 and 180, respectively. Not surprisingly, due to the large sample size required for the low incidence rate of CDAD, most RCTs cannot detect the effects of Lactobacillus statistically.  


Sample size estimates at different underlying risks of CDAD

Details of sample size calculation:

Test: Pearson Chi-square test, 2-sided; Type I error: 0.05; power: 0.8; equal group sample size; risk ratio of Lactobacillus versus placebo: 0.3; no loss follow up; no potential confounders. Based on these assumptions, we estimated the sample sizes required to identify the statistical differences at various incidence rate of CDAD. 


Sinclair A, Xie X, Dendukuri N. The Use of Lactobacillus probiotics in the Prevention of Antibiotic Associated Clostridium Difficile Diarrhea. Montreal (Canada): Technology Assessment Unit (TAU) of the McGill University Health Centre (MUHC); 2011 Dec 19. Report no. 54. 45 p. Available from: 54_probiotic.pdf



ods noproctitle;

ods output output=work._psscalc;


proc power;

   TwoSampleFreq Test = PChi Dist = Normal Method=Normal

      Alpha = 0.05

      Sides = 2

      RefProportion = 0.0050 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.2

      RelativeRisk = 0.3 0.4 0.5 0.6 0.7

      Power = 0.8

      NTotal = .


   plot x = power vary (linestyle, symbol) / description='Plot of power by sample size';




data rr3;

set _psscalc;

format RefProportion 6.2;

if RelativeRisk=0.3;

keep RefProportion NTotal ;



goptions reset=all;

symbol i=spline ci=red w=2 l=1;

proc gplot data=rr3;

      plot NTotal *  RefProportion / haxis=axis1 vaxis=axis2;

    axis1 label=(h=1.4 f=Arial  'Incidence rate of CDAD at baseline')

          order=(0 to 0.20 by 0.02 ) value=(f='Arial' h=12pt) minor=none;

    axis2 label=(h=1.4  f=Arial  a=90 'Total number of patients')

          order=(0 to 9000 by 1000 )  value=(f='Arial' h=12pt) minor=none;







data rr5;

set _psscalc;

if RelativeRisk=0.5;

N_5=round (NTotal * 1.2);

keep RefProportion N_5;



Data rr35;

merge rr3 rr5;

by RefProportion;