Effects of More Replicates on Statistical Significance
How many replicates will you have to do to obtain statistical sigficance?
The following data came from a set of Affymetrix experiments done by Daniel Amador-Noguez in Dr. Gretchen Darlington's lab. 24 different arrays were run looking at 2 different genotypes, wildtype mice and Ames Dwarf mice. In total, they ended up with 12 replicates for each treatment. They used the Affymetrix MOE 430A GeneChips®. The results were normalized in dChip and a one-way ANOVA (t-test) was applied.
Using different p-values of 0.001 and 0.0001(without multiple testing correction), the t-test generated a certain number of significant genes. Starting with 2 biological replicates for each treatment (in total 4 arrays), a t-test was run. Each data point represents how many statistically significant, differentially expressed genes were found per number of replicates used in the analysis.
In addition, different approaches were used in terms of the assumptions made about the variance across the samples for each gene. If you assume that the variance is equal between the two different samples across every gene, then you will get a larger number of significant genes, compared to assuming that the variance is not equal across the samples. Although you will get a larger number of differentially expressed genes from assuming the variance is equal, more than likely the safer bet statistically is to assume non-equal variance.
|Number of differentially expressed genes vs. number of replicates (ANOVA)|
|# of Replicates||Variance Not Equal||Variance Equal|
(Click Images to zoom in)
NOTE: Although this is only one experiment and other results might be different, the table and graphs clearly show that as you increase the number of replicates, your number of significant differentially expressed genes will also increase. Therefore, when planning microarray experiments, its is important to seriously consider the number of replicates for each treatment.