How to define a table of DE genes in a microarray experiment

From BITS wiki
Jump to: navigation, search
Go to parent Analyze your own microarray data in R/Bioconductor

Let's go back to the simple example of the single comparison of 3 wild-type and 3 mutant samples.

Many scientific papers quote the non-adjusted p-values, however this is not a good idea for the massive number of comparisons you make for the identification of DE genes. Adjusted p-values accompanied by the FDR you used as a cutoff is much more accurate. As of yet no conventions have been established for false discovery rate in published work. An FDR of 5% or less should be acceptable for journal publication of gene lists.

As you can see topTable() does not allow you to sort on adjusted p-value or to select the genes with an adjusted p-value below a certain cutoff and in most cases, this is exactly what you want to do.

A large number of the genes that have an adjusted p-value below the threshold (adj. p-value < 0.001) have fold changes below two-fold. Although the changes of these genes are significant (since the adjusted p-value is so low), most people do not select genes with changes in gene expression below two-fold.

There's an alternative method for finding DE genes that allows you to specify a false discovery rate and a threshold for the log fold change in a single command.
It is especially interesting when you do multiple comparisons, as in the example of the 3 groups of mice samples. If you use topTable() for this, you have to perform the adjustment thrice, once for each comparison, each time changing the value of the coef parameter.