Microarray analysis (data generated on Affymetrix platform)

From BITS wiki
Jump to: navigation, search


Nice video tutorial on the principle of microarrays (thanks Zeinab!)

This page will redirect you to all our Affymetrix microarray tutorials.

To make it easier for you to choose the appropriate software to analyze your data, we highlighted the characteristics and the potential of each microarray analysis tool.

Checking probes of your favourite gene

There are many ways to check the presence and specificity of the probes that represent your favourite gene or the DE gene that you seleccted for further study on a specific microarray. It is something that you should do before you invest a lot of time and money in the follow-up of that gene since many probe sets are wrongly annotated or known to bind to multiple genes. See how to check probes on the basic bioinformatics exercises page.

Analysis of your own microarray data

R/Bioconductor: Analyze_your_own_microarray_data_in_R/Bioconductor
Pro: full freedom to perform every step in the analysis the way you want it
Con: hard to use

Affymetrix Expression console and TAC: [ tutorial]
Pro: easy user interface
Con: you loose freedom compared to R, e.g. only 3 normalization algorithms to choose from

CLC Main Workbench tutorial
Pro: easy user interface

Con:

Overview of possibilities of software tools for analysis of own microarray data
software supported analyses statistics 2 groups 3 groups paired data 2 factors OS free?
R/Bioconductor any sound: limma yes yes yes yes any yes
Affymetrix EC+TAC search DE genes regular t-test and ANOVA
with FDR correction
yes yes no no Windows yes
CLC Main Workbench(*) search DE genes
MA plots
Box plots
PCA
clustering
simple:
(regular t-test and ANOVA)
ANOVA follow-up not ok
(t tests without correction)
yes yes yes no any no

(*) data has to be first normalized in other software

Analysis of public microarray data

Obtaining public microarray data

It is mandatory for publication to make microarray data publicly available.
There are two main public microarray data repositories:

  • GEO (Gene Expression Omnibus) from NCBI
  • ArrayExpress from EBI

You can search these repositories for microarray data sets that are related to your research topic.
We provide tutorials on searching GEO:


Analysis of public microarray data

R/Bioconductor: see tutorial on loading the data in R and tutorial on analyzing the data in R (ignore step 4)
Pro: Full freedom to perform every step in the analysis the way you want
Con: Hard to use

GEO2R (see tutorial on space-flown mice data set and tutorial on rat heart versus diaphragm data set)
Pro: Very easy to use, works in a browser
Con: Assumes the data is normalized (whether or not this is truly the case depends on the submitter of the data). You can check if the data is normalized by creating a box plot. If the boxes are very different, the data is not normalized and you cannot proceed with the analysis.

GEO Dataset browser (see tutorial)
Pro: Very easy to use, works in a browser

Con:
  • Statistics used for searching DE genes are too simple:
    (1) ordinary t-test without shrinkage and multiple testing correction or (2) setting a threshold on the fold changes
  • Assumes the data is normalized. You can check if this is true by creating a box plot. If the boxes are very different, the data is not normalized and you cannot proceed with the analysis.

Affymetrix Expression console and TAC (see tutorial)
Pros: easy user interface and free
Cons: you loose freedom compared to R, e.g. only 3 normalization algorithms to choose from

CLC Main Workbench: tutorial
Pro: easy user interface
Con: you have to pay for the software ! VIB scientists can use the software for free via BITS !
the software does not perform a normalization meaning that you have to download normalized data from GEO (see tutorial)

Very powerfull and reliable but expensive: GeneVestigator (see tutorial)
Pros: you can combine multiple data sets
- - - - -heavily curated data
Cons: you have to pay for the software ! VIB scientists can use the software for free via BITS !

Overview of possibilities of software tools for analysis of public microarray data
software supported analyses statistics 2 groups 3 groups paired data 2 factors OS free?
R/Bioconductor any sound: limma yes yes yes yes any yes
GEO2R search DE genes
box plots
sound: limma yes yes no no any yes
GEO Dataset browser search DE genes
box plots
clustering
heatmaps
simple:
(regular t-test or
fold change threshold)
no FDR correction
yes no no no any yes
Affymetrix EC+TAC search DE genes regular t-test and ANOVA
with FDR correction
yes yes no no Windows yes
CLC Main Workbench(*) search DE genes
MA plots
box plots
PCA
clustering
simple:
(regular t-test and ANOVA)
ANOVA follow-up not ok
(t tests without correction)
yes yes yes no any no

(*) data has to be first normalized in other software

Additional info

In many exercises we refer to the other material of the various microarray analysis trainings BITS offers: