Hands-on Analysis of public microarray datasets

From BITS wiki
Jump to: navigation, search

"Date: October, 17 2014, from 9h30 to 17h00"


[ Main_Page ]



This basic training will give you an overview of the what GEO[1] has to offer. Several experiments will be analyzed using simple tools to obtain differential gene lists. An introduction to downstream tools dedicated to functional enrichment will close the session.

Required skills

This training is meant for biologists with little or no data of their own that need to identify genes of interest associated to a given biological problem.​ The participants do not need any prior knowledge of programing.

Morning Session

  • Find relevant data on GEO
  • Analyze using the NCBI GEO2R utility
  • Continue the GEO2R analysis in RStudio (intro)
  • Find cluster using the NCBI GEO DataSets browser
  • Analyze the same data using RobiNA
  • Perform functional enrichment on the DE gene-lists using public tools

Afternoon Session

  • Users with access to CLC Main can follow the CLC tutorial (VIB-only)
  • Users search GEO datasets and analyze them with help from the trainer

More Info

  • More on the VIB website [2]
  • Related VIB training sessions [3]
  • Related BITS Website pages [4]



You will find in this section exercises performed during the hand-on session.

  • PubMA_Exercise.1 Search GEO to find public datasets related to one's project
  • PubMA_Exercise.2 Compute differential analysis using GEO2R within the NCBI web-portal (and follow-up in RStudio)
  • PubMA_Exercise.3 Clustering using the GEO Dataset browser (only for data with attached GDS ID)
  • PubMA_Exercise.4 Full RobiNA analysis as a standalone desktop alternative to GEO2R and Bioconductor
  • PubMA_Exercise.5 Web-tools for functional enrichment of the obtained lists to identify key biological functions


Additional resources

Additional tutorials


Web-services and resources

Only few of the following resources will be used during this training.


Please feel free to discover these other ones with more Plant-dedicated resources than above

Meta Analysis Resources

  • InsilicoDB [9] offers similar services by linking the data to the Broad data-mining tools GenePattern & Gene-E. Please refer to the InsilicoDB tutorial pages for more info.

Commercial resources licensed by VIB

  • Genevestigator not covered during this training but strongly advised for all users who do not have their own MA data but need to find biomarkers.
  • CLC Main workbench (http://data.bits.vib.be/pub/trainingen/CLCMain/TutorialMicroarrays.pdf) used in the optional PubMA_Exercise.6
  • Ingenuity Pathway Analysis (IPA) is strongly advised for more advanced users/usage. You can use IPA on any Java-installed computer after asking for a personal account to mailto:bits@vib.be and login in here . Please keep in mind that IPA is only meant for human/mouse/rat data.

Do you still need MORE?

Find more tools with OMICtools[10]

  1. Tanya Barrett, Ron Edgar
    Mining microarray data at NCBI's Gene Expression Omnibus (GEO)*.
    Methods Mol. Biol.: 2006, 338;175-90
    [PubMed:16888359] ##WORLDCAT## [DOI] (P p)

    Ron Edgar, Michael Domrachev, Alex E Lash
    Gene Expression Omnibus: NCBI gene expression and hybridization array data repository.
    Nucleic Acids Res.: 2002, 30(1);207-10
    [PubMed:11752295] ##WORLDCAT## (I p)

  2. http://www.vib.be/en/training/research-training/courses/Pages/Analysis-of-public-microarray-data-sets.aspx
  3. http://www.vib.be/en/training/research-training/courses/Pages/Introduction-to-Affymetrix-microarray-analysis.aspx http://www.vib.be/en/training/research-training/courses/Pages/Analysis-of-public-microarray-data-using-Genevestigator.aspx
  4. https://www.bits.vib.be/index.php/training/177-microarray-bioconductor https://www.bits.vib.be/index.php/training/125-genevestigator
  5. http://genepattern.org/
  6. http://www.broadinstitute.org/cancer/software/GENE-E/
  7. http://www.broadinstitute.org/gsea/
  8. http://tagc.univ-mrs.fr/tbrowser/

    Cyrille Lepoivre, Aurélie Bergon, Fabrice Lopez, Narayanan B Perumal, Catherine Nguyen, Jean Imbert, Denis Puthier
    TranscriptomeBrowser 3.0: introducing a new compendium of molecular interactions and a new visualization tool for the study of gene regulatory networks.
    BMC Bioinformatics: 2012, 13;19
    [PubMed:22292669] ##WORLDCAT## [DOI] (I e)

    Fabrice Lopez, Julien Textoris, Aurélie Bergon, Gilles Didier, Elisabeth Remy, Samuel Granjeaud, Jean Imbert, Catherine Nguyen, Denis Puthier
    TranscriptomeBrowser: a powerful and flexible toolbox to explore productively the transcriptional landscape of the Gene Expression Omnibus database.
    PLoS ONE: 2008, 3(12);e4001
    [PubMed:19104654] ##WORLDCAT## [DOI] (I p)

  9. https://insilicodb.com InsilicoDB

    Jonatan Taminau, Stijn Meganck, Cosmin Lazar, David Steenhoff, Alain Coletta, Colin Molter, Robin Duque, Virginie de Schaetzen, David Y Weiss Solís, Hugues Bersini, Ann Nowé
    Unlocking the potential of publicly available microarray data using inSilicoDb and inSilicoMerging R/Bioconductor packages.
    BMC Bioinformatics: 2012, 13;335
    [PubMed:23259851] ##WORLDCAT## [DOI] (I e)

    Alain Coletta, Colin Molter, Robin Duqué, David Steenhoff, Jonatan Taminau, Virginie de Schaetzen, Stijn Meganck, Cosmin Lazar, David Venet, Vincent Detours, Ann Nowé, Hugues Bersini, David Y Weiss Solís
    InSilico DB genomic datasets hub: an efficient starting point for analyzing genome-wide studies in GenePattern, Integrative Genomics Viewer, and R/Bioconductor.
    Genome Biol.: 2012, 13(11);R104
    [PubMed:23158523] ##WORLDCAT## [DOI] (I e)

    Cosmin Lazar, Stijn Meganck, Jonatan Taminau, David Steenhoff, Alain Coletta, Colin Molter, David Y Weiss-Solís, Robin Duque, Hugues Bersini, Ann Nowé
    Batch effect removal methods for microarray gene expression data integration: a survey.
    Brief. Bioinformatics: 2013, 14(4);469-90
    [PubMed:22851511] ##WORLDCAT## [DOI] (I p)

    Jonatan Taminau, David Steenhoff, Alain Coletta, Stijn Meganck, Cosmin Lazar, Virginie de Schaetzen, Robin Duque, Colin Molter, Hugues Bersini, Ann Nowé, David Y Weiss Solís
    inSilicoDb: an R/Bioconductor package for accessing human Affymetrix expert-curated datasets from GEO.
    Bioinformatics: 2011, 27(22);3204-5
    [PubMed:21937664] ##WORLDCAT## [DOI] (I p)

  10. http://omictools.com

    Vincent J Henry, Anita E Bandrowski, Anne-Sophie Pepin, Bruno J Gonzalez, Arnaud Desfeux
    OMICtools: an informative directory for multi-omic data analysis.
    Database (Oxford): 2014, 2014;
    [PubMed:25024350] ##WORLDCAT## [DOI] (I e)

[ Main_Page ]