Exercises: Reading and storing expression data

From BITS wiki
Jump to: navigation, search
Go to parent Introduction to R/Bioconductor for analysis of microarray data#Training Units

Atxn1 CEL files

Under reference E-MEXP-886, the data for a small experiment comparing gene expression in Ataxin 1 knockout mice and wild type animals can be found (either through the ArrayExpress webpage or just by googling the reference).

  1. Download the raw expression data to your hard disk and extract the cel files into a new working directory.
  2. Download the accompanying phenotypic data and store it in the same directory.
  3. Read the probe-level expression data into R.
  4. Read in the phenotypic information and store it as part of the raw expression data. (tip: sdrf)
  5. Bonus: download the description of the experiment, and fill in the relevant parts of the MIAME standard as part of the raw expression data.
  6. Store the annotated raw expression data object as a binary RData file in the same directory as above.
  7. Create a density plot of the probe intensities. What do you notice?

Resistant CLM

We want to use RMA instead of MAS for the analysis of the resistant CLM data used in the tutorial. The Geo Series (GSExxxx) associated with the reference GDS2729 contains the necessary CEL files as supplementary files.

Alternative 1: go directly to the GEO repository [1] and access the reference. From there, you can go to the associated reference series and download the supplementary files into your working directory.

Alternative 2: the ID of the associated reference series is part of the meta information contained in the GDS object that was already downloaded in the tutorial. With this ID, the function getGEOSuppFiles can download the supplementary files to the local directory without ever leaving the R command line.

In both cases, some uncompressing of the downloaded files will be required. You can either read in the probe-level data and use the function rma, or you directly jump to the probeset-level data using justRMA.

Direct download from ArrayExpress

The Bioconductor package ArrayExpress allows direct download from ArrayExpress/EBI. Load the package, read the vignette and

  1. download the data for the Atxn1 experiment directly.
  2. identify another experiment involving Atxn1 and a mouse model at EBI; if it is not too big, download it too.
Go to parent Introduction to R/Bioconductor for analysis of microarray data#Training Units