Loading public microarray data in R/Bioconductor
From BITS wiki
If you want to load raw CEL files from GEO directly into R, you have to install the GEOquery Bioconductor package and use the following script:
#Installing and loading the GEOquery package #For the data import you also need the Bioconductor base package and the affy package library(Biobase) library(GEOquery) library(affy) #Downloading and unzipping the CEL-files from GEO. Selection of the data set is based on its GEO Series accession number getGEOSuppFiles("GSE6943") untar("GSE6943/GSE6943_RAW.tar", exdir="data") cels <- list.files("data/", pattern = "[gz]") sapply(paste("data", cels, sep="/"), gunzip) cels #Define celpath as the path to the folder where R saved the CEL-files #in this example, one of the files, GSM160097, has been corrupted, so you have to remove it from the folder celpath <- "C:/Users/Janick/Documents/data/" fns <- list.celfiles(path=celpath,full.names=TRUE) fns cat("Reading files:\n",paste(fns,collapse="\n"),"\n") #Loading the CEL-files into an AffyBatch object celfiles <- ReadAffy(celfile.path=celpath)