GEO
GEO, or Gene Expression Omnibus, is the public repository for microarray experiment data at NCBI. GEO contains
- Microarray and other transcriptome data in MIAME compliant formats
- ChIP-chip data
Microarray experiments at GEO come as GDS (datasets) or GSE (series). Basically, the GSE experiments are not (yet) processed by GEO: all information about the experiment, used chips, signal intensity values are available as raw data. GDS datasets are processed, and normalization has been performed.
You can download GEO data in SOFT format, with contains gene expression data together with experiment data. In GSE experiments, the SOFT files contain for each used sample (chip) all raw intensity values, while for GDS experiments, all intensity values are normalized for the experiment - offering ready-to-use data.
Contents
Content
Content in GEO has data describes as
- Platforms
- Series
- Samples
- Datasets
- Profiles: GEO profiles are expression patterns for specific genes over a dataset.
Platforms
A platform describes the physical setup of the assay. For example a platform might describe a specific product, such as the Affymetrix GeneChip E.coli Genome 2.0 Array. GEO platform accessions start with GPL
Samples
Samples are the individual array measurements. Sample accessions begin with GSM.
Series
Series are sets of samples. GEO Series accessions begin with GSE. Series are submitted by users.
Datasets
Datasets are curated by GEO curators at NCBI.
A DataSet represents a curated collection of biologically and statistically comparable GEO Samples and forms the basis of GEO's suite of data display and analysis tools. Samples within a DataSet refer to the same Platform, that is, they share a common set of array elements. Value measurements for each Sample within a DataSet are assumed to be calculated in an equivalent manner, that is, considerations such as background processing and normalization are consistent across the DataSet. Information reflecting experimental factors is provided through DataSet subsets.
Note that not all series are in datasets due to curation backlogs.
Using GEO
The overview of how to use GEO is on the GEO website
Browsing and Searching
GEO can be searched from either the GEO search page or from the Entrez home page (use the pulldown menu to select GEO datasets or GEO profiles).
Datasets
GEO datasets include a variety of built-in analysis tools, such as views of hierarchical clustering within a specific dataset.
Profiles
Profiles show the expression of individual genes in a dataset. When viewing a gene profile, you can click on "Profile neighbors" above the graphic representation of the profile. This will return genes with similar profiles within the dataset.
Usage examples
Add links to additional pages describing success stories here.
Technology
Web Services/API
GEO is queryable through the NCBI EUtils system. Brief documentation is provided at the GEO programmatic access page. Additional query documentation needed.