GEO

From BITS wiki
Jump to: navigation, search

GEO, or Gene Expression Omnibus, is the public repository for microarray experiment data at NCBI. GEO contains

  • Microarray and other transcriptome data in MIAME compliant formats
  • ChIP-chip data


Microarray experiments at GEO come as GDS (datasets) or GSE (series). Basically, the GSE experiments are not (yet) processed by GEO: all information about the experiment, used chips, signal intensity values are available as raw data. GDS datasets are processed, and normalization has been performed.

You can download GEO data in SOFT format, with contains gene expression data together with experiment data. In GSE experiments, the SOFT files contain for each used sample (chip) all raw intensity values, while for GDS experiments, all intensity values are normalized for the experiment - offering ready-to-use data.

Content

Content in GEO has data describes as

  • Platforms
  • Series
  • Samples
  • Datasets
  • Profiles: GEO profiles are expression patterns for specific genes over a dataset.


Platforms

A platform describes the physical setup of the assay. For example a platform might describe a specific product, such as the Affymetrix GeneChip E.coli Genome 2.0 Array. GEO platform accessions start with GPL

Samples

Samples are the individual array measurements. Sample accessions begin with GSM.

Series

Series are sets of samples. GEO Series accessions begin with GSE. Series are submitted by users.

Datasets

Datasets are curated by GEO curators at NCBI.

A DataSet represents a curated collection of biologically and statistically comparable GEO Samples and forms the basis of GEO's suite of data display and analysis tools. Samples within a DataSet refer to the same Platform, that is, they share a common set of array elements. Value measurements for each Sample within a DataSet are assumed to be calculated in an equivalent manner, that is, considerations such as background processing and normalization are consistent across the DataSet. Information reflecting experimental factors is provided through DataSet subsets.

Note that not all series are in datasets due to curation backlogs.

Using GEO

The overview of how to use GEO is on the GEO website

Browsing and Searching

GEO can be searched from either the GEO search page or from the Entrez home page (use the pulldown menu to select GEO datasets or GEO profiles).

Datasets

GEO datasets include a variety of built-in analysis tools, such as views of hierarchical clustering within a specific dataset.

Profiles

Profiles show the expression of individual genes in a dataset. When viewing a gene profile, you can click on "Profile neighbors" above the graphic representation of the profile. This will return genes with similar profiles within the dataset.

Usage examples

Add links to additional pages describing success stories here.

Other sites with related content

Technology

Web Services/API

GEO is queryable through the NCBI EUtils system. Brief documentation is provided at the GEO programmatic access page. Additional query documentation needed.