.gff

From BITS wiki
Jump to: navigation, search

sources

http://genome.ucsc.edu/goldenPath/help/customTrack.html#GFF source
http://www.sequenceontology.org/gff3.shtml.
http://www.sanger.ac.uk/Software/formats/GFF

format

GFF stands for General Feature Format and tries to be a standard way of representing annotation on genomic sequences. GFF lines have nine (not less, not more) required fields that must be tab-separated.

Here is a brief description of the GFF fields:

  1. seqname - The name of the sequence. Must be a chromosome or scaffold.
  2. source - The program that generated this feature.
  3. feature - The name of this type of feature. Some examples of standard feature types are "CDS", "start_codon", "stop_codon", and "exon".
  4. start - The starting position of the feature in the sequence. The first base is numbered 1.
  5. end - The ending position of the feature (inclusive).
  6. score - A score between 0 and 1000. If the track line useScore attribute is set to 1 for this annotation data set, the score value will determine the level of gray in which this feature is displayed (higher numbers = darker gray). If there is no score value, enter ".".
  7. strand - Valid entries include '+', '-', or '.' (for don't know/don't care).
  8. frame - If the feature is a coding exon, frame should be a number between 0-2 that represents the reading frame of the first base. If the feature is not a coding exon, the value should be '.'.
  9. group - All lines with the same group are linked together into a single item.

If the fields are separated by spaces instead of tabs, the track will not display correctly in genome browsers. Currently version three is in use.If you have GFF file which can not be displayed, you might want to validate the GFF file with following tool: http://modencode.oicr.on.ca/cgi-bin/validate_gff3_online

Example

 browser position chr22:10000000-10020000  
browser hide all
track name=regulatory description="TeleGene(tm) Regulatory Regions"
visibility=2
chr22 TeleGene enhancer 10000000 10001000 500 + . TG1
chr22 TeleGene promoter 10010000 10010100 900 + . TG1
chr22 TeleGene promoter 10020000 10025000 800 - . TG2