Bowtie2

From BITS wiki
Jump to: navigation, search

Bowtie2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes.

SimilarTo.png: Bwa

Suggests.pngsuggests : bowtie


[ BioWare | Main_Page ]


Please refer to the [bowtie2 manual] for more detailed information.
The code used to be downloaded from sourceforge https://sourceforge.net/projects/bowtie-bio/files/bowtie2/2.2.1[1] but recently moved to GitHub https://github.com/BenLangmead/bowtie2 [2]

Bowtie1 vs Bowtie2 (from the manual)

How is Bowtie 2 different from Bowtie 1?

Bowtie 1 was released in 2009 and was geared toward aligning the relatively short sequencing reads (up to 25-50 nucleotides) prevalent at the time. Since then, technology has improved both sequencing throughput (more nucleotides produced per sequencer per day) and read length (more nucleotides per read).

The chief differences between Bowtie 1 and Bowtie 2 are:

  1. For reads longer than about 50 bp Bowtie 2 is generally faster, more sensitive, and uses less memory than Bowtie 1. For relatively short reads (e.g. less than 50 bp) Bowtie 1 is sometimes faster and/or more sensitive.
  2. Bowtie 2 supports gapped alignment with affine gap penalties. Number of gaps and gap lengths are not restricted, except by way of the configurable scoring scheme. Bowtie 1 finds just ungapped alignments.
  3. Bowtie 2 supports local alignment, which doesn’t require reads to align end-to-end. Local alignments might be “trimmed” (“soft clipped”) at one or both extremes in a way that optimizes alignment score. Bowtie 2 also supports end-to-end alignment which, like Bowtie 1, requires that the read align entirely.
  4. There is no upper limit on read length in Bowtie 2. Bowtie 1 had an upper limit of around 1000 bp.
  5. Bowtie 2 allows alignments to overlap ambiguous characters (e.g. Ns) in the reference. Bowtie 1 does not.
  6. Bowtie 2 does away with Bowtie 1’s notion of alignment “stratum”, and its distinction between “Maq-like” and “end-to-end” modes. In Bowtie 2 all alignments lie along a continuous spectrum of alignment scores where the scoring scheme, similar to Needleman-Wunsch and Smith-Waterman.
  7. Bowtie 2’s paired-end alignment is more flexible. E.g. for pairs that do not align in a paired fashion, Bowtie 2 attempts to find unpaired alignments for each mate.
  8. Bowtie 2 reports a spectrum of mapping qualities, in contrast fo Bowtie 1 which reports either 0 or high.
  9. Bowtie 2 does not align colorspace reads.
  10. Bowtie 2 is not a “drop-in” replacement for Bowtie 1. Bowtie 2’s command-line arguments and genome index format are both different from Bowtie 1’s.

What isn’t Bowtie 2?

  1. Bowtie 1 and Bowtie 2 are not general-purpose alignment tools like MUMmer, BLAST or Vmatch.
  2. Bowtie 2 works best when aligning to large genomes, though it supports arbitrarily small reference sequences (e.g. amplicons). It handles very long reads (i.e. upwards of 10s or 100s of kilobases), but it is optimized for the read lengths and error modes yielded by recent sequencers, such as the Illumina HiSeq 2000, Roche 454, and Ion Torrent instruments.
  3. If your goal is to align two very large sequences (e.g. two genomes), consider using MUMmer. If your goal is very sensitive alignment to a relatively short reference sequence (e.g. a bacterial genome), this can be done with Bowtie 2 but you may want to consider using tools like NUCmer, BLAT, or BLAST. These tools can be extremely slow when the reference genome is long, but are often adequate when the reference is short.
  4. Bowtie 2 does not support alignment of colorspace reads. This might be supported in future versions.


Handicon.png Current version of the Illumina's iGenomes cufflinks annotation files were obtained (here: [3]) and is documented (here: [4]). Importantly, These annotation files are augmented with the tss_id and p_id GTF attributes that Cufflinks needs to perform differential splicing, CDS output, and promoter user analysis.


References:
  1. https://sourceforge.net/projects/bowtie-bio/files/bowtie2/2.2.1
  2. https://github.com/BenLangmead/bowtie2
  3. http://cufflinks.cbcb.umd.edu/igenomes.html
  4. ftp://igenome:G3nom3s4u@ussd-ftp.illumina.com/README.txt



[ BioWare | Main_Page ]