From BITS wiki
Jump to: navigation, search

Structural variant and breakpoint detection


BreakDancer (publication:[1], download:[2]). A documentation can be found at[3].

BreakDancerMax predicts five types of structural variants: insertions, deletions, inversions, inter- and intra-chromosomal translocations from next-generation short paired-end sequencing reads using read pairs that are mapped with unexpected separation distances or orientation.

BreakDancer Pipeline

The following example is reproduced from the tutorial page

Create a configuration file using arguments

Usage: <bam files>
         -q INT    Minimum mapping quality [35]
         -m        Using mapping quality instead of alternative mapping quality
         -s        Minimal mean insert size [50]
         -C        Change default system from Illumina to SOLiD
         -c FLOAT  Cutoff in unit of standard deviation [4]
         -n INT    Number of observation required to estimate mean and s.d. insert size [10000]
         -v FLOAT  Cutoff on coefficients of variation [1]
         -f STRING A two column tab-delimited text file (RG, LIB) specify the RG=>LIB mapping, useful when BAM header is incomplete
	 -b INT	   Number of bins in the histogram [50] 
         -g        Output mapping flag distribution
         -h        Plot insert size histogram for each BAM library -g -h tumor.bam normal.bam > BRC6.cfg
# bam2cfg now only has the perl version.

Manually view the insert size and flag distribution results in BRC6.cfg to see if there are any data quality issue. Usually std/mean should be < 0.2 or 0.3 at most. The flag 32(x%), represents percent of chimeric insert, this number (x%) should usually be smaller than 3%.

View png files for the insert size distribution. You should usually see a normal distribution, a bimodal distribution is undesirable and it is not recommended to continue BreakDancerMax step with this situation existing.

Detect inter-chromosomal translocations

breakdancer_max (cpp) arguments

Program: breakdancer_max <analysis.config>
Version: BreakDancerMax-1.1.2
       -o STRING       operate on a single chromosome [all chromosome]
       -s INT          minimum length of a region [7]
       -c INT          cutoff in unit of standard deviation [3]
       -m INT          maximum SV size [1000000000]
       -q INT          minimum mapping quality [25]
       -r INT          minimum number of read pairs required for a SV [2]
       -x INT          maximum threshold of haploid sequence coverage for regions to be ignored [1000]
       -b INT          buffer size for building connection [100]
       -t              only detect inter-chromosomal rearrangement, by default off
       -d STRING       prefix of fastq files that SV supporting reads will be saved by library
       -g STRING       dump SVs and supporting reads in BED format for GBrowse
       -l              analyze Illumina long insert (mate-pair) library
       -a              print out copy number and support reads per library rather than per bam, by default off
       -h              print out Allele Frequency column, by default off
       -y INT          output SVs with scores > [30]
# cpp version
breakdancer_max -t -q 10 -d BRC6.ctx BRC6.cfg > BRC6.ctx

Handicon.png the -d option dumps CTX supporting read pairs into fastq files (in this case BRC6.ctx) by library

Technical.png This step normally takes 12 hours or so for three bam files, 8 hours or so for two bam files for cpp version, around three days for perl version.

  1. Ken Chen, John W Wallis, Michael D McLellan, David E Larson, Joelle M Kalicki, Craig S Pohl, Sean D McGrath, Michael C Wendl, Qunyuan Zhang, Devin P Locke, Xiaoqi Shi, Robert S Fulton, Timothy J Ley, Richard K Wilson, Li Ding, Elaine R Mardis
    BreakDancer: an algorithm for high-resolution mapping of genomic structural variation.
    Nat Methods: 2009, 6(9);677-81
    [PubMed:19668202] ##WORLDCAT## [DOI] (I p)


[ BioWare | Main_Page ]