Breakdancer
Contents
Structural variant and breakpoint detection
Description
BreakDancer (publication:[1], download:[2]). A documentation can be found at http://breakdancer.sourceforge.net/pipeline.html[3].
BreakDancerMax predicts five types of structural variants: insertions, deletions, inversions, inter- and intra-chromosomal translocations from next-generation short paired-end sequencing reads using read pairs that are mapped with unexpected separation distances or orientation.
BreakDancer Pipeline
The following example is reproduced from the tutorial page
Create a configuration file using bam2cfg.pl
bam2cfg.pl arguments
Usage: bam2cfg.pl <bam files> Options: -q INT Minimum mapping quality [35] -m Using mapping quality instead of alternative mapping quality -s Minimal mean insert size [50] -C Change default system from Illumina to SOLiD -c FLOAT Cutoff in unit of standard deviation [4] -n INT Number of observation required to estimate mean and s.d. insert size [10000] -v FLOAT Cutoff on coefficients of variation [1] -f STRING A two column tab-delimited text file (RG, LIB) specify the RG=>LIB mapping, useful when BAM header is incomplete -b INT Number of bins in the histogram [50] -g Output mapping flag distribution -h Plot insert size histogram for each BAM library
bam2cfg.pl -g -h tumor.bam normal.bam > BRC6.cfg # bam2cfg now only has the perl version.
Manually view the insert size and flag distribution results in BRC6.cfg to see if there are any data quality issue. Usually std/mean should be < 0.2 or 0.3 at most. The flag 32(x%), represents percent of chimeric insert, this number (x%) should usually be smaller than 3%.
View png files for the insert size distribution. You should usually see a normal distribution, a bimodal distribution is undesirable and it is not recommended to continue BreakDancerMax step with this situation existing.
Detect inter-chromosomal translocations
breakdancer_max (cpp) arguments
Program: breakdancer_max <analysis.config> Version: BreakDancerMax-1.1.2 Options: -o STRING operate on a single chromosome [all chromosome] -s INT minimum length of a region [7] -c INT cutoff in unit of standard deviation [3] -m INT maximum SV size [1000000000] -q INT minimum mapping quality [25] -r INT minimum number of read pairs required for a SV [2] -x INT maximum threshold of haploid sequence coverage for regions to be ignored [1000] -b INT buffer size for building connection [100] -t only detect inter-chromosomal rearrangement, by default off -d STRING prefix of fastq files that SV supporting reads will be saved by library -g STRING dump SVs and supporting reads in BED format for GBrowse -l analyze Illumina long insert (mate-pair) library -a print out copy number and support reads per library rather than per bam, by default off -h print out Allele Frequency column, by default off -y INT output SVs with scores > [30]
# cpp version breakdancer_max -t -q 10 -d BRC6.ctx BRC6.cfg > BRC6.ctx
This step normally takes 12 hours or so for three bam files, 8 hours or so for two bam files for cpp version, around three days for perl version.
References:
- ↑
Ken Chen, John W Wallis, Michael D McLellan, David E Larson, Joelle M Kalicki, Craig S Pohl, Sean D McGrath, Michael C Wendl, Qunyuan Zhang, Devin P Locke, Xiaoqi Shi, Robert S Fulton, Timothy J Ley, Richard K Wilson, Li Ding, Elaine R Mardis
BreakDancer: an algorithm for high-resolution mapping of genomic structural variation.
Nat Methods: 2009, 6(9);677-81
[PubMed:19668202] ##WORLDCAT## [DOI] (I p) - ↑ http://breakdancer.sourceforge.net
- ↑ http://breakdancer.sourceforge.net/pipeline.html