VCFtools

From BITS wiki
Jump to: navigation, search

The evolving standard framework for Variant analysis

SimilarTo.png: GATK

Suggests.pngsuggests : Bcftools


[ BioWare | Main_Page ]


VCFtools[1] is a program package designed for working with VCF files, such as those generated by the 1000 Genomes Project. The aim of VCFtools is to provide easily accessible methods for working with complex genetic variation data in the form of VCF files.

VCFTools code and documentation are hosted at http://vcftools.sourceforge.net/[2]

This toolset can be used to perform the following operations on VCF files:

  • Filter out specific variants
  • Compare files
  • Summarize variants
  • Convert to different file types
  • Validate and merge files
  • Create intersections and subsets of variants

A mail list is present where you can register and post your questions and error reports and where you will be very rapidly rescued.
To subscribe or unsubscribe via the World Wide Web, visit https://lists.sourceforge.net/lists/listinfo/vcftools-help [3]


VCFtools consists of two parts, a perl module and a binary executable. The perl module is a general Perl API for manipulating VCF files, whereas the binary executable provides general analysis routines.

vcftools

VCFtools contains a Perl API (Vcf.pm) and a number of Perl scripts that can be used to perform common tasks with VCF files such as file validation, file merging, intersecting, complements, etc. The Perl tools support all versions of the VCF specification (3.2, 3.3, 4.0, 4.1 and 4.2), nevertheless, the users are encouraged to use the latest versions VCFv4.1 or VCFv4.2. The VCFtools in general have been used mainly with diploid data, but the Perl tools aim to support polyploid data as well.

Run any of the Perl scripts with the --help switch to obtain more help.

Many of the Perl scripts require that the VCF files are compressed by bgzip and indexed by tabix (both tools are part of the tabix package, available for download here). The VCF files can be compressed and indexed using the following commands

Tool page: http://vcftools.sourceforge.net/index.html [4]

bgzip my_file.vcf
tabix -p vcf my_file.vcf.gz
 
# the tools
fill-aa
fill-an-ac
fill-fs
fill-ref-md5
fill-rsIDs
vcf-annotate
vcf-compare
vcf-concat
vcf-consensus
vcf-contrast
vcf-convert
vcf-filter
vcf-fix-newlines
vcf-fix-ploidy
vcf-indel-stats
vcf-isec
vcf-merge
vcf-phased-join
vcf-query
vcf-shuffle-cols
vcf-sort
vcf-stats
vcf-subset
vcf-to-tab
vcf-tstv
vcf-validator
 
# all are based on the perl module
Vcf.pm



References:
  1. Petr Danecek, Adam Auton, Goncalo Abecasis, Cornelis A Albers, Eric Banks, Mark A DePristo, Robert E Handsaker, Gerton Lunter, Gabor T Marth, Stephen T Sherry, Gilean McVean, Richard Durbin, 1000 Genomes Project Analysis Group
    The variant call format and VCFtools.
    Bioinformatics: 2011, 27(15);2156-8
    [PubMed:21653522] ##WORLDCAT## [DOI] (I p)

  2. http://vcftools.sourceforge.net/
  3. https://lists.sourceforge.net/lists/listinfo/vcftools-help
  4. http://vcftools.sourceforge.net/index.html



[ BioWare | Main_Page ]