Licensing and DownloadThe release of VAAST 2.1.0 includes pVAAST and is now available for download.
In: Nature Biotechnology, 2014.
In: Cancer Discovery, pp. CD–14, 2014.
In: Current Protocols in Human Genetics, 2014.
In: Transl Psychiatry, 3 , 2013.
pVAAST 5-Minute Guide for the Impatient
- Call the variants in case and control genomes to create VCF files. Ideally, case and control samples should be matched in a) ethnicity; b) sequencing platform; and c) variant calling pipeline. For the best result, we also recommend jointly calling all case and control genomes with GATK UnifiedGenotyper. However, if no control genomes are available, publicly available exomes can be downloaded at: http://www.yandell-lab.org/software/VAAST/data/hg19/Background_CDR/
- Run <VAAST>/bin/vaast_tools/vcf2cdr.pl script to convert multi-sample VCF file(s) to CONDENSER (CDR) file format. (See the command line docs for more information.) This script will create one CDR file for each cohort, which can be unrelated cases/controls or families. An example of this step can be found at: <VAAST>/examples/vcf2cdr_example/vcf2cdr.sh
- Create the pedigree file (".ped" file; see http://pngu.mgh.harvard.edu/~purcell/plink/data.shtml#ped). Every family should have a separate pedigree file. For sequenced individuals, the IDs in the ".ped" file should match the IDs in the original VCF file (or the ## FILE-INDEX entries at the bottom of CDR files).
- Prepare the pVAAST parameter file. You can find several template parameter files in <VAAST>/data/pvaast/ folder, each designed for a different type of family and disease model. At a minimum, the options in the "Basic Options" section should be changed. Other sections are non-essential but can improve performance.
- Run pVAAST. The basic command line is: VAAST -m pvaast -pv_control <parameter file> <GFF3 annotation file> <Control CDR file> --gw <max permutations> For genome-wide significance, --gw value of at least 1e6 is recommended. An example bash script for this step can be found at: <VAAST>/examples/pvaast_example/pvaast.sh
- Any required external data files in this pipeline can be downloaded at: http://www.yandell-lab.org/software/VAAST/data/hg19/
- The ".simple" file provides a quick ranked list of protein coding genes. The ".vaast" file is the complete VAAST report.
- By default pVAAST scores only nonsynonymous and null mutations. To enable support for indels and splice sites, use --indel and --splice_site options in the pVAAST command line. CAUTION: indels and splice_site may result in significant inflation of the false-positive rate when cases and controls are not matched.
- For more information or for advanced options, please see the command line documentation, download VAAST documentation at http://www.yandell-lab.org/software/vaast.html, or read a preprint of our recent paper entitled “Identification of damaged genes and disease-causing alleles with VAAST.”
- IF YOU GET STUCK, WE WOULD LOVE TO HEAR FROM YOU AND HELP! Our mailing list is gro.b1506183183al-ll1506183183ednay1506183183@resu1506183183-tsaa1506183183v1506183183, and my email address is .