7.1 convertVCF
Converting a VCF file to other formats
convertVCF
converts a VCF file to a variety of file formats. Available output formats are:
- beagle
- geno
- LFMM
- posfile
- genfile
Detailed descriptions of the file formats are provided here.
7.1.1 Input
Required inputs :
--vcf Input_VCF_file.vcf.gz |
Input VCF file. |
Optional inputs :
None
Specific Parameters :
--format format_to_be_converted_to |
To specify the format to be converted to. Options = beagle, geno, LFMM, posfile, genofile. |
Optional Parameters :
Various filters can also be used during conversion. Filter parameters for VCF conversion are as follows:
--limitLines integer_value |
To limit amount of lines to be read from VCF file. Default = Will parse entire VCF. |
--regions \*.bed |
To limit analysis to regions defined in BED file. Default = Will parse entire VCF. |
--filterDepth integer_value,integer_value |
To keep only the samples with indicated sample depth (inclusive). Default = Will keep all sites regardless of depth. |
--maxMissing numeric_value |
To filter out sites which has more than the indicated data fraction missing. numeric_value must be between 0 and 1 (inclusive). Default = keep sites regardless of missingness. |
--minMAF numeric_value |
To keep only sites for which minor allele frequency is at the least the indicated number. Default = all sites are kept regardless of minor allele frequency. |
--minVarQual numeric_value |
To only store sites with minimum variant quality as indicated or more. Default = Will keep sites regardless of their variant quality. |
--chr or '--limitChr' |
To keep only specified chromosomes. Default = Will keep all chromosomes. |
Engine parameters that are common to all tasks can be found here.
7.1.3 Usage Example
#! /bin/bash
. $(dirname $0)/find_atlas
. $(dirname $0)/simulate_vcf --sampleSize 11 --chrLength 1111 \
--ploidy 2 --fixedSeed 45
out="beagle"
$atlas --task convertVCF --vcf simulate.vcf.gz --format beagle \
--fixedSeed 46 --out $out --logFile $out.out 2> $out.eout
out="geno"
$atlas --task convertVCF --vcf simulate.vcf.gz --format geno \
--fixedSeed 47 --out $out --logFile $out.out 2> $out.eout
out="LFMM_call"
$atlas --task convertVCF --vcf simulate.vcf.gz --format LFMM \
--genotypes call \
--fixedSeed 48 --out $out --logFile $out.out 2> $out.eout
out="LFMM_post"
$atlas --task convertVCF --vcf simulate.vcf.gz --format LFMM \
--genotypes posterior --maxMissing 0.01 \
--fixedSeed 49 --out $out --logFile $out.out 2> $out.eout
out="posfile"
$atlas --task convertVCF --vcf simulate.vcf.gz --format posfile \
--fixedSeed 50 --out $out --logFile $out.out 2> $out.eout
out="genofile"
$atlas --task convertVCF --vcf simulate.vcf.gz --format genfile \
--fixedSeed 51 --out $out --logFile $out.out 2> $out.eout