7.1 convertVCF

Converting a VCF file to other formats

convertVCF converts a VCF file to a variety of file formats. Available output formats are:

  • beagle
  • geno
  • LFMM
  • posfile
  • genfile

Detailed descriptions of the file formats are provided here.

7.1.1 Input

Required inputs :

--vcf Input_VCF_file.vcf.gz Input VCF file.

Optional inputs :

  • None

Specific Parameters :

--format format_to_be_converted_to To specify the format to be converted to. Options = beagle, geno, LFMM, posfile, genofile.

Optional Parameters :

Various filters can also be used during conversion. Filter parameters for VCF conversion are as follows:

--limitLines integer_value To limit amount of lines to be read from VCF file. Default = Will parse entire VCF.
--regions \*.bed To limit analysis to regions defined in BED file. Default = Will parse entire VCF.
--filterDepth integer_value,integer_value To keep only the samples with indicated sample depth (inclusive). Default = Will keep all sites regardless of depth.
--maxMissing numeric_value To filter out sites which has more than the indicated data fraction missing. numeric_value must be between 0 and 1 (inclusive). Default = keep sites regardless of missingness.
--minMAF numeric_value To keep only sites for which minor allele frequency is at the least the indicated number. Default = all sites are kept regardless of minor allele frequency.
--minVarQual numeric_value To only store sites with minimum variant quality as indicated or more. Default = Will keep sites regardless of their variant quality.
--chr or '--limitChr' To keep only specified chromosomes. Default = Will keep all chromosomes.

Engine parameters that are common to all tasks can be found here.

7.1.2 Output

Any supported file format, see format options above.

7.1.3 Usage Example

#! /bin/bash

. $(dirname $0)/find_atlas
. $(dirname $0)/simulate_vcf --sampleSize 11 --chrLength 1111 \
  --ploidy 2 --fixedSeed 45

out="beagle"
$atlas --task convertVCF --vcf simulate.vcf.gz --format beagle \
       --fixedSeed 46 --out $out --logFile $out.out 2> $out.eout

out="geno"
$atlas --task convertVCF --vcf simulate.vcf.gz --format geno \
       --fixedSeed 47 --out $out --logFile $out.out 2> $out.eout

out="LFMM_call"
$atlas --task convertVCF --vcf simulate.vcf.gz --format LFMM \
       --genotypes call \
       --fixedSeed 48 --out $out --logFile $out.out 2> $out.eout

out="LFMM_post"
$atlas --task convertVCF --vcf simulate.vcf.gz --format LFMM \
       --genotypes posterior --maxMissing 0.01 \
       --fixedSeed 49 --out $out --logFile $out.out 2> $out.eout

out="posfile"
$atlas --task convertVCF --vcf simulate.vcf.gz --format posfile \
       --fixedSeed 50 --out $out --logFile $out.out 2> $out.eout

out="genofile"
$atlas --task convertVCF --vcf simulate.vcf.gz --format genfile \
       --fixedSeed 51 --out $out --logFile $out.out 2> $out.eout