6.8 polymorphicWindows

Identifying windows for which samples are polymorphic

polymorphicWindows is used to identify polymorphic sites in the samples.

6.8.1 Input

Required inputs :

--vcf Input_VCF_file.vcf.gz Input VCF file.

Optional inputs :

  • None

Specific Parameters :

--limitLines integer_value To limit amount of lines to be read from VCF file. Default = Will parse entire VCF.
--regions \*.bed To limit analysis to regions defined in BED file. Default = Will parse entire VCF.
--filterDepth integer_value,integer_value To keep only the samples with indicated sample depth (inclusive). Default = Will keep all sites regardless of depth.
--maxMissing numeric_value To filter out sites which has more than the indicated data fraction missing. numeric_value must be between 0 and 1 (inclusive). Default = keep sites regardless of missingness.
--minMAF numeric_value To keep only sites for which minor allele frequency is at the least the indicated number. Default = all sites are kept regardless of minor allele frequency.
--minVarQual numeric_value To only store sites with minimum variant quality as indicated or more. Default = Will keep sites regardless of their variant quality.
--chr or '--limitChr' To keep only specified chromosomes. Default = Will keep all chromosomes.
  • See Filter parameters to apply specific filters for bases, reads and parsing window setting.

Engine parameters that are common to all tasks can be found here.

6.8.2 Output

*_polymorphicWindows.txt.gz Text file

6.8.3 Usage Example

#! /bin/bash

. $(dirname $0)/find_atlas
. $(dirname $0)/simulate_vcf --sampleSize 11 --fixedSeed 181

echo "chr1 1000 10000" > bed.bed
echo "chr2 2 3" >> bed.bed
echo "chr3 0 10" >> bed.bed
echo "chr3 100 110" >> bed.bed
echo "chr3 200 210" >> bed.bed
echo "chr3 300 310" >> bed.bed
echo "chr3 400 410" >> bed.bed
echo "chr3 500 510" >> bed.bed

out="polymorphicWindows"
$atlas --task polymorphicWindows --vcf simulate.vcf.gz --regions bed.bed \
       --fixedSeed 183 --out $out --logFile $out.out 2> $out.eout