5.8 pileup

Printing pileup from BAM file

pileup goes through the input BAM file and for every position provides a string of bases covering that position and the genotype likelihoods while taking into account all standard input filters. If a post-mortem damage pattern or a base quality score recalibration table is provided, these will be taken into account in the estimated genotype likelihoods. See PMD for more information on how to produce a post-mortem damage pattern, and BQSR for more information on how to produce a base quality score recalibration table.

5.8.1 Input

Required inputs :

--bam Input_bam_file.bam Input bam file

Optional inputs :

--pmd Input_PMD.txt post-mortem damage parameters (see PMD for generating such a file)
--recal recal.txt quality score recalibration file (see recal)

Specific Parameters :

--printAll print all sites that pass filters, including those without data. By default, only sites with data will be printed.
--histograms print the counts for following histogram parameters: Sequencing depth (depth), Base qualities (qualities), Base contexts (contexts), Allelic depth (allelicDepth)

5.8.2 Output

*_pileup.txt.gz Text file containing for each genomic position the sequencing depth, identity of the bases covering it and the genotype likelihoods (taking into account potential PMD and recalibration parameters, when provided.)
*_depthPerChromosome.txt.gz
*_depthPerWindow.txt.gz
*_allelicDepth.txt.gz (optional) only produced when --histograms argument is used.
*_contextInformation.txt.gz (optional) only produced when --histograms argument is used
*_depthPerSiteHistogram.txt.gz (optional) only produced when --histograms argument is used
*_qualHistogram.txt.gz (optional) only produced when --histograms argument is used
*_filterSummary.txt Filter summary for all read groups combined and individual read groups.

5.8.3 Usage Example

#! /bin/bash

. $(dirname $0)/find_atlas
. $(dirname $0)/simulate --type HW --sampleSize 19 --fixedSeed 177

echo "chr1  0   4567" >> window.txt
echo "chr1  4567    9134" >> window.txt
echo "chr1  9134    11111" >> window.txt
echo "chr2  0   4567" >> window.txt
echo "chr2  4567    5432" >> window.txt
echo "chr3  0   4567" >> window.txt
echo "chr3  4567    9134" >> window.txt
echo "chr3  9134    12345" >> window.txt

echo "chr1 0 1" > bed.bed
echo "chr1 2 3" >> bed.bed
echo "chr1 4 5" >> bed.bed
echo "chr1 6 7" >> bed.bed
echo "chr1 8 9" >> bed.bed
echo "chr1 10 111" >> bed.bed
echo "chr1 200 333" >> bed.bed
echo "chr1 400 555" >> bed.bed
echo "chr1 600 777" >> bed.bed
echo "chr1 800 999" >> bed.bed
echo "chr1 1000 11111" >> bed.bed

out="default"
$atlas --task pileup \
       --bam simulate_ind1.bam --fasta simulate.fasta \
       --fixedSeed 171 --out $out --logFile $out.out 2> $out.eout

out="printAll"
$atlas --task pileup --printAll \
       --bam simulate_ind2.bam --fasta simulate.fasta \
       --window window.txt  --readUpToDepth 97 \
       --histograms depth,allelicDepth,contexts,qualities \
       --fixedSeed 173 --out $out --logFile $out.out 2> $out.eout

out="regions"
$atlas --task pileup --printAll \
       --bam simulate_ind3.bam --fasta simulate.fasta \
       --regions bed.bed --histograms depth \
       --fixedSeed 175 --out $out --logFile $out.out 2> $out.eout

out="multiBam"
bams=$(ls *.bam)
$atlas --task pileup --fields "depth,bases,sampleBases" --bam "$bams"  \
       --fixedSeed 179 --out $out --logFile $out.out 2> $out.eout