4.7 PMDS
Filtering for ancient reads using PMDS (Post mortem damage score)
Contamination from present-day DNA is a fundamental issue when studying ancient DNA from historical or archaeological material, and quantifying the amount of contamination is essential for downstream analyses. One way of measuring contamination or filtering out reads that most likely originate from modern DNA is to use the post-mortem damage score (PMDS) introduced by Skoglund et al. (2014). With this task, you can use ATLAS to calculate the PMDS for all reads. The output is a new BAM file, where each alignment is described by a field DS:f:<PMDS>, which specifies the alignment’s PMDS score. If the DS field already exists, it will be overwritten. It is possible to filter out the reads with a low PMDS score while writing the BAM file by using the option minPMDS or maxPMDS to specify the threshold value.
A PMDS score > 0 means that the read is more likely ancient than modern. The higher the score, the higher its probability of being ancient.
4.7.1 Input
Required inputs :
--bam Input_bam_file.bam |
Input bam file. |
--fasta Input_refrence_genome_file.fasta |
Reference genome. |
Optional inputs :
--recal recal.txt |
Quality score recalibration file (see recal. |
--pmd Input_PMD.txt |
Post-mortem damage parameters (see PMD for generating such a file). |
Specific Parameters :
--minPMDS integer_value |
Filter out reads that have a PMDS score below the specified value. Default = -10000. |
--maxPMDS integer_value |
Filter out reads that have a PMDS score above the specified value. Default = 10000. |
Engine Parameters :
Engine parameters that are common to all tasks can be found here.
4.7.3 Usage Example
#! /bin/bash
# `--fixedSeed = N` is needed to have reproducable results in regression test
pmd5="0.1*exp(-0.1*p)+0.05"
pmd3="0.2*exp(-0.3*p)+0.07"
. $(dirname $0)/find_atlas
for strand in single double; do
pmd="CT5:$pmd5;CT3:$pmd3"
[ $strand == "double" ] && pmd="CT5:$pmd5;GA3:$pmd3"
. $(dirname $0)/simulate --pmd $pmd --fixedSeed 7 \
--out $strand --logFile simulate_$strand.out
$atlas --task PMDS --bam $strand.bam --fasta $strand.fasta \
--fixedSeed 5 --out $strand --logFile $strand.out 2> $strand.eout
done