4.3 downsample
Downsampling a BAM file by removing reads
downsample
creates downsampled BAM files that contain a specified percentage of the original number of reads. More than one percentage/probability can be specified, and thus allows the creation of several downsampled BAM files at a time. In this task, all reads are considered, even those that do not pass the usual SAM flag filters.
4.3.1 Input
Required inputs :
--bam Input_bam_file.bam |
Input bam file |
Optional inputs :
None
Specific parameters :
--prob numeric_value |
One value or a vector of percentages of reads to be kept in the downsampled BAM files.numeric_value must be between 0 and 1 (inclusive). It is possible to obtain replicates by adding the desired number of replicates in curly brackets {} after the concerned percentage. |
--separateReads or --writeN |
--separateReads will down sample by removing reads and --writeN will down sample by setting bases to N. Default = --writeN |
Optional parameters :
--outQual integer_1,integer_2 |
to constrain the quality scores to the indicated range (inclusive) when writing alignments. Default = uses the full range of quality scores when writing alignments. |
--writeBinnedQualities |
to write Illumina-binned quality scores. Default = writes raw quality scores. |
Engine parameters that are common to all tasks can be found here.
4.3.2 Output
*_separated_*Prob*.bam or *_downsampled_*Prob*.bam or | Downsampled BAM files(downsampled by removing reads) or Downsampled BAM files(downsampled by setting bases to N). |
*_separated_*Prob*.bam.bai or *_downsampled_*Prob*.bam.bai | Index files for downsampled BAM files (downsampled by removing reads) or Index files for downsampled BAM files (downsampled by setting bases to N). |
*_filterSummary.txt | .txt file with per readgroup general filter counts and all readgroups general filter counts. |
4.3.3 Usage Example
#! /bin/bash
# Set atlas path
atlas=$(dirname "$0")/../build/atlas
# Simulate a BAM File with depth 15
$atlas simulate --logFile simulate.out --depth 15
# downsample to depth 3.14
$atlas downsample --bam ATLAS_simulations.bam --depth 3.14 --logFile downsample.out
# simplify name of downsampled bam-file
mv ATLAS_simulations_downsampled_*.bam downsampled.bam
mv ATLAS_simulations_downsampled_*.bam.bai downsampled.bam.bai
# Calculate depth of downsampled BAM File
$atlas BAMDiagnostics --bam downsampled.bam --logFile BAMDiagnostics.out