4.3 downsample

Downsampling a BAM file by removing reads

downsample creates downsampled BAM files that contain a specified percentage of the original number of reads. More than one percentage/probability can be specified, and thus allows the creation of several downsampled BAM files at a time. In this task, all reads are considered, even those that do not pass the usual SAM flag filters.

4.3.1 Input

Required inputs :

--bam Input_bam_file.bam Input bam file

Optional inputs :

  • None

Specific parameters :

--prob numeric_value One value or a vector of percentages of reads to be kept in the downsampled BAM files.numeric_value must be between 0 and 1 (inclusive). It is possible to obtain replicates by adding the desired number of replicates in curly brackets {} after the concerned percentage.
--separateReads or --writeN --separateReads will down sample by removing reads and --writeN will down sample by setting bases to N. Default = --writeN

Optional parameters :

--outQual integer_1,integer_2 to constrain the quality scores to the indicated range (inclusive) when writing alignments. Default = uses the full range of quality scores when writing alignments.
--writeBinnedQualities to write Illumina-binned quality scores. Default = writes raw quality scores.

Engine parameters that are common to all tasks can be found here.

4.3.2 Output

*_separated_*Prob*.bam or *_downsampled_*Prob*.bam or Downsampled BAM files(downsampled by removing reads) or Downsampled BAM files(downsampled by setting bases to N).
*_separated_*Prob*.bam.bai or *_downsampled_*Prob*.bam.bai Index files for downsampled BAM files (downsampled by removing reads) or Index files for downsampled BAM files (downsampled by setting bases to N).
*_filterSummary.txt .txt file with per readgroup general filter counts and all readgroups general filter counts.

4.3.3 Usage Example

#! /bin/bash

# Set atlas path
atlas=$(dirname "$0")/../build/atlas

# Simulate a BAM File with depth 15
$atlas simulate --logFile simulate.out --depth 15

# downsample to depth 3.14
$atlas downsample --bam ATLAS_simulations.bam --depth 3.14 --logFile downsample.out 

# simplify name of downsampled bam-file
mv ATLAS_simulations_downsampled_*.bam downsampled.bam
mv ATLAS_simulations_downsampled_*.bam.bai downsampled.bam.bai

# Calculate depth of downsampled BAM File
$atlas BAMDiagnostics --bam downsampled.bam --logFile BAMDiagnostics.out