4.2 BAMDiagnostics
Estimating approximate depth, read length frequencies and mapping quality frequencies
BAMDiagnostics
provides a set of read statistics for the input BAM file while taking into account all standard input filters. The output are written to .txt
files that summarize the following information:
- Total number of reads
- Number of reads that passed filters
- Number of duplicate reads
- Average read length
- Maximum read length
- Number of proper pairs
- Average fragment length (only known for paired-end data)
- Total number of soft-clipped positions
- Average soft-clipped length
- Average aligned length
- Mean sequencing depth across the whole genome
- Average mapping quality
It also provides histograms which display the distributions of fragment lengths, mapping qualities, read lengths, soft-clipped lengths and aligned lengths. All of this data is written for all read groups combined, as well as for each read group separately.
4.2.1 Input
Required inputs :
--bam Input_bam_file.bam |
Input BAM file. |
Optional inputs :
none
Specific Parameters :
--diagnosticsPerChromosome |
To output data per chromosome into a *_diagnostics.txt diagnostics file. Default = Only per-read group summary statistics is provided (per chromosome summary statistics is provided). |
--splitMergeInput |
To create input file for splitMerge . Default = Will not create input file for splitMerge. |
--printReferenceLength |
To print reference lengths of chromosomes to file. Default = Will not print reference lengths of chromosomes to file. |
- See Filter parameters to apply specific filters for bases, reads and parsing window setting.
Engine parameters that are common to all tasks can be found here.
4.2.2 Output
*_filterSummary.txt | Filter summary for all read groups combined and individual read groups. |
*_fragmentLengthHistogram.txt | Counts for all fragment length for all read groups combined and individual read groups. |
*_mappingQualityHistogram.txt | Mapping quality counts for all read groups combined and individual read groups. |
*_readLengthHistogram.txt | Read length counts for all read groups combined and individual read groups. |
*_softClippedLengthHistogram.txt | Length of soft-clipped bases as counts for all read groups combined and individual read groups. |
*_alignedLengthHistogram.txt | Aligned length counts for all read groups combined and individual read groups |
*_diagnostics.txt | File containing per-read group summary statistics. Also contains per chromosome summary statistics is provided when --diagnosticsPerChromosome parameter is used. This file can be used as input file for the splitMerge task. |