9.1 Beagle

The Beagle format was originally used for the program Beagle, and is required as an input file for various tasks in ANGSD [1].

The first three columns specify the position, the reference and the alternative allele, followed by three columns per individual that contain the genotype likelihood for each of the three genotypes. Here is an example for four loci and three individuals:

This portion of the text will have a font size of 10 pixels.

marker allele1 allele12 Ind0 Ind0 Ind0 Ind1 Ind1 Ind1 Ind2 Ind2 Ind2
chr1_1 A C 0.941177 0.058822 0.000001 0.799685 0.199918 0.000397 0.666316 0.333155
chr1_2 G T 0.709983 0.177493 0.112525 0.941178 0.058822 0.000000 0.665554 0.332774 0.001672
chr1_3 C A 0.855993 0.106996 0.037010 0.333333 0.333333 0.333333 0.799971 0.333155
chr1_5 T A 0.835380 0.104420 0.060201 0.799685 0.199918 0.000397 0.333333 0.333333

Because ANGSD requires the genotype likelihoods per individual to sum to one, ATLAS normalizes them accordingly. In case of haploid genotypes, ATLAS still uses three columns per individual, but set the third genotype likelihood (for the homozygous alternative genotype) to zero. This is required for certain downstream tools, e.g. FastNGSAdmix. For other applications, this might not make sense, so Beagle files for haploid chromosomes should be handled with care.