Calculate the probability of genotypes based on the product of allele frequencies over all loci.
Arguments
- gid
a genind or genclone object.
- pop
either a formula to set the population factor from the
strata
slot or a vector specifying the population factor for each sample. Defaults toNULL
.- by_pop
When this is
TRUE
(default), the calculation will be done by population.- log
a
logical
iflog =TRUE
(default), the values returned will be log(Pgen). Iflog = FALSE
, the values returned will be Pgen.- freq
a vector or matrix of allele frequencies. This defaults to
NULL
, indicating that the frequencies will be determined via round-robin approach inrraf
. If this matrix or vector is not provided, zero-value allele frequencies will automatically be corrected. For details, please see the documentation on correcting rare alleles.- ...
options from correcting rare alleles. The default is to correct allele frequencies to 1/n
Details
Pgen is the probability of a given genotype occuring in a population assuming HWE. Thus, the value for diploids is
$$P_{gen} = \left(\prod_{i=1}^m p_i\right)2^h$$
where \(p_i\) are the allele frequencies and h is the count of the number of heterozygous sites in the sample (Arnaud-Haond et al. 2007; Parks and Werth, 1993). The allele frequencies, by default, are calculated using a round-robin approach where allele frequencies at a particular locus are calculated on the clone-censored genotypes without that locus.
To avoid issues with numerical precision of small numbers, this function calculates pgen per locus by adding up log-transformed values of allele frequencies. These can easily be transformed to return the true value (see examples).
Note
For haploids, Pgen at a particular locus is the allele frequency. This
function cannot handle polyploids. Additionally, when the argument
pop
is not NULL
, by_pop
is automatically TRUE
.
References
Arnaud-Haond, S., Duarte, C. M., Alberto, F., & Serrão, E. A. 2007. Standardizing methods to address clonality in population studies. Molecular Ecology, 16(24), 5115-5139.
Parks, J. C., & Werth, C. R. 1993. A study of spatial features of clones in a population of bracken fern, Pteridium aquilinum (Dennstaedtiaceae). American Journal of Botany, 537-544.
Examples
data(Pram)
head(pgen(Pram, log = FALSE))
#> PrMS6A1 Pr9C3A1 PrMS39A1 PrMS45A1 PrMS43A1
#> 1411152-10B 0.5 0.5 0.15 0.5 0.111111111
#> 82 0.5 0.5 0.35 0.5 0.111111111
#> 83 0.5 0.5 0.15 0.5 0.002298851
#> 84 0.5 0.5 0.35 0.5 0.004597701
#> 85 0.5 0.5 0.35 0.5 0.222222222
#> 81-NM-1 0.5 0.5 0.35 0.5 0.111111111
if (FALSE) { # \dontrun{
# You can also supply the observed allele frequencies
pramfreq <- Pram %>% genind2genpop() %>% tab(freq = TRUE)
head(pgen(Pram, log = FALSE, freq = pramfreq))
# You can get the Pgen values over all loci by summing over the logged results:
pgen(Pram, log = TRUE) %>% # calculate pgen matrix
rowSums(na.rm = TRUE) %>% # take the sum of each row
exp() # take the exponent of the results
# You can also take the product of the non-logged results:
apply(pgen(Pram, log = FALSE), 1, prod, na.rm = TRUE)
## Rare Allele Correction ---------------------------------------------------
##
# If you don't supply a table of frequencies, they are calculated with rraf
# with correction = TRUE. This is normally benign when analyzing large
# populations, but it can have a great effect on small populations. To help
# control this, you can supply arguments described in
# help("rare_allele_correction").
# Default is to correct by 1/n per population. Since the calculation is
# performed on a smaller sample size due to round robin clone correction, it
# would be more appropriate to correct by 1/rrmlg at each locus. This is
# acheived by setting d = "rrmlg". Since this is a diploid, we would want to
# account for the number of chromosomes, and so we set mul = 1/2
head(pgen(Pram, log = FALSE, d = "rrmlg", mul = 1/2)) # compare with the output above
# If you wanted to treat all alleles as equally rare, then you would set a
# specific value (let's say the rare alleles are 1/100):
head(pgen(Pram, log = FALSE, e = 1/100))
} # }