Poppr provides tools for population genetic analysis that include genotypic diversity measures, genetic distances with bootstrap support, native organization and handling of population hierarchies, and clone correction.
To cite poppr, please use citation("poppr")
. When referring to
poppr in your manuscript, please use lower case unless it occurs at the
beginning of a sentence.
Details
This package relies on the adegenet package. It is built around the genind and genlight object. Genind objects store genetic information in a table of allele frequencies while genlight objects store SNP data efficiently by packing binary allele calls into single bits. Poppr has extended these object into new objects called genclone and snpclone, respectively. These objects are designed for analysis of clonal organisms as they add the @mlg slot for keeping track of multilocus genotypes and multilocus lineages.
Documentation
Documentation is available for any function by
typing ?function_name
in the R console. Detailed topic explanations
live in the package vignettes:
Vignette | command |
Data import and manipulation | vignette("poppr_manual", "poppr") |
Algorithms and Equations | vignette("algo", "poppr") |
Multilocus Genotype Analysis | vignette("mlg", "poppr") |
Essential functions for importing and manipulating data are detailed within the Data import and manipulation vignette, details on algorithms used in poppr are within the Algorithms and equations vignette, and details for working with multilocus genotypes are in Multilocus Genotype Analysis.
Examples of analyses are available in a primer written by Niklaus J. Grünwald, Zhian N. Kamvar, and Sydney E. Everhart at https://grunwaldlab.github.io/Population_Genetics_in_R/.
Getting help
If you have a specific question or issue with poppr, feel free to contribute to the google group at https://groups.google.com/d/forum/poppr. If you find a bug and are a github user, you can submit bug reports at https://github.com/grunwaldlab/poppr/issues. Otherwise, leave a message on the groups. Personal emails are highly discouraged as they do not allow others to learn.
Functions in poppr
Below are descriptions and links to functions found in poppr. Be aware that all functions in adegenet are also available. The functions are documented as:
function_name()
(data type) - Description
Where ‘data type’ refers to the type of data that can be used:
m | a genclone or genind object |
s | a snpclone or genlight object |
x | a different data type (e.g. a matrix from mlg.table() ) |
Data import/export
getfile()
(x) - Provides a quick GUI to grab files for importread.genalex()
(x) - Reads GenAlEx formatted csv files to a genind objectgenind2genalex()
(m) - Converts genind objects to GenAlEx formatted csv filesgenclone2genind()
(m) - Removes the @mlg slot from genclone objectsas.genambig()
(m) - Converts genind data to polysat's genambig data structure.bootgen2genind()
(x) - seeaboot()
for details)
Data Structures
Data structures "genclone" (based off of adegenet's genind) and "snpclone" (based off of adegenet's genlight for large SNP data sets). Both of these data structures are defined by the presence of an extra MLG slot representing multilocus genotype assignments, which can be a numeric vector or a MLG class object.
genclone - Handles microsatellite, presence/absence, and small SNP data sets
snpclone - Designed to handle larger binary SNP data sets.
MLG - An internal class holding a data frame of multilocus genotype assignments that acts like a vector, allowing the user to easily switch between different MLG definitions.
bootgen - An internal class used explicitly for
aboot()
that inherits the gen-class virtual object. It is designed to allow for sampling loci with replacement.bruvomat - An internal class designed to handle bootstrapping for Bruvo's distance where blocks of integer loci can be shuffled.
Data manipulation
as.genclone()
(m) - Converts genind objects to genclone objectsmissingno()
(m) - Handles missing dataclonecorrect()
(m | s) - Clone-censors at a specified population hierarchyinformloci()
(m) - Detects and removes phylogenetically uninformative locipopsub()
(m | s) - Subsets genind objects by populationshufflepop()
(m) - Shuffles genotypes at each locus using four different shuffling algorithmsrecode_polyploids()
(m | x) - Recodes polyploid data sets with missing alleles imported as "0"make_haplotypes()
(m | s) - Splits data into pseudo-haplotypes. This is mainly used in AMOVA.test_replen()
(m) - Tests for inconsistent repeat lengths in microsatellite data. For use inbruvo.dist()
functions.fix_replen()
(m) - Fixes inconsistent repeat lengths. For use inbruvo.dist()
functions.
Genetic distances
bruvo.dist()
(m) - Bruvo's distance (see also:fix_replen()
)diss.dist()
(m) - Absolute genetic distance (seeprevosti.dist()
)nei.dist()
(m | x) - Nei's 1978 genetic distancerogers.dist()
(m | x) - Rogers' euclidean distancereynolds.dist()
(m | x) - Reynolds' coancestry distanceedwards.dist()
(m | x) - Edwards' angular distanceprevosti.dist()
(m | x) - Prevosti's absolute genetic distancebitwise.dist()
(s) - Calculates fast pairwise distances for genlight objects.
Bootstrapping
aboot()
(m | s | x) - Creates a bootstrapped dendrogram for any distance measurebruvo.boot()
(m) - Produces dendrograms with bootstrap support based on Bruvo's distancediversity_boot()
(x) - Generates boostrap distributions of diversity statistics for multilocus genotypesdiversity_ci()
(m | s | x) - Generates confidence intervals for multilocus genotype diversity.resample.ia()
(m) - Calculates the index of association over subsets of data.
Multilocus Genotypes
mlg()
(m | s) - Calculates the number of multilocus genotypesmll()
(m | s) - Displays the current multilocus lineages (genotypes) defined.mlg.crosspop()
(m | s) - Finds all multilocus genotypes that cross populationsmlg.table()
(m | s) - Returns a table of populations by multilocus genotypesmlg.vector()
(m | s) - Returns a vector of a numeric multilocus genotype assignment for each individualmlg.id()
(m | s) - Finds all individuals associated with a single multilocus genotypemlg.filter()
(m | s) - Collapses MLGs by genetic distancefilter_stats()
(m | s) - Calculates mlg.filter for all algorithms and plotscutoff_predictor()
(x) - Predicts cutoff threshold from mlg.filter.mll.custom()
(m | s) - Allows for the custom definition of multilocus lineagesmll.levels()
(m | s) - Allows the user to change levels of custom MLLs.mll.reset()
(m | s) - Reset multilocus lineages.diversity_stats()
(x) - Creates a table of diversity indices for multilocus genotypes.
Population Genetic Analysis
poppr.amova()
(m | s) - Analysis of Molecular Variance (as implemented in ade4)poppr()
(m | x) - Returns a diversity table by populationpoppr.all()
(m | x) - Returns a diversity table by population for all compatible files specifiedprivate_alleles()
(m) - Tabulates the occurrences of alleles that only occur in one population.locus_table()
(m) - Creates a table of summary statistics per locus.rrmlg()
(m | x) - Round-robin multilocus genotype estimates.rraf()
(m) - Round-robin allele frequency estimates.pgen()
(m) - Probability of genotypes.psex()
(m) - Probability of observing a genotype more than once.rare_allele_correction (m) - rules for correcting rare alleles for round-robin estimates.
incomp()
(m) - Check data for incomparable samples.
Visualization
imsn()
(m | s) - Interactive construction and visualization of minimum spanning networksplot_poppr_msn()
(m | s | x) - Plots minimum spanning networks produced in poppr with scale bar and legendgreycurve()
(x) - Helper to determine the appropriate parameters for adjusting the grey level for msn functionsbruvo.msn()
(m) - Produces minimum spanning networks based off Bruvo's distance colored by populationpoppr.msn()
(m | s | x) - Produces a minimum spanning network for any pairwise distance matrix related to the datainfo_table()
(m) - Creates a heatmap representing missing data or observed ploidygenotype_curve()
(m | x) - Creates a series of boxplots to demonstrate how many markers are needed to represent the diversity of your data.
Datasets
Aeut()
- (AFLP) Oomycete root rot pathogen Aphanomyces euteiches (Grünwald and Hoheisel, 2006)monpop()
- (SSR) Peach brown rot pathogen Monilinia fructicola (Everhart and Scherm, 2015)partial_clone()
- (SSR) partially-clonal data simulated via simuPOP (Peng and Amos, 2008)Pinf()
- (SSR) Potato late blight pathogen Phytophthora infestans (Goss et. al., 2014)Pram()
- (SSR) Sudden Oak Death pathogen Phytophthora ramorum (Kamvar et. al., 2015; Goss et. al., 2009)
References
--------- Papers announcing poppr ---------
Kamvar ZN, Tabima JF, Grünwald NJ. (2014) Poppr: an R package for genetic analysis of populations with clonal, partially clonal, and/or sexual reproduction. PeerJ 2:e281 doi:10.7717/peerj.281
Kamvar ZN, Brooks JC and Grünwald NJ (2015) Novel R tools for analysis of genome-wide population genetic data with emphasis on clonality. Front. Genet. 6:208. doi:10.3389/fgene.2015.00208
--------- Papers referencing data sets ---------
Grünwald, NJ and Hoheisel, G.A. 2006. Hierarchical Analysis of Diversity, Selfing, and Genetic Differentiation in Populations of the Oomycete Aphanomyces euteiches. Phytopathology 96:1134-1141 doi: doi:10.1094/PHYTO-96-1134
SE Everhart, H Scherm, (2015) Fine-scale genetic structure of Monilinia fructicola during brown rot epidemics within individual peach tree canopies. Phytopathology 105:542-549 doi: doi:10.1094/PHYTO-03-14-0088-R
Bo Peng and Christopher Amos (2008) Forward-time simulations of nonrandom mating populations using simuPOP. bioinformatics, 24 (11): 1408-1409.
Goss, Erica M., Javier F. Tabima, David EL Cooke, Silvia Restrepo, William E. Fry, Gregory A. Forbes, Valerie J. Fieland, Martha Cardenas, and Niklaus J. Grünwald. (2014) "The Irish potato famine pathogen Phytophthora infestans originated in central Mexico rather than the Andes." Proceedings of the National Academy of Sciences 111:8791-8796. doi: doi:10.1073/pnas.1401884111
Kamvar, Z. N., Larsen, M. M., Kanaskie, A. M., Hansen, E. M., & Grünwald, N. J. (2015). Spatial and temporal analysis of populations of the sudden oak death pathogen in Oregon forests. Phytopathology 105:982-989. doi: doi:10.1094/PHYTO-12-14-0350-FI
Goss, E. M., Larsen, M., Chastagner, G. A., Givens, D. R., and Grünwald, N. J. 2009. Population genetic analysis infers migration pathways of Phytophthora ramorum in US nurseries. PLoS Pathog. 5:e1000583. doi: doi:10.1371/journal.ppat.1000583