Skip to contents

Genotype accumulation curves are useful for determining the minimum number of loci necessary to discriminate between individuals in a population. This function will randomly sample loci without replacement and count the number of multilocus genotypes observed.


  sample = 100,
  maxloci = 0L,
  quiet = FALSE,
  thresh = 1,
  plot = TRUE,
  drop = TRUE,
  dropna = TRUE



a genclone, genind, or loci object.


an integer defining the number of times loci will be resampled without replacement.


the maximum number of loci to sample. By default, maxloci = 0, which indicates that n - 1 loci are to be used. Note that this will always take min(n - 1, maxloci)


if FALSE (default), Progress of the iterations will be displayed. If TRUE, nothing is printed to screen as the function runs.


a number from 0 to 1. This will draw a line at that fraction of multilocus genotypes, rounded. Defaults to 1, which will draw a line at the maximum number of observable genotypes.


if TRUE (default), the genotype curve will be plotted via ggplot2. If FALSE, the resulting matrix will be visibly returned.


if TRUE (default), monomorphic loci will be removed before analysis as these loci affect the shape of the curve.


if TRUE (default) and drop = TRUE, NAs will be ignored when determining if a locus is monomorphic. When FALSE, presence of NAs will result in the locus being retained. This argument has no effect when drop = FALSE


(invisibly by deafuls) a matrix of integers showing the results of each randomization. Columns represent the number of loci sampled and rows represent an independent sample.


Internally, this function works by converting the data into a loci object, which represents genotypes as a data frame of factors. Random samples are taken of 1 to n-1 columns of the matrix and the number of unique rows are counted to determine the number of multilocus genotypes in that random sample. This function does not take into account any definitions of MLGs via mlg.filter or mll.custom.


Zhian N. Kamvar


nan_geno <- genotype_curve(nancycats)

# \dontrun{

# Marker Type Comparison --------------------------------------------------
# With AFLP data, it is often necessary to include more markers for resolution
Ageno <- genotype_curve(Aeut)

# Many microsatellite data sets have hypervariable markers
mgeno <- geotype_curve(microbov)
#> Error in geotype_curve(microbov): could not find function "geotype_curve"

# Adding a trendline ------------------------------------------------------

# Trendlines: you can add a smoothed trendline with geom_smooth()
p <- last_plot()
p + geom_smooth()
#> `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'

# Producing Figures for Publication ---------------------------------------

# This data set has been pre filtered
mongeno <- genotype_curve(monpop)

# Here, we add a curve and a title for publication
p <- last_plot()
mytitle <- expression(paste("Genotype Accumulation Curve for ", 
                            italic("M. fructicola")))
p + geom_smooth() + 
  theme_bw() + 
  theme(text = element_text(size = 12, family = "serif")) +
  theme(title = element_text(size = 14)) +
#> `geom_smooth()` using method = 'gam' and formula = 'y ~ s(x, bs = "cs")'

# }