Skip to contents

Given a series of thresholds for a data set that collapse it into one giant cluster, this will search the top fraction of threshold differences to find the largest difference. The average between the thresholds spanning that difference is the cutoff threshold defining the clonal lineage threshold.


cutoff_predictor(thresholds, fraction = 0.5)



a vector of numerics coming from mlg.filter where the threshold has been set to the maximum threshold theoretically possible.


the fraction of the data to seek the threshold.


a numeric value representing the threshold at which multilocus lineages should be defined.


This function originally appeared in doi:10.5281/zenodo.17424 . This is a bit of a blunt instrument.


ZN Kamvar, JC Brooks, and NJ Grünwald. 2015. Supplementary Material for Frontiers Plant Genetics and Genomics 'Novel R tools for analysis of genome-wide population genetic data with emphasis on clonality'. DOI: doi:10.5281/zenodo.17424

Kamvar ZN, Brooks JC and Grünwald NJ (2015) Novel R tools for analysis of genome-wide population genetic data with emphasis on clonality. Front. Genet. 6:208. doi: doi:10.3389/fgene.2015.00208


Zhian N. Kamvar


pinfreps <- fix_replen(Pinf, c(2, 2, 6, 2, 2, 2, 2, 2, 3, 3, 2))
pthresh  <- filter_stats(Pinf, distance = bruvo.dist, replen = pinfreps, 
                         plot = TRUE, stats = "THRESHOLD", threads = 1L)

# prediction for farthest neighbor
#> [1] 0.1132221

# prediction for all algorithms
sapply(pthresh, cutoff_predictor)
#>  farthest   average   nearest 
#> 0.1132221 0.1084407 0.1092773