Given a series of thresholds for a data set that collapse it into one giant cluster, this will search the top fraction of threshold differences to find the largest difference. The average between the thresholds spanning that difference is the cutoff threshold defining the clonal lineage threshold.


cutoff_predictor(thresholds, fraction = 0.5)



a vector of numerics coming from mlg.filter where the threshold has been set to the maximum threshold theoretically possible.


the fraction of the data to seek the threshold.


a numeric value representing the threshold at which multilocus lineages should be defined.


This function originally appeared in doi:10.5281/zenodo.17424 . This is a bit of a blunt instrument.


Zhian N. Kamvar


pinfreps <- fix_replen(Pinf, c(2, 2, 6, 2, 2, 2, 2, 2, 3, 3, 2))
pthresh  <- filter_stats(Pinf, distance = bruvo.dist, replen = pinfreps, 
                         plot = TRUE, stats = "THRESHOLD", threads = 1L)

# prediction for farthest neighbor
#> [1] 0.1132221

# prediction for all algorithms
sapply(pthresh, cutoff_predictor)
#>  farthest   average   nearest 
#> 0.1132221 0.1084407 0.1092773