Skip to contents

This function will facilitate in removing phylogenetically uninformative loci from a genclone or genind object. The user has the ability to define what uninformative means by setting a cutoff value for either percentage of differentiating genotypes or minor allele frequency.

Usage

informloci(pop, cutoff = 2/nInd(pop), MAF = 0.01, quiet = FALSE)

Arguments

pop

a genclone or genind object.

cutoff

numeric. A number from 0 to 1 defining the minimum number of differentiating samples.

MAF

numeric. A number from 0 to 1 defining the minimum minor allele frequency. This is passed as the thresh parameter of isPoly.

quiet

logical. When quiet = TRUE (default), messages indicating the loci removed will be printed to screen. When quiet = FALSE, nothing will be printed to screen.

Value

A genind object with user-defined informative loci.

Details

This function will remove uninformative loci using a traditional MAF cutoff (using isPoly from adegenet) as well as analyzing the number of observed genotypes in a locus. This is important for clonal organisms that can have fixed heterozygous sites not detected by MAF methods.

Note

This will have a few side effects that affect certain analyses. First, the number of multilocus genotypes might be reduced due to the reduced number of markers (if you are only using a genind object). Second, if you plan on using this data for analysis of the index of association, be sure to use the standardized version (rbarD) that corrects for the number of observed loci.

Author

Zhian N. Kamvar

Examples

# We will use a dummy data set to demonstrate how this detects uninformative
# loci using both MAF and a cutoff.

genos <- c("A/A", "A/B", "A/C", "B/B", "B/C", "C/C")

v <- sample(genos, 100, replace = TRUE)
w <- c(rep(genos[2], 99), genos[3])           # found by cutoff
x <- c(rep(genos[1], 98), genos[3], genos[2]) # found by MAF
y <- c(rep(genos[1], 99), genos[2])           # found by both
z <- sample(genos, 100, replace = TRUE)
dat <- df2genind(data.frame(v = v, w = w, x = x, y = y, z = z), sep = "/")

informloci(dat)
#> cutoff value: 2 % ( 2 samples ).
#> MAF         : 0.01
#> 
#>  Found 3 uninformative loci 
#>  ============================ 
#>  2 loci found with a cutoff of 2 samples :
#>  w, y 
#>  2 loci found with MAF < 0.01 :
#>  x, y
#> /// GENIND OBJECT /////////
#> 
#>  // 100 individuals; 2 loci; 6 alleles; size: 14.6 Kb
#> 
#>  // Basic content
#>    @tab:  100 x 6 matrix of allele counts
#>    @loc.n.all: number of alleles per locus (range: 3-3)
#>    @loc.fac: locus factor for the 6 columns of @tab
#>    @all.names: list of allele names for each locus
#>    @ploidy: ploidy of each individual  (range: 2-2)
#>    @type:  codom
#>    @call: .local(x = x, i = i, j = j, loc = ..1, drop = drop)
#> 
#>  // Optional content
#>    - empty -

# \dontrun{
# Ignore MAF
informloci(dat, MAF = 0)
#> cutoff value: 2 % ( 2 samples ).
#> MAF         : 0
#> 
#>  Found 2 uninformative loci 
#>  ============================ 
#>  2 loci found with a cutoff of 2 samples :
#>  w, y 
#>  0 loci found with MAF < 0  
#> /// GENIND OBJECT /////////
#> 
#>  // 100 individuals; 3 loci; 9 alleles; size: 16.5 Kb
#> 
#>  // Basic content
#>    @tab:  100 x 9 matrix of allele counts
#>    @loc.n.all: number of alleles per locus (range: 3-3)
#>    @loc.fac: locus factor for the 9 columns of @tab
#>    @all.names: list of allele names for each locus
#>    @ploidy: ploidy of each individual  (range: 2-2)
#>    @type:  codom
#>    @call: .local(x = x, i = i, j = j, loc = ..1, drop = drop)
#> 
#>  // Optional content
#>    - empty -

# Ignore cutoff
informloci(dat, cutoff = 0)
#> cutoff value: 0 % ( 0 samples ).
#> MAF         : 0.01
#> 
#>  Found 2 uninformative loci 
#>  ============================ 
#>  0 loci found with a cutoff of 0 samples   
#>  2 loci found with MAF < 0.01 :
#>  x, y
#> /// GENIND OBJECT /////////
#> 
#>  // 100 individuals; 3 loci; 9 alleles; size: 16.5 Kb
#> 
#>  // Basic content
#>    @tab:  100 x 9 matrix of allele counts
#>    @loc.n.all: number of alleles per locus (range: 3-3)
#>    @loc.fac: locus factor for the 9 columns of @tab
#>    @all.names: list of allele names for each locus
#>    @ploidy: ploidy of each individual  (range: 2-2)
#>    @type:  codom
#>    @call: .local(x = x, i = i, j = j, loc = ..1, drop = drop)
#> 
#>  // Optional content
#>    - empty -

# Real data
data(H3N2)
informloci(H3N2)
#> cutoff value: 0.105097214923805 % ( 2 samples ).
#> MAF         : 0.01
#> 
#>  Found 5 uninformative loci 
#>  ============================ 
#>  1 locus found with a cutoff of 2 samples :
#>  597 
#>  5 loci found with MAF < 0.01 :
#>  42, 313, 433, 597, 915
#> /// GENIND OBJECT /////////
#> 
#>  // 1,903 individuals; 120 loci; 322 alleles; size: 3.3 Mb
#> 
#>  // Basic content
#>    @tab:  1903 x 322 matrix of allele counts
#>    @loc.n.all: number of alleles per locus (range: 2-4)
#>    @loc.fac: locus factor for the 322 columns of @tab
#>    @all.names: list of allele names for each locus
#>    @ploidy: ploidy of each individual  (range: 1-1)
#>    @type:  codom
#>    @call: .local(x = x, i = i, j = j, loc = ..1, drop = drop)
#> 
#>  // Optional content
#>    @other: a list containing: x  xy  epid 
#> 

# }