missingno gives the user four options to deal with missing data: remove loci, remove samples, replace with zeroes, or replace with average allele counts.
Arguments
- pop
- type
a
character
string: can be "ignore", "zero", "mean", "loci", or "geno" (seeDetails
for definitions).- cutoff
numeric
. A number from 0 to 1 indicating the allowable rate of missing data in either genotypes or loci. This will be ignored fortype
values of"mean"
or"zero"
.- quiet
if
TRUE
, it will print to the screen the action performed.- freq
defaults to
FALSE
. This option is passed on to thetab
function. IfTRUE
, the matrix in the genind object will be replaced by a numeric matrix (as opposed to integer). THIS IS NOT RECOMMENDED. USE THE FUNCTIONtab
instead.
Details
These methods provide a way to deal with systematic missing data and
to give a wrapper for adegenet
's tab
function.
ALL OF THESE ARE TO BE USED WITH CAUTION.
Using this function with polyploid data (where missing data is coded as "0") may give spurious results.
Treatment types
"ignore"
- does not remove or replace missing data."loci"
- removes all loci containing missing data in the entire data set."genotype"
- removes any genotypes/isolates/individuals with missing data."mean"
- replaces all NA's with the mean of the alleles for the entire data set."zero"
or"0"
- replaces all NA's with "0". Introduces more diversity.
See also
tab
, poppr
, poppr.amova
,
nei.dist
, aboot
Examples
data(nancycats)
nancy.locina <- missingno(nancycats, type = "loci")
#>
#> Found 617 missing values.
#>
#> 2 loci contained missing values greater than 5%
#>
#> Removing 2 loci: fca8, fca45
## Found 617 missing values.
## 2 loci contained missing values greater than 5%.
## Removing 2 loci : fca8 fca45
nancy.genona <- missingno(nancycats, type = "geno")
#>
#> Found 617 missing values.
#>
#> 38 genotypes contained missing values greater than 5%
#>
#> Removing 38 genotypes: N215, N216, N188, N189, N190, N191, N192, N298,
#> N299, N300, N301, N302, N303, N304, N310, N195, N197, N198, N199, N200,
#> N201, N206, N182, N184, N186, N282, N283, N288, N291, N292, N293, N294,
#> N295, N296, N297, N281, N289, N290
## Found 617 missing values.
## 38 genotypes contained missing values greater than 5%.
## Removing 38 genotypes : N215 N216 N188 N189 N190 N191 N192 N302 N304 N310
## N195 N197 N198 N199 N200 N201 N206 N182 N184 N186 N298 N299 N300 N301 N303
## N282 N283 N288 N291 N292 N293 N294 N295 N296 N297 N281 N289 N290
# Replacing all NA with "0" (see tab in the adegenet package).
nancy.0 <- missingno(nancycats, type = "0")
#>
#> Replaced 617 missing values.
## Replaced 617 missing values
# Replacing all NA with the mean of each column (see tab in the
# adegenet package).
nancy.mean <- missingno(nancycats, type = "mean")
#>
#> Replaced 617 missing values.
## Replaced 617 missing values