Importing data from genalex formatted \*.csv files.

read.genalex will read in a genalex-formatted file that has been exported in a comma separated format and will parse most types of genalex data. The output is a genclone-class or genind-class object.

Usage

read.genalex(
  genalex,
  ploidy = 2,
  geo = FALSE,
  region = FALSE,
  genclone = TRUE,
  sep = ",",
  recode = FALSE
)

Arguments

genalex: a \*.csv file exported from genalex
ploidy: an integer to indicate the ploidy of the dataset
geo: indicates the presence of geographic data in the file. This data will be included in a data frame labeled xy in the other() slot.
region: indicates the presence of regional data in the file.
genclone: when TRUE (default), the output will be a genclone object. When FALSE, the output will be a genind object
sep: A character specifying the column separator of the data. Defaults to ",".
recode: For polyploid data: Do you want to recode your data to have varying ploidy? Default is FALSE, and the data will be returned with even ploidy where missing alleles are coded as "0". When TRUE, the data is run through the function recode_polyploids() before being returned. Note that this will prevent conversion to genpop objects in the future. See details.

Value

A genclone or genind bject.

Details

The resulting genclone-class or genind-class object will have a single strata defined in the strata slot. This will be called "Pop" and will reflect the population factor defined in the genalex input. If region = TRUE, a second column will be inserted and labeled "Region". If you have more than two strata within your data set, you should run the command adegenet::splitStrata() on your data set to define the unique stratifications.

FOR POLYPLOID (> 2n) DATA SETS

The genind object has an all-or-none approach to missing data. If a sample has missing data at a particular locus, then the entire locus is considered missing. This works for diploids and haploids where allelic dosage is unambiguous. For polyploids this poses a problem as much of the data set would be transformed into missing data. With this function, I have created a workaround.

When importing polyploid data sets, missing data is scored as "0" and kept within the genind object as an extra allele. This will break most analyses relying on allele frequencies*. All of the functions in poppr will work properly with these data sets as multilocus genotype analysis is agnostic of ploidy and we have written both Bruvo's distance and the index of association in such a way as to be able to handle polyploids presented in this manner.

\* To restore functionality of analyses relying on allele frequencies, use the recode_polyploids() function.

Note

This function cannot handle raw allele frequency data.

In the case that there are duplicated names within the file, this function will assume separate individuals and rename each one to a sequence of integers from 1 to the number of individuals. A vector of the original names will be saved in the other slot under original_names.

Author

Zhian N. Kamvar

Examples


# \dontrun{
Aeut <- read.genalex(system.file("files/rootrot.csv", package="poppr"))

genalex2 <- read.genalex("genalex2.csv", geo=TRUE)
#> Warning: cannot open file 'genalex2.csv': No such file or directory
#> Error in file(con, "r"): cannot open the connection
# A genalex file with geographic coordinate data.

genalex3 <- read.genalex("genalex3.csv", region=TRUE) 
#> Warning: cannot open file 'genalex3.csv': No such file or directory
#> Error in file(con, "r"): cannot open the connection
# A genalex file with regional information.

genalex4 <- read.genalex("genalex4.csv", region=TRUE, geo=TRUE) 
#> Warning: cannot open file 'genalex4.csv': No such file or directory
#> Error in file(con, "r"): cannot open the connection
# A genalex file with both regional and geographic information.
# }