Format tool for genetic data

eco.format(data, ncod = NULL, nout = 3, ploidy = 2, sep.in, sep.out,
  fill.mode = c("last", "first", "none"), recode = c("none", "all",
  "column"), show.codes = FALSE)

Arguments

data

Genetic data frame.

ncod

Number of digits coding each allele in the input file.

nout

Number of digits in the output.

ploidy

Ploidy of the data.

sep.in

Character separating alleles in the input data if present.

sep.out

Character separating alleles in the output data. Default

fill.mode

Add zeros at the beggining ("fist") or the end ("last") of each allele. Default = "last".

recode

Recode mode: "none" for no recoding (defalut), "all" for recoding the data considering all the individuals values at once (e.g., protein data), or "column" for recoding the values by column (e.g., microsatellite data).

show.codes

May we returned tables with the equivalence between the old and new codes when recode = "all" or recode = "column"?

Details

The function can format data with different ploidy levels. It allows to: - add/remove zeros at the beginning/end of each allele - separate alleles with a character - divide alleles into columns - bind alleles from separate columns - transform character data into numeric data

"NA" is considered special character (not available data).

Examples

# NOT RUN {
data(eco.test)

# Adding zeros

example <- as.matrix(genotype[1:10,])
mode(example) <- "character"
# example data
example
recoded <- eco.format(example, ncod = 1, ploidy = 2, nout = 3)
# recoded data
recoded


# Tetraploid data, separating alleles with a "/"
tetrap <- as.matrix(example)
# simulated tetraploid example data
tetrap <- matrix(paste(example,example, sep = ""), ncol = ncol(example))
recoded <- eco.format(tetrap, ncod = 1, ploidy = 4, sep.out = "/")
# recoded data
recoded

# Example with a single character
ex <- c("A","T","G","C")
ex <- sample(ex, 100, rep= T)
ex <- matrix(ex, 10, 10)
colnames(ex) <- letters[1:10]
rownames(ex) <- LETTERS[1:10]
# example data
ex
recoded <- eco.format(ex, ploidy = 1, nout = 1,  recode = "all", show.codes = TRUE)
# recoded data 
recoded


# Example with two strings per cell and missing values:
ex <- c("Ala", "Asx", "Cys", "Asp", "Glu", "Phe", "Gly", "His", "Ile",
"Lys", "Leu", "Met", "Asn", "Pro", "Gln", "Arg", "Ser", "Thr",
"Val", "Trp")
ex1 <- sample(ex, 100, rep= T)
ex2 <- sample(ex, 100, rep= T)
ex3 <- paste(ex1, ex2, sep="")
missing.ex3 <- sample(1:100, 20)
ex3[missing.ex3] <-NA
ex4 <- matrix(ex3, 10, 10)
colnames(ex4) <- letters[1:10]
rownames(ex4) <- LETTERS[1:10]
# example data
ex4
recoded <- eco.format(ex4, ncod = 3, ploidy = 2,
                      nout = 2, recode = "column")
# recoded data
recoded

# Example with a vector, following the latter example:
ex1 <- as.data.frame(ex1)
# example data
ex1
recoded <- eco.format(ex1, ploidy = 1, recode = "all")
# recoded data
recoded

# }