Creating a new ecogen object

ecogen(XY = data.frame(), P = data.frame(), G = data.frame(),
  E = data.frame(), S = data.frame(), C = data.frame(),
  G.processed = TRUE, order.G = FALSE, type = c("codominant", "dominant"),
  ploidy = 2, sep = "", ncod = NULL, missing = c("0", "NA", "MEAN"),
  NA.char = "NA", poly.level = 5, rm.empty.ind = FALSE, order.df = TRUE,
  set.names = NULL, valid.names = FALSE)

Arguments

XY

Data frame with m columns (coordinates) and n rows (individuals).

P

Data frame with n rows (individuals), and phenotypic data in columns.

G

Data of class: "data.frame", with individuals in rows and genotypic data in columns (loci). The ploidy and the type (codominant, dominant) of the data, must be passed with the arguments "ploidy" and "type". Missing data is coded as NA. Dominant data must be coded with binary values (0 for absence - 1 for presence).

E

Data frame with n rows (individuals), and environmental data in columns.

S

Data frame with n rows (individuals), and groups (factors) in columns. The program converts non-factor data into factor.

C

Data frame with n rows (individuals), and custom variables in columns.

G.processed

If TRUE, the slot G will include a processed data frame ( removed non informative loci (the data non available for all the individuals), removed non polymorphic loci (for dominant data) and ordered alleles in ascending order.

order.G

Genotypes must be ordered in G slot? (codominant data) Default FALSE.

type

Marker type: "codominant" or "dominant".

ploidy

Ploidy of the G data frame. Default ploidy = 2.

sep

Character separating alleles (codominant data). Default option is no character separating alleles.

ncod

Number of characters coding each allele (codominant data).

missing

Missing data treatment ("0", "NA", or "MEAN") for the A slot. Missing elements are set to 0 in the default option. missing elements are recoded as "NA" or the mean allelic frequency across individuals in "NA" and "MEAN" options, respectively.

NA.char

Character simbolizing missing data in the input. Default is "NA".

poly.level

Polymorphism threshold in percentage (0 - 100), for remotion of non polymorphic loci (for dominant data). Default is 5 (5%).

rm.empty.ind

Remotion of noinformtive individuals (row of "NAs"). Default if FALSE.

order.df

Order individuals of data frames by row? (all data frames with a same order in row names). This option works when the names of the data frames are used (i.e., set.names and valid.names are NULL), otherwise the TRUE/FALSE value of this parameter has no effect in the function. Defalut TRUE. If FALSE, the row names of all the data frames must be ordered. The use of data frames with row names in different order will return an error. In both cases, the program sets an internal names attribute of the object using the row names of the first non-empty data frame found in the following order: XY, P, G, E, S, C. This attribute is used as reference to order rows when order.df = TRUE.

set.names

Character vector with names for the rows of the non-empty data frames. This argument is incompatible with valid.names

valid.names

Logical. Create valid row names? This argument is incompatible with set.names. The program will name individuals with valid tags I.1, I.2, etc.

Details

This is a generic function for creation of ecogen objects. In the default option, missing data should be coded as "NA", but any missing data character can be passed with the option NA.char. In all the cases, the new object will have a slot G coding the missing data as NA. For dominant markers (0/1 coding), the slot A is unnecesary an it is treated by ecogen methods as a symbolic link to G.

ACCESS TO THE SLOTS. MODIFICATION OF ECOGEN OBJECTS

The content of the slots can be extracted with the corresponding accessors ecoslot.XY, ecoslot.P, ecoslot.G, ecoslot.A, ecoslot.E, ecoslot.C and ecoslot.OUT. Accessors can be also used to assign data to the slots. The correct use of ecogen objects requires the implementation of accessors, as they ensure the checking and pre-processing of the data. The use of accessors allows to modify or fill the slots of ecogen objects, without the need of creating a new object each time. See help("EcoGenetics accessors") for a detailed description and examples about ecogen accessors. OTHER SLOT ACCESS METHODS FOR ECOGEN OBJECTS

The use of brackets is defined for ecogen objects:

- Single bracket: the single bracket ("[") is used to subset all the ecogen data frames (P, G, E, S, A and C) by row, at once. The notation for an object is eco[from:to], where eco is any ecogen object, and from: to is the row range. For example: eco[1:10] , subsets the object eco from row 1 to row 10, for all the data frames at once.

- Double square brackets: the double square brackets are symbolic abbreviations of the accessors (i.e., it is a call to the corresponding accessor). The usage is: eco[["X"]], where X is a slot: eco[["P"]], eco[["G"]], eco[["A"]], eco[["E"]], eco[["S"]], eco[["C"]] and eco[["OUT"]]. Double square brackets can be used in get/set mode. See Examples below and in help("EcoGenetics accessors").

ABOUT THE CONSTRUCTION OF NEW ECOGEN OBJECTS

A new ecogen object can be constructed in two different ways. First, a new object can be created, incorporating all the information at once. Second, the data can be added in each slot, using the corresponding accessor / "[[". Accessor/double square brackets methods allow temporal modification of any ecogen object and ensure the modularity of this kind of objets. These methods are not only functions used to get/assign values to the slots, they provide a basic pre-processing of the data during assignment, generating a coherent and valid set of information.

Examples

# NOT RUN {
# Example with G data of class "data.frame", corresponding to
# microsatellites of a diploid organism:
data(eco.test)
eco <- ecogen(XY = coordinates, P = phenotype, G = genotype,
E = environment, S = structure)

# Example with G data of class "data.frame", corresponding to a
# presence - absence molecular marker:
dat <- sample(c(0,1),100,rep = TRUE)
dat <- data.frame(matrix(dat,10,10))
eco <- ecogen(G = dat, type = "dominant")


# DINAMIC ASSIGNMENT WITH ACCESSORS AND "[["

eco <- ecogen(XY = coordinates, P = phenotype)
eco

ecoslot.G(eco, order.G = TRUE) <- genotype

# this is identical to
eco[["G", order.G=TRUE]] <- genotype

ecoslot.E(eco) <- environment

# this is identical to
eco[["E"]] <- environment

#----------------------------------------------------------
# See additional examples in help("EcoGenetics accessors")
#----------------------------------------------------------

# Storing data in the slot OUT

 singers <- c("carlos_gardel", "billie_holiday")

ecoslot.OUT(eco) <- singers

# Storing several datasets

golden.number <- (sqrt(5) + 1) / 2
ecoslot.OUT(eco) <- list(singers, golden.number)    # several objects must be passed as a list

# this is identical to:

eco[["OUT"]] <- list(singers, golden.number)

# }