Creating a new ecogen object
ecogen(XY = data.frame(), P = data.frame(), G = data.frame(), E = data.frame(), S = data.frame(), C = data.frame(), G.processed = TRUE, order.G = FALSE, type = c("codominant", "dominant"), ploidy = 2, sep = "", ncod = NULL, missing = c("0", "NA", "MEAN"), NA.char = "NA", poly.level = 5, rm.empty.ind = FALSE, order.df = TRUE, set.names = NULL, valid.names = FALSE)
XY | Data frame with m columns (coordinates) and n rows (individuals). |
---|---|
P | Data frame with n rows (individuals), and phenotypic data in columns. |
G | Data of class: "data.frame", with individuals in rows and genotypic data in columns (loci). The ploidy and the type (codominant, dominant) of the data, must be passed with the arguments "ploidy" and "type". Missing data is coded as NA. Dominant data must be coded with binary values (0 for absence - 1 for presence). |
E | Data frame with n rows (individuals), and environmental data in columns. |
S | Data frame with n rows (individuals), and groups (factors) in columns. The program converts non-factor data into factor. |
C | Data frame with n rows (individuals), and custom variables in columns. |
G.processed | If TRUE, the slot G will include a processed data frame ( removed non informative loci (the data non available for all the individuals), removed non polymorphic loci (for dominant data) and ordered alleles in ascending order. |
order.G | Genotypes must be ordered in G slot? (codominant data) Default FALSE. |
type | Marker type: "codominant" or "dominant". |
ploidy | Ploidy of the G data frame. Default ploidy = 2. |
sep | Character separating alleles (codominant data). Default option is no character separating alleles. |
ncod | Number of characters coding each allele (codominant data). |
missing | Missing data treatment ("0", "NA", or "MEAN") for the A slot. Missing elements are set to 0 in the default option. missing elements are recoded as "NA" or the mean allelic frequency across individuals in "NA" and "MEAN" options, respectively. |
NA.char | Character simbolizing missing data in the input. Default is "NA". |
poly.level | Polymorphism threshold in percentage (0 - 100), for remotion of non polymorphic loci (for dominant data). Default is 5 (5%). |
rm.empty.ind | Remotion of noinformtive individuals (row of "NAs"). Default if FALSE. |
order.df | Order individuals of data frames by row? (all data frames with a same order in row names). This option works when the names of the data frames are used (i.e., set.names and valid.names are NULL), otherwise the TRUE/FALSE value of this parameter has no effect in the function. Defalut TRUE. If FALSE, the row names of all the data frames must be ordered. The use of data frames with row names in different order will return an error. In both cases, the program sets an internal names attribute of the object using the row names of the first non-empty data frame found in the following order: XY, P, G, E, S, C. This attribute is used as reference to order rows when order.df = TRUE. |
set.names | Character vector with names for the rows of the non-empty data frames. This argument is incompatible with valid.names |
valid.names | Logical. Create valid row names? This argument is incompatible with set.names. The program will name individuals with valid tags I.1, I.2, etc. |
This is a generic function for creation of ecogen objects. In the default option, missing data should be coded as "NA", but any missing data character can be passed with the option NA.char. In all the cases, the new object will have a slot G coding the missing data as NA. For dominant markers (0/1 coding), the slot A is unnecesary an it is treated by ecogen methods as a symbolic link to G.
ACCESS TO THE SLOTS. MODIFICATION OF ECOGEN OBJECTS
The content of the slots can be extracted with the corresponding accessors ecoslot.XY, ecoslot.P, ecoslot.G, ecoslot.A, ecoslot.E, ecoslot.C and ecoslot.OUT. Accessors can be also used to assign data to the slots. The correct use of ecogen objects requires the implementation of accessors, as they ensure the checking and pre-processing of the data. The use of accessors allows to modify or fill the slots of ecogen objects, without the need of creating a new object each time. See help("EcoGenetics accessors") for a detailed description and examples about ecogen accessors. OTHER SLOT ACCESS METHODS FOR ECOGEN OBJECTS
The use of brackets is defined for ecogen objects:
- Single bracket: the single bracket ("[") is used to subset all the ecogen data frames (P, G, E, S, A and C) by row, at once. The notation for an object is eco[from:to], where eco is any ecogen object, and from: to is the row range. For example: eco[1:10] , subsets the object eco from row 1 to row 10, for all the data frames at once.
- Double square brackets: the double square brackets are symbolic abbreviations of the accessors (i.e., it is a call to the corresponding accessor). The usage is: eco[["X"]], where X is a slot: eco[["P"]], eco[["G"]], eco[["A"]], eco[["E"]], eco[["S"]], eco[["C"]] and eco[["OUT"]]. Double square brackets can be used in get/set mode. See Examples below and in help("EcoGenetics accessors").
ABOUT THE CONSTRUCTION OF NEW ECOGEN OBJECTS
A new ecogen object can be constructed in two different ways. First, a new object can be created, incorporating all the information at once. Second, the data can be added in each slot, using the corresponding accessor / "[[". Accessor/double square brackets methods allow temporal modification of any ecogen object and ensure the modularity of this kind of objets. These methods are not only functions used to get/assign values to the slots, they provide a basic pre-processing of the data during assignment, generating a coherent and valid set of information.
# NOT RUN { # Example with G data of class "data.frame", corresponding to # microsatellites of a diploid organism: data(eco.test) eco <- ecogen(XY = coordinates, P = phenotype, G = genotype, E = environment, S = structure) # Example with G data of class "data.frame", corresponding to a # presence - absence molecular marker: dat <- sample(c(0,1),100,rep = TRUE) dat <- data.frame(matrix(dat,10,10)) eco <- ecogen(G = dat, type = "dominant") # DINAMIC ASSIGNMENT WITH ACCESSORS AND "[[" eco <- ecogen(XY = coordinates, P = phenotype) eco ecoslot.G(eco, order.G = TRUE) <- genotype # this is identical to eco[["G", order.G=TRUE]] <- genotype ecoslot.E(eco) <- environment # this is identical to eco[["E"]] <- environment #---------------------------------------------------------- # See additional examples in help("EcoGenetics accessors") #---------------------------------------------------------- # Storing data in the slot OUT singers <- c("carlos_gardel", "billie_holiday") ecoslot.OUT(eco) <- singers # Storing several datasets golden.number <- (sqrt(5) + 1) / 2 ecoslot.OUT(eco) <- list(singers, golden.number) # several objects must be passed as a list # this is identical to: eco[["OUT"]] <- list(singers, golden.number) # }