General multiple-table data management in R

Our objective is to develop classes of objects that (1) make handling multiple-table data sets easier and (2) seamlessly integrate with existing R plotting and model-fitting functions; our philosophy is to keep data management and data analysis separate.

Many ecologists and other scientists face a particular kind of data management issue, arising from data sets that cannot be expressed as a single table, matrix, or array. The prototypical example of this issue, called the fourth-corner problem (Legendre et al. 1997), comes from trait-based studies of ecological communities in which the ecologist must manage three tables: a sites-by-species table of community data, a sites-by-environmental variables table, and a species-by-traits table. Such a data set cannot be expressed in a single table (e.g. an object of class data frame or array) without either (1) summarizing or (2) repeating chunks of data; hence, the problem is inherently multi-tabular.

The multitable project involves the development of R tools supporting the analysis of such multiple-table data. The specific objective of the multitable project is to develop new classes of objects that make handling multiple-table data sets easier, and seamlessly integrate with existing R model fitting functions. The framework being developed is based on a separation of data management and data analysis. Once multiple-table data are organized into a special multiple-table R object (i.e. the data management step) they can be passed to plotting and model fitting functions (i.e. the data analysis step), just as we would pass data frames to such functions in R. Having separated analysis from data management, analyzing multiple tables becomes conceptually equivalent to analyzing single tables.

Major features: (1) subscripting of multiple tables simultanenously, (2) coercion of multiple tables into a single data frame for use in standard R model fitting and plotting functions.

NOW ON CRAN!! here.

The development version is here.

The project summary page can be found here.

A presentation on the multitable package at the Ecological Society of America can be found here.

Check out the new package vignette here.

mip.