Supervised Multiblock Analysis in R with the ade4 Package

This paper presents two novel statistical analyses of multiblock data using the R language. It is designed for data organized in (K + 1) blocks (i.e., tables) consisting of a block of response variables to be explained by a large number of explanatory variables which are divided into K meaningful bl...

Full description

Bibliographic Details
Main Authors: Stéphanie Bougeard, Stéphane Dray
Format: Article
Language:English
Published: Foundation for Open Access Statistics 2018-09-01
Series:Journal of Statistical Software
Subjects:
r
Online Access:https://www.jstatsoft.org/index.php/jss/article/view/2400
Description
Summary:This paper presents two novel statistical analyses of multiblock data using the R language. It is designed for data organized in (K + 1) blocks (i.e., tables) consisting of a block of response variables to be explained by a large number of explanatory variables which are divided into K meaningful blocks. All the variables - explanatory and dependent - are measured on the same individuals. Two multiblock methods both useful in practice are included, namely multiblock partial least squares regression and multiblock principal component analysis with instrumental variables. The proposed new methods are included within the ade4 package widely used thanks to its great variety of multivariate methods. These methods are available on the one hand for statisticians and on the other hand for users from various fields in the sense that all the values derived from the multiblock processing are available. Some relevant interpretation tools are also developed. Finally the main results are summarized using overall graphical displays. This paper is organized following the different steps of a standard multiblock process, each corresponding to specific R functions. All these steps are illustrated by the analysis of real epidemiological datasets.
ISSN:1548-7660