Novel variable influence on projection (VIP) methods in OPLS, O2PLS, and OnPLS models for single- and multi-block variable selection : VIPOPLS, VIPO2PLS, and MB-VIOP methods

Multivariate and multiblock data analysis involves useful methodologies for analyzing large data sets in chemistry, biology, psychology, economics, sensory science, and industrial processes; among these methodologies, partial least squares (PLS) and orthogonal projections to latent structures (OPLS®...

Full description

Bibliographic Details
Main Author: Galindo-Prieto, Beatriz
Format: Doctoral Thesis
Language:English
Published: Umeå universitet, Kemiska institutionen 2017
Subjects:
VIP
Online Access:http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-130579
http://nbn-resolving.de/urn:isbn:978-91-7601-620-6
id ndltd-UPSALLA1-oai-DiVA.org-umu-130579
record_format oai_dc
spelling ndltd-UPSALLA1-oai-DiVA.org-umu-1305792017-01-26T05:22:51ZNovel variable influence on projection (VIP) methods in OPLS, O2PLS, and OnPLS models for single- and multi-block variable selection : VIPOPLS, VIPO2PLS, and MB-VIOP methodsengGalindo-Prieto, BeatrizUmeå universitet, Kemiska institutionenUmeå : Umeå University2017Variable influence on projectionVIPMB-VIOPorthogonal projections to latent structuresOPLSO2PLSOnPLSvariable selectionvariable importance in multiblock regressionMultivariate and multiblock data analysis involves useful methodologies for analyzing large data sets in chemistry, biology, psychology, economics, sensory science, and industrial processes; among these methodologies, partial least squares (PLS) and orthogonal projections to latent structures (OPLS®) have become popular. Due to the increasingly computerized instrumentation, a data set can consist of thousands of input variables which contain latent information valuable for research and industrial purposes. When analyzing a large number of data sets (blocks) simultaneously, the number of variables and underlying connections between them grow very much indeed; at this point, reducing the number of variables keeping high interpretability becomes a much needed strategy. The main direction of research in this thesis is the development of a variable selection method, based on variable influence on projection (VIP), in order to improve the model interpretability of OnPLS models in multiblock data analysis. This new method is called multiblock variable influence on orthogonal projections (MB-VIOP), and its novelty lies in the fact that it is the first multiblock variable selection method for OnPLS models. Several milestones needed to be reached in order to successfully create MB-VIOP. The first milestone was the development of a single-block variable selection method able to handle orthogonal latent variables in OPLS models, i.e. VIP for OPLS (denoted as VIPOPLS or OPLS-VIP in Paper I), which proved to increase the interpretability of PLS and OPLS models, and afterwards, was successfully extended to multivariate time series analysis (MTSA) aiming at process control (Paper II). The second milestone was to develop the first multiblock VIP approach for enhancement of O2PLS® models, i.e. VIPO2PLS for two-block multivariate data analysis (Paper III). And finally, the third milestone and main goal of this thesis, the development of the MB-VIOP algorithm for the improvement of OnPLS model interpretability when analyzing a large number of data sets simultaneously (Paper IV). The results of this thesis, and their enclosed papers, showed that VIPOPLS, VIPO2PLS, and MB-VIOP methods successfully assess the most relevant variables for model interpretation in PLS, OPLS, O2PLS, and OnPLS models. In addition, predictability, robustness, dimensionality reduction, and other variable selection purposes, can be potentially improved/achieved by using these methods. Doctoral thesis, comprehensive summaryinfo:eu-repo/semantics/doctoralThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-130579urn:isbn:978-91-7601-620-6application/pdfinfo:eu-repo/semantics/openAccess
collection NDLTD
language English
format Doctoral Thesis
sources NDLTD
topic Variable influence on projection
VIP
MB-VIOP
orthogonal projections to latent structures
OPLS
O2PLS
OnPLS
variable selection
variable importance in multiblock regression
spellingShingle Variable influence on projection
VIP
MB-VIOP
orthogonal projections to latent structures
OPLS
O2PLS
OnPLS
variable selection
variable importance in multiblock regression
Galindo-Prieto, Beatriz
Novel variable influence on projection (VIP) methods in OPLS, O2PLS, and OnPLS models for single- and multi-block variable selection : VIPOPLS, VIPO2PLS, and MB-VIOP methods
description Multivariate and multiblock data analysis involves useful methodologies for analyzing large data sets in chemistry, biology, psychology, economics, sensory science, and industrial processes; among these methodologies, partial least squares (PLS) and orthogonal projections to latent structures (OPLS®) have become popular. Due to the increasingly computerized instrumentation, a data set can consist of thousands of input variables which contain latent information valuable for research and industrial purposes. When analyzing a large number of data sets (blocks) simultaneously, the number of variables and underlying connections between them grow very much indeed; at this point, reducing the number of variables keeping high interpretability becomes a much needed strategy. The main direction of research in this thesis is the development of a variable selection method, based on variable influence on projection (VIP), in order to improve the model interpretability of OnPLS models in multiblock data analysis. This new method is called multiblock variable influence on orthogonal projections (MB-VIOP), and its novelty lies in the fact that it is the first multiblock variable selection method for OnPLS models. Several milestones needed to be reached in order to successfully create MB-VIOP. The first milestone was the development of a single-block variable selection method able to handle orthogonal latent variables in OPLS models, i.e. VIP for OPLS (denoted as VIPOPLS or OPLS-VIP in Paper I), which proved to increase the interpretability of PLS and OPLS models, and afterwards, was successfully extended to multivariate time series analysis (MTSA) aiming at process control (Paper II). The second milestone was to develop the first multiblock VIP approach for enhancement of O2PLS® models, i.e. VIPO2PLS for two-block multivariate data analysis (Paper III). And finally, the third milestone and main goal of this thesis, the development of the MB-VIOP algorithm for the improvement of OnPLS model interpretability when analyzing a large number of data sets simultaneously (Paper IV). The results of this thesis, and their enclosed papers, showed that VIPOPLS, VIPO2PLS, and MB-VIOP methods successfully assess the most relevant variables for model interpretation in PLS, OPLS, O2PLS, and OnPLS models. In addition, predictability, robustness, dimensionality reduction, and other variable selection purposes, can be potentially improved/achieved by using these methods.
author Galindo-Prieto, Beatriz
author_facet Galindo-Prieto, Beatriz
author_sort Galindo-Prieto, Beatriz
title Novel variable influence on projection (VIP) methods in OPLS, O2PLS, and OnPLS models for single- and multi-block variable selection : VIPOPLS, VIPO2PLS, and MB-VIOP methods
title_short Novel variable influence on projection (VIP) methods in OPLS, O2PLS, and OnPLS models for single- and multi-block variable selection : VIPOPLS, VIPO2PLS, and MB-VIOP methods
title_full Novel variable influence on projection (VIP) methods in OPLS, O2PLS, and OnPLS models for single- and multi-block variable selection : VIPOPLS, VIPO2PLS, and MB-VIOP methods
title_fullStr Novel variable influence on projection (VIP) methods in OPLS, O2PLS, and OnPLS models for single- and multi-block variable selection : VIPOPLS, VIPO2PLS, and MB-VIOP methods
title_full_unstemmed Novel variable influence on projection (VIP) methods in OPLS, O2PLS, and OnPLS models for single- and multi-block variable selection : VIPOPLS, VIPO2PLS, and MB-VIOP methods
title_sort novel variable influence on projection (vip) methods in opls, o2pls, and onpls models for single- and multi-block variable selection : vipopls, vipo2pls, and mb-viop methods
publisher Umeå universitet, Kemiska institutionen
publishDate 2017
url http://urn.kb.se/resolve?urn=urn:nbn:se:umu:diva-130579
http://nbn-resolving.de/urn:isbn:978-91-7601-620-6
work_keys_str_mv AT galindoprietobeatriz novelvariableinfluenceonprojectionvipmethodsinoplso2plsandonplsmodelsforsingleandmultiblockvariableselectionvipoplsvipo2plsandmbviopmethods
_version_ 1718410623617859584