Algorithms for Large Scale Problems in Eigenvalue and Svd Computations and in Big Data Applications

As ”big data” has increasing influence on our daily life and research activities, it poses significant challenges on various research areas. Some applications often demand a fast solution of large, sparse eigenvalue and singular value problems; In other applications, extracting knowledge from large-...

Full description

Bibliographic Details
Main Author: Wu, Lingfei
Format: Others
Language:English
Published: W&M ScholarWorks 2016
Subjects:
Online Access:https://scholarworks.wm.edu/etd/1477068451
https://scholarworks.wm.edu/cgi/viewcontent.cgi?article=1093&context=etd
id ndltd-wm.edu-oai-scholarworks.wm.edu-etd-1093
record_format oai_dc
spelling ndltd-wm.edu-oai-scholarworks.wm.edu-etd-10932021-09-18T05:29:05Z Algorithms for Large Scale Problems in Eigenvalue and Svd Computations and in Big Data Applications Wu, Lingfei As ”big data” has increasing influence on our daily life and research activities, it poses significant challenges on various research areas. Some applications often demand a fast solution of large, sparse eigenvalue and singular value problems; In other applications, extracting knowledge from large-scale data requires many techniques such as statistical calculations, data mining, and high performance computing. In this dissertation, we develop efficient and robust iterative methods and software for the computation of eigenvalue and singular values. We also develop practical numerical and data mining techniques to estimate the trace of a function of a large, sparse matrix and to detect in real-time blob-filaments in fusion plasma on extremely large parallel computers. In the first work, we propose a hybrid two stage SVD method for efficiently and accurately computing a few extreme singular triplets, especially the ones corresponding to the smallest singular values. The first stage achieves fast convergence while the second achieves the final accuracy. Furthermore, we develop a high-performance preconditioned SVD software based on the proposed method on top of the state-of-the-art eigensolver PRIMME. The method can be used with or without preconditioning, on parallel computers, and is superior to other state-of-the-art SVD methods in both efficiency and robustness. In the second study, we provide insights and develop practical algorithms to accomplish efficient and accurate computation of interior eigenpairs using refined projection techniques in non-Krylov iterative methods. By analyzing different implementations of the refined projection, we propose a new hybrid method to efficiently find interior eigenpairs without compromising accuracy. Our numerical experiments illustrate the efficiency and robustness of the proposed method. In the third work, we present a novel method to estimate the trace of matrix inverse that exploits the pattern correlation between the diagonal of the inverse of the matrix and that of some approximate inverse. We leverage various sampling and fitting techniques to fit the diagonal of the approximation to that of the inverse. Our method may serve as a standalone kernel for providing a fast trace estimate or as a variance reduction method for Monte Carlo in some cases. An extensive set of experiments demonstrate the potential of our method. In the fourth study, we provide first results on applying outlier detection techniques to effectively tackle the fusion blob detection problem on extremely large parallel machines. We present a real-time region outlier detection algorithm to efficiently find and track blobs in fusion experiments and simulations. Our experiments demonstrated we can achieve linear time speedup up to 1024 MPI processes and complete blob detection in two or three milliseconds. 2016-10-01T07:00:00Z text application/pdf https://scholarworks.wm.edu/etd/1477068451 https://scholarworks.wm.edu/cgi/viewcontent.cgi?article=1093&context=etd © The Author http://creativecommons.org/licenses/by/4.0/ Dissertations, Theses, and Masters Projects English W&M ScholarWorks Computer Sciences
collection NDLTD
language English
format Others
sources NDLTD
topic Computer Sciences
spellingShingle Computer Sciences
Wu, Lingfei
Algorithms for Large Scale Problems in Eigenvalue and Svd Computations and in Big Data Applications
description As ”big data” has increasing influence on our daily life and research activities, it poses significant challenges on various research areas. Some applications often demand a fast solution of large, sparse eigenvalue and singular value problems; In other applications, extracting knowledge from large-scale data requires many techniques such as statistical calculations, data mining, and high performance computing. In this dissertation, we develop efficient and robust iterative methods and software for the computation of eigenvalue and singular values. We also develop practical numerical and data mining techniques to estimate the trace of a function of a large, sparse matrix and to detect in real-time blob-filaments in fusion plasma on extremely large parallel computers. In the first work, we propose a hybrid two stage SVD method for efficiently and accurately computing a few extreme singular triplets, especially the ones corresponding to the smallest singular values. The first stage achieves fast convergence while the second achieves the final accuracy. Furthermore, we develop a high-performance preconditioned SVD software based on the proposed method on top of the state-of-the-art eigensolver PRIMME. The method can be used with or without preconditioning, on parallel computers, and is superior to other state-of-the-art SVD methods in both efficiency and robustness. In the second study, we provide insights and develop practical algorithms to accomplish efficient and accurate computation of interior eigenpairs using refined projection techniques in non-Krylov iterative methods. By analyzing different implementations of the refined projection, we propose a new hybrid method to efficiently find interior eigenpairs without compromising accuracy. Our numerical experiments illustrate the efficiency and robustness of the proposed method. In the third work, we present a novel method to estimate the trace of matrix inverse that exploits the pattern correlation between the diagonal of the inverse of the matrix and that of some approximate inverse. We leverage various sampling and fitting techniques to fit the diagonal of the approximation to that of the inverse. Our method may serve as a standalone kernel for providing a fast trace estimate or as a variance reduction method for Monte Carlo in some cases. An extensive set of experiments demonstrate the potential of our method. In the fourth study, we provide first results on applying outlier detection techniques to effectively tackle the fusion blob detection problem on extremely large parallel machines. We present a real-time region outlier detection algorithm to efficiently find and track blobs in fusion experiments and simulations. Our experiments demonstrated we can achieve linear time speedup up to 1024 MPI processes and complete blob detection in two or three milliseconds.
author Wu, Lingfei
author_facet Wu, Lingfei
author_sort Wu, Lingfei
title Algorithms for Large Scale Problems in Eigenvalue and Svd Computations and in Big Data Applications
title_short Algorithms for Large Scale Problems in Eigenvalue and Svd Computations and in Big Data Applications
title_full Algorithms for Large Scale Problems in Eigenvalue and Svd Computations and in Big Data Applications
title_fullStr Algorithms for Large Scale Problems in Eigenvalue and Svd Computations and in Big Data Applications
title_full_unstemmed Algorithms for Large Scale Problems in Eigenvalue and Svd Computations and in Big Data Applications
title_sort algorithms for large scale problems in eigenvalue and svd computations and in big data applications
publisher W&M ScholarWorks
publishDate 2016
url https://scholarworks.wm.edu/etd/1477068451
https://scholarworks.wm.edu/cgi/viewcontent.cgi?article=1093&context=etd
work_keys_str_mv AT wulingfei algorithmsforlargescaleproblemsineigenvalueandsvdcomputationsandinbigdataapplications
_version_ 1719481556109623296