Indexing and Search of Order-Preserving Submatrix for Gene Expression Data

Bicluster pattern discovery plays a key role in analysis of gene expression data. One vital model of bicluster mining is Order-Preserving SubMatrix (OPSM), which finds similar tendency of some genes on some conditions. Most of the OPSM discovery methods are batch mining techniques and not suitable f...

Full description

Bibliographic Details
Main Authors: Tao Jiang, Bolin Chen, Juntao Li, Guoyu Xu
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8936975/
id doaj-ab42a661d9144e20a127f04739b192d7
record_format Article
spelling doaj-ab42a661d9144e20a127f04739b192d72021-03-29T23:15:09ZengIEEEIEEE Access2169-35362019-01-01718476918478510.1109/ACCESS.2019.29608568936975Indexing and Search of Order-Preserving Submatrix for Gene Expression DataTao Jiang0https://orcid.org/0000-0002-5145-6935Bolin Chen1https://orcid.org/0000-0001-5507-2288Juntao Li2https://orcid.org/0000-0002-3288-4395Guoyu Xu3https://orcid.org/0000-0001-9960-1786School of Computer and Information Engineering, Henan University of Economics and Law, Zhengzhou, ChinaSchool of Computer Science, Northwestern Polytechnical University, Xi’an, ChinaSchool of Mathematics and Information Science, Henan Normal University, Xinxiang, ChinaSchool of Computer and Information Engineering, Henan University of Economics and Law, Zhengzhou, ChinaBicluster pattern discovery plays a key role in analysis of gene expression data. One vital model of bicluster mining is Order-Preserving SubMatrix (OPSM), which finds similar tendency of some genes on some conditions. Most of the OPSM discovery methods are batch mining techniques and not suitable for low latency data query. To make data analysis efficient and effective, in this paper, we first propose a prefix-tree based indexing method pfTree, then give an optimization technique pIndex that employs row and column header tables to search the positive, negative and time-delayed OPSMs. Meanwhile, we present an online sharing query technique to accelerate the frequent searches. Finally, we conduct extensive experiments and compare our methods with the existing approaches. Experimental results demonstrate the efficiency and effectiveness of the proposed methods.https://ieeexplore.ieee.org/document/8936975/Gene expression dataonline sharing queriesOPSMpfTreepIndex
collection DOAJ
language English
format Article
sources DOAJ
author Tao Jiang
Bolin Chen
Juntao Li
Guoyu Xu
spellingShingle Tao Jiang
Bolin Chen
Juntao Li
Guoyu Xu
Indexing and Search of Order-Preserving Submatrix for Gene Expression Data
IEEE Access
Gene expression data
online sharing queries
OPSM
pfTree
pIndex
author_facet Tao Jiang
Bolin Chen
Juntao Li
Guoyu Xu
author_sort Tao Jiang
title Indexing and Search of Order-Preserving Submatrix for Gene Expression Data
title_short Indexing and Search of Order-Preserving Submatrix for Gene Expression Data
title_full Indexing and Search of Order-Preserving Submatrix for Gene Expression Data
title_fullStr Indexing and Search of Order-Preserving Submatrix for Gene Expression Data
title_full_unstemmed Indexing and Search of Order-Preserving Submatrix for Gene Expression Data
title_sort indexing and search of order-preserving submatrix for gene expression data
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2019-01-01
description Bicluster pattern discovery plays a key role in analysis of gene expression data. One vital model of bicluster mining is Order-Preserving SubMatrix (OPSM), which finds similar tendency of some genes on some conditions. Most of the OPSM discovery methods are batch mining techniques and not suitable for low latency data query. To make data analysis efficient and effective, in this paper, we first propose a prefix-tree based indexing method pfTree, then give an optimization technique pIndex that employs row and column header tables to search the positive, negative and time-delayed OPSMs. Meanwhile, we present an online sharing query technique to accelerate the frequent searches. Finally, we conduct extensive experiments and compare our methods with the existing approaches. Experimental results demonstrate the efficiency and effectiveness of the proposed methods.
topic Gene expression data
online sharing queries
OPSM
pfTree
pIndex
url https://ieeexplore.ieee.org/document/8936975/
work_keys_str_mv AT taojiang indexingandsearchoforderpreservingsubmatrixforgeneexpressiondata
AT bolinchen indexingandsearchoforderpreservingsubmatrixforgeneexpressiondata
AT juntaoli indexingandsearchoforderpreservingsubmatrixforgeneexpressiondata
AT guoyuxu indexingandsearchoforderpreservingsubmatrixforgeneexpressiondata
_version_ 1724189909243133952