An efficient integration and indexing method based on feature patterns and semantic analysis for big data

Big Data has received much attention in the multi-domain industry. In the digital and computing world, information is generated and collected at a rate that quickly exceeds the boundaries. The traditional data integration system interconnects the limited number of resources and is built with relativ...

Full description

Bibliographic Details
Main Authors: Madhu Mahesh Nashipudimath, Subhash K. Shinde, Jayshree Jain
Format: Article
Language:English
Published: Elsevier 2020-09-01
Series:Array
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2590005620300187
id doaj-33bc1212344346938d92a4339fefa7e9
record_format Article
spelling doaj-33bc1212344346938d92a4339fefa7e92020-11-25T03:19:57ZengElsevierArray2590-00562020-09-017100033An efficient integration and indexing method based on feature patterns and semantic analysis for big dataMadhu Mahesh Nashipudimath0Subhash K. Shinde1Jayshree Jain2Department of Computer Engineering, Pacific Academic Higher Education and Research University, Udaipur, India; Department of Computer Engineering, Pillai College of Engineering, New Panvel, Navi Mumbai, India; Corresponding author. Department of Computer Engineering, Pacific Academic Higher Education and Research University, Udaipur, India.Department of Computer Engineering, Lokmanya Tilak College of Engineering, Navi Mumbai, IndiaDepartment of Computer Engineering, Pacific Academic Higher Education and Research University, Udaipur, IndiaBig Data has received much attention in the multi-domain industry. In the digital and computing world, information is generated and collected at a rate that quickly exceeds the boundaries. The traditional data integration system interconnects the limited number of resources and is built with relatively stable and generally complex and time-consuming design activities. However, the rapid growth of these large data sets creates difficulties in learning heterogeneous data structures for integration and indexing. It also creates difficulty in information retrieval for the various data analysis requirements. In this paper, a probabilistic feature Patterns (PFP) approach using feature transformation and selection method is proposed for efficient data integration and utilizing the features latent semantic analysis (F-LSA) method for indexing the unsupervised multiple heterogeneous integrated cluster data sources. The PFP approach takes the advantage of the features transformation and selection mechanism to map and cluster the data for the integration, and an analysis of the data features context relation using LSA to provide the appropriate index for fast and accurate data extraction. A huge volume of BibText dataset from different publication sources are processed to evaluated to understand the effectiveness of the proposal. The analytical study and the outcome results show the improvisation in integration and indexing of the work.http://www.sciencedirect.com/science/article/pii/S2590005620300187Big dataIntegrationFeature patternsIndexingSemantic analysis
collection DOAJ
language English
format Article
sources DOAJ
author Madhu Mahesh Nashipudimath
Subhash K. Shinde
Jayshree Jain
spellingShingle Madhu Mahesh Nashipudimath
Subhash K. Shinde
Jayshree Jain
An efficient integration and indexing method based on feature patterns and semantic analysis for big data
Array
Big data
Integration
Feature patterns
Indexing
Semantic analysis
author_facet Madhu Mahesh Nashipudimath
Subhash K. Shinde
Jayshree Jain
author_sort Madhu Mahesh Nashipudimath
title An efficient integration and indexing method based on feature patterns and semantic analysis for big data
title_short An efficient integration and indexing method based on feature patterns and semantic analysis for big data
title_full An efficient integration and indexing method based on feature patterns and semantic analysis for big data
title_fullStr An efficient integration and indexing method based on feature patterns and semantic analysis for big data
title_full_unstemmed An efficient integration and indexing method based on feature patterns and semantic analysis for big data
title_sort efficient integration and indexing method based on feature patterns and semantic analysis for big data
publisher Elsevier
series Array
issn 2590-0056
publishDate 2020-09-01
description Big Data has received much attention in the multi-domain industry. In the digital and computing world, information is generated and collected at a rate that quickly exceeds the boundaries. The traditional data integration system interconnects the limited number of resources and is built with relatively stable and generally complex and time-consuming design activities. However, the rapid growth of these large data sets creates difficulties in learning heterogeneous data structures for integration and indexing. It also creates difficulty in information retrieval for the various data analysis requirements. In this paper, a probabilistic feature Patterns (PFP) approach using feature transformation and selection method is proposed for efficient data integration and utilizing the features latent semantic analysis (F-LSA) method for indexing the unsupervised multiple heterogeneous integrated cluster data sources. The PFP approach takes the advantage of the features transformation and selection mechanism to map and cluster the data for the integration, and an analysis of the data features context relation using LSA to provide the appropriate index for fast and accurate data extraction. A huge volume of BibText dataset from different publication sources are processed to evaluated to understand the effectiveness of the proposal. The analytical study and the outcome results show the improvisation in integration and indexing of the work.
topic Big data
Integration
Feature patterns
Indexing
Semantic analysis
url http://www.sciencedirect.com/science/article/pii/S2590005620300187
work_keys_str_mv AT madhumaheshnashipudimath anefficientintegrationandindexingmethodbasedonfeaturepatternsandsemanticanalysisforbigdata
AT subhashkshinde anefficientintegrationandindexingmethodbasedonfeaturepatternsandsemanticanalysisforbigdata
AT jayshreejain anefficientintegrationandindexingmethodbasedonfeaturepatternsandsemanticanalysisforbigdata
AT madhumaheshnashipudimath efficientintegrationandindexingmethodbasedonfeaturepatternsandsemanticanalysisforbigdata
AT subhashkshinde efficientintegrationandindexingmethodbasedonfeaturepatternsandsemanticanalysisforbigdata
AT jayshreejain efficientintegrationandindexingmethodbasedonfeaturepatternsandsemanticanalysisforbigdata
_version_ 1724620060023062528