Limitations of information extraction methods and techniques for heterogeneous unstructured big data

During the recent era of big data, a huge volume of unstructured data are being produced in various forms of audio, video, images, text, and animation. Effective use of these unstructured big data is a laborious and tedious task. Information extraction (IE) systems help to extract useful information...

Full description

Bibliographic Details
Main Authors: Kiran Adnan, Rehan Akbar
Format: Article
Language:English
Published: SAGE Publishing 2019-12-01
Series:International Journal of Engineering Business Management
Online Access:https://doi.org/10.1177/1847979019890771
id doaj-1023b8eb32f04eedaa98c87a4580c1ed
record_format Article
spelling doaj-1023b8eb32f04eedaa98c87a4580c1ed2021-04-02T12:44:54ZengSAGE PublishingInternational Journal of Engineering Business Management1847-97902019-12-011110.1177/1847979019890771Limitations of information extraction methods and techniques for heterogeneous unstructured big dataKiran AdnanRehan AkbarDuring the recent era of big data, a huge volume of unstructured data are being produced in various forms of audio, video, images, text, and animation. Effective use of these unstructured big data is a laborious and tedious task. Information extraction (IE) systems help to extract useful information from this large variety of unstructured data. Several techniques and methods have been presented for IE from unstructured data. However, numerous studies conducted on IE from a variety of unstructured data are limited to single data types such as text, image, audio, or video. This article reviews the existing IE techniques along with its subtasks, limitations, and challenges for the variety of unstructured data highlighting the impact of unstructured big data on IE techniques. To the best of our knowledge, there is no comprehensive study conducted to investigate the limitations of existing IE techniques for the variety of unstructured big data. The objective of the structured review presented in this article is twofold. First, it presents the overview of IE techniques from a variety of unstructured data such as text, image, audio, and video at one platform. Second, it investigates the limitations of these existing IE techniques due to the heterogeneity, dimensionality, and volume of unstructured big data. The review finds that advanced techniques for IE, particularly for multifaceted unstructured big data sets, are the utmost requirement of the organizations to manage big data and derive strategic information. Further, potential solutions are also presented to improve the unstructured big data IE systems for future research. These solutions will help to increase the efficiency and effectiveness of the data analytics process in terms of context-aware analytics systems, data-driven decision-making, and knowledge management.https://doi.org/10.1177/1847979019890771
collection DOAJ
language English
format Article
sources DOAJ
author Kiran Adnan
Rehan Akbar
spellingShingle Kiran Adnan
Rehan Akbar
Limitations of information extraction methods and techniques for heterogeneous unstructured big data
International Journal of Engineering Business Management
author_facet Kiran Adnan
Rehan Akbar
author_sort Kiran Adnan
title Limitations of information extraction methods and techniques for heterogeneous unstructured big data
title_short Limitations of information extraction methods and techniques for heterogeneous unstructured big data
title_full Limitations of information extraction methods and techniques for heterogeneous unstructured big data
title_fullStr Limitations of information extraction methods and techniques for heterogeneous unstructured big data
title_full_unstemmed Limitations of information extraction methods and techniques for heterogeneous unstructured big data
title_sort limitations of information extraction methods and techniques for heterogeneous unstructured big data
publisher SAGE Publishing
series International Journal of Engineering Business Management
issn 1847-9790
publishDate 2019-12-01
description During the recent era of big data, a huge volume of unstructured data are being produced in various forms of audio, video, images, text, and animation. Effective use of these unstructured big data is a laborious and tedious task. Information extraction (IE) systems help to extract useful information from this large variety of unstructured data. Several techniques and methods have been presented for IE from unstructured data. However, numerous studies conducted on IE from a variety of unstructured data are limited to single data types such as text, image, audio, or video. This article reviews the existing IE techniques along with its subtasks, limitations, and challenges for the variety of unstructured data highlighting the impact of unstructured big data on IE techniques. To the best of our knowledge, there is no comprehensive study conducted to investigate the limitations of existing IE techniques for the variety of unstructured big data. The objective of the structured review presented in this article is twofold. First, it presents the overview of IE techniques from a variety of unstructured data such as text, image, audio, and video at one platform. Second, it investigates the limitations of these existing IE techniques due to the heterogeneity, dimensionality, and volume of unstructured big data. The review finds that advanced techniques for IE, particularly for multifaceted unstructured big data sets, are the utmost requirement of the organizations to manage big data and derive strategic information. Further, potential solutions are also presented to improve the unstructured big data IE systems for future research. These solutions will help to increase the efficiency and effectiveness of the data analytics process in terms of context-aware analytics systems, data-driven decision-making, and knowledge management.
url https://doi.org/10.1177/1847979019890771
work_keys_str_mv AT kiranadnan limitationsofinformationextractionmethodsandtechniquesforheterogeneousunstructuredbigdata
AT rehanakbar limitationsofinformationextractionmethodsandtechniquesforheterogeneousunstructuredbigdata
_version_ 1721567717862932480