Design of Novel ETL Model to Analyse Corona Virus Data

INTRODUCTION:The corona disease was first recognized in 2019 in Wuhan, which is the capital of China’s Hubei-province,and from then it continued spreading and as a result declared as a pandemic by all nations. The COVID-19virus has different effects on people in various ways. It is a kind of respira...

Full description

Bibliographic Details
Main Authors: Amit Dewangan, S.M. Ghosh, Akhilesh Shrivas
Format: Article
Language:English
Published: European Alliance for Innovation (EAI) 2020-09-01
Series:EAI Endorsed Transactions on Pervasive Health and Technology
Subjects:
etl
Online Access:https://eudl.eu/pdf/10.4108/eai.13-7-2018.165671
id doaj-e60defdda1244543b48cda0fb32b35eb
record_format Article
spelling doaj-e60defdda1244543b48cda0fb32b35eb2020-11-25T02:50:03ZengEuropean Alliance for Innovation (EAI)EAI Endorsed Transactions on Pervasive Health and Technology2411-71452020-09-0162310.4108/eai.13-7-2018.165671Design of Novel ETL Model to Analyse Corona Virus DataAmit Dewangan0S.M. Ghosh1Akhilesh Shrivas2Department of Information Technology, Guru Ghasidas Vishwavidyalaya, Bilaspur. IndiaDepartment of Computer Science and Engineering, Dr. C.V. Raman University, Kota, Bilaspur. IndiaDepartment of Computer Science and Information Technology, Guru Ghasidas Vishwavidyalaya, Bilaspur. IndiaINTRODUCTION:The corona disease was first recognized in 2019 in Wuhan, which is the capital of China’s Hubei-province,and from then it continued spreading and as a result declared as a pandemic by all nations. The COVID-19virus has different effects on people in various ways. It is a kind of respiratory disease. The confirmed casesare increasing day to day in India, which leads to complete lockdown throughout the nation.OBJECTIVE:The objective of this research is to design a novel Extract-Trandform and Load NETL model to analyse covid19 data in india.METHODS:The extraction of useful information from a large database is a well-connected research field of text mining.This paper is proposed a novel extract-transform-load ETL model to process the COVID-19 data of India toget the exact recovery data from the multiple data sources from different states of India. In this, a knowledgebased model that generate knowledge based on three different module split, validation, and join is discussed.RESULTS:The outcomes of the proposed NETL process are, output file which has the description of total positive cases,active cases, recovery cases, and death rate, based on different regions. The analysis of NETL is done basedon accuracy, failure count, and execution time. The proposed NETL process is more accurate and taking lesscompilation time with minimum failure count as compared with existing models.CONCLUSION:To analyze the coronavirus data in India, a novel ETL (NETL) model is proposed. In this model, a total of 9CSV files is processed as input files to get different results in different categories. This model is having threemodules namely splitting, verification, and join. The dataset is split into based on its coupling attributes andthen joined with a single value to produce the updated results as per the current dataset. The last stage of thisprocess is to join the data which is generated through splitting. The proposed NETL model is more accurateas compared with existing ETM models.https://eudl.eu/pdf/10.4108/eai.13-7-2018.165671corona virustext miningdata analyticsetlcovid-19pandemic
collection DOAJ
language English
format Article
sources DOAJ
author Amit Dewangan
S.M. Ghosh
Akhilesh Shrivas
spellingShingle Amit Dewangan
S.M. Ghosh
Akhilesh Shrivas
Design of Novel ETL Model to Analyse Corona Virus Data
EAI Endorsed Transactions on Pervasive Health and Technology
corona virus
text mining
data analytics
etl
covid-19
pandemic
author_facet Amit Dewangan
S.M. Ghosh
Akhilesh Shrivas
author_sort Amit Dewangan
title Design of Novel ETL Model to Analyse Corona Virus Data
title_short Design of Novel ETL Model to Analyse Corona Virus Data
title_full Design of Novel ETL Model to Analyse Corona Virus Data
title_fullStr Design of Novel ETL Model to Analyse Corona Virus Data
title_full_unstemmed Design of Novel ETL Model to Analyse Corona Virus Data
title_sort design of novel etl model to analyse corona virus data
publisher European Alliance for Innovation (EAI)
series EAI Endorsed Transactions on Pervasive Health and Technology
issn 2411-7145
publishDate 2020-09-01
description INTRODUCTION:The corona disease was first recognized in 2019 in Wuhan, which is the capital of China’s Hubei-province,and from then it continued spreading and as a result declared as a pandemic by all nations. The COVID-19virus has different effects on people in various ways. It is a kind of respiratory disease. The confirmed casesare increasing day to day in India, which leads to complete lockdown throughout the nation.OBJECTIVE:The objective of this research is to design a novel Extract-Trandform and Load NETL model to analyse covid19 data in india.METHODS:The extraction of useful information from a large database is a well-connected research field of text mining.This paper is proposed a novel extract-transform-load ETL model to process the COVID-19 data of India toget the exact recovery data from the multiple data sources from different states of India. In this, a knowledgebased model that generate knowledge based on three different module split, validation, and join is discussed.RESULTS:The outcomes of the proposed NETL process are, output file which has the description of total positive cases,active cases, recovery cases, and death rate, based on different regions. The analysis of NETL is done basedon accuracy, failure count, and execution time. The proposed NETL process is more accurate and taking lesscompilation time with minimum failure count as compared with existing models.CONCLUSION:To analyze the coronavirus data in India, a novel ETL (NETL) model is proposed. In this model, a total of 9CSV files is processed as input files to get different results in different categories. This model is having threemodules namely splitting, verification, and join. The dataset is split into based on its coupling attributes andthen joined with a single value to produce the updated results as per the current dataset. The last stage of thisprocess is to join the data which is generated through splitting. The proposed NETL model is more accurateas compared with existing ETM models.
topic corona virus
text mining
data analytics
etl
covid-19
pandemic
url https://eudl.eu/pdf/10.4108/eai.13-7-2018.165671
work_keys_str_mv AT amitdewangan designofnoveletlmodeltoanalysecoronavirusdata
AT smghosh designofnoveletlmodeltoanalysecoronavirusdata
AT akhileshshrivas designofnoveletlmodeltoanalysecoronavirusdata
_version_ 1724740381813243904