Time Series Analysis and Forecasting with Automated Machine Learning on a National ICD-10 Database

The application of machine learning (ML) for use in generating insights and making predictions on new records continues to expand within the medical community. Despite this progress to date, the application of time series analysis has remained underexplored due to complexity of the underlying techni...

Full description

Bibliographic Details
Main Authors: Victor Olsavszky, Mihnea Dosius, Cristian Vladescu, Johannes Benecke
Format: Article
Language:English
Published: MDPI AG 2020-07-01
Series:International Journal of Environmental Research and Public Health
Subjects:
Online Access:https://www.mdpi.com/1660-4601/17/14/4979
id doaj-c7e21b75182748a5a80fdcf90f098de7
record_format Article
spelling doaj-c7e21b75182748a5a80fdcf90f098de72020-11-25T03:23:38ZengMDPI AGInternational Journal of Environmental Research and Public Health1661-78271660-46012020-07-01174979497910.3390/ijerph17144979Time Series Analysis and Forecasting with Automated Machine Learning on a National ICD-10 DatabaseVictor Olsavszky0Mihnea Dosius1Cristian Vladescu2Johannes Benecke3Department of Dermatology, Venereology and Allergy, University Medical Center and Medical Faculty Mannheim, University of Heidelberg, and Center of Excellence in Dermatology, Theodor-Kutzer-Ufer 1–3, 68167 Mannheim, GermanyNational School of Public Health Management and Professional Development, Str. Vaselor, nr. 31, Bucharest 030167, RomaniaNational School of Public Health Management and Professional Development, Str. Vaselor, nr. 31, Bucharest 030167, RomaniaDepartment of Dermatology, Venereology and Allergy, University Medical Center and Medical Faculty Mannheim, University of Heidelberg, and Center of Excellence in Dermatology, Theodor-Kutzer-Ufer 1–3, 68167 Mannheim, GermanyThe application of machine learning (ML) for use in generating insights and making predictions on new records continues to expand within the medical community. Despite this progress to date, the application of time series analysis has remained underexplored due to complexity of the underlying techniques. In this study, we have deployed a novel ML, called automated time series (AutoTS) machine learning, to automate data processing and the application of a multitude of models to assess which best forecasts future values. This rapid experimentation allows for and enables the selection of the most accurate model in order to perform time series predictions. By using the nation-wide ICD-10 (International Classification of Diseases, Tenth Revision) dataset of hospitalized patients of Romania, we have generated time series datasets over the period of 2008–2018 and performed highly accurate AutoTS predictions for the ten deadliest diseases. Forecast results for the years 2019 and 2020 were generated on a NUTS 2 (Nomenclature of Territorial Units for Statistics) regional level. This is the first study to our knowledge to perform time series forecasting of multiple diseases at a regional level using automated time series machine learning on a national ICD-10 dataset. The deployment of AutoTS technology can help decision makers in implementing targeted national health policies more efficiently.https://www.mdpi.com/1660-4601/17/14/4979automated machine learningdeep learningartificial intelligencedeadliest diseasestime seriesdisease prediction
collection DOAJ
language English
format Article
sources DOAJ
author Victor Olsavszky
Mihnea Dosius
Cristian Vladescu
Johannes Benecke
spellingShingle Victor Olsavszky
Mihnea Dosius
Cristian Vladescu
Johannes Benecke
Time Series Analysis and Forecasting with Automated Machine Learning on a National ICD-10 Database
International Journal of Environmental Research and Public Health
automated machine learning
deep learning
artificial intelligence
deadliest diseases
time series
disease prediction
author_facet Victor Olsavszky
Mihnea Dosius
Cristian Vladescu
Johannes Benecke
author_sort Victor Olsavszky
title Time Series Analysis and Forecasting with Automated Machine Learning on a National ICD-10 Database
title_short Time Series Analysis and Forecasting with Automated Machine Learning on a National ICD-10 Database
title_full Time Series Analysis and Forecasting with Automated Machine Learning on a National ICD-10 Database
title_fullStr Time Series Analysis and Forecasting with Automated Machine Learning on a National ICD-10 Database
title_full_unstemmed Time Series Analysis and Forecasting with Automated Machine Learning on a National ICD-10 Database
title_sort time series analysis and forecasting with automated machine learning on a national icd-10 database
publisher MDPI AG
series International Journal of Environmental Research and Public Health
issn 1661-7827
1660-4601
publishDate 2020-07-01
description The application of machine learning (ML) for use in generating insights and making predictions on new records continues to expand within the medical community. Despite this progress to date, the application of time series analysis has remained underexplored due to complexity of the underlying techniques. In this study, we have deployed a novel ML, called automated time series (AutoTS) machine learning, to automate data processing and the application of a multitude of models to assess which best forecasts future values. This rapid experimentation allows for and enables the selection of the most accurate model in order to perform time series predictions. By using the nation-wide ICD-10 (International Classification of Diseases, Tenth Revision) dataset of hospitalized patients of Romania, we have generated time series datasets over the period of 2008–2018 and performed highly accurate AutoTS predictions for the ten deadliest diseases. Forecast results for the years 2019 and 2020 were generated on a NUTS 2 (Nomenclature of Territorial Units for Statistics) regional level. This is the first study to our knowledge to perform time series forecasting of multiple diseases at a regional level using automated time series machine learning on a national ICD-10 dataset. The deployment of AutoTS technology can help decision makers in implementing targeted national health policies more efficiently.
topic automated machine learning
deep learning
artificial intelligence
deadliest diseases
time series
disease prediction
url https://www.mdpi.com/1660-4601/17/14/4979
work_keys_str_mv AT victorolsavszky timeseriesanalysisandforecastingwithautomatedmachinelearningonanationalicd10database
AT mihneadosius timeseriesanalysisandforecastingwithautomatedmachinelearningonanationalicd10database
AT cristianvladescu timeseriesanalysisandforecastingwithautomatedmachinelearningonanationalicd10database
AT johannesbenecke timeseriesanalysisandforecastingwithautomatedmachinelearningonanationalicd10database
_version_ 1724605290999971840