Machine Learning-Based Code Auto-Completion Implementation for Firmware Developers

With the advent of artificial intelligence, the research paradigm in natural language processing has been transitioned from statistical methods to machine learning-based approaches. One application is to develop a deep learning-based language model that helps software engineers write code faster. Al...

Full description

Bibliographic Details
Main Authors: Junghyun Kim, Kyuman Lee, Sanghyun Choi
Format: Article
Language:English
Published: MDPI AG 2020-11-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/10/23/8520
id doaj-ae8c5424380246609ce32fc2f998b4de
record_format Article
spelling doaj-ae8c5424380246609ce32fc2f998b4de2020-11-29T00:03:09ZengMDPI AGApplied Sciences2076-34172020-11-01108520852010.3390/app10238520Machine Learning-Based Code Auto-Completion Implementation for Firmware DevelopersJunghyun Kim0Kyuman Lee1Sanghyun Choi2School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA 30332, USADepartment of Robot and Smart System Engineering, Kyungpook National University, Daegu 41566, KoreaMemory S/W Development Team, Samsung Electronics, Hwasung 18448, KoreaWith the advent of artificial intelligence, the research paradigm in natural language processing has been transitioned from statistical methods to machine learning-based approaches. One application is to develop a deep learning-based language model that helps software engineers write code faster. Although there have already been many attempts to develop code auto-completion functionality from different research groups, a need to establish an in-house code has been identified for the following reasons: (1) a security-sensitive company (e.g., Samsung Electronics) may not want to utilize commercial tools given that there is a risk of leaked source codes and (2) commercial tools may not be applicable to the specific domain (e.g., SSD firmware development) especially if one needs to predict unique code patterns and style. This research proposes a hybrid approach that harnesses the synergy between machine learning techniques and advanced design methods aiming to develop a code auto-completion framework that helps firmware developers write code in a more efficient manner. The sensitivity analysis results show that the deterministic design results in reducing prediction accuracy as it generates output in some unexpected ways, while the probabilistic design provides a list of reasonable next code elements in which one could select it manually to increase prediction accuracy.https://www.mdpi.com/2076-3417/10/23/8520machine learningcode auto-completionGPT-2 modeladvanced design methods
collection DOAJ
language English
format Article
sources DOAJ
author Junghyun Kim
Kyuman Lee
Sanghyun Choi
spellingShingle Junghyun Kim
Kyuman Lee
Sanghyun Choi
Machine Learning-Based Code Auto-Completion Implementation for Firmware Developers
Applied Sciences
machine learning
code auto-completion
GPT-2 model
advanced design methods
author_facet Junghyun Kim
Kyuman Lee
Sanghyun Choi
author_sort Junghyun Kim
title Machine Learning-Based Code Auto-Completion Implementation for Firmware Developers
title_short Machine Learning-Based Code Auto-Completion Implementation for Firmware Developers
title_full Machine Learning-Based Code Auto-Completion Implementation for Firmware Developers
title_fullStr Machine Learning-Based Code Auto-Completion Implementation for Firmware Developers
title_full_unstemmed Machine Learning-Based Code Auto-Completion Implementation for Firmware Developers
title_sort machine learning-based code auto-completion implementation for firmware developers
publisher MDPI AG
series Applied Sciences
issn 2076-3417
publishDate 2020-11-01
description With the advent of artificial intelligence, the research paradigm in natural language processing has been transitioned from statistical methods to machine learning-based approaches. One application is to develop a deep learning-based language model that helps software engineers write code faster. Although there have already been many attempts to develop code auto-completion functionality from different research groups, a need to establish an in-house code has been identified for the following reasons: (1) a security-sensitive company (e.g., Samsung Electronics) may not want to utilize commercial tools given that there is a risk of leaked source codes and (2) commercial tools may not be applicable to the specific domain (e.g., SSD firmware development) especially if one needs to predict unique code patterns and style. This research proposes a hybrid approach that harnesses the synergy between machine learning techniques and advanced design methods aiming to develop a code auto-completion framework that helps firmware developers write code in a more efficient manner. The sensitivity analysis results show that the deterministic design results in reducing prediction accuracy as it generates output in some unexpected ways, while the probabilistic design provides a list of reasonable next code elements in which one could select it manually to increase prediction accuracy.
topic machine learning
code auto-completion
GPT-2 model
advanced design methods
url https://www.mdpi.com/2076-3417/10/23/8520
work_keys_str_mv AT junghyunkim machinelearningbasedcodeautocompletionimplementationforfirmwaredevelopers
AT kyumanlee machinelearningbasedcodeautocompletionimplementationforfirmwaredevelopers
AT sanghyunchoi machinelearningbasedcodeautocompletionimplementationforfirmwaredevelopers
_version_ 1724412836640194560