Real-time data analysis for medical diagnosis using FPGA-accelerated neural networks

Abstract Background Real-time analysis of patient data during medical procedures can provide vital diagnostic feedback that significantly improves chances of success. With sensors becoming increasingly fast, frameworks such as Deep Neural Networks are required to perform calculations within the stri...

Full description

Bibliographic Details
Main Authors: Ahmed Sanaullah, Chen Yang, Yuri Alexeev, Kazutomo Yoshii, Martin C. Herbordt
Format: Article
Language:English
Published: BMC 2018-12-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-018-2505-7
id doaj-dfd9db05ddff4cf89cd9978eef71f662
record_format Article
spelling doaj-dfd9db05ddff4cf89cd9978eef71f6622020-11-25T01:31:14ZengBMCBMC Bioinformatics1471-21052018-12-0119S18193110.1186/s12859-018-2505-7Real-time data analysis for medical diagnosis using FPGA-accelerated neural networksAhmed Sanaullah0Chen Yang1Yuri Alexeev2Kazutomo Yoshii3Martin C. Herbordt4Computer Architecture and Automated Design LabComputer Architecture and Automated Design LabArgonne Leadership Computing FacilityMathematics and Computer Science DivisionComputer Architecture and Automated Design LabAbstract Background Real-time analysis of patient data during medical procedures can provide vital diagnostic feedback that significantly improves chances of success. With sensors becoming increasingly fast, frameworks such as Deep Neural Networks are required to perform calculations within the strict timing constraints for real-time operation. However, traditional computing platforms responsible for running these algorithms incur a large overhead due to communication protocols, memory accesses, and static (often generic) architectures. In this work, we implement a low-latency Multi-Layer Perceptron (MLP) processor using Field Programmable Gate Arrays (FPGAs). Unlike CPUs and Graphics Processing Units (GPUs), our FPGA-based design can directly interface sensors, storage devices, display devices and even actuators, thus reducing the delays of data movement between ports and compute pipelines. Moreover, the compute pipelines themselves are tailored specifically to the application, improving resource utilization and reducing idle cycles. We demonstrate the effectiveness of our approach using mass-spectrometry data sets for real-time cancer detection. Results We demonstrate that correct parameter sizing, based on the application, can reduce latency by 20% on average. Furthermore, we show that in an application with tightly coupled data-path and latency constraints, having a large amount of computing resources can actually reduce performance. Using mass-spectrometry benchmarks, we show that our proposed FPGA design outperforms both CPU and GPU implementations, with an average speedup of 144x and 21x, respectively. Conclusion In our work, we demonstrate the importance of application-specific optimizations in order to minimize latency and maximize resource utilization for MLP inference. By directly interfacing and processing sensor data with ultra-low latency, FPGAs can perform real-time analysis during procedures and provide diagnostic feedback that can be critical to achieving higher percentages of successful patient outcomes.http://link.springer.com/article/10.1186/s12859-018-2505-7FPGAMachine learningMulti-layer perceptronsReal-timeInferenceCancer
collection DOAJ
language English
format Article
sources DOAJ
author Ahmed Sanaullah
Chen Yang
Yuri Alexeev
Kazutomo Yoshii
Martin C. Herbordt
spellingShingle Ahmed Sanaullah
Chen Yang
Yuri Alexeev
Kazutomo Yoshii
Martin C. Herbordt
Real-time data analysis for medical diagnosis using FPGA-accelerated neural networks
BMC Bioinformatics
FPGA
Machine learning
Multi-layer perceptrons
Real-time
Inference
Cancer
author_facet Ahmed Sanaullah
Chen Yang
Yuri Alexeev
Kazutomo Yoshii
Martin C. Herbordt
author_sort Ahmed Sanaullah
title Real-time data analysis for medical diagnosis using FPGA-accelerated neural networks
title_short Real-time data analysis for medical diagnosis using FPGA-accelerated neural networks
title_full Real-time data analysis for medical diagnosis using FPGA-accelerated neural networks
title_fullStr Real-time data analysis for medical diagnosis using FPGA-accelerated neural networks
title_full_unstemmed Real-time data analysis for medical diagnosis using FPGA-accelerated neural networks
title_sort real-time data analysis for medical diagnosis using fpga-accelerated neural networks
publisher BMC
series BMC Bioinformatics
issn 1471-2105
publishDate 2018-12-01
description Abstract Background Real-time analysis of patient data during medical procedures can provide vital diagnostic feedback that significantly improves chances of success. With sensors becoming increasingly fast, frameworks such as Deep Neural Networks are required to perform calculations within the strict timing constraints for real-time operation. However, traditional computing platforms responsible for running these algorithms incur a large overhead due to communication protocols, memory accesses, and static (often generic) architectures. In this work, we implement a low-latency Multi-Layer Perceptron (MLP) processor using Field Programmable Gate Arrays (FPGAs). Unlike CPUs and Graphics Processing Units (GPUs), our FPGA-based design can directly interface sensors, storage devices, display devices and even actuators, thus reducing the delays of data movement between ports and compute pipelines. Moreover, the compute pipelines themselves are tailored specifically to the application, improving resource utilization and reducing idle cycles. We demonstrate the effectiveness of our approach using mass-spectrometry data sets for real-time cancer detection. Results We demonstrate that correct parameter sizing, based on the application, can reduce latency by 20% on average. Furthermore, we show that in an application with tightly coupled data-path and latency constraints, having a large amount of computing resources can actually reduce performance. Using mass-spectrometry benchmarks, we show that our proposed FPGA design outperforms both CPU and GPU implementations, with an average speedup of 144x and 21x, respectively. Conclusion In our work, we demonstrate the importance of application-specific optimizations in order to minimize latency and maximize resource utilization for MLP inference. By directly interfacing and processing sensor data with ultra-low latency, FPGAs can perform real-time analysis during procedures and provide diagnostic feedback that can be critical to achieving higher percentages of successful patient outcomes.
topic FPGA
Machine learning
Multi-layer perceptrons
Real-time
Inference
Cancer
url http://link.springer.com/article/10.1186/s12859-018-2505-7
work_keys_str_mv AT ahmedsanaullah realtimedataanalysisformedicaldiagnosisusingfpgaacceleratedneuralnetworks
AT chenyang realtimedataanalysisformedicaldiagnosisusingfpgaacceleratedneuralnetworks
AT yurialexeev realtimedataanalysisformedicaldiagnosisusingfpgaacceleratedneuralnetworks
AT kazutomoyoshii realtimedataanalysisformedicaldiagnosisusingfpgaacceleratedneuralnetworks
AT martincherbordt realtimedataanalysisformedicaldiagnosisusingfpgaacceleratedneuralnetworks
_version_ 1725087872380305408