Software Techniques For Dependable Execution
abstract: Advances in semiconductor technology have brought computer-based systems intovirtually all aspects of human life. This unprecedented integration of semiconductor based systems in our lives has significantly increased the domain and the number of safety-critical applications – application w...
Other Authors: | |
---|---|
Format: | Doctoral Thesis |
Language: | English |
Published: |
2018
|
Subjects: | |
Online Access: | http://hdl.handle.net/2286/R.I.51604 |
id |
ndltd-asu.edu-item-51604 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-asu.edu-item-516042019-02-02T03:01:06Z Software Techniques For Dependable Execution abstract: Advances in semiconductor technology have brought computer-based systems intovirtually all aspects of human life. This unprecedented integration of semiconductor based systems in our lives has significantly increased the domain and the number of safety-critical applications – application with unacceptable consequences of failure. Software-level error resilience schemes are attractive because they can provide commercial-off-the-shelf microprocessors with adaptive and scalable reliability. Among all software-level error resilience solutions, in-application instruction replication based approaches have been widely used and are deemed to be the most effective. However, existing instruction-based replication schemes only protect some part of computations i.e. arithmetic and logical instructions and leave the rest as unprotected. To improve the efficacy of instruction-level redundancy-based approaches, we developed several error detection and error correction schemes. nZDC (near Zero silent Data Corruption) is an instruction duplication scheme which protects the execution of whole application. Rather than detecting errors on register operands of memory and control flow operations, nZDC checks the results of such operations. nZDC en sures the correct execution of memory write instruction by reloading stored value and checking it against redundantly computed value. nZDC also introduces a novel control flow checking mechanism which replicates compare and branch instructions and detects both wrong direction branches as well as unwanted jumps. Fault injection experiments show that nZDC can improve the error coverage of the state-of-the-art schemes by more than 10x, without incurring any more performance penalty. Further more, we introduced two error recovery solutions. InCheck is our backward recovery solution which makes light-weighted error-free checkpoints at the basic block granularity. In the case of error, InCheck reverts the program execution to the beginning of last executed basic block and resumes the execution by the aid of preserved in formation. NEMESIS is our forward recovery scheme which runs three versions of computation and detects errors by checking the results of all memory write and branch operations. In the case of a mismatch, NEMESIS diagnosis routine decides if the error is recoverable. If yes, NEMESIS recovery routine reverts the effect of error from the program state and resumes program normal execution from the error detection point. Dissertation/Thesis Didehban, Moslem (Author) Shrivastava, Aviral (Advisor) Wu, Carole-Jean (Committee member) Clark, Lawrence (Committee member) Mahlke, Scott (Committee member) Arizona State University (Publisher) Computer engineering Computer science Compiler transfromation Instruction Duplication Redundancy Reliability Silent Data Corruption Soft Error eng 129 pages Doctoral Dissertation Computer Engineering 2018 Doctoral Dissertation http://hdl.handle.net/2286/R.I.51604 http://rightsstatements.org/vocab/InC/1.0/ 2018 |
collection |
NDLTD |
language |
English |
format |
Doctoral Thesis |
sources |
NDLTD |
topic |
Computer engineering Computer science Compiler transfromation Instruction Duplication Redundancy Reliability Silent Data Corruption Soft Error |
spellingShingle |
Computer engineering Computer science Compiler transfromation Instruction Duplication Redundancy Reliability Silent Data Corruption Soft Error Software Techniques For Dependable Execution |
description |
abstract: Advances in semiconductor technology have brought computer-based systems intovirtually all aspects of human life. This unprecedented integration of semiconductor based systems in our lives has significantly increased the domain and the number
of safety-critical applications – application with unacceptable consequences of failure. Software-level error resilience schemes are attractive because they can provide commercial-off-the-shelf microprocessors with adaptive and scalable reliability.
Among all software-level error resilience solutions, in-application instruction replication based approaches have been widely used and are deemed to be the most effective. However, existing instruction-based replication schemes only protect some part of computations i.e. arithmetic and logical instructions and leave the rest as unprotected. To improve the efficacy of instruction-level redundancy-based approaches, we developed several error detection and error correction schemes. nZDC (near Zero silent
Data Corruption) is an instruction duplication scheme which protects the execution of whole application. Rather than detecting errors on register operands of memory and control flow operations, nZDC checks the results of such operations. nZDC en
sures the correct execution of memory write instruction by reloading stored value and checking it against redundantly computed value. nZDC also introduces a novel control flow checking mechanism which replicates compare and branch instructions and
detects both wrong direction branches as well as unwanted jumps. Fault injection experiments show that nZDC can improve the error coverage of the state-of-the-art schemes by more than 10x, without incurring any more performance penalty. Further
more, we introduced two error recovery solutions. InCheck is our backward recovery solution which makes light-weighted error-free checkpoints at the basic block granularity. In the case of error, InCheck reverts the program execution to the beginning of last executed basic block and resumes the execution by the aid of preserved in formation. NEMESIS is our forward recovery scheme which runs three versions of computation and detects errors by checking the results of all memory write and branch
operations. In the case of a mismatch, NEMESIS diagnosis routine decides if the error is recoverable. If yes, NEMESIS recovery routine reverts the effect of error from the program state and resumes program normal execution from the error detection
point. === Dissertation/Thesis === Doctoral Dissertation Computer Engineering 2018 |
author2 |
Didehban, Moslem (Author) |
author_facet |
Didehban, Moslem (Author) |
title |
Software Techniques For Dependable Execution |
title_short |
Software Techniques For Dependable Execution |
title_full |
Software Techniques For Dependable Execution |
title_fullStr |
Software Techniques For Dependable Execution |
title_full_unstemmed |
Software Techniques For Dependable Execution |
title_sort |
software techniques for dependable execution |
publishDate |
2018 |
url |
http://hdl.handle.net/2286/R.I.51604 |
_version_ |
1718970016902152192 |