InCheck - An Integrated Recovery Methodology for Fine-grained Soft-Error Detection Schemes

abstract: Soft errors are considered as a key reliability challenge for sub-nano scale transistors. An ideal solution for such a challenge should ultimately eliminate the effect of soft errors from the microprocessor. While forward recovery techniques achieve fast recovery from errors by simply voti...

Full description

Bibliographic Details
Other Authors: Lokam, Sai Ram Dheeraj (Author)
Format: Dissertation
Language:English
Published: 2016
Subjects:
Online Access:http://hdl.handle.net/2286/R.I.40720
id ndltd-asu.edu-item-40720
record_format oai_dc
spelling ndltd-asu.edu-item-407202018-06-22T03:07:52Z InCheck - An Integrated Recovery Methodology for Fine-grained Soft-Error Detection Schemes abstract: Soft errors are considered as a key reliability challenge for sub-nano scale transistors. An ideal solution for such a challenge should ultimately eliminate the effect of soft errors from the microprocessor. While forward recovery techniques achieve fast recovery from errors by simply voting out the wrong values, they incur the overhead of three copies execution. Backward recovery techniques only need two copies of execution, but suffer from check-pointing overhead. In this work I explored the efficiency of integrating check-pointing into the application and the effectiveness of recovery that can be performed upon it. After evaluating the available fine-grained approaches to perform recovery, I am introducing InCheck, an in-application recovery scheme that can be integrated into instruction-duplication based techniques, thus providing a fast error recovery. The proposed technique makes light-weight checkpoints at the basic-block granularity, and uses them for recovery purposes. To evaluate the effectiveness of the proposed technique, 10,000 fault injection experiments were performed on different hardware components of a modern ARM in-order simulated processor. InCheck was able to recover from all detected errors by replaying about 20 instructions, however, the state of the art recovery scheme failed more than 200 times. Dissertation/Thesis Lokam, Sai Ram Dheeraj (Author) Shrivastava, Aviral (Advisor) Clark, Lawrence T (Committee member) Mubayi, Anuj (Committee member) Arizona State University (Publisher) Computer engineering Computer science Electrical engineering Algorithms Checkpointing Recovery Reliability Silent Data Corruption Soft-Errors eng 36 pages Masters Thesis Computer Science 2016 Masters Thesis http://hdl.handle.net/2286/R.I.40720 http://rightsstatements.org/vocab/InC/1.0/ All Rights Reserved 2016
collection NDLTD
language English
format Dissertation
sources NDLTD
topic Computer engineering
Computer science
Electrical engineering
Algorithms
Checkpointing
Recovery
Reliability
Silent Data Corruption
Soft-Errors
spellingShingle Computer engineering
Computer science
Electrical engineering
Algorithms
Checkpointing
Recovery
Reliability
Silent Data Corruption
Soft-Errors
InCheck - An Integrated Recovery Methodology for Fine-grained Soft-Error Detection Schemes
description abstract: Soft errors are considered as a key reliability challenge for sub-nano scale transistors. An ideal solution for such a challenge should ultimately eliminate the effect of soft errors from the microprocessor. While forward recovery techniques achieve fast recovery from errors by simply voting out the wrong values, they incur the overhead of three copies execution. Backward recovery techniques only need two copies of execution, but suffer from check-pointing overhead. In this work I explored the efficiency of integrating check-pointing into the application and the effectiveness of recovery that can be performed upon it. After evaluating the available fine-grained approaches to perform recovery, I am introducing InCheck, an in-application recovery scheme that can be integrated into instruction-duplication based techniques, thus providing a fast error recovery. The proposed technique makes light-weight checkpoints at the basic-block granularity, and uses them for recovery purposes. To evaluate the effectiveness of the proposed technique, 10,000 fault injection experiments were performed on different hardware components of a modern ARM in-order simulated processor. InCheck was able to recover from all detected errors by replaying about 20 instructions, however, the state of the art recovery scheme failed more than 200 times. === Dissertation/Thesis === Masters Thesis Computer Science 2016
author2 Lokam, Sai Ram Dheeraj (Author)
author_facet Lokam, Sai Ram Dheeraj (Author)
title InCheck - An Integrated Recovery Methodology for Fine-grained Soft-Error Detection Schemes
title_short InCheck - An Integrated Recovery Methodology for Fine-grained Soft-Error Detection Schemes
title_full InCheck - An Integrated Recovery Methodology for Fine-grained Soft-Error Detection Schemes
title_fullStr InCheck - An Integrated Recovery Methodology for Fine-grained Soft-Error Detection Schemes
title_full_unstemmed InCheck - An Integrated Recovery Methodology for Fine-grained Soft-Error Detection Schemes
title_sort incheck - an integrated recovery methodology for fine-grained soft-error detection schemes
publishDate 2016
url http://hdl.handle.net/2286/R.I.40720
_version_ 1718701284315365376