InCheck - An Integrated Recovery Methodology for Fine-grained Soft-Error Detection Schemes
abstract: Soft errors are considered as a key reliability challenge for sub-nano scale transistors. An ideal solution for such a challenge should ultimately eliminate the effect of soft errors from the microprocessor. While forward recovery techniques achieve fast recovery from errors by simply voti...
Other Authors: | |
---|---|
Format: | Dissertation |
Language: | English |
Published: |
2016
|
Subjects: | |
Online Access: | http://hdl.handle.net/2286/R.I.40720 |
id |
ndltd-asu.edu-item-40720 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-asu.edu-item-407202018-06-22T03:07:52Z InCheck - An Integrated Recovery Methodology for Fine-grained Soft-Error Detection Schemes abstract: Soft errors are considered as a key reliability challenge for sub-nano scale transistors. An ideal solution for such a challenge should ultimately eliminate the effect of soft errors from the microprocessor. While forward recovery techniques achieve fast recovery from errors by simply voting out the wrong values, they incur the overhead of three copies execution. Backward recovery techniques only need two copies of execution, but suffer from check-pointing overhead. In this work I explored the efficiency of integrating check-pointing into the application and the effectiveness of recovery that can be performed upon it. After evaluating the available fine-grained approaches to perform recovery, I am introducing InCheck, an in-application recovery scheme that can be integrated into instruction-duplication based techniques, thus providing a fast error recovery. The proposed technique makes light-weight checkpoints at the basic-block granularity, and uses them for recovery purposes. To evaluate the effectiveness of the proposed technique, 10,000 fault injection experiments were performed on different hardware components of a modern ARM in-order simulated processor. InCheck was able to recover from all detected errors by replaying about 20 instructions, however, the state of the art recovery scheme failed more than 200 times. Dissertation/Thesis Lokam, Sai Ram Dheeraj (Author) Shrivastava, Aviral (Advisor) Clark, Lawrence T (Committee member) Mubayi, Anuj (Committee member) Arizona State University (Publisher) Computer engineering Computer science Electrical engineering Algorithms Checkpointing Recovery Reliability Silent Data Corruption Soft-Errors eng 36 pages Masters Thesis Computer Science 2016 Masters Thesis http://hdl.handle.net/2286/R.I.40720 http://rightsstatements.org/vocab/InC/1.0/ All Rights Reserved 2016 |
collection |
NDLTD |
language |
English |
format |
Dissertation |
sources |
NDLTD |
topic |
Computer engineering Computer science Electrical engineering Algorithms Checkpointing Recovery Reliability Silent Data Corruption Soft-Errors |
spellingShingle |
Computer engineering Computer science Electrical engineering Algorithms Checkpointing Recovery Reliability Silent Data Corruption Soft-Errors InCheck - An Integrated Recovery Methodology for Fine-grained Soft-Error Detection Schemes |
description |
abstract: Soft errors are considered as a key reliability challenge for sub-nano scale transistors. An ideal solution for such a challenge should ultimately eliminate the effect of soft errors from the microprocessor. While forward recovery techniques achieve fast recovery from errors by simply voting out the wrong values, they incur the overhead of three copies execution. Backward recovery techniques only need two copies of execution, but suffer from check-pointing overhead.
In this work I explored the efficiency of integrating check-pointing into the application and the effectiveness of recovery that can be performed upon it. After evaluating the available fine-grained approaches to perform recovery, I am introducing InCheck, an in-application recovery scheme that can be integrated into instruction-duplication based techniques, thus providing a fast error recovery. The proposed technique makes light-weight checkpoints at the basic-block granularity, and uses them for recovery purposes.
To evaluate the effectiveness of the proposed technique, 10,000 fault injection experiments were performed on different hardware components of a modern ARM in-order simulated processor. InCheck was able to recover from all detected errors by replaying about 20 instructions, however, the state of the art recovery scheme failed more than 200 times. === Dissertation/Thesis === Masters Thesis Computer Science 2016 |
author2 |
Lokam, Sai Ram Dheeraj (Author) |
author_facet |
Lokam, Sai Ram Dheeraj (Author) |
title |
InCheck - An Integrated Recovery Methodology for Fine-grained Soft-Error Detection Schemes |
title_short |
InCheck - An Integrated Recovery Methodology for Fine-grained Soft-Error Detection Schemes |
title_full |
InCheck - An Integrated Recovery Methodology for Fine-grained Soft-Error Detection Schemes |
title_fullStr |
InCheck - An Integrated Recovery Methodology for Fine-grained Soft-Error Detection Schemes |
title_full_unstemmed |
InCheck - An Integrated Recovery Methodology for Fine-grained Soft-Error Detection Schemes |
title_sort |
incheck - an integrated recovery methodology for fine-grained soft-error detection schemes |
publishDate |
2016 |
url |
http://hdl.handle.net/2286/R.I.40720 |
_version_ |
1718701284315365376 |