Summary: | 碩士 === 輔仁大學 === 資訊工程學系 === 99 === Software failures may lead to lose important data. Therefore, how to handle software failures
is a very important issue. There are many studies attempting to solve software failures.
Checkpoint is a technique which is used to improve fault-tolerant in software. When
a program is running, certain program states and information are stored in a checkpoint at
an appropriate time. If a software failure occurs, the program rolls back to the checkpoint
to re-execute. The program re-executes from the last interrupt time by restoring to earlier
states. Checkpoint can save more time than restarting the program.
There are many researches using checkpoint recovery mechanisms. But most recovery
mechanisms restore the program states to a recent checkpoint and re-execute in a new
process when the program failure occurs. This type of recovery mechanisms handles to
unexpected errors on the program runtime such as transient errors. If the program crash
is caused by the user input wrong data, the wrong data is still exist even rolls back to the
recent checkpoint, it can crash, roll back, crash, roll back…, the program and recovery
mechanisms will into a infinite loop.
This thesis proves a multi-checkpoint recovery system to handle wrong data by user
input, and our recovery system do not need to re-compile the program. This recovery
system makes a checkpoint for the program while user input a data. Each program may
not have only one data input, so that many checkpoints may belong to a program. If a user
input data cause the program to crash, recovery system will show the information which is
the past data by the user input. This information can let users to consider which input data
caused the program to crash, and then user chooses the input of checkpoint, rolls back to
input a new data. The program can avoid the same error to occur again.
|