Detecting outliers using the Least Square Regression Line that is forced through an observation

碩士 === 逢甲大學 === 統計與精算所 === 95 === Cook''s Distance are commonly used to detect outliers. Although only has one outlier its diagnostic effect is very good, but if has more than one outlier, it easily has the situation because of masking effect which sentences by mistake. This article propos...

Full description

Bibliographic Details
Main Authors: Ming-tan Hsieh, 謝明潭
Other Authors: Jung-Pin Wu
Format: Others
Language:zh-TW
Published: 2007
Online Access:http://ndltd.ncl.edu.tw/handle/44467562856032893098
Description
Summary:碩士 === 逢甲大學 === 統計與精算所 === 95 === Cook''s Distance are commonly used to detect outliers. Although only has one outlier its diagnostic effect is very good, but if has more than one outlier, it easily has the situation because of masking effect which sentences by mistake. This article proposes new method of detect outlier. We use Least Square Method and Lagrange Multiplier Method to solve the Least Square Regression Line that is forced through an observation, and solve the Least Square Regression Line that is deleted above observation. Then calculates the angle of two straight lines, the angle is used to judge this observation whether is outlier. It''s not easy to calculate the sampling distribution of angle, therefore we use bootstrap to simulate its sampling distribution so as to estimate the p value of angle. If p value smaller than the probability of type I error α, this observation is recognized for outlier. The new technique that we suggested is compared with several traditional diagnosis method (Cook''s Distance, H, DFFITS, DFBETAS and COVRATIO) penetrates Monte Carlo simulation. Using Positive and False Positive to ponder the quality of diagnosis method. Positive is the ratio that the true outlier not be found. False Positive is the ratio that a good observation was mistaken for the outlier.