Power analysis of knockoff filters for correlated designs

© 2019 Neural information processing systems foundation. All rights reserved. The knockoff filter introduced by Barber and Candès 2016 is an elegant framework for controlling the false discovery rate in variable selection. While empirical results indicate that this methodology is not too conservati...

Full description

Bibliographic Details
Main Authors: Liu, Jingbo (Author), Rigollet, Philippe (Author)
Other Authors: Massachusetts Institute of Technology. Institute for Data, Systems, and Society (Contributor), Massachusetts Institute of Technology. Department of Mathematics (Contributor)
Format: Article
Language:English
Published: 2021-12-14T15:11:54Z.
Subjects:
Online Access:Get fulltext
Description
Summary:© 2019 Neural information processing systems foundation. All rights reserved. The knockoff filter introduced by Barber and Candès 2016 is an elegant framework for controlling the false discovery rate in variable selection. While empirical results indicate that this methodology is not too conservative, there is no conclusive theoretical result on its power. When the predictors are i.i.d. Gaussian, it is known that as the signal to noise ratio tend to infinity, the knockoff filter is consistent in the sense that one can make FDR go to 0 and power go to 1 simultaneously. In this work we study the case where the predictors have a general covariance matrix S. We introduce a simple functional called effective signal deficiency (ESD) of the covariance matrix of the predictors that predicts consistency of various variable selection methods. In particular, ESD reveals that the structure of the precision matrix plays a central role in consistency and therefore, so does the conditional independence structure of the predictors. To leverage this connection, we introduce Conditional Independence knockoff, a simple procedure that is able to compete with the more sophisticated knockoff filters and that is defined when the predictors obey a Gaussian tree graphical models (or when the graph is sufficiently sparse). Our theoretical results are supported by numerical evidence on synthetic data.
NSF (Awards IIS-BIGDATA- 1838071, DMS-1712596 and CCF-TRIPODS- 1740751)
ONR (Grant N00014-17-1-2147)