Scaling law for recovering the sparsest element in a subspace

We address the problem of recovering a sparse n-vector within a given subspace. This problem is a subtask of some approaches to dictionary learning and sparse principal component analysis. Hence, if we can prove scaling laws for recovery of sparse vectors, it will be easier to derive and prove recov...

Full description

Bibliographic Details
Main Authors: Demanet, Laurent (Contributor), Hand, Paul (Contributor)
Other Authors: Massachusetts Institute of Technology. Department of Mathematics (Contributor)
Format: Article
Language:English
Published: Oxford University Press (OUP), 2018-05-17T19:30:44Z.
Subjects:
Online Access:Get fulltext
LEADER 01903 am a22001813u 4500
001 115483
042 |a dc 
100 1 0 |a Demanet, Laurent  |e author 
100 1 0 |a Massachusetts Institute of Technology. Department of Mathematics  |e contributor 
100 1 0 |a Demanet, Laurent  |e contributor 
100 1 0 |a Hand, Paul  |e contributor 
700 1 0 |a Hand, Paul  |e author 
245 0 0 |a Scaling law for recovering the sparsest element in a subspace 
260 |b Oxford University Press (OUP),   |c 2018-05-17T19:30:44Z. 
856 |z Get fulltext  |u http://hdl.handle.net/1721.1/115483 
520 |a We address the problem of recovering a sparse n-vector within a given subspace. This problem is a subtask of some approaches to dictionary learning and sparse principal component analysis. Hence, if we can prove scaling laws for recovery of sparse vectors, it will be easier to derive and prove recovery results in these applications. In this paper, we present a scaling law for recovering the sparse vector from a subspace that is spanned by the sparse vector and k random vectors. We prove that the sparse vector will be the output to one of n linear programs with high probability if its support size s satisfies s≲n√/klogn. The scaling law still holds when the desired vector is approximately sparse. To get a single estimate for the sparse vector from the n linear programs, we must select which output is the sparsest. This selection process can be based on any proxy for sparsity, and the specific proxy has the potential to improve or worsen the scaling law. If sparsity is interpreted in an ℓ1/ℓ∞ sense, then the scaling law cannot be better than s≲n/√k. Computer simulations show that selecting the sparsest output in the ℓ1/ℓ2 or thresholded-ℓ0 senses can lead to a larger parameter range for successful recovery than that given by the ℓ1/ℓ∞ sense. 
655 7 |a Article 
773 |t Information and Inference