Second Order Accurate Distributed Eigenvector Computation for Extremely Large Matrices

  • TITLE: Second Order Accurate Distributed Eigenvector Computation for Extremely Large Matrices.

  • AUTHORS: Noureddine El Karoui, Alexandre d'Aspremont

  • ABSTRACT: We propose a second-order accurate method to estimate the eigenvectors of extremely large matrices thereby addressing a problem of relevance to statisticians working in the analysis of very large datasets. More specifically, we show that averaging eigenvectors of randomly subsampled matrices efficiently approximates the true eigenvectors of the original matrix under certain conditions on the incoherence of the spectral decomposition. This incoherence assumption is typically milder than those made in matrix completion and allows eigenvectors to be sparse. We discuss applications to spectral methods in dimensionality reduction and information retrieval.

  • STATUS: Electronic Journal of Statistics, 4, pp. 1345-1385, 2010.

  • ArXiv PREPRINT: 0908.0137

  • PAPER: Second Order Accurate Distributed Eigenvector Computation for Extremely Large Matrices in pdf