Clustering and Feature Selection using Sparse Principal Component Analysis

  • TITLE: Clustering and Feature Selection using Sparse Principal Component Analysis.

  • AUTHORS: Ronny Luss, Alexandre d'Aspremont

  • ABSTRACT: In this paper, we use sparse principal component analysis (PCA) to solve clustering and feature selection problems. Sparse PCA seeks sparse factors, or linear combinations of the data variables, explaining a maximum amount of variance in the data while having only a limited number of nonzero coefficients. PCA is often used as a simple clustering technique and sparse factors allow us here to interpret the clusters in terms of a reduced set of variables. We begin with a brief introduction and motivation on sparse PCA and detail our implementation of the algorithm in d'Aspremont et al. (2005). We finish by describing the application of sparse PCA to clustering and by a brief description of DSPCA, the numerical package used in these experiments.

  • STATUS: Optimization & Engineering, 11(1), pp. 145-157, February 2010.

  • ArXiv PREPRINT: 0707.0701

  • PAPER: Clustering and Feature Selection using Sparse Principal Component Analysis