Fast estimation of approximate matrix ranks using spectral densities
Abstract
In many machine learning and data related applications, it is required to have the knowledge of approximate ranks of large data matrices at hand. In this paper, we present two computationally inexpensive techniques to estimate the approximate ranks of such large matrices. These techniques exploit approximate spectral densities, popular in physics, which are probability density distributions that measure the likelihood of finding eigenvalues of the matrix at a given point on the real line. Integrating the spectral density over an interval gives the eigenvalue count of the matrix in that interval. Therefore the rank can be approximated by integrating the spectral density over a carefully selected interval. Two different approaches are discussed to estimate the approximate rank, one based on Chebyshev polynomials and the other based on the Lanczos algorithm. In order to obtain the appropriate interval, it is necessary to locate a gap between the eigenvalues that correspond to noise and the relevant eigenvalues that contribute to the matrix rank. A method for locating this gap and selecting the interval of integration is proposed based on the plot of the spectral density. Numerical experiments illustrate the performance of these techniques on matrices from typical applications.
- Publication:
-
arXiv e-prints
- Pub Date:
- August 2016
- DOI:
- arXiv:
- arXiv:1608.05754
- Bibcode:
- 2016arXiv160805754U
- Keywords:
-
- Computer Science - Numerical Analysis;
- Computer Science - Machine Learning;
- Mathematics - Numerical Analysis
- E-Print:
- Neural Computation, Vol. 29, No. 5, pp. 1317-1351 (May 2017)