학술·연구정보가이드: Computer Science 분야 (02): Matrix algebra; Algorithms; Random projections

피인용 상위 논문

Sketching as a tool for numerical linear algebra.
Woodruff, D.P.
(2014) Foundations and Trends in Theoretical Computer Science, 10 (1-2), pp. 1-157.

more... less...

This survey highlights the recent advances in algorithms for numerical linear algebra that have come from the technique of linear sketching, whereby given a matrix, one first compresses it to a much smaller matrix by multiplying it by a (usually) random matrix with certain properties. Much of the expensive computation can then be performed on the smaller matrix, thereby accelerating the solution for the original problem. In this survey we consider least squares as well as robust regression problems, low rank approximation, and graph sparsification. We also discuss a number of variants of these problems. Finally, we discuss the limitations of sketching methods. © 2014 D. P. Woodruff.
An introduction to matrix concentration inequalities.
Tropp, J.A.
(2015) Foundations and Trends in Machine Learning, 8 (1-2), pp. 1-230.

more... less...

Random matrices now play a role in many areas of theoretical, applied, and computational mathematics. Therefore, it is desirable to have tools for studying random matrices that are flexible, easy to use, and powerful. Over the last fifteen years, researchers have developed a remarkable family of results, called matrix concentration inequalities, that achieve all of these goals. This monograph offers an invitation to the field of matrix concentration inequalities. It begins with some history of random matrix theory; it describes a flexible model for random matrices that is suitable for many problems; and it discusses the most important matrix concentration results. To demonstrate the value of these techniques, the presentation includes examples drawn from statistics, machine learning, optimization, combinatorics, algorithms, scientific computing, and beyond. © 2015 J. A. Tropp.
Dimensionality reduction for k-means clustering and low rank approximation.
Cohen, M.B., Elder, S., Musco, C. and 2 more (2015) Proceedings of the Annual ACM Symposium on Theory of Computing, 14-17-, pp.
163-172.

more... less...

We show how to approximate a data matrix A with a much smaller sketch A that can be used to solve a general class of constrained k-rank approximation problems to within (1 + ∈) error. Importantly, this class includes k-means clustering and unconstrained low rank approximation (i.e. principal component analysis). By reducing data points to just O(k) dimensions, we generically accelerate any exact, approximate, or heuristic algorithm for these ubiquitous problems. For k-means dimensionality reduction, we provide (1 + ∈) relative error results for many common sketching techniques, including random row projection, column selection, and approximate SVD. For approximate principal component analysis, we give a simple alternative to known algorithms that has applications in the streaming setting. Additionally, we extend recent work on column-based matrix reconstruction, giving column subsets that not only 'cover' a good subspace for A, but can be used directly to compute this subspace. Finally, for k-means clustering, we show how to achieve a (9 + ∈) approximation by Johnson-Lindenstrauss projecting data to just O(logk/∈2 ) dimensions. This is the first result that leverages the specific structure of k-means to achieve dimension independent of input size and sublinear in k.
Randomized dimensionality reduction for κ -means clustering.
Boutsidis, C., Zouzias, A., Mahoney, M.W. and 1 more (2015) IEEE Transactions on Information Theory, 61 (2), pp. 1045-1062.

more... less...

We study the topic of dimensionality reduction for κ -means clustering. Dimensionality reduction encompasses the union of two approaches: 1) feature selection and 2) feature extraction. A feature selection-based algorithm for κ -means clustering selects a small subset of the input features and then applies κ -means clustering on the selected features. A feature extraction-based algorithm for κ -means clustering constructs a small set of new artificial features and then applies κ -means clustering on the constructed features. Despite the significance of κ -means clustering as well as the wealth of heuristic methods addressing it, provably accurate feature selection methods for κ -means clustering are not known. On the other hand, two provably accurate feature extraction methods for κ -means clustering are known in the literature; one is based on random projections and the other is based on the singular value decomposition (SVD). This paper makes further progress toward a better understanding of dimensionality reduction for κ -means clustering. Namely, we present the first provably accurate feature selection method for κ -means clustering and, in addition, we present two feature extraction methods. The first feature extraction method is based on random projections and it improves upon the existing results in terms of time complexity and number of features needed to be extracted. The second feature extraction method is based on fast approximate SVD factorizations and it also improves upon the existing results in terms of time complexity. The proposed algorithms are randomized and provide constant-factor approximation guarantees with respect to the optimal κ -means objective value. © 1963-2012 IEEE.
Sparser Johnson-Lindenstrauss transforms.
Kane, D.M., Nelson, J.
(2014) Journal of the ACM, 61 (1).

more... less...

We give two different and simple constructions for dimensionality reduction in l2 via linear mappings that are sparse: only an O(ε)-fraction of entries in each column of our embedding matrices are non-zero to achieve distortion 1+ε with high probability, while still achieving the asymptotically optimal number of rows. These are the first constructions to provide subconstant sparsity for all values of parameters, improving upon previous works of Achlioptas [2003] and Dasgupta et al. [2010]. Such distributions can be used to speed up applications where l2 dimensionality reduction is used. © 2014 ACM.