This article addresses the problem of multichannel audio source separation. We propose a framework where deep neural networks (DNNs) are used to model the source spectra and combined with the classical multichannel Gaussian model to exploit the spatial information. The parameters are estimated in an iterative expectation-maximization (EM) fashion and used to derive a multichannel Wiener filter. We present an extensive experimental study to show the impact of different design choices on the performance of the proposed technique. We consider different cost functions for the training of DNNs, namely the probabilistically motivated Itakura-Saito divergence, and also Kullback-Leibler, Cauchy, mean squared error, and phase-sensitive cost functions. We also study the number of EM iterations and the use of multiple DNNs, where each DNN aims to improve the spectra estimated by the preceding EM iteration. Finally, we present its application to a speech enhancement problem. The experimental results show the benefit of the proposed multichannel approach over a single-channel DNN-based approach and the conventional multichannel nonnegative matrix factorization-based iterative EM algorithm.
Non-negative matrix factorization (NMF) has found numerous applications, due to its ability to provide interpretable decompositions. Perhaps surprisingly, existing results regarding its uniqueness properties are rather limited, and there is much room for improvement in terms of algorithms as well. Uniqueness aspects of NMF are revisited here from a geometrical point of view. Both symmetric and asymmetric NMF are considered, the former being tantamount to element-wise non-negative square-root factorization of positive semidefinite matrices. New uniqueness results are derived, e.g., it is shown that a sufficient condition for uniqueness is that the conic hull of the latent factors is a superset of a particular second-order cone. Checking this condition is shown to be NP-complete; yet this and other results offer insights on the role of latent sparsity in this context. On the computational side, a new algorithm for symmetric NMF is proposed, which is very different from existing ones. It alternates between Procrustes rotation and projection onto the non-negative orthant to find a non-negative matrix close to the span of the dominant subspace. Simulation results show promising performance with respect to the state-of-art. Finally, the new algorithm is applied to a clustering problem for co-authorship data, yielding meaningful and interpretable results.
Nonnegative Matrix Factorization (NMF) has been one of the most widely used clustering techniques for exploratory data analysis. However, since each data point enters the objective function with squared residue error, a few outliers with large errors easily dominate the objective function. In this article, we propose a Robust Manifold Nonnegative Matrix Factorization (RMNMF) method using ℓ2,1-norm and integrating NMF and spectral clustering under the same clustering framework. We also point out the solution uniqueness issue for the existing NMF methods and propose an additional orthonormal constraint to address this problem. With the new constraint, the conventional auxiliary function approach no longer works. We tackle this difficult optimization problem via a novel Augmented Lagrangian Method (ALM)--based algorithm and convert the original constrained optimization problem on one variable into a multivariate constrained problem. The new objective function then can be decomposed into several subproblems that each has a closed-form solution. More importantly, we reveal the connection of our method with robust K-means and spectral clustering, and we demonstrate its theoretical significance. Extensive experiments have been conducted on nine benchmark datasets, and all empirical results show the effectiveness of our method.
In this paper, we study the nonnegative matrix factorization problem under the separability assumption (that is, there exists a cone spanned by a small subset of the columns of the input nonnegative data matrix containing all columns), which is equivalent to the hyperspectral unmixing problem under the linear mixing model and the pure-pixel assumption. We present a family of fast recursive algorithms and prove they are robust under any small perturbations of the input data matrix. This family generalizes several existing hyperspectral unmixing algorithms and hence provides for the first time a theoretical justification of their better practical performance.