[MEANS, SIGMAS, WEIGHTS] = VL_GMM(X, NUMCLUSTERS) fits a GMM with NUMCLUSTERS components to the data X. Each column of X represent a sample point. X may be either SINGLE or DOUBLE. MEANS, SIGMAS, and WEIGHTS are repsectively the means, the diagonal covariances, and the prior probabilities of the Guassian modes. MEANS and SIGMAS have the same number of rows as X and NUMCLUSTERS columns with one column per mode. WEIGHTS is a row vector with NUMCLUSTER entries summing to one.
[MEANS, SIGMAS, WEIGHTS, LL] = VL_GMM(...) returns the loglikelihood (LL) of the model as well.
[MEANS, SIGMAS, WEIGHTS, LL, POSTERIORS] = VL_GMM(...) returns the loglikelihood of the solution and posterior probabilities of the Gaussian modes given each data point. The POSTERIORS matrix has NUMCLUSTERS rows and NUMDATA columns.
VL_GMM() supports different initialization and optimization methods. Specifically, the following options are supported:
Increase the verbosity level (may be specified multiple times).
RAND initializes the means as random data poitns and the covaraince matrices as the covariance of X. CUSTOM allow specifying the initial means, covariances, and prior probabilities.
Specify the initial means (size(X,1)-by-NUMCLUSTERS matrix).
Specify the initial weights (a vector of dimension NUMCLUSTER).
Specify the initial diagonal covariance matrices
Choose whether to use SERIAL or PARALLEL (multi-core) computations.
Number of times to restart EM. The solution with maximum loglikelihood is returned.
Set the lower bound on the covariance diagonal entries. This is particularly important if the data contains singleton dimensions, and generally useful for stability.
VL_GMM(X, 10, 'verbose', 'multithreading', 'parallel', 'MaxNumIterations', 20) estimates the mixture of 10 gaussians using at mosst 20 iterations.