Tutorials - Encodings

The ongoing lines describe how to use Fisher and VLAD encodings.

The encoding methods are generally used for quantizing a set of vectors with respect to a vocabulary model (obtained using Gaussian mixture estimation, KMeans clustering, ...).

Fisher encoding

The Fisher encoding is best used with a particular estimated gaussian mixture model. Using the obtained means, sigmas and variances one can compute a fisher vector. First we initialize random data and train the GMM model (please see the GMM tutorial page to find info on GMM usage).

N         = 5000 ;
dimension = 2 ;
dataLearn = rand(dimension,N) ;

numClusters = 30 ;
[means, sigmas, weights] = vl_gmm(dataLearn, numClusters);
Next we initialize another random set of vectors, which should be encoded with respect to the model, we have just estimated.
Nencode = 1000;
dataEncode = rand(dimension,Nencode);
The fisher encoding enc of this new set can be easily obtained by pluging the vl_gmm outputs to the vl_fisher function:
enc = vl_fisher(dataEncode, means, sigmas, weights);
The enc is our final fisher vector.

VLAD encoding

The Vector of Linearly Agregated Descriptors encodes the features in a slightly different way. Simillar to the Fisher encoding, the VLAD encoding collaborates with a clustering technique. In the case of VLAD, this method is KMeans clustering.

KMeans + VLAD

Let's first, as we did in the Fisher section, make a random dataset dataLearn and cluster it using KMeans (KMeans tutorial). Also we will need the dataset we want to encode, so we create one.

N         = 5000 ;
dimension = 2 ;
dataLearn = rand(dimension,N) ;

numClusters = 30 ;
centers = vl_kmeans(dataLearn, numClusters);

Nencode = 1000;
dataEncode = rand(dimension,Nencode);

The vl_vlad function accepts centers of clusters, data we want to encode and also assignments (which are "hard" in the case of KMeans) of each vector to a cluster. The assignments could be obtained by the vl_kdtree_query function, which quickly finds the nearest cluster center (stored in centers) for each dataEncode vector. Note that before running queries to KD-Tree it must be built using the vl_kdtreebuild function.

kd_tree = vl_kdtreebuild(centers) ;
assign = vl_kdtreequery(kd_tree, centers, dataEncode) ;

Now we have in the assign variable indices of nearest centers to each dataEncode vector. The next step is converting assign vector to the right format, which is accepted by vl_vlad.

assignments = zeros(numClusters,Nencode);
assignments(sub2ind(size(assignments),assign,1:length(assign))) = 1;

After this, we are ready to proceed to calculate the final VLAD vector enc.

enc = vl_vlad(dataEncode,centers,assignments);