The ongoing lines describe how to use Fisher and VLAD encodings.
The encoding methods are generally used for quantizing a set of vectors with respect to a vocabulary model (obtained using Gaussian mixture estimation, KMeans clustering, ...).
The Fisher encoding is best used with a particular estimated gaussian mixture model. Using the obtained means, sigmas and variances one can compute a fisher vector. First we initialize random data and train the GMM model (please see the GMM tutorial page to find info on GMM usage).
enc
of this new set can be easily obtained by pluging
the vl_gmm
outputs to the
vl_fisher
function:
enc
is our final fisher vector.
The Vector of Linearly Agregated Descriptors encodes the features in a slightly different way. Simillar to the Fisher encoding, the VLAD encoding collaborates with a clustering technique. In the case of VLAD, this method is KMeans clustering.
Let's first, as we did in the Fisher section,
make a random dataset dataLearn
and cluster
it using KMeans (KMeans tutorial).
Also we will need the dataset we want to encode, so we create one.
The vl_vlad
function accepts centers of clusters,
data we want to encode and also assignments (which are "hard" in the case of KMeans)
of each vector to a cluster. The assignments could be obtained by the
vl_kdtree_query function, which quickly finds the nearest cluster center
(stored in centers
) for each dataEncode
vector.
Note that before running queries to KD-Tree it must be built
using the vl_kdtreebuild
function.
Now we have in the assign
variable indices of nearest centers
to each dataEncode
vector. The next step is
converting assign vector to the right format, which
is accepted by vl_vlad
.
After this, we are ready to proceed to calculate
the final VLAD vector enc
.