pegasos.h File Reference
PEGASOS SVM. More...
#include "generic.h"
Functions | |
void | vl_pegasos_train_binary_svm_d (double *model, double const *data, vl_size dimension, vl_size numSamples, vl_int8 const *labels, double regularizer, double biasMultiplier, vl_uindex startingIteration, vl_size numIterations, VlRand *randomGenerator, vl_uint32 const *permutation, vl_size permutationSize, double const *preconditioner) |
void | vl_pegasos_train_binary_svm_f (float *model, float const *data, vl_size dimension, vl_size numSamples, vl_int8 const *labels, double regularizer, double biasMultiplier, vl_uindex startingIteration, vl_size numIterations, VlRand *randomGenerator, vl_uint32 const *permutation, vl_size permutationSize, float const *preconditioner) |
Detailed Description
pegasos.h provides a basic implementation of the PEGASOS [1] linear SVM solver.
Overview
PEGASOS solves the linear SVM learning problem
where are data vectors in
,
are binary labels,
is the regularization parameter, and
is the hinge loss. The result of the optimization is a model that yields the decision function
It is well known that the hinge loss is a convex upper bound of the i01-loss of the decision function:
PEGASOS is accessed by calling vl_pegasos_train_binary_svm_d or vl_pegasos_train_binary_svm_f, operating respectively on double
or float
data.
Algorithm
PEGASOS is a stochastic subgradient optimizer. At the t-th iteration the algorithm:
- Samples uniformly at random as subset
of k of training pairs
from the m pairs provided for training (this subset is called mini batch).
- Computes a subgradient
of the function
(this is the SVM objective function restricted to the minibatch).
- Compute an intermediate weight vector
by doing a step
with learning rate
along the subgradient. Note that the learning rate is inversely proportional to the iteration numeber.
- Back projects the weight vector
on the hypersphere of radius
to obtain the next model estimate
:
[1].
VLFeat implementation fixes to one the size of the mini batches .
Bias
PEGASOS SVM formulation does not incorporate a bias. To learn an SVM with bias, the each data vector can be extended by a constant component
(called
biasMultiplier
in the code). In this case, the model has dimension
and the SVM discriminat function is given by
. If the bias multiplier B is large enough, the weight
remains small and it has small contribution in the SVM regularization term
, better approximating the case of an SVM with bias. Unfortunately, setting the bias multiplier
to a large value makes the optimization harder.
Restarting
VLFeat PEGASOS implementation can be restatred after any given number of iterations. This is useful to compute intermediate statistics or to load new data from disk for large datasets. The state of the algorithm, which is required for restarting, is limited to the current estimate of the SVM weight vector and the iteration number
.
Permutation
VLFeat PEGASOS can use a user-defined permutation to decide the order in which data points are visited (instead of using random sampling). By specifying a permutation the algorithm is guaranteed to visit each data point exactly once in each loop. The permutation needs not to be bijective. This can be used to visit certain data samples more or less often than others, implicitly reweighting their relative importance in the SVM objective function. This can be used to blanace the data.
Non-linear kernels
PEGASOS can be extended to non-linear kernels, but the algorithm is not particularly efficient in this setting [1]. When possible, it may be preferable to work with explicit feature maps.
Let be a positive definite kernel. A feature map is a function
such that
. Using this representation the non-linear SVM learning objective function writes:
Thus the only difference with the linear case is that the feature is used in place of the data
.
can be learned off-line, for instance by using the incomplete Cholesky decomposition
of the Gram matrix
(in this case
is the i-th columns of V). Alternatively, for additive kernels (e.g. intersection, Chi2) the explicit feature map computed by homkermap.h can be used.
References
[1] S. Shalev-Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal estimated sub-GrAdient SOlver for SVM. In Proc. ICML, 2007.
Function Documentation
vl_pegasos_train_binary_svm_d | ( | double * | model, |
double const * | data, | ||
vl_size | dimension, | ||
vl_size | numSamples, | ||
vl_int8 const * | labels, | ||
double | regularizer, | ||
double | biasMultiplier, | ||
vl_uindex | startingIteration, | ||
vl_size | numIterations, | ||
VlRand * | randomGenerator, | ||
vl_uint32 const * | permutation, | ||
vl_size | permutationSize, | ||
double const * | preconditioner | ||
) |
- Parameters:
-
model (out) the learned model. data training vectors. dimension data dimension. numSamples number of training data vectors. labels labels of the training vectors. regularizer value of the regularizer coefficient .
biasMultiplier value of the bias multiplier .
startingIteration number of the first iteration. numIterations number of iterations to perform. randomGenerator random number generator. permutation order in which the data is accessed. permutationSize length of permutation
.preconditioner diagonal precoditioner.
The function runs PEGASOS on the specified data. The vector model must have either dimension equal to dimension if biasMultiplier is zero, or dimension + 1 if biasMultiplier is larger than zero.
The function runs PEGASOS for iterations t in the interval [fistIteration, lastIteration]. Together with the fact that the initial model can be set arbitrarily, this enable restarting PEGASOS from any point.
PEGASOS select the next point for computing the gradient at random. If randomGenerator is NULL
, the default random generator (as returned by vl_get_rand()) is used.
Alternatively, if permutation is not NULL
, then points are sampled in the order specified by this vector of indexes (this is cycled through). In this way It is an error to set both randomGenerator and permutation to non-null values.
preconditioner
specifies a diagonal preconditioner for the minimization problem (it is often useful to slow down the steps for the bias term, if the latter is used). Set preconditioner
to NULL to avoid using a preconditioner. The precodnitioner should have the same dimension of the model, plus one if an SVM with bias is learned.
See the Overview overview for details.
vl_pegasos_train_binary_svm_f | ( | float * | model, |
float const * | data, | ||
vl_size | dimension, | ||
vl_size | numSamples, | ||
vl_int8 const * | labels, | ||
double | regularizer, | ||
double | biasMultiplier, | ||
vl_uindex | startingIteration, | ||
vl_size | numIterations, | ||
VlRand * | randomGenerator, | ||
vl_uint32 const * | permutation, | ||
vl_size | permutationSize, | ||
float const * | preconditioner | ||
) |
- See also:
- vl_pegasos_train_binary_svm_d