# Quickstart Image

Here is a quick tutorial on scattering transform scattering transform with ScatNet for image classification. It contains :

## Setup

• Type edit startup.m into MATLAB prompt
• Put the following two lines to your startup.m file.
addpath '/path/to/scatnet/';

• Save startup.m and launch it from MATLAB prompt

ScatNet is now set up and will automatically be loaded when starting MATLAB.

## Fundamentals

This section describes how to compute the scattering transform of an image with as few lines of MATLAB as possible.

Load an image of reasonable size (e.g. 640x480).

        x = uiuc_sample;
imagesc(x);
colormap gray;


Before computing the scattering coefficients of x, we need to precompute the wavelet transform operators that will be applied to the image. Those operators are built by specific built-in factories adapted to different types of signal. For translation-invariant representation of images, we use wavelet_factory_2d.

        Wop = wavelet_factory_2d(size(x));


Now we call the scat function to compute the scattering of x using those Wop operators.

        S = scat(x, Wop);


The variable S is a cell of structure. It is not suited for direct numeric manipulation. You can reformat it in a 3D array using format_scat :

        S_mat = format_scat(Sx);


S_mat is a 417x60x80 matrix. The first dimension in S_mat is the path index, while the second and third dimensions correspond to subsampled spatial coordinates of the original image.

% scattering transform - code example
%
x = uiuc_sample;
% compute wavelet operators with factory
Wop = wavelet_factory_2d(size(x));
% compute scattering
S = scat(x, Wop);
% reformat in 3d matrix
S_mat = format_scat(S);


Note: wavelet_factory_2d_pyramid() may be used instead of wavelet_factory_2d(size(x)). It will use a similar yet faster algorithm to compute the scattering coefficients at different scales.

## Data Structures

ScatNet relies on three data structures, embedded into each other: wavelet factories, layers and filters. We review their respective structures

### Layers

The first three layers of the 2D scattering of an image $$x$$ are defined by $S_0 x = \{x \star \phi\}$ $S_1 x = \{|x \star \psi_{j_1, \theta_1} | \star \phi\}$ $S_2 x = \{||x \star \psi_{j_1, \theta_1} |\star \psi_{j_2, \theta_2} | \star \phi\}$ where the symbol $$\star$$ denotes the spatial convolution, $$\phi$$ is an averaging window and $$\psi_{j,\theta}$$ is a wavelet dilated by $$2^j$$ and rotated by $$\pi (\theta-1)/L$$.

The output variable S is a cell array, whose elements are the different layers of the scattering transform : S{m+1} corresponds to the $$m$$th layer $$S_m x$$.

Scattering layer is a MATLAB structure with two fields:

• signal is a cell array of 2d matrices, which combines the scattering coefficients, which is indexed by the integer variable $$p$$ as follow
% the p-th at the m-th layer:
Sx{m+1}.signal{p}

• meta is a struct with the sequences of $$j$$ and $$\theta$$ that corresponds to each $$p$$.
% the sequence of j for p-th coefficient at the m-th order :
Sx{m+1}.meta.j(:,p)
% the sequence of theta for p-th coefficient at the m-th order :
Sx{m+1}.meta.theta(:,p)


For example, let us display the coefficients of second order with $j_1 = 0, j_2 = 2, \theta_1 = 1, \theta_2 = 5$ First, we find the index $$p$$ using meta information
j1 = 0;
j2 = 2;
theta1 = 1;
theta2 = 5;
p = find( Sx{3}.meta.j(1,:) == j1 & ...
Sx{3}.meta.j(2,:) == j2 & ...
Sx{3}.meta.theta(1,:) == theta1 & ...
Sx{3}.meta.theta(2,:) == theta2 );

Then we retrieve the $$p$$-th signal of the second layer and display it :
imagesc(Sx{3}.signal{p});

This should result in the figure: The resolution corresponds to the oversampling we work with the image.

### Factories

A factory is the cell array of all linear operators involved in a scattering network represented in MATLAB as function handless. The standard built-in factory has the following prototype: the Wop operators are obtained with

        Wop = wavelet_factory_2d(size(x));


is a cell array of function handles. scat.m applies layer by layer

[S{m+1}, V] = Wop{m+1}(U{m+1});
U{m+2} = modulus_layer(V);

The $$m$$-th operator Wop{m+1} (which is a function handle) is applied to the $$m$$-th internal layer U{m+1}. The operator outputs the $$m$$-th external layer S{m+1} and a residual layer V. The second line applies a complex modulus to layer V and stores the result as the $$m+1$$-th internal layer U{m+2} for further decomposition.

### Filters

The filters $$\phi$$ and $$\psi_{j,\theta}$$ used in the computation of scattering can be obtained as second argument of the wavelet factory.

[Wop, filters] = wavelet_factory_2d(size(x));

filters is a struct containing three fields
• filters.phi the low pass filter $$\phi$$
• filters.psi all dilated and oriented wavelets $$\psi_{j,\theta}$$
• meta  global information about the filter bank

The field filters.psi is a structure with two fields :

• filters.psi.filter that is a cell array of filter
• filters.psi.meta that contains the $$j$$ and $$\theta$$ corresponding to each filters

The structure of filters.phi and filters.psi are similar with that of a scattering layer, except that the filed name is replaced by filter filter instead of signal.

Finally, fields filters.psi.filter{p} is a structure with two fields :
• filters.psi.filter{p}.type is a string that specifies the type of filter used. For 2D scattering, 'fourier_multires' stands for multiresolution of Fourier transform.
• filters.psi.filter{p}.coefft that contains the coefficient of the $$p$$-th filter, in a format that depends upon the previous type field. For 2D scattering, it is a cell whose elements are the Fourier transform of the filter at different resolution.

## Configuration

From filter design to network architecture, the built-in wavelet factories in ScatNet are fully customizable. We review the available options with the example of wavelet_factory_2d. The same would hold for wave_fact_3d.

### Compute the scattering with options

Options must be embedded in filt_opt and scat_opt structures.
filt_opt.J = 5;

Non specified fields will be set to a default value. The filt_opt and scat_opt structures must be passed to the factory :
% compute scattering with non-default options
x = uiuc_sample;
filt_opt.J = 5;
filt_opt.L = 6;
scat_opt.oversampling = 2;
scat_opt.M = 2;
Wop = wavelet_factory_2d(size(x), filt_opt, scat_opt);
Sx = scat(x, Wop);

• filt_opt contains filters-related options
• filt_opt.J (default 4) is the number of scale $$j$$ in the filter bank $$\psi_{j,\theta}$$. Note that increasing J will also increase the range of translation invariance.
• filt_opt.L (default 8) is the number of orientations. Note that increasing L will also increase the angular selectivity of filters.
• filt_opt.Q (default 1) is the number of scale per octave. Not that increasing Q will decrease the range of translation invariance and increase the scale selectivity of filters.
• scat_opt contains scattering-related options
• scat_opt.M (default 2) is the maximum scattering order. The scat function will compute scattering coefficient of order 0 to M, that is $S_0 x, \dotsc, S_M x$
• scat_opt.oversampling (default 1) : scattering will be oversampled by up to a power of 2.

## Display

### Scattering display

All the scattering coefficients in a layer can be displayed at once, as a big image of concatenated nodes. The function image_scat iterates over the scattering order, so as to produce one such image per layer. In the following example, we display a second order scattering representation with 5 scales and 6 orientations per layer. You can display all scattering coefficients with image_scat
% compute scattering with 5 scales, 6 orientations
% and an oversampling factor of 2^2
x = uiuc_sample;
filt_opt.J = 5;
filt_opt.L = 6;
scat_opt.oversampling = 2;
Wop = wavelet_factory_2d(size(x), filt_opt, scat_opt);
Sx = scat(x, Wop);
% display scattering coefficients
image_scat(Sx)

which should result in the following three figures. The meta information that identifies a scattering path are overlaid on each node. Also, each node is renormalized by the norm of its parent. Both these features can be turned off
        image_scat(Sx, false, false);

which should result in the following three figures. The coefficient can be organized with a more structured layout using display_with_layer_order. For instance, they can be arranged by lexicographic order on $$j$$ along rows and lexicographic order on $$\theta$$ along columns.
display_with_layer_order(Sx,1);

which should result in the figure :

### Filter display

One may display the real and imaginary part of those filters by means of display_filter_bank_2d
% compute the filters with 5 scales and 6 orientations
filt_opt.J = 5;
filt_opt.L = 6;
[Wop, filters] = wavelet_factory_2d([512, 512], filt_opt);
% display all the filters
display_filter_bank_2d(filters);

The above code should result in the following figure :

The top left image corresponds to $$\phi$$. The first left half corresponds to the real parts of $$\psi_{j,\theta}$$, arranged according scales (rows) and orientations (columns). The right half image corresponds to the imaginary part.

## Classification

### Scattering transform on a database of images

We need to define a src structure containing the path and label of all the images. We have a specific function for each database. For the uiuc texture database we use
src = uiuc_src;

Then, we need to define the features as a cell of function handle containing all the feature that we want to compute per image. In this example, the only feature is the scattering representation.
filt_opt.J = 5;
scat_opt.oversampling = 0;
Wop = wavelet_factory_2d([480, 640], filt_opt, scat_opt);
features{1} = @(x)(sum(sum(format_scat(scat(x,Wop)),2),3));

We use prepare_database to perform the computation of the feature
options.parallel = 0;
db = prepare_database(src, features, options);


### Classifier

The code below performs an affine space training and testing of the database of features computed in the previous subsection.
% proportion of training example
prop = 0.5;
% split between training and testing
[train_set, test_set] = create_partition(src, prop);
% dimension of the affine pca classifier
train_opt.dim = 20;
% training
model = affine_train(db, train_set, train_opt);
% testing
labels = affine_test(db, model, test_set);
% compute the error
error = classif_err(labels, test_set, src);


## Roto-translation scattering

### Computation

This section explains how to compute the roto-translation scattering introduced in paper [SifreEtMallat,2013].
Roto-translation scattering computes local descriptors that are invariant to small translation and arbitrary local rotations. First, load an image of reasonnable size
        x = uiuc_sample;

As for translation scattering, we need to precompute the wavelet operators. This time we use wavelet_factory_3d instead of wavelet_factory_2d
        Wop = wavelet_factory_3d(size(x));

As explained in [9], the roto-translation wavelet operators are identical to those obtained with wavelet_factory_2d ; yet the third operators is the roto translation wavelet transform :
>> Wop{3}

ans =

@(x)(wavelet_layer_3d(x,filters,filters_rot,scat_opt))

Here, we compute roto-translation scattering by calling scat function on x using the Wop operators.
        Sx = scat(x, Wop);

The above instruction lasts on a 2.4 GHz Core i7 about 15 seconds. If lesser resolution is needed, the antialiasing may be set to 0 (default is 1, which mean that scattering will be oversampled by a factor 2^1).
        filt_opt = struct;
filt_opt_rot = struct;
scat_opt.oversampling = 0;
wavelet = wavelet_factory_3d(size(x), ...
filt_opt, filt_opt_rot, scat_opt);
Sx = scat(x, Wop);

Without antialiasing, it takes about 6 seconds.
As for translation scattering, one can format the output is a 3d matrix:
        sx = format_scat(Sx);


### Display

The same image_scat function works with roto-translation scattering. Calling it
     image_scat(Sx);

results in the three following figures : One can also arrange the coefficient according to lexicographic order in $$(j_1,j_2)$$ along rows and lexicographic order in $$(k_2, \theta_2)$$ along columns (see [9] for definition of $$(k_2, \theta_2)$$ )
% arrange scattering layer in lexicographic order
var_y{1}.name = 'j';
var_y{1}.index = 1;
var_y{2}.name = 'j';
var_y{2}.index = 2;
var_x{1}.name = 'k2';
var_x{1}.index = 1;
var_x{2}.name = 'theta2';
var_x{2}.index = 1;

big_img = image_scat_layer_order(Sx{3}, var_x, var_y, 1);
imagesc(big_img)
ylabel('j_1, j_2');
xlabel('k_2, \theta_2');

which results in the figure Note that the spatial filters and the rotation filters may be obtained as second and third argument of wavelet_factory_3d :
    [wavelet, filters, filters_rot] = wavelet_factory_3d(size(x));


### Example of invariance

Here, we check that the roto-translation scattering is indeed rotation invariant, by computing the scattering transformation of a rotated image and rotating back its outputs.
% this script demonstrates the rotation invariance
% of the roto-translation scattering

x = uiuc_sample;
x = x(128:128+255, 128:128+255);

filt_opt = struct;
filt_rot_opt = struct;
% oversampling must be set to infty
% so that scattering of original and rotated
% image will be sampled at exactly the same points
scat_opt.oversampling = 10;

Wop = wavelet_factory_3d(size(x), filt_opt, filt_rot_opt, scat_opt);

% compute scattering of x
Sx = scat(x, Wop);
sx_raw = format_scat(Sx);

% compute scattering of x rotated by pi/2
x_rot = rot90(x,1);
Sx_rot = scat(x_rot, Wop);
sx_rot_raw = format_scat(Sx_rot);

% rotate back every layer of output
for p = 1:size(sx_rot_raw,1)
tmp = rot90(squeeze(sx_rot_raw(p,:,:)), -1);
sx_rot_raw_back(p,:,:) = permute(tmp, [3, 1, 2]);
end

% compute norm ratio
norm_diff = sqrt(sum((sx_rot_raw_back(:)-sx_raw(:)).^2));
norm_s = sqrt(sum((sx_raw(:)).^2));
norm_ratio = norm_diff/norm_s

This script is in demo/core/demo_scat_3d_pyramid_rotation_invariance.m. Executing it result in a relative norm of
norm_ratio =

1.2552e-08

This means that every local descriptors sx(u1, u2, :) that describes the patch located in u1,u2 are invariant to rotation.