Documentation of Patch-Based
Multi-View Stereo Software (PMVS)
Yasutaka
Furukawa (yfurukaw -at- uiuc.edu)
09/25/2007
The package includes core multi-view stereo reconstruction programs (affine and
match), a visualization software (patchviewer), and a
file format conversion program (patch2pset).
First thing to do
The package includes one data set with the output of our
software. After uncompressing the file, go into the directory
>cd pmvs/program/main
Then, try the following command
>patchviewer.exe
24 ../../data/nskullb/
You can use a command prompt or a cygwin type application. After
20 or 30 seconds (depending on the spec of your machine and a graphics card), a
window should pop-up in which you will see a bunch of texture mapped patches of
our skull model. The output of our software is stored under "pmvs/data/nskullb/models".
You should be able to recreate the same results by running the following two
commands from pmvs/program/main.
>./affine.exe 24 ../../data/nskullb/ 16
>./match.exe 24 ../../data/nskullb/ 2 1 0.5 7 3
There are a couple more output files under the data directories and please see
the following for the details.
Directory
Structures and File Formats
Now, I will describe the input file formats and a directory
structure, which your data files have to follow.
Each
dataset has a root directory, and all
the relevant files must be stored under root.
There are 3 types of input data: images, camera parameters and segmentation
masks, while segmentation masks are optional.
l
Images
must be in either jpeg or restricted ppm file
formats. ppm must be in P6
format (binary color image), and there should not be any comments. All the
image files must be under root/visualize/.
File names must be in a special format. Suppose a dataset consists of 5 images,
then image files (for example in jpeg format) must be named 0000.jpg, 0001.jpg,
0002.jpg, 0003.jpg, and 0004.jpg. In general, i_th
image file must be named printf("%04d.jpg", i) or printf("%04d.ppm", i). The
program first tries to read ppm file. If ppm file is not found, the program next tries to find jpeg
file.
l
Camera parameter files must be stored under root/txt/. Suppose a dataset
consists of 5 images, then camera parameter
files must be named 0000.txt, 0001.txt, 0002.txt, 0003.txt, and
0004.txt. Each camera parameter file must have the following
format:
-------------------------------------------
CONTOUR
P[0][0] P[0][1] P[0][2] P[0][3]
P[1][0] P[1][1]
P[1][2] P[1][3]
P[2][0] P[2][1] P[2][2] P[2][3]
-------------------------------------------
"CONTOUR" is just a
header. P[3][4] denotes a 3x4 projection
matrix, which is defined as follows. Let (x y z 1) denote a
homogeneous 3D coordinate of a point, and (u v 1) denote a homogeneous
2D coordinate of its image projection, then (x y z 1) and (u v 1) are
related by the following equation:
,
where d is the depth of the point
with respect to the camera. Note that the origin of the image
coordinate system is at the top left corner of an image (strictly
speaking, the origin lies in the center of the pixel at the top-left
corner of an image). The x-axis points to the right and the y-axis
points to the bottom. So, the 2D image coordinate of the top left
pixel is (0, 0), and the 2D image coordinate of the bottom right pixel
is (w, h), where w and h are the image width and height,
respectively.
l
Segmentation
masks can be optionally given to the program. A segmentation mask must be given
in pgm file and stored under root/masks/ directory. Their names must be again 0000.pgm, 0001,pgm, ... A pgm file must be in P5
(binary grey-scale) format and there should be no comments. If the intensity of
a pixel is more than 127, the pixel is treated as background, and if the
intensity is less than 127, the pixel is treaded as foreground. Intuitively,
background should appear white and foreground should appear black. If files do
not exist in the directory, the program just ignores them. You do not have to place
pgm files for all the input images. You can, for
example, put only 0002.pgm and 0004.pgm to specify mask images for these 2
images.
l
All
the results will be stored under root/models/.
Do not forget to create this directory before running the programs, otherwise
results will not be saved.
Output
File Format
If
you just need to know the information of reconstructed oriented points, use
patch2pset.exe and get a .pset file. You will see 6
numbers in each line of a .pset file. The first 3
numbers represent the 3D coordinate of a point and the last 3 numbers represent
the estimated (outward) surface normal at that point.
After
running affine.exe and match.exe, you will get a lot of .patc0 files which are
similar to .pset files but contain more information.
Typically, you will see the same number of .patc0 files as the input cameras.
0000.patc0 contains reconstructed points whose reference image is the first camera, and
0001.patc0 contains points whose reference
image is the second camera, and etc. For the meaning of a reference image, please refer to our
paper. The first line of each patc0 file is an integer, which is the value of a
parameter A used in
match.exe. In the
second line, you will see an integer N
representing the number of points stored in the file. After that you will see N blocks of data in the following
format. Each block contains information for a single reconstructed point and
starts with a header "PATCH0", followed by the homogeneous coordinate of its
center, the homogeneous coordinate of the estimated (outward) surface normal at
that point, and the photometric consistency score associated with the point
(ranging from -1.0 to 1.0 and the higher the better). Lastly, you will see 4
lines of integers. The first number represents the number of images where
textures are consistent with each other, and the following line contains
indexes to such images. Then, you will see another integer in the next line,
representing the number of images where the point should be visible but
textures are not really consistent and hence not used for reconstructions. The
last line contains indexes to such images.
Programs
Note
that for most of the programs, you must have 3 dll
files, included in the package, in the same directory
l
affine
Usage: ./affine num root [A=16]
One feature is detected in every A by A pixel cell.
Example
./affine.exe 24 ../../data/nskullb/ 16
affine detects features in each image. We use 2
types of feature detection filters: Harris corner detector and Difference of
Gaussian (DoG) blob detector. The first argument (num) specifies the number of images in
the dataset. The second argument (root)
specifies the root directory of the dataset containing input camera parameters,
input images and optionally segmentation images. The third parameter (A) controls the density of detected
features in each image. In particular, a single feature is detected in A by A
pixel cell in each image. Results (detected features) are saved in the
directory "rootmodels/".
For example, if you specify "root=../../data/nskullb/", results are
saved under "../../data/nskullb/models/".
l
match
Usage: ./match num root [A=16] [fullin=0]
[threshold=0.7] [size=5] [minImage=3]
Example
./match.exe 24 ../../data/nskullb/ 2 1 0.5 7 3
match uses image features detected by affine
and reconstructs a set of small rectangle patches (oriented points) representing
a surface of an object or a scene of interests. The first 2 arguments are the
same as affine. The third argument A controls the density of
reconstructions. In particular, the program tries to reconstruct one patch in
every A by A pixel cell. The smaller the value of A is, the denser the result would be. The fourth argument fullin should be
set to 1, if an object or a scene if fully contained in all the input images.
On the other hand, if an object or a scene is outside of an image in some
images, fullin
should be set to 0. The fifth argument threshold
is a threshold on the photometric consistency score of each patch. The
photometric consistency score is an average normalized cross correlation score
of back-projected image textures, and hence, ranges from -1.0 to 1.0. The value
is typically set to 0.5 or 0.7. The sixth argument size controls the image window size used for the computation of
photometric consistency scores. If size
= 5, 5x5 = 25 pixels are used to compute the photometric consistency score, for
example. The last argument specifies the minimum number of images, in which a
patch must be visible to be reconstructed. minimage
must be at least 2. For stability, 3 is much better than 2.
l
patchviewer
Usage: ./patchviewer num root
Example
./patchviewer.exe 24 ../../data/nskullb/
patchviewer is a visualization program and renders a
set of reconstructed patches with and without texture mapping. The first 2 arguments
are the same as affine and match. Left mouse drag to rotate
patches, right mouse drag (up and down) to scale patches, and left+right (both down) mouse drag to translate patches.
There are 4 modes in the viewer. 1) With texture mapping and smooth-shading; 2)
Without texture mapping and smooth-shading; 3) With texture mapping and wireframe;
and 4) Without texture mapping and wireframe. Type 'c' to cycle through the
modes.
One important note is that patchviewer reads jpeg files for texture mapping and you
must have images in jpeg format under "visualize" directory.
l
patch2pset
Usage: ./patch2pset num root output
Example
./patch2pset.exe 24 ../../data/nskullb/ ../../data/nskullb/nskullb.pset
patch2pset converts a set of reconstructed patches into a file format that can
be loaded by PoissonRecon.32, which is one of
Michael Kazhdan's programs that compute a mesh model
from oriented points (patches). The first 2 arguments are the same as before,
while the last argument specifies the name of the output file.
l
PoissonRecon
Example
./PoissonRecon.32.exe –in ../../data/nskullb/nskullb.pset –out ../../data/nskullb/nskullb.ply –depth 12
PoissonRecon turns a set of oriented
points into a 2D manifold with boundaries represented by a triangulated mesh.
This software is publicly available at the following URL together with its
detailed usage and explanations:
http://www.cs.jhu.edu/~misha/Code/PoissonRecon/
Datasets
I
included one dataset in the package. In addition to the input data files,
output files are also included. Data files are stored in the directory
structure described above. The following is a list of commands I used to get
the results in the package.
./affine.exe 24 ../../data/nskullb/ 16
./match.exe 24 ../../data/nskullb/
2 1 0.5 7 3
./patch2pset.exe 24 ../../data/nskullb/ ../../data/nskullb/nskullb.pset
./PoissonRecon.32.exe --in nskullb.pset --out nskullb.ply --depth 12
Threads
Support
Our software can utilize threads. affine2.exe and affine4.exe are 2-threads and 4-threads
versions of affine.exe, respectively. Similarly, match2.exe and match4.exe are
2-threads and 4-threads versions of match.exe, respectively.
Back to my home page -- Last updated on 2009/02/07.