Documentation of Patch-Based Multi-View Stereo Software (PMVS)
Yasutaka Furukawa (yfurukaw -at- uiuc.edu)

09/25/2007

The package includes core multi-view stereo reconstruction programs (affine and match), a visualization software (patchviewer), and a file format conversion program (patch2pset).

First thing to do

The package includes one data set with the output of our software. After uncompressing the file, go into the directory
>cd pmvs/program/main
Then, try the following command
>
patchviewer.exe 24 ../../data/nskullb/
You can use a command prompt or a cygwin type application. After 20 or 30 seconds (depending on the spec of your machine and a graphics card), a window should pop-up in which you will see a bunch of texture mapped patches of our skull model. The output of our software is stored under "pmvs/data/nskullb/models". You should be able to recreate the same results by running the following two commands from pmvs/program/main.
>./affine.exe 24 ../../data/nskullb/ 16
>./match.exe 24 ../../data/nskullb/ 2 1 0.5 7 3
There are a couple more output files under the data directories and please see the following for the details.

Directory Structures and File Formats

Now, I will describe the input file formats and a directory structure, which your data files have to follow. Each dataset has a root directory, and all the relevant files must be stored under root. There are 3 types of input data: images, camera parameters and segmentation masks, while segmentation masks are optional.

l         Images must be in either jpeg or restricted ppm file formats. ppm must be in P6 format (binary color image), and there should not be any comments. All the image files must be under root/visualize/. File names must be in a special format. Suppose a dataset consists of 5 images, then image files (for example in jpeg format) must be named 0000.jpg, 0001.jpg, 0002.jpg, 0003.jpg, and 0004.jpg. In general, i_th image file must be named printf("%04d.jpg", i) or printf("%04d.ppm", i). The program first tries to read ppm file. If ppm file is not found, the program next tries to find jpeg file.

l         Camera parameter files must be stored under root/txt/. Suppose a dataset consists of 5 images, then camera parameter files must be named 0000.txt, 0001.txt, 0002.txt, 0003.txt, and 0004.txt. Each camera parameter file must have the following format:
-------------------------------------------
CONTOUR
P[0][0] P[0][1] P[0][2] P[0][3]
P[1][0] P[1][1] P[1][2] P[1][3]
P[2][0] P[2][1] P[2][2] P[2][3]
-------------------------------------------
"CONTOUR" is just a header. P[3][4] denotes a 3x4 projection matrix, which is defined as follows. Let (x y z 1) denote a homogeneous 3D coordinate of a point, and (u v 1) denote a homogeneous 2D coordinate of its image projection, then (x y z 1) and (u v 1) are related by the following equation:
,
where d is the depth of the point with respect to the camera. Note that the origin of the image coordinate system is at the top left corner of an image (strictly speaking, the origin lies in the center of the pixel at the top-left corner of an image). The x-axis points to the right and the y-axis points to the bottom. So, the 2D image coordinate of the top left pixel is (0, 0), and the 2D image coordinate of the bottom right pixel is (w, h), where w and h are the image width and height, respectively.

l         Segmentation masks can be optionally given to the program. A segmentation mask must be given in pgm file and stored under root/masks/ directory. Their names must be again 0000.pgm, 0001,pgm, ... A pgm file must be in P5 (binary grey-scale) format and there should be no comments. If the intensity of a pixel is more than 127, the pixel is treated as background, and if the intensity is less than 127, the pixel is treaded as foreground. Intuitively, background should appear white and foreground should appear black. If files do not exist in the directory, the program just ignores them. You do not have to place pgm files for all the input images. You can, for example, put only 0002.pgm and 0004.pgm to specify mask images for these 2 images.

l         All the results will be stored under root/models/. Do not forget to create this directory before running the programs, otherwise results will not be saved.

Output File Format

If you just need to know the information of reconstructed oriented points, use patch2pset.exe and get a .pset file. You will see 6 numbers in each line of a .pset file. The first 3 numbers represent the 3D coordinate of a point and the last 3 numbers represent the estimated (outward) surface normal at that point.

After running affine.exe and match.exe, you will get a lot of .patc0 files which are similar to .pset files but contain more information. Typically, you will see the same number of .patc0 files as the input cameras. 0000.patc0 contains reconstructed points whose reference image is the first camera, and 0001.patc0 contains points whose reference image is the second camera, and etc. For the meaning of a reference image, please refer to our paper. The first line of each patc0 file is an integer, which is the value of a parameter A used in match.exe. In the second line, you will see an integer N representing the number of points stored in the file. After that you will see N blocks of data in the following format. Each block contains information for a single reconstructed point and starts with a header "PATCH0", followed by the homogeneous coordinate of its center, the homogeneous coordinate of the estimated (outward) surface normal at that point, and the photometric consistency score associated with the point (ranging from -1.0 to 1.0 and the higher the better). Lastly, you will see 4 lines of integers. The first number represents the number of images where textures are consistent with each other, and the following line contains indexes to such images. Then, you will see another integer in the next line, representing the number of images where the point should be visible but textures are not really consistent and hence not used for reconstructions. The last line contains indexes to such images.

Programs

Note that for most of the programs, you must have 3 dll files, included in the package, in the same directory

l         affine

Usage: ./affine num root [A=16]
One feature is detected in every A by A pixel cell.

Example
./affine.exe 24 ../../data/nskullb/ 16

affine
detects features in each image. We use 2 types of feature detection filters: Harris corner detector and Difference of Gaussian (DoG) blob detector. The first argument (num) specifies the number of images in the dataset. The second argument (root) specifies the root directory of the dataset containing input camera parameters, input images and optionally segmentation images. The third parameter (A) controls the density of detected features in each image. In particular, a single feature is detected in A by A pixel cell in each image. Results (detected features) are saved in the directory "rootmodels/". For example, if you specify "root=../../data/nskullb/", results are saved under "../../data/nskullb/models/".

l         match

Usage: ./match num root [A=16] [fullin=0] [threshold=0.7] [size=5] [minImage=3]

Example
./match.exe 24 ../../data/nskullb/ 2 1 0.5 7 3

match
uses image features detected by affine and reconstructs a set of small rectangle patches (oriented points) representing a surface of an object or a scene of interests. The first 2 arguments are the same as affine. The third argument A controls the density of reconstructions. In particular, the program tries to reconstruct one patch in every A by A pixel cell. The smaller the value of A is, the denser the result would be. The fourth argument fullin should be set to 1, if an object or a scene if fully contained in all the input images. On the other hand, if an object or a scene is outside of an image in some images, fullin should be set to 0. The fifth argument threshold is a threshold on the photometric consistency score of each patch. The photometric consistency score is an average normalized cross correlation score of back-projected image textures, and hence, ranges from -1.0 to 1.0. The value is typically set to 0.5 or 0.7. The sixth argument size controls the image window size used for the computation of photometric consistency scores. If size = 5, 5x5 = 25 pixels are used to compute the photometric consistency score, for example. The last argument specifies the minimum number of images, in which a patch must be visible to be reconstructed. minimage must be at least 2. For stability, 3 is much better than 2.

l         patchviewer

Usage: ./patchviewer num root

Example
./patchviewer.exe 24 ../../data/nskullb/

patchviewer is a visualization program and renders a set of reconstructed patches with and without texture mapping. The first 2 arguments are the same as affine and match. Left mouse drag to rotate patches, right mouse drag (up and down) to scale patches, and left+right (both down) mouse drag to translate patches.
There are 4 modes in the viewer. 1) With texture mapping and smooth-shading; 2) Without texture mapping and smooth-shading; 3) With texture mapping and wireframe; and 4) Without texture mapping and wireframe. Type 'c' to cycle through the modes.
One important note is that patchviewer reads jpeg files for texture mapping and you must have images in jpeg format under "visualize" directory.

l         patch2pset

Usage: ./patch2pset num root output

Example
./patch2pset.exe 24 ../../data/nskullb/ ../../data/nskullb/nskullb.pset

patch2pset converts a set of reconstructed patches into a file format that can be loaded by  PoissonRecon.32, which is one of Michael Kazhdan's programs that compute a mesh model from oriented points (patches). The first 2 arguments are the same as before, while the last argument specifies the name of the output file.

l         PoissonRecon

Example
./PoissonRecon.32.exe –in ../../data/nskullb/nskullb.psetout ../../data/nskullb/nskullb.ply –depth 12

PoissonRecon
turns a set of oriented points into a 2D manifold with boundaries represented by a triangulated mesh. This software is publicly available at the following URL together with its detailed usage and explanations: http://www.cs.jhu.edu/~misha/Code/PoissonRecon/

Datasets

I included one dataset in the package. In addition to the input data files, output files are also included. Data files are stored in the directory structure described above. The following is a list of commands I used to get the results in the package.

./affine.exe 24 ../../data/nskullb/ 16
./match.exe 24 ../../data/nskullb/ 2 1 0.5 7 3
./patch2pset.exe 24 ../../data/nskullb/ ../../data/nskullb/nskullb.pset
./PoissonRecon.32.exe --in nskullb.pset  --out nskullb.ply --depth 12


Threads Support

Our software can utilize threads. affine2.exe and affine4.exe are 2-threads and 4-threads versions of affine.exe, respectively. Similarly, match2.exe and match4.exe are 2-threads and 4-threads versions of match.exe, respectively.


Back to my home page -- Last updated on 2009/02/07.