Painting-to-3D Model Alignment Via Discriminative Visual Elements

Mathieu Aubry, Bryan Russell Josef Sivic

Photo is broken


This paper describes a technique that can reliably align arbitrary 2D depictions of an architectural site, including drawings, paintings and historical photographs, with a 3D model of the site. This is a tremendously difficult task as the appearance and scene structure in the 2D depictions can be very different from the appearance and geometry of the 3D model, e.g., due to the specific rendering style, drawing error, age, lighting or change of seasons. In addition, we face a hard search problem: the number of possible alignments of the painting to a large 3D model, such as a partial reconstruction of a city, is huge. To address these issues, we develop a new compact representation of complex 3D scenes. The 3D model of the scene is represented by a small set of discriminative visual elements that are automatically learnt from rendered views. Similar to object detection, the set of visual elements, as well as the weights of individual features for each element, are learnt in a discriminative fashion. We show that the learnt visual elements are reliably matched in 2D depictions of the scene despite large variations in rendering style (e.g. watercolor, sketch, historical photograph) and structural changes (e.g. missing scene parts, large occluders) of the scene. We demonstrate an application of the proposed approach to automatic re-photography to find an approximate viewpoint of historical paintings and photographs with respect to a 3D model of the site. The proposed alignment procedure is validated via a human user study on a new database of paintings and sketches spanning several sites. The results demonstrate that our algorithm produces significantly better alignments than several baseline methods.


Article thumbnail

Painting-to-3D Model Alignment Via Discriminative Visual Elements

M. Aubry, B. Russell and J. Sivic

accepted to ACM Transactions on Graphics (TOG), 2013

Download pdf | bibtex


You can download this video in MP4 format here (19MB).


Our code is available on Github .


You can download a zip file with the paintings we used for our evaluation here (40.5MB). More data, including the 3D models and the discriminative visual elements are linked from our Github repository.


You can download our SIGGRAPH 2014 presentation in PPTX (104 MB, including the videos) or in PDF (6 MB, without the videos and animations), You can also download an older slightly more technical presentation of our project here (PDF 243MB)

Copyright Notice

The documents contained in these directories are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright.