Publications

  • International peer reviewed conferences:

    • A convex relaxation for weakly supervised classifiers
      Armand Joulin and Francis Bach.
      Proceedings of the International Conference on Machine Learning (ICML), 2012.
      Abstract | BibTeX | PDF

      Abstract

      This paper introduces a general multi-class approach to weakly supervised classification. Inferring the labels and learning the parameters of the model is usually done jointly through a block-coordinate descent algorithm such as expectation-maximization (EM), which may lead to local minima. To avoid this problem, we propose a cost function based on a convex relaxation of the soft-max loss. We then propose an algorithm specifically designed to efficiently solve the corresponding semidefinite program (SDP). Empirically, our method compares favorably to standard ones on different datasets for multiple instance learning and semi-supervised learning, as well as on clustering tasks.

      BibTeX

         @InProceedings{JouBacICML12,
         title = "A convex relaxation for weakly supervised classifiers",
         booktitle = "Proceedings of the International Conference on Machine Learning (ICML)",
         author = "A. Joulin and F. Bach",
         year = "2012"
      }
         
    • Multi-Class Cosegmentation
      Armand Joulin, Francis Bach and Jean Ponce.
      Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), 2012.
      Abstract | BibTeX | PDF | Supp | Code

      Abstract

      Bottom-up, fully unsupervised segmentation remains a daunting challenge for computer vision. In the cosegmentation context, on the other hand, the availability of multiple images assumed to contain instances of the same object classes provides a weak form of supervision that can be exploited by discriminative approaches. Unfortunately, most existing algorithms are limited to a very small number of images and/or object classes (typically two of each). This paper proposes a novel energy-minimization approach to cosegmentation that can handle multiple classes and a significantly larger number of images. The proposed cost function combines spectral- and discriminative-clustering terms, and it admits a probabilistic interpretation. It is optimized using an efficient EM method, initialized using a convex quadratic approximation of the energy. Comparative experiments show that the proposed approach matches or improves the state of the art on several standard datasets.

      BibTeX

         @InProceedings{JouBacPon12,
         title = "Multi-Class Cosegmentation",
         booktitle = "Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR)",
         author = "A. Joulin and F. Bach and J. Ponce",
         year = "2012"
      }
         
    • A Graph-matching Kernel for Object Categorization
      Olivier Duchenne, Armand Joulin and Jean Ponce.
      Proceedings of the International Conference on Computer Vision (ICCV), 2011.
      Abstract | BibTeX | PDF

      Abstract

      This paper addresses the problem of category-level image classification. The underlying image model is a graph whose nodes correspond to a dense set of regions, and edges reflect the underlying grid structure of the image and act as springs to guarantee the geometric consistency of nearby regions during matching. A fast approximate algorithm for matching the graphs associated with two images is presented. This algorithm is used to construct a kernel appropriate for SVM-based image classification, and experiments with the Caltech 101, Caltech 256, and Scenes datasets demonstrate performance that matches or exceeds the state of the art for methods using a single type of features.

      BibTeX

         @InProceedings{DucJouPon11,
         title = "A Graph-Matching Kernel for Object Categorization",
         booktitle = "Proceedings of the International Conference in Computer Vision (ICCV)",
         author = "O. Duchenne and A. Joulin and J. Ponce",
         year = "2011"
      }
         
    • Clusterpath: an Algorithm for Clustering using Convex Fusion Penalties
      Toby Dylan Hocking, Armand Joulin, Francis Bach and Jean-Philippe Vert.
      Proceedings of the International Conference on Machine Learning (ICML), 2011.
      Abstract | BibTeX | PDF | Project page | Code

      Abstract

      We present a new clustering algorithm by proposing a convex relaxation of hierarchical clustering, which results in a family of objective functions with a natural geometric interpretation. We give efficient algorithms for calculating the continuous regularization path of solutions, and discuss relative advantages of the parameters. Our method experimentally gives state-ofthe-art results similar to spectral clustering for non-convex clusters, and has the added benefit of learning a tree structure from the data.

      BibTeX

      @InProceedings{ hocking2011clusterpath,
         title = "Clusterpath An Algorithm for Clustering using Convex Fusion Penalties",
         booktitle = "In The International Conference on Machine Learning (ICML)",
         author = "T.D. Hocking and A. Joulin and F. Bach and J.P. Vert",
         year = "2011"
      }
       
    • Efficient Optimization for Discriminative Latent Class Models
      Armand Joulin, Francis Bach and Jean Ponce.
      Advances in Neural Information Processing System (NIPS), 2010.
      Abstract | BibTeX | PDF | Code

      Abstract

      Dimensionality reduction is commonly used in the setting of multi-label supervised classification to control the learning capacity and to provide a meaningful representation of the data. We introduce a simple forward probabilistic model which is a multinomial extension of reduced rank regression, and show that this model provides a probabilistic interpretation of discriminative clustering methods with added benefits in terms of number of hyperparameters and optimization. While the expectation-maximization (EM) algorithm is commonly used to learn these probabilistic models, it usually leads to local maxima because it relies on a non-convex cost function. To avoid this problem, we introduce a local approximation of this cost function, which in turn leads to a quadratic non-convex optimization problem over a product of simplices. In order to maximize quadratic functions, we propose an efficient algorithm based on convex relaxations and lowrank representations of the data, capable of handling large-scale problems. Experiments on text document classification show that the new model outperforms other supervised dimensionality reduction methods, while simulations on unsupervised clustering show that our probabilistic formulation has better properties than existing discriminative clustering methods.

      BibTeX

      @InProceedings{JouBacPonc10_nips,
         title = "Efficient Optimization for Discriminative Latent Class Models",
         booktitle = "Advances in Neural Information Processing Systems (NIPS)",
         author = "A. Joulin and F. Bach and J. Ponce",
         year = "2010"
      } 
      
    • Discriminative Clustering for Image Co-segmentation
      Armand Joulin, Francis Bach and Jean Ponce.
      Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), 2010.
      Abstract | BibTeX | PDF | Code

      Abstract

      Purely bottom-up, unsupervised segmentation of a single image into foreground and background regions remains a challenging task for computer vision. Co-segmentation is the problem of simultaneously dividing multiple images into regions (segments) corresponding to different object classes. In this paper, we combine existing tools for bottom-up image segmentation such as normalized cuts, with kernel methods commonly used in object recognition. These two sets of techniques are used within a discriminative clustering framework: the goal is to assign foreground/background labels jointly to all images, so that a supervised classifier trained with these labels leads to maximal separation of the two classes. In practice, we obtain a combinatorial optimization problem which is relaxed to a continuous convex optimization problem, that can itself be solved efficiently for up to dozens of images. We illustrate the proposed method on images with very similar foreground objects, as well as on more challenging problems with objects with higher intra-class variations.

      BibTeX

      @InProceedings{JouBacPonc10_cvpr,
         title = "Discriminative Clustering for Image Co-segmentation",
         booktitle = "Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR)",
         author = "A. Joulin and F. Bach and J. Ponce",
         year = "2010"
      }
      
  • Technical report:

    • Hybrid Deterministic-Stochastic Methods for Data Fitting Addendum: Application to the Hinge Loss
      Mark Schmidt and Armand Joulin
      Note (2012).
      PDF | Main article by M. Friedlander and M. Schmidt
    • Stock price jumps: news and volume play a minor role
      Armand Joulin, Augustin Lefevre, Daniel Grunberg and Jean-Philippe Bouchaud.
      Technical report (2008).
      PDF