Max-Margin Hough Transform

Subhransu Maji and Jitendra Malik


This page contains details of the paper:
Object Detection Using a Max-Margin Hough Transform.
Subhransu Maji and Jitendra Malik
In Proceedings, CVPR 2009, Miami, USA.
pdf

Hough transform is a general technique for finding parametric objects like lines, circles, bounding boxes, etc, where local parts vote for the parameters that agree with them. The implicit shape models proposed by Liebe et.al., extends the simple hough tranform to deal with probabilistic votes and parts based on local patch features for object detection.

In this project we build on the idea of implict shape models and propose a learning scheme to determine the importance of local parts in the voting scheme. In particular, we determine weights of the votes of each part type which optimizes the detection score on true locations over the incorrect ones. The formulation leads to a convex optimization problem similar to that of a linear SVM and can be solved using a solver like CVX.

The framework is quite general and provides a way of "spatial feature selection". Compared to feature selection on bag-of-words model which just models the counts of a feature type within the object, we take into account both the frequency and the spatial distribution of the parts. Parts that are appear often and in a consistent location on the objects are given high weights. In the paper we perform experiments using point descriptors sampled on edges and show that the learning improves the detection rates using the hough transform. More recently we have combined various parts learned from annotated human skeletons to detect human torsos. For more details check out the PASCAL VOC 2009 workshop page and the POSELETS page of my colleague Lubomir Bourdev.