Abstract

We propose a mid-level statistical model for image segmentationthat composes multiple figure-ground hypotheses(FG) obtained by applying constraints at different locationsand scales, into larger interpretations (tilings) ofthe entire image. Inference is cast as optimization oversets of maximal cliques sampled from a graph connectingall non-overlapping figure-ground segment hypotheses. Potentialfunctions over cliques combine unary, Gestalt-basedfigure qualities, and pairwise compatibilities among spatiallyneighboring segments, constrained by T-junctions andthe boundary interface statistics of real scenes. Learningthe model parameters is based on maximum likelihood, alternatingbetween sampling image tilings and optimizingtheir potential function parameters. State of the art resultsare reported on the Berkeley and Stanford segmentationdatasets, as well as VOC2009, where a 28% improvementwas achieved.

Reference

Ion, A., Carreira, J., & Sminchisescu, C. (2011). Image Segmentation by Figure-Ground Composition into Maximal Cliques. In 13th IEEE International Conference on Computer Vision (pp. 2110–2117). IEEE. http://hdl.handle.net/20.500.12708/53259