In this paper we propose a generic framework to incorporate unobserved auxiliary information for classifying objects and actions. This framework allows us to automatically select a bounding box and its quadrants from which best to extract features. These spatial subdivisions are learnt as latent variables. The paper is an extended version of our earlier work , complemented with additional ideas, experiments and analysis.
We approach the classification problem in a discriminative setting, as learning a max-margin classifier that infers the class label along with the latent variables. Through this paper we make the following contributions: a) we provide a method for incorporating latent variables into object and action classification; b) these variables determine the relative focus on foreground vs. background information that is taken account of; c) we design an objective function to more effectively learn in unbalanced data sets; d) we learn a better classifier by iterative expansion of the latent parameter space. We demonstrate the performance of our approach through experimental evaluation on a number of standard object and action recognition data sets.
Recommended citation: H. Bilen, V.P. Namboodiri and L. Van Gool (2014), “Object and Action Classification with Latent Window Parameters”, International Journal of Computer Vision (IJCV) Vol: 106: 237 - 251, February 2014