Bag of words Adapt the feature extraction and matching pipeline developed in
Exercise \(14.8\) to category (class) recognition, using some of the techniques
described in Section 14.4.1.
1\. Download the training and test images from one or more of the databases
listed in Tables 14.1 and 14.2, e.g., Caltech 101, Caltech 256, or PASCAL VOC.
2\. Extract features from each of the training images, quantize them, and
compute the \(t f\)-idf vectors (bag of words histograms).
3\. As an option, consider not quantizing the features and using pyramid
matching (14.4014.41) (Grauman and Darrell \(2007 \mathrm{~b}\) ) or using a
spatial pyramid for greater selectivity (Lazebnik, Schmid, and Ponce 2006).
4\. Choose a classification algorithm (e.g., nearest neighbor classification
or support vector machine) and "train" your recognizer, i.e., build up the
appropriate data structures (e.g., \(\mathrm{k}\)-d trees) or set the
appropriate classifier parameters.
5\. Test your algorithm on the test data set using the same pipeline you
developed in steps \(2-4\) and compare your results to the best reported
results.
6\. Explain why your results differ from the previously reported ones and give
some ideas for how you could improve your system.
You can find a good synopsis of the best-performing classification algorithms
and their approaches in the report of the PASCAL Visual Object Classes
Challenge found on their Web site
(http://pascallin.ecs.soton.ac.uk/challenges/VOC/).