Home   >   CSC-OpenAccess Library   >    Manuscript Information
Full Text Available

This is an Open Access publication published under CSC-OpenAccess Policy.
Speeded-up and Compact Visual Codebook for Object Recognition
Barathy Mayurathan, Amirthalingam Ramanan, Sinnathamby Mahesan, U.A.J. Pinidiyaarachchi
Pages - 31 - 50     |    Revised - 15-01-2013     |    Published - 28-02-2013
Volume - 7   Issue - 1    |    Publication Date - February 2013  Table of Contents
Object recognition, Codebook, K-means, RAC, fast-RNN, SIFT, SURF
The well known framework in the object recognition literature uses local information extracted at several patches in images which are then clustered by a suitable clustering technique. A visual codebook maps the patch-based descriptors into a fixed-length vector in histogram space to which standard classifiers can be directly applied. Thus, the construction of a codebook is an important step which is usually done by cluster analysis. However, it is still difficult to construct a compact codebook with reduced computational cost. This paper evaluates the effectiveness and generalisation performance of the Resource-Allocating Codebook (RAC) approach that overcomes the problem of constructing fixed size codebooks that can be used at any time in the learning process and the learning patterns do not have to be repeated. It either allocates a new codeword based on the novelty of a newly seen pattern, or adapts the codebook to fit that observation. Furthermore, we improve RAC to yield codebooks that are more compact. We compare and contrast the recognition performance of RAC evaluated with two distinctive feature descriptors: SIFT and SURF and two clustering techniques: K-means and Fast Reciprocal Nearest Neighbours (fast-RNN) algorithms. SVM is used in classifying the image signatures. The entire visual object recognition pipeline has been tested on three benchmark datasets: PASCAL visual object classes challenge 2007, UIUC texture, and MPEG-7 Part-B silhouette image datasets. Experimental results show that RAC is suitable for constructing codebooks due to its wider span of the feature space. Moreover, RAC takes only one-pass through the entire data that slightly outperforms traditional approaches at drastically reduced computing times. The modified RAC performs slightly better than RAC and gives more compact codebook. Future research should focus on designing more discriminative and compact codebooks such as RAC rather than focusing on methods tuned to achieve high performance in classification.
CITED BY (1)  
1 Mayurathan, B. ,Pinidiyaarachchi, U. A. J., & Niranjan, M. (2013, September). Compact codebook design for visual scene recognition by Sequential Input Space Carving. In Machine Learning for Signal Processing (MLSP), 2013 IEEE International Workshop on (pp. 1-6). IEEE.
1 Google Scholar
2 CiteSeerX
3 refSeek
4 Scribd
5 SlideShare
1 . S. Agarwal, A. Awan, and D. Roth, "Learning to Detect Objects in Images via a Sparse, Part-based Representation", IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 26, pp.1475–1490, 2004.
2 . H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, "SURF: Speeded Up Robust Features", In Computer Vision and Image Understanding, Vol. 110, pp. 346–359, 2008.
3 . C. J. Burgues, "A Tutorial on Support Vector Machines for Pattern Recognition", Knowledge Discovery and Data Mining, Vol. 2, pp. 121–167, 1998.
4 . D. Comaniciu and P. Meer, "Mean Shift: A Robust Approach toward Feature Space Analysis", In IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 24, pp. 603–619, 2002.
5 . N. Cristianini and J. Shawe-Taylor, "An introduction to Support Vector Machines and other Kernelbased Learning Methods", Cambridge University Press, 2000.
6 . G. Csurka, C. R. Dance, L. Fan, J. Willamowski, and C. Bray, "Visual Categorization with Bags of Keypoints", In Workshop on Statistical Learning in Computer Vision, ECCV, pp. 1–22, 2004.
7 . R. Debnath, N. Takahide, and H. Takahashi, "A Decision based One-Against-One Method for Multiclass Support Vector Machine", Pattern Analysis Application, Vol. 7, pp. 164–175, 2004.
8 . M. Everingham, L. Van-Gool, C. K. I. Williams, J. Winn, and A. Zisserman, "The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Results".
9 . L. Fei-Fei, and P. Perona, "A Bayesian Hierarchical Model for Learning Natural Scene Categories", In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 2,pp. 524–531, 2005.
10 . L. Gorelick, M. Blank, E. Shechtman, M. Irani, and R. Basri, "Actions as Space-Time Shapes", In IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), Vol. 29, pp. 2247–2253,2007.
11 . G. Griffi, A. Holub and P. Perona, "The Caltech-256 Object Category Dataset", Technical Report California Institute of Technology, 2007.
12 . L. Jan Latecki, R. Lakamper and U. Eckhardt, "Shape Descriptors for Non-rigid Shapes with a Single Closed Contour", In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 424–429, 2000.
13 . L. Juan and O. Gwun, "A Comparison of SIFT, PCA-SIFT and SURF", In International Journal of Image Processing, Vol. 3, pp. 143–152, 2009.
14 . F. Jurie and B. Triggs, "Creating Efficient Codebooks for Visual Recognition", In Proceedings of the Tenth IEEE International Conference on Computer Vision (ICCV'05), Vol. 01, pp. 604 – 610, 2005.
15 . Y. Ke and R. Sukthankar, "PCA-SIFT: A More Distinctive Representation for Local Image Descriptors", In Proceedings of the Conference on Computer Vision and Pattern Recognition(CVPR), pp. 511–517, 2004.
16 . S. Lazebnik, C. Schmid, and J. Ponce, "A Sparse Texture Representation using Local Affine Regions", In IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), Vol. 27,pp. 1265–1278, 2005.
17 . B. Leibe and B. Schiele, "Interleaved Object Categorization and Segmentation", In Proceedings of the British Machine Vision Conference (BMVC’03), pp. 759–768, 2003.
18 . D. Li, L. Yang, X. Hua and H. Zhan, "Large-scale Robust Visual Codebook Construction", ACM International Conference on Multimedia (ACM-MM), pp. 1183–1186, 2010.
19 . T. Li, T. Mei and I.-S. Kweon, "Learning Optimal Compact Codebook for Efficient Object Categorization", In IEEE workshop on Applications of Computer Vision, pp. 1–6, 2008.
20 . R. J. Lopez-Sastre, D. Onoro-Rubio, P. Gil-Jimenez, and S. Maldonado-Bascon, "Fast Reciprocal Nearest Neighbours Clustering", Signal Processing, Vol. 92, pp. 270–275, 2012.
21 . D. Lowe, "Distinctive Image Features from Scale-invariant Keypoints", International Journal of Computer Vision, Vol. 60 (2), pp. 91–110, 2004.
22 . K. Mikolajczyk, B. Leibe, and B. Schiele, "Multiple Object Class Detection with a Generative Model",In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), Vol. 1, pp. 26–36, 2006.
23 . K. Mikolajczyk and C. Schmid, "A Performance Evaluation of Local Descriptors", In IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 27, pp. 1615–1630, 2005.
24 . F. Perronnin, "Universal and Adapted Vocabularies for Generic Visual Categorization", In IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 30, pp. 1243–1256, 2008.
25 . J. C. Platt, N. Cristianini, and J. Shawe-Taylor, "Large Margin DAGs for Multiclass Classification", In Advances in Neural Information Processing Systems (NIPS’00), Vol. 12, pp. 547–553, 2000.
26 . A. Ramanan and M. Niranjan, "A One-Pass Resource-Allocating Codebook for Patch-based Visual Object Recognition", In Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing (MLSP'10), pp. 35 – 40, 2010.
27 . A. Ramanan, R. Paheerathy, and M. Niranjan, "Speeding Up Multi-Class Texture Classification by One-Pass Vocabulary Design and Decision Tree", In Proceedings of the Sixth IEEE International Conference on Industrial and Information Systems (ICIIS’11), pp. 255-260, 2011.
28 . A. Ramanan, S. Suppharangsan, and M. Niranjan. "Unbalanced Decision Trees for Multi-class Classification", In Proceedings of the IEEE International Conference on Industrial and Information Systems (ICIIS’07), pp. 291–294, 2007.
29 . R. Rifkin and A. Klautau, "In Defense of One-vs-All Classification", Journal of Machine Learning Research, Vol. 5, pp. 101–141, 2004.
30 . E. B. Sudderth, A. Torralba, W. T. Freeman and A. S. Willsky, "Describing Visual Scenes using Transformed Objects and Parts", International Journal of Computer Vision, Vol. 77, pp. 291–330,2008.
31 . N. Tishby, F. C. Pereira, and W. Bialek, "The Information Bottleneck Method", In the 37th Annual Allerton Conference on Communication, Control and Computing, pp. 368–377, 1999.
32 . Q. Wei, X. Zhang, Y. Kong, W. Hu and H. Ling, "Compact Visual Codebook for Action Recognition",In International Conference on Image Processing (ICIP), pp. 3805–3808, 2010.
33 . J. Winn, A. Criminisi, and T. Minka, “Object Categorization by Learned Universal Visual Dictionary",In Proceedings of the IEEE International Conference on Computer Vision (ICCV), pp. 1800–1807,2005.
Mr. Barathy Mayurathan
University of Peradeniya, Sri Lanka - Sri Lanka
Dr. Amirthalingam Ramanan
University of Jaffna - Sri Lanka
Dr. Sinnathamby Mahesan
University of Jaffna - Sri Lanka
Dr. U.A.J. Pinidiyaarachchi
University of Peradeniya - Sri Lanka