簡易檢索 / 詳目顯示

研究生: TRUONG TAN LOC
TRUONG - TAN LOC
論文名稱: A COMPARATIVE STUDY ON MULTI-VIEW OBJECT RECOGNITION WITH APPEARANCE FEATURES AND SILHOUETTE ALIGNMENT
A COMPARATIVE STUDY ON MULTI-VIEW OBJECT RECOGNITION WITH APPEARANCE FEATURES AND SILHOUETTE ALIGNMENT
指導教授: 徐繼聖
Gee-Sern Hsu
口試委員: 洪一平
Yi-Ping Hung
李明穗
Ming-Sui Lee
郭景明
Jing-Ming Guo
學位類別: 碩士
Master
系所名稱: 工程學院 - 機械工程系
Department of Mechanical Engineering
論文出版年: 2012
畢業學年度: 100
語文別: 英文
論文頁數: 72
中文關鍵詞: Object recognitionfeature extractionappearance features.
外文關鍵詞: Object recognition, feature extraction, appearance features.
相關次數: 點閱:226下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • The features extracted for object recognition can be generally split
    into two categories, local invariant and appearance-based. The
    former is commonly selected for the recognition of generic objects,
    while the latter is a popular choice for the recognition of specific
    objects, for example, faces. Because most specific objects can be
    aligned to appearance features, the deviations from the aligned
    features offer some appearance characteristics good for recognition.
    Such an alignment can be difficult to define in generic objects.
    Therefore, the works on the recognition of generic objects using
    appearance features in the literature are significantly outnumbered
    by those using local invariant features. The performance of many
    appearance features and associated classifiers on face recognition
    has been widely studied and reported; however, their performance on
    generic object recognition is only studied in a limited scope. To
    extend our understanding in this regard and be able to determine the
    appearance features and classifiers good for generic object
    recognition, this paper reports a comprehensive comparison study in
    which different combinations of features and classifiers are
    evaluated on a benchmark database. To detect and segment the object
    of interest from a scene, which is often the first step in object
    recognition, we propose a scheme, called {\it Silhouette Alignment},
    to align the features extracted from a test image to those in the
    database. Although the appearance features considered in this study
    are holistic, in the comparison we also include SIFT (Scale
    Invariant Feature Transform), one of the most popular local
    invariant features for object recognition, to justify the
    performance of the appearance-based features and associated
    classifiers. Experiments on the COIL-100 database show that DCT
    features with Naive Bayesian classifier give the best performance
    among others on object recognition across viewpoints. SIFT
    outperforms most appearance features when the image quality is
    degraded, i.e., blurred by noise. However, a few classifiers with
    appearance features outperform SIFT in both noise-free conditions
    and cluttered backgrounds.


    The features extracted for object recognition can be generally split
    into two categories, local invariant and appearance-based. The
    former is commonly selected for the recognition of generic objects,
    while the latter is a popular choice for the recognition of specific
    objects, for example, faces. Because most specific objects can be
    aligned to appearance features, the deviations from the aligned
    features offer some appearance characteristics good for recognition.
    Such an alignment can be difficult to define in generic objects.
    Therefore, the works on the recognition of generic objects using
    appearance features in the literature are significantly outnumbered
    by those using local invariant features. The performance of many
    appearance features and associated classifiers on face recognition
    has been widely studied and reported; however, their performance on
    generic object recognition is only studied in a limited scope. To
    extend our understanding in this regard and be able to determine the
    appearance features and classifiers good for generic object
    recognition, this paper reports a comprehensive comparison study in
    which different combinations of features and classifiers are
    evaluated on a benchmark database. To detect and segment the object
    of interest from a scene, which is often the first step in object
    recognition, we propose a scheme, called {\it Silhouette Alignment},
    to align the features extracted from a test image to those in the
    database. Although the appearance features considered in this study
    are holistic, in the comparison we also include SIFT (Scale
    Invariant Feature Transform), one of the most popular local
    invariant features for object recognition, to justify the
    performance of the appearance-based features and associated
    classifiers. Experiments on the COIL-100 database show that DCT
    features with Naive Bayesian classifier give the best performance
    among others on object recognition across viewpoints. SIFT
    outperforms most appearance features when the image quality is
    degraded, i.e., blurred by noise. However, a few classifiers with
    appearance features outperform SIFT in both noise-free conditions
    and cluttered backgrounds.

    Abstract Acknowledgements Table of Contents List of Figures List of Tables 1 Introduction 1.1 Issues and Motivations 1.2 Related Works 1.3 Outline of Approach 1.3.1 Learning phase 1.3.2 Recognition phase 1.4 Contributions of this thesis 2 Object Detection and Segmentation 2.1 Silhouette and Principal Alignment Axis 2.2 Multi-view Feature Extraction from a Training Set 2.3 Object Detection and Segmentation using Silhouette Alignment 3 Appearance Features and Classifiers 3.1 Involved Appearance Features of this Study 3.1.1 Principal Component Analysis 3.1.2 Discrete Cosine Transform 3.1.3 Gabor Transform 3.2 Classifiers of Interest 3.2.1 k-Nearest Neighbors 3.2.2 Linear Discriminant Analysis 3.2.3 Naive Bayesian 3.2.4 Artificial Neural Network 3.2.5 Support Vector Machine 4 Experiment 4.1 COIL-100 4.1.1 Introduction 4.1.2 Standard Scenario 4.1.3 Scenarios with noisy data, cluttered background and imposters 4.1.4 Discussion on COIL-100 4.2 Object Recognition Database 4.2.1 Introduction 4.2.2 Result 4.3 Experiment on Real Time System 4.3.1 Introduction 4.3.2 Result 5 Conclusion 5.1 Discussion 5.2 Future works Bibliography

    [1] K.Mikolajczyk and C.Schmid, “A performance evaluation of local descriptors,” TPAMI, vol. 27, no. 10, pp. 1615–1630, 2005.
    [2] K.Mikolajczyk, T.Tuytelaars, C.Schmid, A.Zisserman, J.Matas, F.Schaffalitzky, T.Kadir, and L.J.V.Gool, “A comparison of affine region detectors,” IJCV, vol. 65, no. 1-2, pp. 43–72, 2005.
    [3] M.Pontil and A.Verri, “Support vector machines for 3d object recognition,” TPAMI, vol. 20, no. 6, pp. 637 –646, jun 1998.
    [4] A.M.Martinez and A.C.Kak, “Pca versus lda,” TPAMI, vol. 23, no. 2, pp. 228 –233, feb 2001.
    [5] X.Liu, A.Srivastava, and K.Gallivan, “Optimal linear representations of images for object recognition,” TPAMI, vol. 26, no. 5, pp. 662 –666, may 2004.
    [6] T.V.Pham and A.W.M.Smeulders, “Sparse representation for coarse and fine object recognition,” TPAMI, vol. 28, no. 4, pp. 555 –567, april 2006.
    [7] A.Choksuriwong, B.Emile, H.Laurent, and C.Rosenberger, “Comparative study of global invariant descriptors for object recognition,” JEI, vol. 17, no. 2, pp. 023015, 2008.
    [8] F.Rothganger, S.Lazebnik, C.Schmid, and J.Ponce, “3d object modeling and recognition using local affine-invariant image descriptors and multi-view spatial constraints,” IJCV, vol. 66, no. 3, pp. 231–259, 2006.
    [9] X.Han, Y.Chen, and X.Ruan, “Multilinear supervised neighborhood embedding of local descriptor tensor for scene/object recognition,” TIP, vol. 21, no. 3, pp. 1314 –1326, 2012.
    [10] C.Liu and H.Wechsler, “Gabor feature based classification using the enhanced fisher linear discriminant model for face recognition,” TIP, vol. 11, no. 4, pp. 467–476, April 2002.
    [11] S.Lyu, “Mercer kernels for object recognition with local features,” in CVPR, 2005, vol. 2, pp. 223–229.
    [12] X.H.Han, Y.W.Chen, and X.Ruan, “Multilinear supervised neighborhood em-bedding of a local descriptor tensor for scene/object recognition,” TIP, vol. 21, no. 3, pp. 1314 –1326, march 2012.
    [13] C.Hua, Y.Mingqiang, K.Kidiyo, and R.Joseph, “Object recognition using relative-chord context,” in CECNet, april 2011, pp. 515 –518.
    [14] M.Rudinac and P.P.Jonker, “A fast and robust descriptor for multiple-view object recognition,” in ICARCV, dec. 2010, pp. 2166 –2171.
    [15] S A Nene, S K Nayar, and H Murase, “Columbia object image library (coil-100),” RLAAOTA, vol. 95, no. CUCS-006-96, pp. 223–303, 1996.
    [16] I.Buciu, I.Nafornita, and I.Pitas, “Global gabor features for rotation invariant object classification,” in ICCP, 2008, pp. 41 –46.
    [17] F.A.Pavel, Z.Wang, and D.D.Feng, “Reliable object recognition using sift fea¬tures,” in MMSP, oct. 2009, pp. 1 –6.
    [18] P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple features,” in CVPR, 2001, vol. 1, pp. I–511 – I–518 vol.1.
    [19] D.G.Lowe, “Distinctive image features from scale-invariant keypoints,” IJCV, vol. 60, pp. 91–110, 2004.
    [20] D.Comaniciu, P.Meer, and Senior Member, “Mean shift: A robust approach toward feature space analysis,” TPAMI, vol. 24, pp. 603–619, 2002.
    [21] M.J.Er, W.Chen, and S.Wu, “High-speed face recognition based on discrete cosine transform and rbf neural networks,” TNN, vol. 16, no. 3, pp. 679 –691, may 2005.
    [22] J.-K.Kamarainen, V.Kyrki, and H.Kalviainen, “Invariance properties of gabor filter-based features-overview and applications,” TIP, vol. 15, no. 5, pp. 1088 –1099, 2006.
    [23] C.Liu and H.Wechsler, “A gabor feature classifier for face recognition,” in ICCV, 2001, vol. 2, pp. 270 –275 vol.2.
    [24] M.Lades, J.C.Vorbruggen, J.Buhmann, J.Lange, von der C.Malsburg, R.P.Wurtz, and W.Konen, “Distortion invariant object recognition in the dy¬namic link architecture,” TC, vol. 42, no. 3, pp. 300 –311, mar 1993.
    [25] D.Price, S.Knerr, L.Personnaz, G.Dreyfus, and L.P.G.Dreyfus, “Pairwise neural network classifiers with probabilistic outputs,” in ANIPS 7. 1994, pp. 1109–1116, MIT Press.
    [26] J.C. Platt, “Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods,” in ALMC. 1999, pp. 61–74, MIT Press.
    [27] Kazuhiro Hotta, “Pose independent object classification from small number of training samples based on kernel principal component analysis of local parts,” IVC, vol. 27, no. 9, pp. 1240–1251, Aug. 2009.

    QR CODE