簡易檢索 / 詳目顯示

研究生: 張凱翔
Kai-Hsiang Chang
論文名稱: 基於雙層式元件模型之人臉地標點定位與角度估測
Bilayer Part-based Model for Facial Landmark Detection and Pose Estimation
指導教授: 徐繼聖
Gee-Sern Hsu
口試委員: 莊仁輝
Jen-Hui Chuang
賴尚宏
Shang-Hong Lai
郭景明
Jing-Ming Guo
學位類別: 碩士
Master
系所名稱: 工程學院 - 機械工程系
Department of Mechanical Engineering
論文出版年: 2015
畢業學年度: 103
語文別: 中文
論文頁數: 76
中文關鍵詞: 人臉地標點定位人臉角度估測雙層式樹狀結構模型元件模型樹狀結構模型
外文關鍵詞: Facial Landmark Location, pose estimate, bilayer tree structured model, part based model, tree structured model
相關次數: 點閱:258下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 不同於以往的人臉偵測、角度估測及Landmark(地標點)定位在實際系統的
    應用,Tree Structured Model,簡稱TSM,僅利用單一的統一模型,可解決此三
    種議題,但其需花費大量的計算時間(以640*480 大小來說,平均單張偵測時間
    為10 秒)且無法對較小的人臉(小於80*80)進行偵測,故無法符合實際系統的需
    求。
    本研究首次提出兩層式的TSM,簡稱BTSM,將解決偵測速度及小人臉問
    題,兩層式的TSM 由尺度較小的模型與精細模型所組成,前者為少數元件
    (part)及訓練較小的人臉影像建立而成的模型,可偵測到的最小人臉為50*50,
    當尺度小的模型偵測到足夠大的人臉時,進入第二層進行較精準且地標點較多
    的定位且不需要通過影像金字塔進行摺積(Convolution)。在不同標準資料庫的
    測試下,在不失TSM 優點的情況下,BTSM 偵測速度比原先的TSM 快上30 倍
    以上。本論文也提供一份完整的TSM 模型在不同參數測試於不同標準資料庫的
    效能與時間比較,證實所提出的方法具有高度競爭力且可滿足實際系統之應
    用。


    Tree Structured model (TSM) is proven effective for face detection, landmark
    localization and pose estimation. It is a rare approach that can solve all three issues using one single unified model. However, it can be too slow to handle real-time applications because of the heavy computation involved. Besides, it cannot detect faces less than 80x80 in size. A bilayer structure, coined Bilayer Tree Structure Model(BTSM), is proposed in this study to solve these two issues. The BTSM has a downscaled model with fewer parts and trained on down-scaled samples, and therefore, can detect faces as small as 50x50. When the down-scaled model finds faces of sufficient sizes, it would activate a full-scaled model to locate more landmarks without performing convolution through the image pyramid. Compared on various databases, the BTSM can be 30x faster than the original TSM, while keeping almost all advantages of TSM the same.

    摘要 Abstract 誌謝 目錄 圖目錄 表目錄 演算法目錄 第1章 介紹 1.1 研究背景和動機 1.1.1 地標點定位 1.1.2人臉偵測 1.1.3角度估測 1.2 方法概述 1.3 論文貢獻 1.4 論文架構 第2章 相關文獻探討 2.1 地標點定位之相關理論 2.1.1 主動形狀模型 (Active Shape Models, ASMs) 2.1.2 主動外觀模型 (Active Appearance Models, AAMs) 2.1.3約束的局部模型 (Constrained Local Models, CLMs) 2.2 人臉偵測 2.2.1 Adaboost 2.2.2 Quasi-random weighted sampling + trimming 2.3 角度估測 2.3.1 Tree Structured Part Model 第3章 主要方法與流程 3.1 可型變之部件模型 ( Deformable Part Model, DPM ) 3.1.1 方法介紹 3.2 樹狀結構之元件模型(Tree Structured Part Model, TSM) 3.2.1 訓練模組 3.2.2 偵測模組 3.3 Bilayer TSM 3.4 TSM之參數探討 3.5 Principal Component Analysis for Acceleration 第4章 實驗設置與分析 4.1 標準資料庫介紹 4.1.1 Multi-PIE 介紹 4.1.2 LFPW 介紹 4.1.3 AFW 介紹 4.2 實驗設計 4.3 實驗結果與分析 4.3.1 HOG特徵參數設定 4.3.2 Part數量之變化 4.3.3樹狀結構之Root位置探討 4.3.4訓練影像大小與Part Size的關係 4.3.5 Coarse TSM效能比較 4.3.6小人臉偵測效能比較 4.3.7 PCA降維效能與時間比較 4.3.8相關文獻效能比較 第5章 即時系統製作與效能評估 5.1系統架構 第6章 結論與未來研究方向 參考文獻

    [1] X. Zhu, D. Ramanan. "Face detection, pose estimation and landmark localization in the wild",Computer Vision and Pattern Recognition (CVPR) Providence, Rhode Island, June 2012.
    [2] T.F. Cootes and C.J. Taylor and D.H. Cooper and J. Graham. "Active shape models - their training and application". Computer Vision and Image Understanding (61): 38–59,1995
    [3] T.F Cootes, G. J. Edwards, and C. J. Taylor, “Active appearance models,” In Proc. 5th European Conference on Computer Vision, Vol. 2, pp 484-498, 1998.
    [4] T. Cootes, G. Edwards, and C. Taylor, “Active Appearance Models,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 23, no. 6, pp.681-685 , 2001.
    [5] P. Felzenszwalb, D. McAllester, D. Ramaman, A Discriminatively Trained, Multiscale, Deformable Part Model IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2008
    [6] X Yu, J Huang, S Zhang, W Yan, DN Metaxas ”Pose-free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model”,ICCV 2013
    [7] S. Ioffe and D. Forsyth. Mixtures of trees for object recognition.In CVPR 2001
    [8] A. Asthana, S. Zafeiriou, S. Cheng and M. Pantic. Robust Discriminative Response Map Fitting with Constrained Local Models. CVPR 2013.
    [9] P. Viola and M. J. Jones. Robust real-time face detection. IJCV, 2004
    [10] Z. Kalal, J. Matas, and K. Mikolajczyk. Weighted sampling for large-scale boosting. In BMVC 2008.
    [11] D.Cristinacce and T.Cootes.Feature detection and tracking with constrained local models .In BMVC,2006.
    [12] J. Saragih, S. Lucey, and J. Cohn. Deformable model fitting by regularized landmark mean-shift. IJCV, 2011
    [13] Viola, Jones: Robust Real-time Object Detection, IJCV 2001
    [14] G. Edwards, C. Taylor, and T. Cootes, “Interpreting face images using active appearance models,” In Proc. IEEE Int’l Conf. Automatic Face and Gesture Recognittion, pp. 300-305, 1998.
    [15] S. C. Mitchell, J. G. Bosch, B. P. F. Lelieveldt, R. J. van der Geest, J. H. C. Reiber, and M. Sonka, “3-D active appearance models : Segmentation of cardiac MR and ultrasound images,” IEEE Trans. Medical Imaging, Vol. 21, no. 9, pp.1167-1178,2002.
    [16] Simon Lucey, Iain Mattews, Chango Hu, Zara Ambadar, Fernando de la Torre, and Jeffry Chon, “AAM Derived Face Representations for Robust Facial Action Recognition,” International Conference on Automatic Face and Gesture Recognition, pp.155-160, 2006.
    [17] M. B. Stegmann, ” Object tracking using active appearance models,” in Proc. 10th Danish Conf. Pattern Recognition and Image Analysis, Vol.1, pp. 54–60, 2001 .
    [18] M. Fischler and R. Elschlager. The representation and matching of pictorial structures. IEEE Transactions on Computer,22(1):67–92, January 1973.
    [19] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan.”Object detection with discriminatively trained partbased models.” IEEE TPAMI, 2009
    [20] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR, pages I: 886–893, 2005.
    [21] R. Gross, I. Matthews, J. Cohn, T. Kanade, and S. Baker, ``Multi-pie,'' Proc. IEEE Conf. Automatic Face and Gesture Recognition, pp. 1,8,17–19, Sept. 2008.
    [22] Peter N. Belhumeur, David W. Jacobs, David J. Kriegman, Neeraj Kumar, "Localizing Parts of Faces Using a Consensus of Exemplars,"Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition (CVPR),June 2011.
    [23] Teodora Vatahska, Maren Bennewitz, and Sven Behnke. Feature-based head pose estimation from images. In Proceedings of the IEEE-RAS International Conference on Humanoid Robots (Humanoids), 2007
    [24] B. D. Lucas, T. Kanade, et al., “An iterative image registration technique with an application to stereo vision.,” in IJCAI, vol. 81, pp. 674–679, 1981. 21, 22

    QR CODE