研究生: |
張凱翔 Kai-Hsiang Chang |
---|---|
論文名稱: |
基於雙層式元件模型之人臉地標點定位與角度估測 Bilayer Part-based Model for Facial Landmark Detection and Pose Estimation |
指導教授: |
徐繼聖
Gee-Sern Hsu |
口試委員: |
莊仁輝
Jen-Hui Chuang 賴尚宏 Shang-Hong Lai 郭景明 Jing-Ming Guo |
學位類別: |
碩士 Master |
系所名稱: |
工程學院 - 機械工程系 Department of Mechanical Engineering |
論文出版年: | 2015 |
畢業學年度: | 103 |
語文別: | 中文 |
論文頁數: | 76 |
中文關鍵詞: | 人臉地標點定位 、人臉角度估測 、雙層式樹狀結構模型 、元件模型 、樹狀結構模型 |
外文關鍵詞: | Facial Landmark Location, pose estimate, bilayer tree structured model, part based model, tree structured model |
相關次數: | 點閱:258 下載:2 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
不同於以往的人臉偵測、角度估測及Landmark(地標點)定位在實際系統的
應用,Tree Structured Model,簡稱TSM,僅利用單一的統一模型,可解決此三
種議題,但其需花費大量的計算時間(以640*480 大小來說,平均單張偵測時間
為10 秒)且無法對較小的人臉(小於80*80)進行偵測,故無法符合實際系統的需
求。
本研究首次提出兩層式的TSM,簡稱BTSM,將解決偵測速度及小人臉問
題,兩層式的TSM 由尺度較小的模型與精細模型所組成,前者為少數元件
(part)及訓練較小的人臉影像建立而成的模型,可偵測到的最小人臉為50*50,
當尺度小的模型偵測到足夠大的人臉時,進入第二層進行較精準且地標點較多
的定位且不需要通過影像金字塔進行摺積(Convolution)。在不同標準資料庫的
測試下,在不失TSM 優點的情況下,BTSM 偵測速度比原先的TSM 快上30 倍
以上。本論文也提供一份完整的TSM 模型在不同參數測試於不同標準資料庫的
效能與時間比較,證實所提出的方法具有高度競爭力且可滿足實際系統之應
用。
Tree Structured model (TSM) is proven effective for face detection, landmark
localization and pose estimation. It is a rare approach that can solve all three issues using one single unified model. However, it can be too slow to handle real-time applications because of the heavy computation involved. Besides, it cannot detect faces less than 80x80 in size. A bilayer structure, coined Bilayer Tree Structure Model(BTSM), is proposed in this study to solve these two issues. The BTSM has a downscaled model with fewer parts and trained on down-scaled samples, and therefore, can detect faces as small as 50x50. When the down-scaled model finds faces of sufficient sizes, it would activate a full-scaled model to locate more landmarks without performing convolution through the image pyramid. Compared on various databases, the BTSM can be 30x faster than the original TSM, while keeping almost all advantages of TSM the same.
[1] X. Zhu, D. Ramanan. "Face detection, pose estimation and landmark localization in the wild",Computer Vision and Pattern Recognition (CVPR) Providence, Rhode Island, June 2012.
[2] T.F. Cootes and C.J. Taylor and D.H. Cooper and J. Graham. "Active shape models - their training and application". Computer Vision and Image Understanding (61): 38–59,1995
[3] T.F Cootes, G. J. Edwards, and C. J. Taylor, “Active appearance models,” In Proc. 5th European Conference on Computer Vision, Vol. 2, pp 484-498, 1998.
[4] T. Cootes, G. Edwards, and C. Taylor, “Active Appearance Models,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 23, no. 6, pp.681-685 , 2001.
[5] P. Felzenszwalb, D. McAllester, D. Ramaman, A Discriminatively Trained, Multiscale, Deformable Part Model IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2008
[6] X Yu, J Huang, S Zhang, W Yan, DN Metaxas ”Pose-free Facial Landmark Fitting via Optimized Part Mixtures and Cascaded Deformable Shape Model”,ICCV 2013
[7] S. Ioffe and D. Forsyth. Mixtures of trees for object recognition.In CVPR 2001
[8] A. Asthana, S. Zafeiriou, S. Cheng and M. Pantic. Robust Discriminative Response Map Fitting with Constrained Local Models. CVPR 2013.
[9] P. Viola and M. J. Jones. Robust real-time face detection. IJCV, 2004
[10] Z. Kalal, J. Matas, and K. Mikolajczyk. Weighted sampling for large-scale boosting. In BMVC 2008.
[11] D.Cristinacce and T.Cootes.Feature detection and tracking with constrained local models .In BMVC,2006.
[12] J. Saragih, S. Lucey, and J. Cohn. Deformable model fitting by regularized landmark mean-shift. IJCV, 2011
[13] Viola, Jones: Robust Real-time Object Detection, IJCV 2001
[14] G. Edwards, C. Taylor, and T. Cootes, “Interpreting face images using active appearance models,” In Proc. IEEE Int’l Conf. Automatic Face and Gesture Recognittion, pp. 300-305, 1998.
[15] S. C. Mitchell, J. G. Bosch, B. P. F. Lelieveldt, R. J. van der Geest, J. H. C. Reiber, and M. Sonka, “3-D active appearance models : Segmentation of cardiac MR and ultrasound images,” IEEE Trans. Medical Imaging, Vol. 21, no. 9, pp.1167-1178,2002.
[16] Simon Lucey, Iain Mattews, Chango Hu, Zara Ambadar, Fernando de la Torre, and Jeffry Chon, “AAM Derived Face Representations for Robust Facial Action Recognition,” International Conference on Automatic Face and Gesture Recognition, pp.155-160, 2006.
[17] M. B. Stegmann, ” Object tracking using active appearance models,” in Proc. 10th Danish Conf. Pattern Recognition and Image Analysis, Vol.1, pp. 54–60, 2001 .
[18] M. Fischler and R. Elschlager. The representation and matching of pictorial structures. IEEE Transactions on Computer,22(1):67–92, January 1973.
[19] P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan.”Object detection with discriminatively trained partbased models.” IEEE TPAMI, 2009
[20] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR, pages I: 886–893, 2005.
[21] R. Gross, I. Matthews, J. Cohn, T. Kanade, and S. Baker, ``Multi-pie,'' Proc. IEEE Conf. Automatic Face and Gesture Recognition, pp. 1,8,17–19, Sept. 2008.
[22] Peter N. Belhumeur, David W. Jacobs, David J. Kriegman, Neeraj Kumar, "Localizing Parts of Faces Using a Consensus of Exemplars,"Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition (CVPR),June 2011.
[23] Teodora Vatahska, Maren Bennewitz, and Sven Behnke. Feature-based head pose estimation from images. In Proceedings of the IEEE-RAS International Conference on Humanoid Robots (Humanoids), 2007
[24] B. D. Lucas, T. Kanade, et al., “An iterative image registration technique with an application to stereo vision.,” in IJCAI, vol. 81, pp. 674–679, 1981. 21, 22