一個從影像序列中擷取臉部特徵點的人臉表情辨識系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	吳明輝 Ming-hui Wu
論文名稱：	一個從影像序列中擷取臉部特徵點的人臉表情辨識系統 A Facial Expression Recognition System Based on the Facial Landmarks Extracted from an Image Sequence
指導教授：	范欽雄 Chin-Shyurng Fahn
口試委員:	邱舉明 none 吳育德 none 賈叢林 none 李建德 none
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2008
畢業學年度：	96
語文別：	英文
論文頁數：	100
中文關鍵詞：	人臉表情辨識、類神經網路、人臉特徵抽取、人臉偵測、支向機、AdaBoost演算法
外文關鍵詞：	face detection, facial feature extraction, facial expression recognition, neural network, support vector machine, AdaBoost algorithm
相關次數：	點閱：453 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

近年來，有鑑於更有效率且更友善的人機介面之需求漸漸提高，使得各種與人臉處理相關的研究迅速地成長。除了對人提供服務之外，一個好用的系統最重要的是與人們的自主性互動關係，因而使得臉部表情辨識成為許多專家學者正在研究發展的熱門議題。於本論文中，我們提出一個具有自動臉部表情辨識功能的系統，其中包含了三個主要程序：首先，基於影像中的膚色區塊取出可能為人臉的區域，再利用區域面積和人臉長寬比例等幾何資訊來偵測人臉；接著，從先前偵測出來的人臉區域找出精確的眼睛、嘴巴和眉毛的區域，再產生它們的二值化影像與邊緣影像來擷取16個特徵點，利用這些特徵點產生16個特徵距離，做為代表某一種表情的人臉特徵資訊；最後，由具有某種表情的16個特徵距離值減去無表情的16個特徵距離值，取出16個表情變化的特徵位移值輸入具有機器學習能力的分類器來辨識高興、生氣、驚訝、害怕、傷心、中性等六種不同的人臉表情。在發展人臉表情辨識的程序時，我們評估了三種分類器的效能分別為：類神經網路、支向機和AdaBoost演算法。經由實驗結果發現：類神經網路跟AdaBoost演算法兩者在辨識率上都有很好的成效，但類神經網路在訓練上需要很長的時間，由於AdaBoost演算法在收斂的速度上佔了很大的優勢，它可時常用來更新我們的訓練樣本以應付不同的表情特徵，同時不會花費太大的成本，所以，我們採取AdaBoost演算法所產生的強分類器當作自動化人臉表情辨識系統的分類器。依此，我們為多類別的人臉表情辨識研發一個由下而上的階層式分類架構；根據實驗結果統計顯示，整體的人臉表情辨識正確率達到90%以上，無論是對單一表情或混合多種表情的影像序列。

Owing to the demand of more efficient and friendly human computer interfaces, the researches on face processing have been rapidly grown in recent years. In addition to offering some kinds of service for human beings, one of the most important characteristics of a favorable system is to autonomously interact with people. Currently, many researchers are making a lot of efforts on the hot topic of facial expression recognition. A completely automatic facial expression recognition system is presented in this thesis, which consists of three main procedures. The first is based on skin color blocks and geometrical properties applied to eliminate the skin color regions that do not belong to the face in the HSI color space. Than we find proper ranges of eyes, mouth, and eyebrows according to the positions of pupils and center of a mouth. Subsequently, we perform both the edge detection and binarization operations on the above ranged images to obtain 16 landmarks. After manipulating these landmarks, 16 characteristic distances are produced to represent a kind of expressions. Finally, we subtract the 16 characteristic distances of a neutral face from the 16 characteristic distances of a certain expression to acquire its 16 displacement values fed to a classifier with an incremental learning scheme, which can identify six kinds of expressions: joy, anger, surprise, fear, sadness, and neutral. During the development of facial expression classification, we evaluate the performance of neural network, support vector machine, and Adaboosting techniques. From the experimental outcomes, we can observe that the average recognition rates obtained from both the AdaBoost algorithm and neural network are better than that from a support vector machine, but the training time of the neural network takes quite a long time. Comparatively, the AdaBoost algorithm has an advantage of facilitating the speed of convergence. Thus, we can update the training samples to deal with comprehensive circumstances of using varied expression features, but need not spend much computational cost. Finally, we choose the AdaBoost algorithm as the core technique to implement our strong facial expression classifier. Moreover, we develop a bottom-up hierarchical classification structure for multi-class expression recognition. Through conducting many experiments, the statistics of performance reveals that the accuracy rate of our facial expression recognition system reaches more than 90% for a single kind or multiple kinds of expressions appearing in an image sequence.

中文摘要………………………………………………………………ii
ABSTRACT…………………………………………………………iii
CONTENTS………………………………………………………………v
LIST OF FIGURES…………………………………………………vii
LIST OF TABLES………………………………………………………x
CHAPTER 1 INTRODUCTION……………………………………………1
1.1 Overview………………………………………………………1
1.2 Background and Motivation…………………………………2
1.3 Thesis Organization and System Architecture……………4

CHAPTER 2 RELATED WORK…………………………………………6
2.1 Reviews of Face Detection and Feature Extraction………6
2.2 Reviews of Facial Expression Recognition………………9

CHAPTER 3 FACE DETECTION…………………………………13
3.1 Color Space Transformation………………………………14
3.2 Connected Component Labeling………………………………17
3.3 Face Detection Strategies……………………………………20
3.4 Pupils Detection………………………………………………22
3.5 Center of Mouth Detection……………………………………24

CHAPTER 4 FACIAL FEATURE EXTRACTION……………………27
4.1 Landmarks of Eyes Extraction………………………28

4.2 Landmarks of Eyebrows Extraction…………………………32
4.3 Landmarks of Mouth Extraction………………………………35

CHAPTER 5 FACIAL EXPRESSION RECOGNITION…………………40
5.1 Feature Distances……………………………………………41
5.2 Multi-Layer Perceptrons………………………………………45
5.2.1 The back-propagation algorithm…………………………45
5.2.2 The MLP-based classifier…………………………………48
5.3 Support Vector Machines………………………………………52
5.3.1 Linear support vector machines……………52
5.3.2 Non-linear support vector machines……………………57
5.3.3 The SVM-based multi-classifier…………………………59
5.4 Adaboosting Schemes…………………………………………62
5.4.1 The AdaBoost algorithm…………………………………62
5.4.2 The weak classifier………………………………………68
5.4.3 The AdaBoost-base multi-classifier……………………70

CHAPTER 6 EXPERIMENTAL RESULTS AND DISCUSSIONS…………73
6.1 The Facial Expression Database……………………………74
6.2 The Results of Face Detection………………………76
6.3 Comparison of Three Different Classifiers………………78
6.4 Experiment of Multi-Type Expressions……………………90

CHAPTER 7 CONCLUSIONS AND FUTURE WORKS……………………93

REFERANCES……………………………………………………………95

                                

[1] Y. Sakagami, R. Watanabe, C. Aoyama, S. Matsunaga, N. Higaki, and K. Fujimura, “The intelligent ASIMO: system overview and integration,” in Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and System, Saitama, Japan, vol. 3, pp. 2478-2483, 2002.
[2] R. Cowie, E. Douglas-Cowie, N. Tsapatsoulis, G. Votsis, S. Kollias,W. Fellenz, and J. G. Taylor, “Emotion recognition in human-computer interaction,” IEEE Signal Processing Magazine, vol. 18, no. 1, pp. 32-80, 2001.
[3] Y. Tian, T. Kanade, and J. F. Cohn, “Recognizing action units for facial expression analysism,” IEEE Transactions on Pattern Analysis Machine Intelligence, vol. 23, no. 2, pp. 97-115, 2001.
[4] G. L. Foresti, C. Micheloni, L. Snidaro, and C. Marchiol, “Face detection for visual surveillance,” in Proceedings of the 12th IEEE International Conference on Image Analysis and Processing, Udine, Italy, pp.115-120, 2003.
[5] Z. Zhang, M. Lyons, M. Schuster, and S. Akamatsu, “Comparison between geometry-based and gabor-wavelets-based facial expression recognition using multi-layer perceptron”, in Proceedings of the 3rd IEEE of the international conference on Automatic Face and Gesture Recognition, Nara, Japan, pp. 454-459, 1998.
[6] M. H. Yang, D. Kriegman, and N. Ahuja, “Detecting faces in images: a survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 1, pp. 34-58, 2002.
[7] G. Yang and T. S. Huang, “Human face detection in complex background,” Pattern Recognition, vol. 27, no. 1, pp. 53-64, 1994.
[8] K. C. Yow and R. Cipolla, “Feature-based human face detection,” Image and Vision Computing, vol. 15, no. 9, pp. 712-735, 1997.
[9] Y. Zhu, L. C. De Silva, and C. C. Ko., “Using moment invariants and HMM in facial expression recognition,” in Proceedings of the IEEE Solithivest Si~niposiirni Conference on Image Analysis and Interpretation, Singapore, pp. 305-309, 2000.
[10] Y. Zhang and Q. Ji, “Active and dynamic information fusion for facial expression understanding from image sequences,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 5, pp. 699-714, 2004.
[11] A. Colmenarez, B. Frey, and T. S. Huang, “A probabilistic framework for embedded face and facial expression recognition,” in Proceedings of the International Conference on Computer Vision and Pattern Recognition, pp. 592-597, 1999.
[12] J. Qin and Z. S. He, “A SVM face recognition method based on Gabor-featured key points,” in Proceedings of the 4th International Conference on Machine Learning and Cybernetics, Chongqing, China, pp. 18-21, 2005.
[13] M. S. Bartlett, G. Littlewort, I. Fasel, and J. R. Movellan, “Real time face detection and facial expression recognition: Development and applications to human computer interaction,” in Proceedings of the International Conference on Computer Vision and Pattern Recognition Workshop, Madison, WI, vol. 5, pp. 53-58, 2003.
[14] M.E. Sargin, E. Erzin, Y. Yemez, A.M. Tekalp, A.T. Erdem, C. Erdem, and M. Ozkan, “Prosody-driven head-gesture animation,” in Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Honolulu, HI, vol. 2, pp. 677-680, 2007.
[15] C. S. Fahn and K. Y. Wang, “A real-time head tracking and face recognition system based on particle filtering and Adaboosting techniques,” Submitted to Journal of Information Science and Engineering, 2007.
[16] D. Chai and A. Bouzerdoum, “A Bayesian approach to skin color classification in YcbCr color space,” in Proceedings of the IEEE Region Ten Conference, Perth, WA, vol. 2, pp. 421-424, 2000.
[17] A. Lanitis, C. J. Taylor, and T. F. Cootes, “An automatic face identification system using flexible appearance models,” Image and Vision Computing, vol. 13, no. 5, pp. 393-401, 1995.
[18] R. Vaillant, C. Monrocq, and Y. Le Cun “An original approach for the localization of objects in images,” in Proceedings of the IEEE Conference on Artificial Neural Networks, Paris, France, pp. 245-250, 1994.
[19] M. Turk and A. Pentland, “Eigenfaces for recognition,” Journal of Cognitive Neuroscence, vol. 3, no. 1, pp. 71-86, 1991.
[20] L. Ma and K. Khorasani “Facial expression recognition using constructive feedforward neural networks,” IEEE Transaction on Systems, Man, and Cybernetics, vol. 34, no. 3, pp. 1588-1595, 2004.
[21] M. J. Lyons, J. Budynek, and S. Akamatsu, “Automatic classification of single facial images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, no. 12, pp. 1357-1362, 1999.
[22] Y. l. Tian, T. Kanade, and J. F. Cohn, “Evaluation of Gabor-wavelet-based facial action unit recognition in image sequences of increasing complexity,” in Proceedings of the 5th IEEE International Conference on Automatic Face and Gesture Recognition, Yorktown Heights, New York, pp. 229-234, 2002.
[23] J. Lien, T. Kanade, J. Cohn, and C. Li, “Detection, tracking, and classification of action units in facial expression,” Robotics and Autonomous Systems, vol. 31, no. 3, pp. 131-146, 2000.
[24] M. Kass, A. Witkin, and D. Terzopoulos, “Snakes: active contour models,” International Journal of Computer Vision, vol. 1, no. 4, pp.321-331, 1988.
[25] T. F. Cootes, C. J. Taylor, D. H. Cooper, and J. Graham, “Active shape models - their training and application,” Computer Vision and Image Understanding, vol. 61, no. 1, pp.38-59, 1995.
[26] T. F. Cootes, G. J. Edwards, and C. J. Taylor, “Active appearance models,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 6, pp. 681-685, 2001.
[27] Y. Chang, C. Hu, and M. Turk, “Probabilistic expression analysis on manifolds,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 520-527, 2004.
[28] R. C. Gonzalez and R. E. Woods, Digital Image Processing, 2nd Ed., Addison-Wesley, Reading, Massachusetts, 1992.
[29] K. Suzuki, I. Horiba, and N. Sugie, “Linear-time connected-component labeling based on sequential local operations,” Source Computer Vision and Image Understanding Archive, vol. 89 , no. 1, pp. 1-23, 2003.
[30] 王進德，蕭大全，類神經網路與模糊控制理論入門，全華科技股份有限公司，台北市，民國90年。
[31] S. Theodoridis and K. Koutroumbas, Pattern Recognition, 3rd Ed., Elsevier, Academic Press, San Diego, California, 2006.
[32] Y. Freund and R. E. Schapire, “Experiments with a new boosting algorithm,” in Proceedings of the 13th International Conference on Machine Learning, Bari, Italy, pp. 148-156, 1996.
[33] J. Friedman, T. Hastie, and R. Tibshirani, “Additive logistic regression: a statistical view of boosting,” The Annals of Statistics, vol. 28, no. 2, pp. 337-407, 2000.
[34] R. E. Schapire and Y. Singer, “Improved boosting algorithms using confidence-rated predictions,” Machine Learning, vol. 37, no. 3, pp. 297-336, 1999.
[35] A. Vezhnevets, GML AdaBoost Matlab Toolbox, Graphics and Media Laboratory, Computer Science Department, Moscow State University, Moscow, Russian Federation, http://research.graphicon.ru/.
[36] L. Breiman, J. Friedman, R. Olshen, and C. Stone, Classification and Regression Trees, Chapman and Hall, New York, 1984.
[37] M. Lyons, S. Akamatsu, M. Kamachi and J. Gyoba, “Coding facial expressions with Gabor wavelets,” in Proceedings of the 3rd IEEE International Conference on Automatic Face and Gesture Recognition, pp. 200-205, Nara, Japan, 1998.
[38] T. Kanade, J. Cohn, and Y. Tian, “Comprehensive database for facial expression analysis,” in Proceedings of the 4th IEEE International Conference on Automatic Face and Gesture Recognition, pp.46-53, Grenoble, France, 2000.
[39] P. Ekman, J. Hager, C. H. Methvin, and W. Irwin, “Ekman-Hager facial action exemplars,” unpublished data, Human Interaction Laboratory, University of California, San Francisco, California.
[40] T. Sim, S. Baker, and M. Bsat, “The CMU Pose, Illumination, and Expression (PIE) Database of Human Faces,” Technique Report CMU-RI-TR-01-02, Robotics Institute, Carnegie Mellon University, Pittsburgh, Pennsylvania, 2001.
[41] L. R. Rabiner and B. H. Juang, “An introduction to hidden Markov models,” IEEE Acoustics Speech and Signal Processing Magazine, vol. 3, no. 1, pp. 4-16, 1986.
[42] Y. Wang and Q. Ji, “A dynamic conditional random field model for object segmentation in image sequences,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, California, vol. 1, pp. 264-270, 2005.

全文公開日期 2013/01/03 (校內網路)
全文公開日期本全文未授權公開 (校外網路)
全文公開日期本全文未授權公開 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文