簡易檢索 / 詳目顯示

研究生: 江聖龍
Sheng-lung Chiang
論文名稱: 一個以粒子濾除器追蹤人臉與雙手的即時上肢動作辨識系統
A Real-Time Upper Limbs Posture Recognition System Utilizing Face and Hands Tracking Based on Particle Filters
指導教授: 范欽雄
Chin-shyurng Fahn
口試委員: 林啟芳
Chi-fang Lin
黃俊堯
Jiung-yao Huang
黃榮堂
Jung-tang Huang
鍾國亮
Kuo-liang Chung
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2008
畢業學年度: 96
語文別: 英文
論文頁數: 114
中文關鍵詞: 人機介面人臉偵測上肢偵測人臉追蹤上肢追蹤粒子濾除器類神經網路支向機AdaBoost演算法上肢動作辨識
外文關鍵詞: Human computer interface, face detection, upper-limbs detection, face tracking, upper-limbs tracking, particle filter, neural network, support vector machine, AdaBoost algorithm, upper-limbs posture recognition
相關次數: 點閱:400下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著機器人相關技術的日漸發展,為了能有一個人性化且易於操作的人機介面,好讓機器人和使用者之間更容易溝通及互動,各種以電腦視覺為基礎的研究正不斷地成長,其中也包括了人類肢體動作的辨識技術。在本論文中,我們提出一個即時的上肢動作辨識系統,它具有人臉與上肢偵測及追蹤的功能,目前可辨識八種自行定義的上肢動作,以便對機器人下達命令與互動。於人臉及上肢的追蹤部份,我們利用粒子濾除器對人臉及雙手區域來作追蹤;為了避免身體上其它裸露的部位或是背景中類似膚色物體的干擾,除了使用膚色資訊外,我們的系統進一步採用物體的運動資訊當作追蹤人臉及雙手的特徵,裨使在追蹤的過程中儘可能讓背景的影響降到最低。另外,我們根據旗語的原理特性定義出八個上肢動作,其優點為只要利用彼此的相對位置關係,就能輕鬆地分辨它們。在辨識上肢動作方面,我們針對已偵測得到的人臉及左、右手區域,計算出上肢與人臉的相對位置,包括遠近距離及方向性四個特徵值,再加上左、右手區域的長寬比例,共六個特徵值,來當作上肢動作辨識的依據。我們評估了三種不同分類器的辨識效能,分別為:類神經網路、支向機和AdaBoost演算法。經由實驗結果發現:AdaBoost演算法在辨識率有著最好的成效,且AdaBoost演算法的訓練時間短,在速度上佔了很大的優勢,因此,我們採取AdaBoost演算法所產生的強分類器做為自動化上肢動作辨識系統的分類器。根據實驗結果顯示,我們所提的方法在人臉與上肢偵測率及上肢動作辨識率皆達到95%以上。


    With the development of robots technology day by day, in order to realize a human computer interface, which has more human nature and is easy to operate, to make humans easier to communicate and interact with robots, many researches of various computer vision techniques are constantly growing up, including the technology of recognizing human body postures. In this thesis, we present a real-time upper-limbs posture recognition system which can detect and track both the face and hands and classify eight kinds of postures we currently define to conduct and interact with robots. In the tracking procedure, we employ particle filters to dynamically locate a face and upper-limbs. To prevent the disturbance caused by skin color regions, such as other naked parts of a human body or the background, we further take the motion cue as a feature to reduce the influence of the skin color regions, especially in the background. Additionally, we prescribe eight kinds of upper-limbs postures according to the property of flag semaphore. The advantage is that we can utilize the relative positions of a face and both hands to recognize the postures easily. To accomplish this, we refer the centers of the detected regions of a face and both hands to calculate four feature vectors, each of which comprises the distance and orientation information. The aspect ratios of both hands are also added to serve as the two of totally six feature values for classifying the postures. Prior to posture recognition, we evaluate three machine learning methods: neural networks, support vector machines, and AdaBoost algorithms. In the experiments, the performance of the AdaBoost algorithm is the best of the three methods. And the AdaBoost algorithm requires much less training time than the other two. Therefore, we choose the AdaBoost algorithm as the classifier of our upper-limbs posture recognition system. The experimental results reveal that all the accuracy rates of detecting a face and both hands reach more than 95% by our proposed methods as well as that of recognizing upper-limbs postures.

    CHAPTER 1 INTRODUCTION..................................................... 1 1.1 Overview.................................................................1 1.2 Background and motivation................................................2 1.3 Thesis organization and system architecture..............................4 CHAPTER 2 RELATED WORK......................................................6 2.1 Reviews of face detection and tracking...................................6 2.2 Reviews of hand detection and tracking...................................9 2.3 Reviews of postures recognition.........................................13 CHAPTER 3 FACE AND HANDS DETECTION..........................................16 3.1 Moving objects detection................................................17 3.2 Color segmentation......................................................21 3.3 Connected components....................................................26 3.4 Geometric constrained operation.........................................29 CHAPTER 4 TRACKING PROCEDURE................................................32 4.1 The Kalmer filter.......................................................33 4.2 The particle filter.....................................................35 4.3 Our proposed method.....................................................40 CHAPTER 5 UPPER-LIMBS POSTURE RECOGNITION...................................49 5.1 Posture definition......................................................50 5.2 Feature extraction......................................................53 5.3 Multi-layer perceptrons.................................................57 5.3.1 The back-propagation algorithm................................58 5.3.2 The MLP-based classifier......................................61 5.4 Support Vector Machines.................................................63 5.4.1 Linear support vector machines................................64 5.4.2 Non-linear support vector machines............................69 5.4.3 The SVM-based multi-classifier................................71 5.5 Adaboosting Schemes.....................................................73 5.5.1 The AdaBoost algorithm........................................74 5.5.2 The weak classifier...........................................79 5.5.3 The AdaBoost-base multi-classifier............................81 CHAPTER 6 EXPERIMENTAL RESULTS AND DISCUSSIONS.............................84 6.1 PTZ camera tracking.....................................................85 6.2 The Results of detection................................................89 6.3 Comparison of three different classifiers...............................92 6.4 Experiment of recognizing sequential composite postures................103 CHAPTER 7 CONCLUSIONS AND FUTURE WORKS.....................................106 REFERANCES.................................................................109

    [1] G. R. Bradski, “Computer vision face tracking for use in a perceptual user interface,” Intel Technology Journal, vol. 2, no. 2, pp. 1-15, 1998.
    [2] K. Imagawa, S. Lu, and S. Igi, “Color-based hands tracking system for sign language recognition,” in Proceedings of the Third IEEE International Conference on Automatic Face and Gesture Recognition, Nara, Japan, pp. 462-467, April 1998.
    [3] C. Shan, Y. Wei, T. Tan, and F. Ojardias, “Real time hand tracking by combining particle filtering and mean shift,” in Proceedings of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition, Seoul, South Korea, pp. 669-674, May 2004.
    [4] K. T. Song and W. J. Chen, “Face recognition and tracking for human-robot interaction,” in Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, Hauge, Netherlands, pp. 2877-2882, Octorber 2004.
    [5] X. Liu and K. Fujimura, “Hand gesture recognition using depth data,” in Proceedings of the Sixth IEEE International Conference on Automatic Face and Gesture Recognition, Seoul, Korea, pp. 529-534 May, 2004.
    [6] S. Marcel and O. Bernier, “Hand posture recognition in a body-face centered space,” in Proceedings of the International Gesture Workshop on Gesture-Based Communication in Human-Computer Interaction, London, United Kingdom, vol. 1739, pp. 97-100, 1999.
    [7] H. Jiang, Z. N. Li, and M. S. Drew, “Human posture recognition with convex programming,” in Proceedings of the IEEE International Conference on Multimedia and Expo, Amsterdam, Netherlands, pp. 574-577, July 2005.
    [8] A. F. Bobick and J. W. Davis, “The recognition of human movement using temporal templates,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 3, pp. 257-267, 2001.
    [9] S. Iwasawa, K. Ebihara, J. Ohya, and S. Morishima, “Real-time estimation of human body posture from monocular thermal images,” in Proceedings of the International Conference on Computer Vision and Pattern Recognition, San Juan, Puerto Rico, pp, 15-20, June 1997.
    [10] J. J. Laviola, “A survey of hand posture and gesture recognition techniques and technology,” Technical Report, CS-99-11, Brown University, Department of Computer Science, Providence RI, June 1999.
    [11] C. S. Fahn, C. W. Huang, and H. K. Chen, “A real-time gesture tracking and recognition system based on particle filtering and AdaBoosting techniques,” in Proceedings of the 12th International Conference on Human-Computer Interaction, Beijing, China, vol. 4555, pp. 818-827, July 2007.
    [12] K. Dorfmüller-Ulhaas and D. Schmalstieg, “Finger tracking for interaction in augmented environments,” in Proceedings of the IEEE and ACM International Symposium on Augmented Reality, pp. 55-64, 2001.
    [13] Virtual Technologies. http://www.vrlogic.com/html/immersion/cyberglove.html, 1999.
    [14] J. Martin and J. L. Crowley, “An appearance-based approach to gesture recognition,” in Proceedings of the 9th International Conference on Image Analysis and Processing, London, United Kindom, vol. 2, pp. 340-347, 1997.
    [15] Y. Kuno, T. Ishiyama, K. H. Jo, N. Shimada, and Y. Shirai, “Vision-based human interface system selectively recognizing intentional hand gestures,” in Proceedings of the IASTED International Conference on Computer Graphics and Imaging, pp. 219-223, 1998.
    [16] M. H. Yang, D. Kriegman, and N. Ahuja, “Detecting faces in images: a survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 1, pp. 34-58, 2002.
    [17] G. Yang and T. S. Huang, “Human face detection in complex background,” Pattern Recognition, vol. 27, no. 1, pp. 53-64, 1994.
    [18] K. C. Yow and R. Cipolla, “Feature-based human face detection,” Image and Vision Computing, vol. 15, no. 9, pp. 712-735, 1997.
    [19] D. Chai and A. Bouzerdoum, “A Bayesian approach to skin color classification in YCbCr color space,” in Proceedings of the IEEE Region Ten Conference, Kuala Lumpur, Malaysia, vol. 2, pp. 421-424, September 2000.
    [20] A. Lanitis, C. J. Taylor, and T. F. Cootes, “An automatic face identification system using flexible appearance models,” Image and Vision Computing, vol. 13, no. 5, pp. 393-401, 1995.
    [21] R. Vaillant, C. Monrocq, and Y. Le Cun, “An original approach for the localization of objects in images,” in Proceedings of the IEEE Conference on Artificial Neural Networks, Brighton, United Kindom, pp. 26-30, May 1993.
    [22] M. Turk and A. Pentland, “Eigenfaces for recognition,” Journal of Cognitive Neuroscence, vol. 3, no. 1, pp. 71-86, 1991.
    [23] 范國清, 王元凱, 陳炳富, 追蹤演算法簡介, 影像與識別, 第8卷, 第4期, 第17-30頁, 民國91年.
    [24] G. L. Foresti, C. Micheloni, L. Snidaro, and C. Marchiol, “Face detection for visual surveillance,” in Proceedings of the 12th IEEE International Conference on Image Analysis and Processing, Mantova, Italy, pp. 115-120, September 2003.
    [25] K. H. An, D. H. Yoo, S. U. Jung, and M. J. Chung, “Robust multi-viewface tracking,” in Proceedings of the IEEE International Conference on Intelligent Robots and Systems, Edmonton, Canada, pp. 1905-1910, Augest 2005.
    [26] A. Vacavant and T. Chateau, “Realtime head and hands tracking by monocular vision,” in Proceedings of the IEEE International Conference on Image Processing, vol. 2, pp. 11-14, 2005.
    [27] I. Haritaoglu, D. Harwood, and L. S. Davis, “Ghost: a human body part labeling system using silhouettes,” in Proceedings of the 14th International Conference on Pattern Recognition, Brisbane, Australia, 1998.
    [28] A. F. Bobick and J. W. Davis, “The recognition of human movement using temporal templates,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 3, March 2001.
    [29] H. Fujiyoshi and A. Lipton, “Real-time human motion analysis by image skeletonization,” in Proceedings of the Fourth IEEE Workshop on Application of Computer Vision, Princeton, New Jersey, pp. 19-21, October 1998.
    [30] H. P. Graf, E. Cosatto, D. Gibbon, M. Kocheisen, and E. Petajan, “Multi-modal system for locating heads and faces,” in Proceedings of the Second International Conference on Automatic Face and Gesture Recognition, Killington, Vermont, pp. 88-93, October 1996.
    [31] Y. H. Guan, “The study on naked people image detection based on skin color,” Master Thesis, Department of Engineering Science, National Cheng Kung University, Kaohsiung, Taiwan, 2004.
    [32] C. Garcia and G. Tziritas, “Face detection using quantized skin color regions merging and wavelet packet analysis,” IEEE Transactions on Multimedia, vol. 1, no. 3, pp. 264-277, 1999.
    [33] K. F. Cheng, “Face recognizability─best face shot candidate for surveillance system,” Master Thesis, Department of Computer Science and Information Engineering, National Central University, Taoyuan, Taiwan, 2005.
    [34] K. Suzuki, I. Horiba, and N. Sugie, “Linear-time connected-component labeling based on sequential local operations,” Computer Vision and Image Understanding, vol. 89, no. 1, pp. 1-23, 2003.
    [35] S. Birchfield, “An elliptical head tracker,” in Proceedings of the 31st Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, California, pp. 1710-1714, November 1997.
    [36] G. Welch and G. Bishop, “An introduction to the Kalman filter,” Technical Report TR95-041, Department of Computer Science, University of North Carolina, Chapel Hill, United States, April 2004.
    [37] P. Pérez, C. Hue, J. Vermaak, and M. Gangnet, “Color-based probabilistic tracking,” in Proceedings of the 7th European Conference on Computer Vision─Part I, Copenhagen, Denmark, vol. 2350, pp. 661-675, 2002.
    [38] K. Okuma, A. Taleghani, N. de Freitas, J. Little, and D. Lowe, “A boosted particle filter: multitarget detection and tracking,” in Proceedings of the 8th European Conference on Computer Vision, Prague, Czech Republic, vol. 1, pp. 28-39, May 2004.
    [39] M. Montemerlo, S. Thrun, and W. Whittaker, “Conditional particle filters for simultaneous mobile robot localization and people-tracking,” in Proceedings of the IEEE International Conference on Robotics and Automation, Washington, District of Columbia, vol. 1, pp. 695-701, May, 2002.
    [40] K. Y. Wang, “A real-time face tracking and recognition system based on particle filtering and AdaBoosting techniques,” Master Thesis, Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan, 2006.
    [41] http://science.nsta.org/enewsletter/2004-10/ss0401_30.pdf
    [42] L. Siana, “A study of human tracking and face detection on a PTZ camera,” Master Thesis, Department of Electrical and Control Engineering, National Chiao-Tung University, Hsinchu, Taiwan, 2005.
    [43] 王進德,蕭大全,類神經網路與模糊控制理論入門,全華科技股份有限公司,台北市,民國90年。
    [44] S. Theodoridis and K. Koutroumbas, Pattern Recognition, 3rd Ed., Elsevier, Academic Press, San Diego, California, 2006.
    [45] Y. Freund and R. E. Schapire, “Experiments with a new boosting algorithm,” in Proceedings of the 13th International Conference on Machine Learning, Bari, Italy, pp. 148-156, January 1996.
    [46] J. Friedman, T. Hastie, and R. Tibshirani, “Additive logistic regression: a statistical view of boosting,” The Annals of Statistics, vol. 28, no. 2, pp. 337-407, 2000.
    [47] R. E. Schapire and Y. Singer, “Improved boosting algorithms using confidence-rated predictions,” Machine Learning, vol. 37, no. 3, pp. 297-336, 1999.
    [48] A. Vezhnevets, GML AdaBoost Matlab Toolbox, Graphics and Media Laboratory, Computer Science Department, Moscow State University, Moscow, Russian Federation, http://research.graphicon.ru/.
    [49] L. Breiman, J. Friedman, R. Olshen, and C. Stone, Classification and Regression Trees, Chapman and Hall, New York, 1984.
    [50] L. R. Rabiner and B. H. Juang, “An introduction to hidden Markov models,” IEEE Acoustics Speech and Signal Processing Magazine, vol. 3, no. 1, pp. 4-16, 1986.
    [51] Y. Wang and Q. Ji, “A dynamic conditional random field model for object segmentation in image sequences,” in Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, California, vol. 1, pp. 264-270, 2005.

    無法下載圖示 全文公開日期 2013/01/25 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE