Basic Search / Detailed Display

Author: 吳文華
Wen-hua Wu
Thesis Title: 基於隱藏式馬可夫模型學習機制的人體動作辨識技術
Human Actions Recognition Techniques Based on the Learning Mechanism of HMMs
Advisor: 范欽雄
Chin-Shyurng Fahn
Committee: 王榮華
none
王聖智
none
洪一平
none
鮑興國
Hsing-kuo Pao
Degree: 碩士
Master
Department: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
Thesis Publication Year: 2010
Graduation Academic Year: 98
Language: 英文
Pages: 54
Keywords (in Chinese): 人體動作辨識人體姿勢表示人體動作分段隱藏式馬可夫模型星型骨架
Keywords (in other languages): human action recognition, human posture presentation, human action segmentation, HMM, star skeletonization
Reference times: Clicks: 222Downloads: 1
Share:
School Collection Retrieve National Library Collection Retrieve Error Report

近幾年,人體動作辨識在電腦視覺研究中已受到注目,其主要目的為辨識人體連續動作所構成的行為,此研究的困難點可分為兩方面,第一,表達人類身體移動與姿勢,需要相當複雜的高維度資訊,第二,為了辨識人體動作,必須處理人類行為當中空間與時間相關聯的資訊。
在這個研究中,我們提出了基於隱藏式馬可夫模型學習機制的人體動作辨識技術,為了有效地表達每一個人體姿勢的特徵,我們採用星型骨架將影像序列轉為特徵序列,並進一步透過編碼書將特徵序列轉換成符號序列,其中編碼書上記錄具代表性的編碼向量,它們係由分群演算法所產生,且每個編碼向量賦予唯一的符號。為了辨識連續動作,我們還結合了滑動窗口與固定點偵測的兩個技術來對動作分段;固定點係指維持不動的星型骨架頂點,可藉由這些固定點的個數變化來區分不同的動作段落,當固定點維持不變時,則視為一段基本動作單位,當基本動作單位太長時,再採用滑動窗口來進行分段。針對每一種人體動作,我們建構出一個隱藏式馬可夫模型,在動作辨識時期,將分段後的動作符號序列,找出與其最匹配的動作模型,作為其辨識的結果。此外,我們也利用了星型骨架中心點的移動速度來辨識人類跌倒的動作。
目前,我們的系統包含了9種不同的動作辨識。根據實驗結果顯示,我們所提出的動作分段方法在動作的分段精確度上,比傳統的滑動窗口來得好。在單一動作辨識方面,實驗結果顯示對於9種動作的平均辨識度達到95.93%,另外,在連續動作辨識方面,我們也確實能有效地從一連串的動作影片中辨識出9種不同的動作,動作分段辨識處理最多延遲為28張畫面。在我們的系統,影像的解析度為320x240,偵測和辨識的平均處理速度約為每秒25到30張畫面。


In recent years, human actions recognition has attracted great attention in computer vision. The objective of the human actions recognition is to recognize each of those continuous actions that constitute a human behavior. The difficulties of the human actions recognition are twofold. First, human body movements and postures are articulated motions; therefore, they involve a high degree of dimensionality and complexity. Second, characterizing human behavior is equivalent to dealing with a sequence of video frames that contain spatial and temporal information.
In this thesis, we propose human actions recognition techniques based on the mechanism HMMs. The star skeleton is used to effectively and efficiently represent the features of postures for each human action. The feature vector sequence extracted from the captured video sequence is converted to a series of symbols which correspond to the codewords in the codebook created by vector quantization. We employ a clustering algorithm to generate the codewords in the codebook. To handle the recognition in continuous actions, the action segmentation is conducted by combining a sliding window scheme with stable contact detection. The extreme points of a star skeleton remaining in the same place for a long enough period are the stable contacts. Primitive motion units (PMUs) that have a consistent number of stable contacts are regarded as a segmented action. When the period of a PMU is too long, we employ a sliding window to segment the continuous actions. We build an HMM for each action type except “fall-down,” and the recognition result is determined as the category that best matches an observed sequence. Moreover, we also exploit the moving speed of the center of the star skeleton in each segmented action sequence to recognize the “fall-down” action type.
There are totally 9 different actions involved in our system. The experimental results of the action segmentation show that our proposed method has the better performance than the sliding window scheme. For single action recognition, the experiments reveal the average recognition rate achieves 95.93%. For continuous actions recognition, the experiments show that we can effectively recognize 9 different actions for the videos including a sequence of actions, it can be correctly recognized with at most 28 frames delay. The video sequences capture with a fixed camera working at 25-30 frames per seconds and the resolution is 320 × 240

中文摘要i Abstractii 致謝iv Contentsv List of Figuresvi List of Tablesviii Chapter 1 Introduction1 1.1Overview1 1.2Motivation1 1.3System Description3 1.4Thesis Organization4 Chapter 2 Background and Related Works6 2.1Hidden Markov Model (HMM)6 2.1.1Outline6 2.1.2Recognition8 2.1.3Learning9 2.2Reviews of Human Posture Representation10 2.3Reviews of Human Action Segmentation11 2.4Reviews of Human Action Recognition13 Chapter 3 Proposed Human Action Recognition Method16 3.1Preliminary Processing17 3.1.1Background Subtraction18 3.1.2Shadow removal19 3.1.3Morphological processing20 3.1.4Connected Component Labeling22 3.2Star Skeletonization23 3.3Star Symbolization27 3.4Action Segmentation28 3.5Action Recognition31 Chapter 4 Experimental Results and Discussions37 Chapter 5 Conclusions and Future Works50 References52

[1]H. Fujiyoshi and A. Lipton, “Real-Time Human Motion Analysis by Image Skeletonization,” in Proceedings of IEEE Workshop on Application Computer Vision, pp.15-21, 1998.
[2]R.T. Collins, A.J. Lipton, T. Kanade, H. Fujiyoshi, D. Duggins, Y. Tsin, D. Tolliver, N. Enomoto, O. Hasegawa, P. Burt, L. Wixson, “A System for Video Surveillance and Monitoring,” CMU-RI-TR-00-12, Carnegie Mellon University, 2000.
[3]A. Ali and J. K. Aggarwal, “Segmentation and recognition of continuous human activity, ” in Proceedings of IEEE Workshop on Detection and Recognition of Events in Video, pp. 28-35, Vancouver, Canada, July, 8 2001.
[4]E. Yu and J.K. Aggarwal, “Detection of Stable Contacts for Human Motion Analysis,” in Proceedings of the 4th ACM international workshop on Video surveillance and sensor networks, Santa Barbara, California, USA, pp. 87-94, 2006.
[5]Cucchiara, R., Grana, C., Neri, G., Piccardi, M. and Prati, A. (2001) “The Sakbot system for moving object detection and tracking,” in P. Remagnino, G.A. Jones, N. Paragios and C.S. Regazzoni (Eds) Video-based Surveillance Systems: Computer Vision and Distributed Processing, Berlin: Springer, pp.145–157.
[6]R. C. Gonzalez and R. E. Woods, Digital Image Processing, 2nd Ed., Addison-Wesley, Reading, Massachusetts, 1992.
[7]T. Kanade, R. Collins, A. Lipton, P. Burt, and L. Wixson. “Advances in cooperative multisensory video surveillance,” In Proceedings of the 1998 DARPA Image Understanding Workshop, volume 1, pp. 3–24, November 1998.
[8]V. Nair and J.J. Clark. Automated visual surveillance using hidden markov models. In International Conference on Vision Interface, pp. 88–93, 2002.
[9]http://en.wikipedia.org/wiki/Vector_quantization.
[10]Hsuan-Sheng Chen, Hua-Tsung Chen, Yi-Wen Chen and Suh-Yin Lee, “Human Action Recognition Using Star Skeleton”, in Proceedings of the 4th ACM international workshop on Video surveillance and sensor networks, pp. 171-178, 2006.
[11]Duan-Yu Chen, Hong-Yuan Mark Liao, and Sheng-Wen Shih, “Continuous Human Action Segmentation and Recognition Using A Spatio-Temporal Probabilistic Framework,” in Proceedings of the Eighth IEEE International Symposium on Multimedia, 2006.
[12]L.R. Rabiner and B.H. Juang, "An Introduction to Hidden Markov Models," IEEE ASSP MAGAZINE, pp. 4-16, Jan 1986.
[13]L.R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Processing,” in Proceedings of the IEEE, vol. 77, pp. 257-286, 1989.
[14]A. Kale, A. Sundaresan, A. N. Rajagopalan, N. P. Cuntoor, A. K. Roy-Chowdhury, V. Kruger and R. Chellappa, “Identification of Humans using Gait,” IEEE Transactions on Image Processing, pp. 1163-1173, 2004.
[15]M. Leo, T. D'Orazio, I. Gnoni, P. Spagnolo and A. Distante, “Complex Human Activity Recognition for Monitoring Wide Outdoor Environments,” in Proceedings of the 17th International Conference on Pattern Recognition, Vol. 4, pp. 913-916, 2004.
[16]Yanxi Liu, R. T. Collins and T. Tsin, “Gait Sequence Analysis using Frieze Patterns,” in Proceedings of European Conference on Computer Vision, 2002.
[17]Sijun Lu, Jian Zhang and David Dagan Feng, “Detecting unattended packages through human activity recognition and object association,” Pattern Recognition, Vol. 40, pp. 2173-2184, 2007.
[18]D. Jung and A. Zelinsky, “Whisker Based Mobile Robot Navigation,” in Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Osaka, Japan, Vol. 2, pp. 497-504, 1996.
[19]B. Tribelhorn and Z. Dodds, “Evaluating the Roomba: A Low-cost, Ubiquitous Platform for Robotics Research and Education,” in Proceedings of IEEE International Conference on Robotics and Automation, Roma, Italy, pp. 1394-1399, 2007.
[20]R. C. Luo, P. K. Wang, T. Y. Hsu, and T. Y. Lin, “Navigation and Mobile Security System of Intelligent Security Robot,” in Proceedings of the IEEE International Conference on Industrial Technology, pp. 260-265, Hong Kong, December, 2005.
[21]P. K. Turaga, R. Chellappa, V. S. Subrahmanian, and O. Udrea, “Machine recognition of human activities: A survey,” IEEE Trans. Circuits Syst. Video Techn., vol. 18, no. 11, pp. 1473–1488, 2008.
[22]W. Hu, T. Tan, L. Wang, and S. Maybank, “A survey on visual surveillance of object motion and behaviors,” IEEE Transactions on Systems, Man and Cybernetics, vol. 34, pp. 334–352, 2004.
[23]A. F. Bobick and J. W. Davis, “The recognition of human movement using temporal templates,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 3, pp. 257–267, 2001.
[24]T. F. Syeda-Mahmood, M. Vasilescu, and S. Sethi, “Recognizing action events from multiple viewpoints,” IEEE Workshop on Detection and Recognition of Events in Video, pp. 64–72, 2001.
[25]A. Yilmaz and M. Shah, “Actions sketch: A novel action representation,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 984–989, 2005.
[26]L. Gorelick, M. Blank, E. Shechtman, M. Irani, and R. Basri, “Actions as space-time shapes,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 12, pp. 2247–2253, 2007.
[27]J. Yamato, J. Ohya, and K. Ishii, “Recognizing human action in time-sequential images using hidden markov model,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 379–385, 1992.
[28]Q. Zhou , S. Yu , X. Wu, Q. Gao, C. Li and Y. Xu, “HMMs-Based Human Action Recognition for an Intelligent Household Surveillance Robot,” in Proceedings of IEEE International Conference on Robotics and Biomimetics (ROBIO), pp. 2295 - 2300, 2009.
[29]R. Cucchiara, A. Prati and R Vezzani, “Posture classification in a multi-camera indoor environment,” in Proceedings of International Conference on Image Processing, pp.725-728, 2005.
[30]F. Martinez-Contreras, C. Orrite-Urunuela, E. Herrero-Jaraba , H. Ragheb, S.A. Velastin, “Recognizing Human Actions Using Silhouette-based HMM,” in Proceedings of IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 43–48, 2009.
[31]I. Laptev, “On space-time interest points,” International Journal of Computer Vision, vol. 64, no. 2-3, pp. 107–123, 2005.
[32]A. Agarwal. and B. Triggs. “Recovering 3D Human Pose from Monocular Images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 44-58, 2006.
[33]F. Niu and M. Abdel-Mottaleb, “Hmm-based segmentation and recognition of human activities from video sequences,” IEEE Int’l Conference on Multimedia and Expo, pp. 804–807, July 2005.
[34]D. Hogg, “Model-based vision: A program to see a walking person,” Image Vis. Comput., vol.1, no.1, pp.5–20, Feb. 1983.
[35]C. Wren, A. Azarbayejani, T. Darrell, and A. Pentland, “Pfinder: Real-time tracking of the human body,” IEEE Transactions Pattern Anal. Mach. Intell., vol.19, no.7, pp.780–785, July 1997.

QR CODE