Basic Search / Detailed Display

Author: 吳文華
Wen-hua Wu
Thesis Title: 基於隱藏式馬可夫模型學習機制的人體動作辨識技術
Human Actions Recognition Techniques Based on the Learning Mechanism of HMMs
Advisor: 范欽雄
Chin-Shyurng Fahn
Committee: 王榮華
Hsing-kuo Pao
Degree: 碩士
Department: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
Thesis Publication Year: 2010
Graduation Academic Year: 98
Language: 英文
Pages: 54
Keywords (in Chinese): 人體動作辨識人體姿勢表示人體動作分段隱藏式馬可夫模型星型骨架
Keywords (in other languages): human action recognition, human posture presentation, human action segmentation, HMM, star skeletonization
Reference times: Clicks: 222Downloads: 1
School Collection Retrieve National Library Collection Retrieve Error Report


In recent years, human actions recognition has attracted great attention in computer vision. The objective of the human actions recognition is to recognize each of those continuous actions that constitute a human behavior. The difficulties of the human actions recognition are twofold. First, human body movements and postures are articulated motions; therefore, they involve a high degree of dimensionality and complexity. Second, characterizing human behavior is equivalent to dealing with a sequence of video frames that contain spatial and temporal information.
In this thesis, we propose human actions recognition techniques based on the mechanism HMMs. The star skeleton is used to effectively and efficiently represent the features of postures for each human action. The feature vector sequence extracted from the captured video sequence is converted to a series of symbols which correspond to the codewords in the codebook created by vector quantization. We employ a clustering algorithm to generate the codewords in the codebook. To handle the recognition in continuous actions, the action segmentation is conducted by combining a sliding window scheme with stable contact detection. The extreme points of a star skeleton remaining in the same place for a long enough period are the stable contacts. Primitive motion units (PMUs) that have a consistent number of stable contacts are regarded as a segmented action. When the period of a PMU is too long, we employ a sliding window to segment the continuous actions. We build an HMM for each action type except “fall-down,” and the recognition result is determined as the category that best matches an observed sequence. Moreover, we also exploit the moving speed of the center of the star skeleton in each segmented action sequence to recognize the “fall-down” action type.
There are totally 9 different actions involved in our system. The experimental results of the action segmentation show that our proposed method has the better performance than the sliding window scheme. For single action recognition, the experiments reveal the average recognition rate achieves 95.93%. For continuous actions recognition, the experiments show that we can effectively recognize 9 different actions for the videos including a sequence of actions, it can be correctly recognized with at most 28 frames delay. The video sequences capture with a fixed camera working at 25-30 frames per seconds and the resolution is 320 × 240

中文摘要i Abstractii 致謝iv Contentsv List of Figuresvi List of Tablesviii Chapter 1 Introduction1 1.1Overview1 1.2Motivation1 1.3System Description3 1.4Thesis Organization4 Chapter 2 Background and Related Works6 2.1Hidden Markov Model (HMM)6 2.1.1Outline6 2.1.2Recognition8 2.1.3Learning9 2.2Reviews of Human Posture Representation10 2.3Reviews of Human Action Segmentation11 2.4Reviews of Human Action Recognition13 Chapter 3 Proposed Human Action Recognition Method16 3.1Preliminary Processing17 3.1.1Background Subtraction18 3.1.2Shadow removal19 3.1.3Morphological processing20 3.1.4Connected Component Labeling22 3.2Star Skeletonization23 3.3Star Symbolization27 3.4Action Segmentation28 3.5Action Recognition31 Chapter 4 Experimental Results and Discussions37 Chapter 5 Conclusions and Future Works50 References52

[1]H. Fujiyoshi and A. Lipton, “Real-Time Human Motion Analysis by Image Skeletonization,” in Proceedings of IEEE Workshop on Application Computer Vision, pp.15-21, 1998.
[2]R.T. Collins, A.J. Lipton, T. Kanade, H. Fujiyoshi, D. Duggins, Y. Tsin, D. Tolliver, N. Enomoto, O. Hasegawa, P. Burt, L. Wixson, “A System for Video Surveillance and Monitoring,” CMU-RI-TR-00-12, Carnegie Mellon University, 2000.
[3]A. Ali and J. K. Aggarwal, “Segmentation and recognition of continuous human activity, ” in Proceedings of IEEE Workshop on Detection and Recognition of Events in Video, pp. 28-35, Vancouver, Canada, July, 8 2001.
[4]E. Yu and J.K. Aggarwal, “Detection of Stable Contacts for Human Motion Analysis,” in Proceedings of the 4th ACM international workshop on Video surveillance and sensor networks, Santa Barbara, California, USA, pp. 87-94, 2006.
[5]Cucchiara, R., Grana, C., Neri, G., Piccardi, M. and Prati, A. (2001) “The Sakbot system for moving object detection and tracking,” in P. Remagnino, G.A. Jones, N. Paragios and C.S. Regazzoni (Eds) Video-based Surveillance Systems: Computer Vision and Distributed Processing, Berlin: Springer, pp.145–157.
[6]R. C. Gonzalez and R. E. Woods, Digital Image Processing, 2nd Ed., Addison-Wesley, Reading, Massachusetts, 1992.
[7]T. Kanade, R. Collins, A. Lipton, P. Burt, and L. Wixson. “Advances in cooperative multisensory video surveillance,” In Proceedings of the 1998 DARPA Image Understanding Workshop, volume 1, pp. 3–24, November 1998.
[8]V. Nair and J.J. Clark. Automated visual surveillance using hidden markov models. In International Conference on Vision Interface, pp. 88–93, 2002.
[10]Hsuan-Sheng Chen, Hua-Tsung Chen, Yi-Wen Chen and Suh-Yin Lee, “Human Action Recognition Using Star Skeleton”, in Proceedings of the 4th ACM international workshop on Video surveillance and sensor networks, pp. 171-178, 2006.
[11]Duan-Yu Chen, Hong-Yuan Mark Liao, and Sheng-Wen Shih, “Continuous Human Action Segmentation and Recognition Using A Spatio-Temporal Probabilistic Framework,” in Proceedings of the Eighth IEEE International Symposium on Multimedia, 2006.
[12]L.R. Rabiner and B.H. Juang, "An Introduction to Hidden Markov Models," IEEE ASSP MAGAZINE, pp. 4-16, Jan 1986.
[13]L.R. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech Processing,” in Proceedings of the IEEE, vol. 77, pp. 257-286, 1989.
[14]A. Kale, A. Sundaresan, A. N. Rajagopalan, N. P. Cuntoor, A. K. Roy-Chowdhury, V. Kruger and R. Chellappa, “Identification of Humans using Gait,” IEEE Transactions on Image Processing, pp. 1163-1173, 2004.
[15]M. Leo, T. D'Orazio, I. Gnoni, P. Spagnolo and A. Distante, “Complex Human Activity Recognition for Monitoring Wide Outdoor Environments,” in Proceedings of the 17th International Conference on Pattern Recognition, Vol. 4, pp. 913-916, 2004.
[16]Yanxi Liu, R. T. Collins and T. Tsin, “Gait Sequence Analysis using Frieze Patterns,” in Proceedings of European Conference on Computer Vision, 2002.
[17]Sijun Lu, Jian Zhang and David Dagan Feng, “Detecting unattended packages through human activity recognition and object association,” Pattern Recognition, Vol. 40, pp. 2173-2184, 2007.
[18]D. Jung and A. Zelinsky, “Whisker Based Mobile Robot Navigation,” in Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Osaka, Japan, Vol. 2, pp. 497-504, 1996.
[19]B. Tribelhorn and Z. Dodds, “Evaluating the Roomba: A Low-cost, Ubiquitous Platform for Robotics Research and Education,” in Proceedings of IEEE International Conference on Robotics and Automation, Roma, Italy, pp. 1394-1399, 2007.
[20]R. C. Luo, P. K. Wang, T. Y. Hsu, and T. Y. Lin, “Navigation and Mobile Security System of Intelligent Security Robot,” in Proceedings of the IEEE International Conference on Industrial Technology, pp. 260-265, Hong Kong, December, 2005.
[21]P. K. Turaga, R. Chellappa, V. S. Subrahmanian, and O. Udrea, “Machine recognition of human activities: A survey,” IEEE Trans. Circuits Syst. Video Techn., vol. 18, no. 11, pp. 1473–1488, 2008.
[22]W. Hu, T. Tan, L. Wang, and S. Maybank, “A survey on visual surveillance of object motion and behaviors,” IEEE Transactions on Systems, Man and Cybernetics, vol. 34, pp. 334–352, 2004.
[23]A. F. Bobick and J. W. Davis, “The recognition of human movement using temporal templates,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 3, pp. 257–267, 2001.
[24]T. F. Syeda-Mahmood, M. Vasilescu, and S. Sethi, “Recognizing action events from multiple viewpoints,” IEEE Workshop on Detection and Recognition of Events in Video, pp. 64–72, 2001.
[25]A. Yilmaz and M. Shah, “Actions sketch: A novel action representation,” Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 984–989, 2005.
[26]L. Gorelick, M. Blank, E. Shechtman, M. Irani, and R. Basri, “Actions as space-time shapes,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 12, pp. 2247–2253, 2007.
[27]J. Yamato, J. Ohya, and K. Ishii, “Recognizing human action in time-sequential images using hidden markov model,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 379–385, 1992.
[28]Q. Zhou , S. Yu , X. Wu, Q. Gao, C. Li and Y. Xu, “HMMs-Based Human Action Recognition for an Intelligent Household Surveillance Robot,” in Proceedings of IEEE International Conference on Robotics and Biomimetics (ROBIO), pp. 2295 - 2300, 2009.
[29]R. Cucchiara, A. Prati and R Vezzani, “Posture classification in a multi-camera indoor environment,” in Proceedings of International Conference on Image Processing, pp.725-728, 2005.
[30]F. Martinez-Contreras, C. Orrite-Urunuela, E. Herrero-Jaraba , H. Ragheb, S.A. Velastin, “Recognizing Human Actions Using Silhouette-based HMM,” in Proceedings of IEEE International Conference on Advanced Video and Signal Based Surveillance, pp. 43–48, 2009.
[31]I. Laptev, “On space-time interest points,” International Journal of Computer Vision, vol. 64, no. 2-3, pp. 107–123, 2005.
[32]A. Agarwal. and B. Triggs. “Recovering 3D Human Pose from Monocular Images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 44-58, 2006.
[33]F. Niu and M. Abdel-Mottaleb, “Hmm-based segmentation and recognition of human activities from video sequences,” IEEE Int’l Conference on Multimedia and Expo, pp. 804–807, July 2005.
[34]D. Hogg, “Model-based vision: A program to see a walking person,” Image Vis. Comput., vol.1, no.1, pp.5–20, Feb. 1983.
[35]C. Wren, A. Azarbayejani, T. Darrell, and A. Pentland, “Pfinder: Real-time tracking of the human body,” IEEE Transactions Pattern Anal. Mach. Intell., vol.19, no.7, pp.780–785, July 1997.