在複雜動態環境下利用損失函數方法的手勢操控系統｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	莊崴丞 Chuang Wei-Cheng
論文名稱：	在複雜動態環境下利用損失函數方法的手勢操控系統 Hand Gesture Recognition System with the Use of Basic Loss Functions in a Complicated and Dynamic Environment
指導教授：	蘇順豐 Shun-Feng Su
口試委員:	郭重顯 Chung-Hsien Kuo 王偉彥 Wei-Yen Wang 莊鎮嘉 Chen-Chia Chuang
學位類別：	碩士 Master
系所名稱：	電資學院 - 電機工程系 Department of Electrical Engineering
論文出版年：	2012
畢業學年度：	100
語文別：	英文
論文頁數：	108
中文關鍵詞：	複雜背景、動態環境、膚色偵測、損失函數
外文關鍵詞：	complicated background, dynamic environment, skin-color detection, loss function.
相關次數：	點閱：191 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

本研究目的在於讓行動不便之者可以不需要藉由手來推動輪椅，而是透過手勢直接導引、操控輪椅。一般的手勢辨識相關文獻只探討如何提高手勢辨識的精準度，在本篇論文，將會討論如何在動態背景下去處理我們的影像，以及在複雜的背景下擷取出完整的手勢前景。本文提出的系統之流程可以分為四個步驟，第一，首先利用背景相減法取得手部前景。第二，由於我們的背景是動態，所以更新我們的背景當 =0.05。並且設立region of interest (ROI)來判斷我們的手是否達到臨界值，然後決定系統是否開始偵測，第三，使用Robust Learning Algorithms中的loss functions概念，去對統計學的離群值(outlier)做延伸應用到膚色偵測上，去濾除一些不必要的膚色雜訊以達到完整擷取出我們的手勢，最後我們利用極座標統計手的像素，找出手指指尖，以方便我們使用指尖來導引輪椅。此方法證明對於在一些複雜場景、膚色背景下的效果都不錯。從實驗結果可發現本文提出的系統除了背景的膚色雜訊過大或者手部光源分布不均，整體而言是相當成功的。

The purpose of this study is to let people with physical disabilities not need to drive the wheel, but guide and control it directly on the wheelchair by gestures. Unlike other studies that focus on the stage of hand gesture recognition, In this paper, we will discuss how to process image in a dynamic and complicated background to obtain the complete hand of the foreground. There are 4 stages in the proposed system. First, it is to use the background subtraction to get the hand image. Since the background is dynamic, the background is updated with =0.05. Secondly, the region of interest (ROI) is considered to decide whether the foreground arrives the threshold value and whether our system starts to detect the hand, Thirdly, in order to extract the completely gesture from the image, unnecessary skin-color noises are removed by a skin-color approach. The proposed skin-color detection approach is adopted from the idea of Robust Learning Algorithms with use of loss functions for outlier. Finally, the method of polar coordinates is employed to count the pixels of the hand, and the fingertips are found to guide conveniently the wheelchair. The method is performed promisingly for complicated sites and skin-color backgrounds. Experimental results show that the proposed system is quite promising except that a very small number of frames are misjudged because the system cannot deal with some problems such as the area of the noise being too large or the distribution of light source being too uneven on the hand region.

Contents
中文摘要	I
Abstract	II
致謝	III
Contents	IV
Figure list	VI
Table list	VIII
Chapter 1 Introduction	1
1.1 Research Motivation and Objective	1
1.2 Environment	3
1.3 Thesis organization	4
Chapter 2 Related Work	5
2.1 Microsoft Kinect Introduction	5
2.2 Hand Gesture Recognition Techniques	7
2.2.1 Glove-Based Techniques	8
2.2.2 Vision-Based Techniques	9
2.3 Methods of Segmentation	11
2.4 Skin Color Detection Techniques	12
2.4.1 RGB Color Space	12
2.4.2 HSV Color Space	14
2.4.3 YCbCr Color Space	15
Chapter 3 System Description	18
3.1 Detection of the Appearance of Hands	19
3.1.1 Background Subtraction	21
3.1.2 Background Updating	22
3.2 Segmentation of Hand Regions	25
3.2.1 Morphological Techniques	27
3.2.2 Component Labeling	30
3.3 Robust Skin Color Detection	32
3.3.1 Filter By The Concept of Outlier	32
3.3.2 The least square of major skin color	34
3.3.3 The concept of use loss function	37
3.3.4 Category of Loss Function	39
3.4 Segmentation of Full Palm	43
3.5 Hand Gesture Recognition	48
Chapter 4 Experimental Result	54
4.1 Experimental Results in Simple Background	56
4.2 Experimental Results in Complicated background	66
4.3 Problem Discussion	89
Chapter 5 Conclusion	93
5.1 Conclusion	93
5.2 Future work	94
Reference	95


Figure list

Figure 1.1 The structure of system	3
Figure 1.2 The Logitech webcam	3
Figure 2.1 The structure of Kinect devices	5
Figure 2.2 The effective range of Kinect	7
Figure 2.3 Categories of gloves	9
Figure 2.4 Example of applying background subtraction as the user is in still	12
Figure 2.5 RGB color space model [25]	13
Figure 2.6 Result of skin color detection in RGB color space	14
Figure 2.7 HSV color space model [28]	14
Figure 2.8 Result of skin color detection in HSV color space	15
Figure 2.9 Result of skin color detection in YCbCr color space	16
Figure 3.1 The flowchart of whole system	19
Figure 3.2 A period of background	20
Figure 3.3 The average of figure 3.2	21
Figure 3.4 The foreground of hand	22
Figure 3.5 Updated background images under different updating rate	23
Figure 3.6 Variation in number of foreground pixels	24
Figure 3.7 ROI area	25
Figure 3.8 The variation of pixel in ROI	26
Figure 3.9 Image after Detection of the appearance of hand	27
Figure 3.10 3x3 structure element	28
Figure 3.11 The erosion operation example	28
Figure 3.12 The dilation operation example	28
Figure 3.13 Opening result example	29
Figure 3.14 The erosion operation example	29
Figure 3.15 Apply the morphological techniques to input image	30
Figure 3.16 Example of component labeling [40]	31
Figure 3.17 The result of component labeling	32
Figure 3.18 The outlier is from the median [43]	33
Figure 3.19 Example of filter by outlier	34
Figure 3.20 Hand and skin-color-like objects	35
Figure 3.21 The CbCr distribution	35
Figure 3.22 Different environments with hand	36
Figure 3.23 The linear regression of figure 3.22	37
Figure 3.24 The mapping of the figure 3.23	37
Figure 3.25 The concept of use loss function in this system	38
Figure 3.26 The distribution of error of CbCr	39
Figure 3.27 The comparison of loss functions	40
Figure 3.28 The comparison of influence functions corresponding to figure 3.27	41
Figure 3.29 The results of loss function from figure 3.20	42
Figure 3.30 The final result by the concept of loss functions	43
Figure 3.31 Dividing the edge of the image into 4 line segments [3]	44
Figure 3.32 The considering hand to segment	45
Figure 3.33 Example of calculating the angle of the reaching hand	46
Figure 3.34 An example after rotating	46
Figure 3.35 The range of wrist line locates	47
Figure 3.36 The adjustment of hand palm	47
Figure 3.37 A binary image transforming into a polar image	48
Figure 3.38 Implement the threshold rule into the example	49
Figure 3.39 Implement the threshold rule into the example	50
Figure 3.40 Implement the threshold rule into the example	50
Figure 3.41 Implement the threshold rule into the example	51
Figure 3.42 Finding fingertips by polar Image	51
Figure 3.43 The direction of fingertip	52
Figure 3.44 When user raises even fingers	52
Figure 3.45 The gestures in our system	53
Figure 4.1 The comparison of approaches	59
Figure 4.2 The comparison of approaches	66
Figure 4.3 The results of comparison	72
Figure 4.4 The results of comparison	80
Figure 4.5 The results of comparison	88
Figure 4.6 Misjudged case	90
Figure 4.7 Misjudged case	91
Figure 4.8 Misjudged case	92







Table list

Table 1.1 The specification of the webcam	4
Table 3.1 The loss functions are used in the literature.	40
Table 4.1 Test environment	54
Table 4.2 Representations of characters below the frames	55
Table 4.3 The result of experiment 1	59
Table 4.4 The result of experiment 2	65
Table 4.5 The result of experiment 3	70
Table 4.6 The experiment 3 result of adopting traditional skin color detection	71
Table 4.7 The experiment 3 result of adopting approach [3]	71
Table 4.8 The result of experiment 4	78
Table 4.9 The experiment 4 result of adopting traditional skin color detection	79
Table 4.10 The experiment 4 result of adopting approach [3]	79
Table 4.11 The result of experiment 5	86
Table 4.12 The experiment 5 result of adopting traditional skin color detection	87
Table 4.13 The experiment 5 result of adopting approach [3]	88

                                

References

[1] 內政部統計處, http://www.moi.gov.tw/stat/
[2] 莊鎮嘉, “在極大誤差情況之強健塑模”, 國立台灣科技大學電機工程研究所博士論文, 2000
[3] 廖崇儒, “可供輪椅使用者使用之基於電腦視覺的手勢辨識系統”, 碩士論文, 國立台灣科技大學電機工程研究所, 2010
[4] Kinect for Windows, http://www.microsoft.com/en-us/kinectforwindows/
[5] 張榮財, 陳建宏, “應用主成份分析法與倒傳遞類神經網路於手語手勢辨識”, 2010 9th Conference on Information Technology and Applications in Outlying Islands, pp.2-5.
[6] K.H. Park, H.E. Lee, Y. Kim, and Z.Z. Bien, “A Steward Robot for Human-Friendly Human-Machine Interaction in a Smart House Environment,” IEEE Transactions on Automation Science and Engineering, pp. 21-25, 2008.
[7] S.W. Lee, “Automatic Gesture Recognition for Intelligent Human-Robot Interaction,”1 Proceedings of the 7th International Conference on Automatic Face and Gesture Recognition, pp.1-6, 2006.
[8] Y. Sato, M. Saito, and H. Koike, “Real Time Input of 3D Pose and Gestures of A
User’s Hand and Its Application for HCI,” Virtual Reality Annual International
Symposium, pp. 79-86, 2001.
[9] W. Ying and T.S. Huang, “Capturing Articulated Human Hand Motion: A Divide-and-Conquer Approach,” The Proceedings of the Seventh IEEE
International Conference on Computer Vision, pp. 606-611, 1999.
[10] J. Lee and T. L. Kunii, “Model-Based Analysis of Hand Posture,” IEEE Computer Graphics and Applications, pp. 77-86, 1995.
[11] K. Kwon, H. Zhang., and F. Dornaika, “Hand Pose Recovery with a Single Video Camera,” IEEE International Conference on Robotics and Automation, pp.1194-1200, 2001.
[12] Y. Shirai and Y. Kuno, “Hand Posture Estimation by Combining 2-D Appearance-based and 3-D Model-based Approacbes” IEEE Pattern Recognition, 2000. Proceedings. 15th International Conference on , vol.3, no., pp.705-708 vol.3, 2000.
[13] N. Shimada, K. Kimura, Y. Shirai and Y. Kuno, “Rapid Hand Posture Recognition Using Adaptive Histogram Template of Skin and Hand Edge Contour,” Machine Vision and Image Processing (MVIP), 2010 6th Iranian , vol., no., pp.1-5, 27-28 Oct. 2010
[14] L.W. Howe, F. Wong, A. Chekima, “Comparison of Hand Segmentation Methodologies for Hand Gesture, ” IEEE Information Technology, 2008. ITSim 2008. International Symposium on , vol.2, no., pp.1-7, 26-28
[15] 葉宗樺, “應用於網路攝影機於指尖軌跡圖形辨識”, 碩士論文, 南台科技大學, 2006
[16] J. Issacs and J.S. Foo, “Hand Pose Estimation for American Sign Language Recognition,” Proc. of the Thirty-Sixth Southeastern Symposium on System Theory, pp. 132-136, 2004.
[17] Wikipedia, http://en.wikipedia.org/wiki/Principal_component_analysis
[18] 曹文潔,“猜拳機”, 國立中央大學電機工程所,碩士論文, 2007.
[19] 黃俊捷, “互動雙足式機器人之設計與實現(I)手勢辨識”, 國立中央大學電機工程所,碩士論文,2008
[20] C. Kim and J.N. Hwang, “Fast and Automatic Video Object Segmentation and Tracking for Content-Based Applications,” IEEE Transactions on Circuits and System for Video Technology, 12(7), pp.597-612, 2002.
[21] S. Murali and R. Girisha, “Segmentation of Motion Objects from Surveillance Video Sequences using Temporal Differencing Combined with Multiple Correlation,” 6th IEEE International Conference on Advanced Video and Signal Based Surveillance, AVSS 2009, art. no.5279625, pp.472-477, 2009.
[22] J.L. Barron, D.J. Fleet, and S.S. Beauchemin, “Performance of optical flow techniques,” International Journal of Computer Vision, 12(1), pp.43-77, 1994.
[23] C. Kim and J.N. Hwang, “Fast and Automatic Video Object Segmentation and Tracking for Content-Based Applications,” IEEE Transactions on Circuits and System for Video Technology, 12(7), pp.597-612, 2002.
[24] Z. Zivkovic, “Improved Adaptive Gaussian Mixture Model for Background Subtraction,” Proceedings-International Conference on Pattern Recognition, 2, pp.28-31, 2004.
[25] C.G. Rafael and E.W. Richard, “Digital Image Processing 2/e,” Prentice Hall,
2002.
[26] L. K. Simone and D. G. Kamper, “Design considerations for a wearable monitor to measure finger posture.” Journal of NeuroEngineering and Rehabilitation, 2005.
[27] S. Singh, D.S. Chauhan, M. Vatsa and R. Singh, “A Robust Skin Color Based Face Detection Algorithm ,” Tamkang Journal of Science and Engineering, pp. 227-234, 2003.
[28] Wikipedia, http://zh.wikipedia.org/wiki/File:HSV_cone.jpg
[29] M.H. Yang, D.J. Kriegman, and N. Ahuja, “Detecting Faces in Images: a
Survey,“ IEEE Trans. Pattern Anal. Mach. Intell., pp. 34-58, 2002.
[30] R.L. Hsu, A. Mottaleb and K. Jain, “Face detection in color images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 696 - 706, 2002
[31] V. Tartter and K. Knowlton. “Perception of sign language from an array of 27 moving spots,” Nature 289, pp. 676-678, 1981.
[32] R.L. Hsu, A. Mottaleb and K. Jain, “Face detection in color images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 696 - 706, 2002.
[33] VPL Research Inc., DataGlove Model 2 User’s Manual, Redwood City, CA, 1987.
[34] Virtex Co,“ Company brochure,” Stanford, CA, October, 1992.
[35] Kaiser, Polhemus, “ 3 Space user’s manual, ” A Kaiser Aerospace & Electronics Company, 1987.
[36] V. Frati, D. Prattichizzo, “Using Kinect for hand tracking and rendering in wearable haptics,” IEEE World Haptics Conference, pp. 317-321, 2011
[37] Z. Ren, J. Meng, J. Yuan, “Depth Camera Based Hand Gesture Recognition
and its Applications in Human-Computer-Interaction,” International Conference on Information and Communication Systems,2011
[38] 辛柏陞,“虛擬實境手部功能訓練系統之設計開發與成效探討之研究”, 國立中央大學機械工程研究所, 博士論文, 2005.
[39] 李慶銘, “即時影音教學傳播系統的實現”, 碩士論文，國立台灣科技大學電機工程系，2011.
[40] 王貞元, “火災偵測和區域定位應用於影像監”, 台灣科技大學電機工程學系碩論文, 2008.
[41] 趙于翔, “可攜式台灣手語發聲系統”, 淡江大學電機工程研究所,碩士論文, 2002.
[42] Barnett, V. Lewis, “Outliers in Statistical Data 3/e,” John Wiley & Sons, 1994.
[43] Math Open Reference, http://www.mathopenref.com/tocs/trigfunctionstoc.html
[44] 涂又仁, “利用人臉及手勢辨識之人機介面”, 國立中正大學電機工程研究所, 碩士論文, 2007.
[45] E.J. Holden and R. Owens, “Recognizing Moving Hand Shapes,” Proceedings of
the 12th International Conference on Image Analysis and Processing, pp. 14-19,
2003
[46] MAD, http://en.wikipedia.org/wiki/Median_absolute_deviation
[47] Regression, http://en.wikipedia.org/wiki/Regression_analysis

全文公開日期 2017/07/31 (校內網路)
全文公開日期本全文未授權公開 (校外網路)
全文公開日期本全文未授權公開 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文