簡易檢索 / 詳目顯示

研究生: 莊崴丞
Chuang Wei-Cheng
論文名稱: 在複雜動態環境下利用損失函數方法的手勢操控系統
Hand Gesture Recognition System with the Use of Basic Loss Functions in a Complicated and Dynamic Environment
指導教授: 蘇順豐
Shun-Feng Su
口試委員: 郭重顯
Chung-Hsien Kuo
王偉彥
Wei-Yen Wang
莊鎮嘉
Chen-Chia Chuang
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2012
畢業學年度: 100
語文別: 英文
論文頁數: 108
中文關鍵詞: 複雜背景動態環境膚色偵測損失函數
外文關鍵詞: complicated background, dynamic environment, skin-color detection, loss function.
相關次數: 點閱:191下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究目的在於讓行動不便之者可以不需要藉由手來推動輪椅,而是透過手勢直接導引、操控輪椅。一般的手勢辨識相關文獻只探討如何提高手勢辨識的精準度,在本篇論文,將會討論如何在動態背景下去處理我們的影像,以及在複雜的背景下擷取出完整的手勢前景。本文提出的系統之流程可以分為四個步驟,第一,首先利用背景相減法取得手部前景。第二,由於我們的背景是動態,所以更新我們的背景當 =0.05。並且設立region of interest (ROI)來判斷我們的手是否達到臨界值,然後決定系統是否開始偵測,第三,使用Robust Learning Algorithms中的loss functions概念,去對統計學的離群值(outlier)做延伸應用到膚色偵測上,去濾除一些不必要的膚色雜訊以達到完整擷取出我們的手勢,最後我們利用極座標統計手的像素,找出手指指尖,以方便我們使用指尖來導引輪椅。此方法證明對於在一些複雜場景、膚色背景下的效果都不錯。從實驗結果可發現本文提出的系統除了背景的膚色雜訊過大或者手部光源分布不均,整體而言是相當成功的。


    The purpose of this study is to let people with physical disabilities not need to drive the wheel, but guide and control it directly on the wheelchair by gestures. Unlike other studies that focus on the stage of hand gesture recognition, In this paper, we will discuss how to process image in a dynamic and complicated background to obtain the complete hand of the foreground. There are 4 stages in the proposed system. First, it is to use the background subtraction to get the hand image. Since the background is dynamic, the background is updated with =0.05. Secondly, the region of interest (ROI) is considered to decide whether the foreground arrives the threshold value and whether our system starts to detect the hand, Thirdly, in order to extract the completely gesture from the image, unnecessary skin-color noises are removed by a skin-color approach. The proposed skin-color detection approach is adopted from the idea of Robust Learning Algorithms with use of loss functions for outlier. Finally, the method of polar coordinates is employed to count the pixels of the hand, and the fingertips are found to guide conveniently the wheelchair. The method is performed promisingly for complicated sites and skin-color backgrounds. Experimental results show that the proposed system is quite promising except that a very small number of frames are misjudged because the system cannot deal with some problems such as the area of the noise being too large or the distribution of light source being too uneven on the hand region.

    Contents 中文摘要 I Abstract II 致謝 III Contents IV Figure list VI Table list VIII Chapter 1 Introduction 1 1.1 Research Motivation and Objective 1 1.2 Environment 3 1.3 Thesis organization 4 Chapter 2 Related Work 5 2.1 Microsoft Kinect Introduction 5 2.2 Hand Gesture Recognition Techniques 7 2.2.1 Glove-Based Techniques 8 2.2.2 Vision-Based Techniques 9 2.3 Methods of Segmentation 11 2.4 Skin Color Detection Techniques 12 2.4.1 RGB Color Space 12 2.4.2 HSV Color Space 14 2.4.3 YCbCr Color Space 15 Chapter 3 System Description 18 3.1 Detection of the Appearance of Hands 19 3.1.1 Background Subtraction 21 3.1.2 Background Updating 22 3.2 Segmentation of Hand Regions 25 3.2.1 Morphological Techniques 27 3.2.2 Component Labeling 30 3.3 Robust Skin Color Detection 32 3.3.1 Filter By The Concept of Outlier 32 3.3.2 The least square of major skin color 34 3.3.3 The concept of use loss function 37 3.3.4 Category of Loss Function 39 3.4 Segmentation of Full Palm 43 3.5 Hand Gesture Recognition 48 Chapter 4 Experimental Result 54 4.1 Experimental Results in Simple Background 56 4.2 Experimental Results in Complicated background 66 4.3 Problem Discussion 89 Chapter 5 Conclusion 93 5.1 Conclusion 93 5.2 Future work 94 Reference 95 Figure list Figure 1.1 The structure of system 3 Figure 1.2 The Logitech webcam 3 Figure 2.1 The structure of Kinect devices 5 Figure 2.2 The effective range of Kinect 7 Figure 2.3 Categories of gloves 9 Figure 2.4 Example of applying background subtraction as the user is in still 12 Figure 2.5 RGB color space model [25] 13 Figure 2.6 Result of skin color detection in RGB color space 14 Figure 2.7 HSV color space model [28] 14 Figure 2.8 Result of skin color detection in HSV color space 15 Figure 2.9 Result of skin color detection in YCbCr color space 16 Figure 3.1 The flowchart of whole system 19 Figure 3.2 A period of background 20 Figure 3.3 The average of figure 3.2 21 Figure 3.4 The foreground of hand 22 Figure 3.5 Updated background images under different updating rate 23 Figure 3.6 Variation in number of foreground pixels 24 Figure 3.7 ROI area 25 Figure 3.8 The variation of pixel in ROI 26 Figure 3.9 Image after Detection of the appearance of hand 27 Figure 3.10 3x3 structure element 28 Figure 3.11 The erosion operation example 28 Figure 3.12 The dilation operation example 28 Figure 3.13 Opening result example 29 Figure 3.14 The erosion operation example 29 Figure 3.15 Apply the morphological techniques to input image 30 Figure 3.16 Example of component labeling [40] 31 Figure 3.17 The result of component labeling 32 Figure 3.18 The outlier is from the median [43] 33 Figure 3.19 Example of filter by outlier 34 Figure 3.20 Hand and skin-color-like objects 35 Figure 3.21 The CbCr distribution 35 Figure 3.22 Different environments with hand 36 Figure 3.23 The linear regression of figure 3.22 37 Figure 3.24 The mapping of the figure 3.23 37 Figure 3.25 The concept of use loss function in this system 38 Figure 3.26 The distribution of error of CbCr 39 Figure 3.27 The comparison of loss functions 40 Figure 3.28 The comparison of influence functions corresponding to figure 3.27 41 Figure 3.29 The results of loss function from figure 3.20 42 Figure 3.30 The final result by the concept of loss functions 43 Figure 3.31 Dividing the edge of the image into 4 line segments [3] 44 Figure 3.32 The considering hand to segment 45 Figure 3.33 Example of calculating the angle of the reaching hand 46 Figure 3.34 An example after rotating 46 Figure 3.35 The range of wrist line locates 47 Figure 3.36 The adjustment of hand palm 47 Figure 3.37 A binary image transforming into a polar image 48 Figure 3.38 Implement the threshold rule into the example 49 Figure 3.39 Implement the threshold rule into the example 50 Figure 3.40 Implement the threshold rule into the example 50 Figure 3.41 Implement the threshold rule into the example 51 Figure 3.42 Finding fingertips by polar Image 51 Figure 3.43 The direction of fingertip 52 Figure 3.44 When user raises even fingers 52 Figure 3.45 The gestures in our system 53 Figure 4.1 The comparison of approaches 59 Figure 4.2 The comparison of approaches 66 Figure 4.3 The results of comparison 72 Figure 4.4 The results of comparison 80 Figure 4.5 The results of comparison 88 Figure 4.6 Misjudged case 90 Figure 4.7 Misjudged case 91 Figure 4.8 Misjudged case 92 Table list Table 1.1 The specification of the webcam 4 Table 3.1 The loss functions are used in the literature. 40 Table 4.1 Test environment 54 Table 4.2 Representations of characters below the frames 55 Table 4.3 The result of experiment 1 59 Table 4.4 The result of experiment 2 65 Table 4.5 The result of experiment 3 70 Table 4.6 The experiment 3 result of adopting traditional skin color detection 71 Table 4.7 The experiment 3 result of adopting approach [3] 71 Table 4.8 The result of experiment 4 78 Table 4.9 The experiment 4 result of adopting traditional skin color detection 79 Table 4.10 The experiment 4 result of adopting approach [3] 79 Table 4.11 The result of experiment 5 86 Table 4.12 The experiment 5 result of adopting traditional skin color detection 87 Table 4.13 The experiment 5 result of adopting approach [3] 88

    References

    [1] 內政部統計處, http://www.moi.gov.tw/stat/
    [2] 莊鎮嘉, “在極大誤差情況之強健塑模”, 國立台灣科技大學電機工程研究所博士論文, 2000
    [3] 廖崇儒, “可供輪椅使用者使用之基於電腦視覺的手勢辨識系統”, 碩士論文, 國立台灣科技大學電機工程研究所, 2010
    [4] Kinect for Windows, http://www.microsoft.com/en-us/kinectforwindows/
    [5] 張榮財, 陳建宏, “應用主成份分析法與倒傳遞類神經網路於手語手勢辨識”, 2010 9th Conference on Information Technology and Applications in Outlying Islands, pp.2-5.
    [6] K.H. Park, H.E. Lee, Y. Kim, and Z.Z. Bien, “A Steward Robot for Human-Friendly Human-Machine Interaction in a Smart House Environment,” IEEE Transactions on Automation Science and Engineering, pp. 21-25, 2008.
    [7] S.W. Lee, “Automatic Gesture Recognition for Intelligent Human-Robot Interaction,”1 Proceedings of the 7th International Conference on Automatic Face and Gesture Recognition, pp.1-6, 2006.
    [8] Y. Sato, M. Saito, and H. Koike, “Real Time Input of 3D Pose and Gestures of A
    User’s Hand and Its Application for HCI,” Virtual Reality Annual International
    Symposium, pp. 79-86, 2001.
    [9] W. Ying and T.S. Huang, “Capturing Articulated Human Hand Motion: A Divide-and-Conquer Approach,” The Proceedings of the Seventh IEEE
    International Conference on Computer Vision, pp. 606-611, 1999.
    [10] J. Lee and T. L. Kunii, “Model-Based Analysis of Hand Posture,” IEEE Computer Graphics and Applications, pp. 77-86, 1995.
    [11] K. Kwon, H. Zhang., and F. Dornaika, “Hand Pose Recovery with a Single Video Camera,” IEEE International Conference on Robotics and Automation, pp.1194-1200, 2001.
    [12] Y. Shirai and Y. Kuno, “Hand Posture Estimation by Combining 2-D Appearance-based and 3-D Model-based Approacbes” IEEE Pattern Recognition, 2000. Proceedings. 15th International Conference on , vol.3, no., pp.705-708 vol.3, 2000.
    [13] N. Shimada, K. Kimura, Y. Shirai and Y. Kuno, “Rapid Hand Posture Recognition Using Adaptive Histogram Template of Skin and Hand Edge Contour,” Machine Vision and Image Processing (MVIP), 2010 6th Iranian , vol., no., pp.1-5, 27-28 Oct. 2010
    [14] L.W. Howe, F. Wong, A. Chekima, “Comparison of Hand Segmentation Methodologies for Hand Gesture, ” IEEE Information Technology, 2008. ITSim 2008. International Symposium on , vol.2, no., pp.1-7, 26-28
    [15] 葉宗樺, “應用於網路攝影機於指尖軌跡圖形辨識”, 碩士論文, 南台科技大學, 2006
    [16] J. Issacs and J.S. Foo, “Hand Pose Estimation for American Sign Language Recognition,” Proc. of the Thirty-Sixth Southeastern Symposium on System Theory, pp. 132-136, 2004.
    [17] Wikipedia, http://en.wikipedia.org/wiki/Principal_component_analysis
    [18] 曹文潔,“猜拳機”, 國立中央大學電機工程所,碩士論文, 2007.
    [19] 黃俊捷, “互動雙足式機器人之設計與實現(I)手勢辨識”, 國立中央大學電機工程所,碩士論文,2008
    [20] C. Kim and J.N. Hwang, “Fast and Automatic Video Object Segmentation and Tracking for Content-Based Applications,” IEEE Transactions on Circuits and System for Video Technology, 12(7), pp.597-612, 2002.
    [21] S. Murali and R. Girisha, “Segmentation of Motion Objects from Surveillance Video Sequences using Temporal Differencing Combined with Multiple Correlation,” 6th IEEE International Conference on Advanced Video and Signal Based Surveillance, AVSS 2009, art. no.5279625, pp.472-477, 2009.
    [22] J.L. Barron, D.J. Fleet, and S.S. Beauchemin, “Performance of optical flow techniques,” International Journal of Computer Vision, 12(1), pp.43-77, 1994.
    [23] C. Kim and J.N. Hwang, “Fast and Automatic Video Object Segmentation and Tracking for Content-Based Applications,” IEEE Transactions on Circuits and System for Video Technology, 12(7), pp.597-612, 2002.
    [24] Z. Zivkovic, “Improved Adaptive Gaussian Mixture Model for Background Subtraction,” Proceedings-International Conference on Pattern Recognition, 2, pp.28-31, 2004.
    [25] C.G. Rafael and E.W. Richard, “Digital Image Processing 2/e,” Prentice Hall,
    2002.
    [26] L. K. Simone and D. G. Kamper, “Design considerations for a wearable monitor to measure finger posture.” Journal of NeuroEngineering and Rehabilitation, 2005.
    [27] S. Singh, D.S. Chauhan, M. Vatsa and R. Singh, “A Robust Skin Color Based Face Detection Algorithm ,” Tamkang Journal of Science and Engineering, pp. 227-234, 2003.
    [28] Wikipedia, http://zh.wikipedia.org/wiki/File:HSV_cone.jpg
    [29] M.H. Yang, D.J. Kriegman, and N. Ahuja, “Detecting Faces in Images: a
    Survey,“ IEEE Trans. Pattern Anal. Mach. Intell., pp. 34-58, 2002.
    [30] R.L. Hsu, A. Mottaleb and K. Jain, “Face detection in color images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 696 - 706, 2002
    [31] V. Tartter and K. Knowlton. “Perception of sign language from an array of 27 moving spots,” Nature 289, pp. 676-678, 1981.
    [32] R.L. Hsu, A. Mottaleb and K. Jain, “Face detection in color images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 696 - 706, 2002.
    [33] VPL Research Inc., DataGlove Model 2 User’s Manual, Redwood City, CA, 1987.
    [34] Virtex Co,“ Company brochure,” Stanford, CA, October, 1992.
    [35] Kaiser, Polhemus, “ 3 Space user’s manual, ” A Kaiser Aerospace & Electronics Company, 1987.
    [36] V. Frati, D. Prattichizzo, “Using Kinect for hand tracking and rendering in wearable haptics,” IEEE World Haptics Conference, pp. 317-321, 2011
    [37] Z. Ren, J. Meng, J. Yuan, “Depth Camera Based Hand Gesture Recognition
    and its Applications in Human-Computer-Interaction,” International Conference on Information and Communication Systems,2011
    [38] 辛柏陞,“虛擬實境手部功能訓練系統之設計開發與成效探討之研究”, 國立中央大學機械工程研究所, 博士論文, 2005.
    [39] 李慶銘, “即時影音教學傳播系統的實現”, 碩士論文,國立台灣科技大學電機工程系,2011.
    [40] 王貞元, “火災偵測和區域定位應用於影像監”, 台灣科技大學電機工程學系碩論文, 2008.
    [41] 趙于翔, “可攜式台灣手語發聲系統”, 淡江大學電機工程研究所,碩士論文, 2002.
    [42] Barnett, V. Lewis, “Outliers in Statistical Data 3/e,” John Wiley & Sons, 1994.
    [43] Math Open Reference, http://www.mathopenref.com/tocs/trigfunctionstoc.html
    [44] 涂又仁, “利用人臉及手勢辨識之人機介面”, 國立中正大學電機工程研究所, 碩士論文, 2007.
    [45] E.J. Holden and R. Owens, “Recognizing Moving Hand Shapes,” Proceedings of
    the 12th International Conference on Image Analysis and Processing, pp. 14-19,
    2003
    [46] MAD, http://en.wikipedia.org/wiki/Median_absolute_deviation
    [47] Regression, http://en.wikipedia.org/wiki/Regression_analysis

    無法下載圖示 全文公開日期 2017/07/31 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE