一個基於雙眼立體視覺及指尖偵測技術的三維動作控制系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	林婷萱 Ting-hsuan Lin
論文名稱：	一個基於雙眼立體視覺及指尖偵測技術的三維動作控制系統 A Novel 3-D Motion Control System Based on Binocular Stereo Vision and Fingertip Detection Techniques
指導教授：	范欽雄 Chin-shyurng Fahn
口試委員:	王榮華 Jung-Hua Wang 鄭為民 Wei-Min Jeng 金台齡 Tai-Lin Chin
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2015
畢業學年度：	103
語文別：	英文
論文頁數：	88
中文關鍵詞：	人機互動、動作控制、指尖偵測、計算幾何、雙眼立體視覺、手勢辨識。
外文關鍵詞：	human-computer interaction, motion control, fingertip detection, computational geometry, binocular stereo vision, gesture recognition
相關次數：	點閱：266 下載：3
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

科技發展至今，人們的生活已經離不開各種各樣的電子設備。然而，為了能讓它們在使用上更加便利且人性化，人機互動變成一項相當重要的議題。以手機的發展為例，從傳統的按鍵式發展成智慧型的觸控面板，使得更多的功能可以被實現。近年來，由於體感技術的出現，與系統互動不再需要透過控制媒介，大幅拉近系統與人之間的距離。其中，以手部來互動被認為是最直觀的方式。

本篇論文提出一個基於雙目立體視覺的指尖互動系統，以最簡單的設備與環境設置之下，精確的定位出各手指尖在三維空間中的位置，並展示一些常用的指尖軌跡辨識。首先，為了偵測指尖，我們利用色彩資訊及幾何特徵計算出指尖在二維平面上的位置。再來，用立體視覺的方法建構出視差圖，以取得指尖在三維空間中的深度資訊。在最後，我們整理指尖的三維軌跡特徵，並使用機器學習方法對軌跡做訓練及辨識。

我們使用雙攝影機手機，以不同的測試者，在不同光線環境下進行兩組實驗，並且可以準確的偵測指尖位置及辨識動態指尖軌跡。第一組實驗為指尖的偵測，並顯示手指從零到五的個數，其平均準確率為91.78%；第二組實驗為指尖軌跡的辨識，辨識三種指尖的動態軌跡，其平均準確率為88.40%。我們提出的系統可以達到即時的效果，針對解析度為320x180的影像，整體的平均執行效率大約每秒25個影格。

With the advance of technology, people can’t live without any kind of electronic equipment. In order to make them more convenient and friendlier when using, human-computer interaction becomes a very important topic. Take the development of cell phone for example; the conversion of the interface from traditional button to smart touch panel makes more function be implemented. In recent years, because of the appearance of somatosensory technology, a controller of middleware is no longer needed when interacting with system. Therefore, the gap between system and human is significantly narrowed. Among all somatosensory technology, using hand to communicate with system is considered as the most intuitive way.
In this thesis, a fingertip interaction system based on stereo vision is proposed. We used the simplest devices and setting of environment to exactly locate the position of fingertips in three dimensions space. In addition, we demonstrated the recognition of some common fingertip gestures. First of all, to detect fingertips, we used color information and geometry features to calculate the position of fingertips in two-dimensional plane. And use the method of stereo vision to construct a disparity map to obtain the depth information of fingertips. Finally, we calculated the three-dimensional features of fingertip’s trajectories and applied machine learning to train and recognize trajectories.
We carried out two experiments which are tested by different people in different light condition using a cell phone with two cameras. We can detect the position of fingertips and recognize the fingertip’s gestures accurately. The first experiment is for fingertips detection, and its average accuracy rate is 91.78%. And the second experiment is for gesture recognition, which average accuracy rate is 88.40%. In addition, the purposed system is real-time, and its total performance is about 25 frames per second for the image of 320x180 resolutions.

中文摘要	i
Abstract	ii
致謝	iv
Table of Contents	v
List of Figures	vii
List of Tables	xi
Chapter 1	Introduction	1
1.1	Overview	1
1.2	Motivation	2
1.3	System Description	3
1.4	Thesis Organization	4
Chapter 2	Background and Related Work	5
Chapter 3	Preprocessing and Fingertip Detection	10
3.1	Hand Segmentation	10
3.1.1	Gaussian Blur	10
3.1.2	Skin Color Detection	12
3.1.3	Morphology Processing	14
3.2	Fingertip Detection	18
3.2.1	Calculating Contour	18
3.2.2	Convex Hull	23
3.2.3	Convexity Defect	26
Chapter 4	Dynamic Gesture Recognition	30
4.1	Stereo Vision	30
4.1.1	Synopsis of Stereo Vision	30
4.1.2	Camera Calibration	33
4.1.3	Stereo Rectification	37
4.1.4	Block Matching	40
4.2	Features for Training	42
4.3	Support Vector Machine	44
4.4	Random Forest	48
Chapter 5	Experimental Results and Discussions	52
5.1	Experiment Setup	52
5.2	The Result of Fingertip Detection	54
5.3	The Result of Gesture Recognition	60
Chapter 6	Conclusions and Future Works	66
6.1	Conclusions	66
6.2	Future Works	67
References	69

                                

[1] M. Soegaard, et al., The Encyclopedia of Human-Computer Interaction, Second Edition, [Online] Available: https://www.interaction-design.org/encyclopedia/human_computer_interaction_hci.html (accessed on January 5, 2015).
[2] “Human–computer interaction,” [Online] Available: http://en.wikipedia.org/wiki/Human-computer_interaction (accessed on September 11, 2014).
[3] “Kinect for Windows,” [Online] Available: http://www.microsoft.com/en-us/kinectforwindows/ (accessed on March 23, 2013).
[4] “Kinect,” [Online] Available: http://en.wikipedia.org/wiki/Kinect (accessed on September 11, 2014).
[5] “Leap Motion,” [Online] Available: https://www.leapmotion.com/ (accessed on March 1, 2013).
[6] P. L. Teh, P. K. Ahmed, R. S. Goonetilleke, E. Y. L. Au, S. N. Cheong, W. J. Yap, “Viewing versus Experiencing in Adopting Somatosensory Technology for Smart Applications,” Pacific Asia Journal of the Association for Information Systems, vol. 6, no. 3, Article 2, 2014.
[7] K. Oka, Y. Sato, and H. Koike, “Real-Time Fingertip Tracking and Gesture Recognition,” IEEE Transactions on Computer Graphics and Applications, vol. 22, no. 6, pp. 64-71, 2002.
[8] B. Zheng, L. Zhao, and Y. Wang, “Fingertip Detection and Gesture Recognition Based on Kinect Depth Data,” Transactions on Computer Science and Technology, vol. 3, no. 1, pp. 9-14, 2014.
[9] “3Gear Systems,” [Online] Available: http://threegearsystems.blogspot.tw/ (accessed on January 5, 2015).
[10] L. G. Shapiro and G. C. Stockman, Computer Vision, First Edition, Prentice Hall, Upper Saddle River, New Jersey, 2001.
[11] C. A. Poynton, Digital Video and HDTV: Algorithms and Interfaces, Morgan Kaufmann, Burlington, Massachusetts, 2003.
[12] R. C. Gonzalez and R. E. Woods, Digital Image Processing, Third Edition, Prentice Hall, Upper Saddle River, New Jersey, 2007.
[13] T. Pavlidis, Algorithms for Graphics and Image Processing, Computer Science Press, Rockville, Maryland, 1982.
[14] R. L. Graham and F. F. Yao, “Finding the Convex Hull of a Simple Polygon,” Journal of Algorithms, vol. 4, no. 4, pp. 324-331, 1983.
[15] R. C. Gonzalez and R. E. Woods, Digital Image Processing, Second Edition, Addison-Wesley, Massachusetts, Boston, 1992.
[16] R. Y. Tsai, “An Efficient and Accurate Camera Calibration Technique for 3D Machine Vision,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami Beach, Florida, pp. 364-374, 1986.
[17] P. J. Bagga, “Real Time Depth Computation Using Stereo Imaging,” Journal of Electrical and Electronic Engineering, vol. 1, no. 2, pp. 51-54, 2013.
[18] J. Olivares, I. Benavides, J. Hormigo, J. Villalba, and E. Zapata, “Fast Full-Search Block Matching Algorithm Motion Estimation Alternatives in FPGA,” in Proceedings of the International Conference on Field Programmable Logic and Applications, Madrid, Spain, pp. 1-4, 2006.
[19] C. Cortes and V. Vapnik, “Support-Vector Networks,” Machine Learning, Kluwer Academic Publishers Hingham, Massachusetts, vol. 20, no. 3, pp. 273-297, 1995.
[20] “Support vector machine,” [Online] Available: http://en.wikipedia.org/wiki/Support_vector_machine (accessed on October 6, 2014).
[21] I. El-Naqa, Y. Yang, M. N. Wernick, N. P. Galatsanos, and R. M. Nishikawa, “A Support Vector Machine Approach for Detection of Microcalcifications,” IEEE Transactions on Medical Imaging, vol. 21, no. 12, pp. 1552-1563, 2002.
[22] L. Breiman, “Random Forests,” Machine Learning, vol. 45, no. 1, pp. 5-32, 2001.
[23] L. Breiman, J. Friedman, C. J. Stone, and R. A. Olshen, “Classification and Regression Trees,” Chapman and Hall, London, United Kingdom, 1984.

全文公開日期 2020/02/03 (校內網路)
全文公開日期 2025/02/03 (校外網路)
全文公開日期 2025/02/03 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文