研究生: |
郭昱慶 Yu-Cing Guo |
---|---|
論文名稱: |
融合色彩與深度影像資訊建立三維場景模型 Fusion of Color and Range Images for 3D Scene Modeling |
指導教授: |
徐繼聖
Gee-Sern Hsu |
口試委員: |
鍾國亮
Kuo-Liang Chung 鐘聖倫 Sern-Lun Chung 洪一平 Yi-Ping Hung 郭景明 Jing-Ming Guo |
學位類別: |
碩士 Master |
系所名稱: |
工程學院 - 機械工程系 Department of Mechanical Engineering |
論文出版年: | 2011 |
畢業學年度: | 99 |
語文別: | 中文 |
論文頁數: | 50 |
中文關鍵詞: | SFM 、SFS 、MSCR 、MSER 、深度 、三維場景 |
外文關鍵詞: | SFM, SFS, MSCR, MSER, Depth, 3D scene |
相關次數: | 點閱:322 下載:6 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
建立三維場景模型的方法大多運用多張二維像素點結合數學模型推導出三維場景座標,如Shape From Motion(SFM)或Shape From Shading(SFS),但此類方法在推估深度時易受光源影響,導致場景模型之誤差。本文運用深度影像與彩色影像之搭配,提出一種建立三維場景模型的方法。擷取不易受光源影響的深度特徵,結合彩色影像中的色彩特徵區塊,將場景中的區塊進行切割,再配合即時的前景與人臉偵測,進行背景確認,有效建立三維場景模型。本文提出的方法主要有兩部份,首先利用深度影像中不同平面的梯度特性,利用Maximally Stable Extremal Regions (MSER)擷取場景中的主要平面;再利用Maximally Stable Colour Regions (MSCR)分割色彩影像中的區塊與邊緣,將較細緻的區域建立於場景模型中。另一部分考慮場景中常有人物出現,遮蔽部分背景,影響場景的讀取,故應用前景與人臉偵測,將前景脫離於場景模型的考慮範圍,區分在場景內停留過久的物件。本文實驗結果可證明所提出之方法,可用於多種室內明暗光源中。
Different from most image-based 3D scene modeling which is easily affected by illumination variations, this research proposes a scene modeling system robust to illumination variations using a fusion of color and depth images. The gradient taken on the depth images and processed by a MSER (Maximally Stable Extremal Region) detector offers an efficient way to extract large sized planes from the scene, and small sized planes can be extracted by a MSCR (Maximally Stable Color Region) detector operated on the color images. The fusion of the outcomes from the MSER and MSCR channels results in an accurate modeling of the 3D scene. When an object or human moves into the scene, the system applies human and face detection to differentiate human from the stationary object, and update the scene with the object. Experiments show that the proposed system can work well in modeling most indoor scenes from various viewpoints and illumination conditions, as long as the scene is within the working range of the depth image camera.
[1] Chau Nguyen, “Object Detection Using Cascaded Classification Models”, Department of Computer Science, Cornell University, United States of America, December 2010
[2] A. Torralba, K. P. Murphy and W. T. Freeman. (2004). Sharing features: “efficient boosting procedures for multiclass object detection”, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). pp 762- 769(2004)
[3] Stephen Gould, Tianshi Gao, Daphne Koller, “Region-based Segmentation and Object Detection”,NIPS conferences, December 2009
[4] Santosh K. Divvala, Derek Hoiem, James H. Hays, Alexei A. Efros, Martial Hebert, “An Empirical Study of Context in Object Detection”, Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on , vol., no., pp.1271-1278, 20-25 June 2009
[5] S. Gould, P. Baumstarck, M. Quigley, A.Y. Ng, and D. Koller ,“Integrating Visual and Range Data for Robotic Object Detection”, Proc. ECCV-08 Workshop on Multi-camera and Multi-modal Sensor Fusion Algorithms and Applications (2008)
[6] Antonio Toralba, Kevin P. Murphy, William T. Freeman, Mark A Rubin, “Context-based vision system for place and object recognition”, Computer Vision, 2003. Proceedings. Ninth IEEE International Conference on , vol., no., pp.273-280 vol.1, 13-16 Oct. 2003
[7] Adam Coates, Andrew Y. Ng, “Multi-Camera Object Detection for Robotics”, Robotics and Automation (ICRA), 2010 IEEE International Conference on , vol., no., pp.412-419, 3-7 May 2010
[8] Tomasz Malisiewicz, Alexei A. Efros. “Improving Spatial Support for Objects via Multiple Segmentations”, British Machine Vision Conference (BMVC 2007), September 2007
[9] Derek Hoiem, Alexei A. Efros, Martial Hebert, “Recovering Surface Layout from an Image”, International Journal of Computer Vision 75(1), 151-172, 2007
[10] Hoiem, D.; Efros, A.A.; Hebert, M.; , “Closing the loop in scene interpretation”, Computer Vision and Pattern Recognition, 2008. CVPR 2008. IEEE Conference on , vol., no., pp.1-8, 23-28 ,June 2008
[11] Amit Bleiweiss* ,Dagan Eshar ,Gershom Kutliroff ,Alon Lerner ,Yinon Oshrat ,Yaron Yanai,Omek Interactive, “Enhanced Interactive Gaming by Blending Full-Body Tracking and Gesture Animation”, In: SA 2010 SIGGRAPH Asia Sketches (2010)
[12] Himanshu Prakash Jain, Anbumani Subramanian, “Real-time Upper-body Human Pose Estimation using a Depth Camera”, HP Laboratories, HPL-2010-190
[13] Andrew D. Wilson, “Using a Depth Camera as a Touch Sensor”, ITS 2010: Devices & Algorithms, Saarbr ‥ucken, Germany, November 7-10, 2010
[14] Manuel Marques, Joao Costeira, “3D face recognition from multiple images: a shape-from-motion approach”, Proc. of FG 2008 - 8th IEEE International Conference on Automatica Face and Gesture Recognition, Amsterdam, The Netherlands, 2008
[15] B.K.P. Horn and M.J. Brooks, “Shape from Shading”, eds. MIT Press,1989.
[16] P. Dupuis and J. Oliensis,“An Optimal Control Formulation and Related Numerical Methods for a Problem in Shape Reconstruction”,The Annals of Applied Probability, vol. 4, no. 2, pp. 287-346,1994.
[17] R. Kimmel and J.A. Sethian, “Optimal Algorithm for Shape from Shading and Path Planning,”J. Math. Imaging and Vision, vol. 14, no. 3, pp. 237-244, 2001.
[18] A.P. Pentland, “Finding the Illuminant Direction”, J. Optical Soc. Of Am., vol. 72, pp. 448-455, 1982.
[19] Mohamad Ivan Fanany *, Itsuo Kumazawa, “A neural network for recovering 3D shape from erroneous and few depth maps of shaded images,ACM”, Pattern Recognition Letters, Volume 25 Issue 4, March 2004
[20] J. Matas, O. Chum, M. Urban, and T. Pajdla. “Robust wide baseline stereo from maximally stable extremal regions”,In 13th BMVC, pages 384–393, September 2002.
[21] P. M. Roth, M. Donoser, and H. Bischof. “Tracking for learning an object representation from unlabeled data”, In CVWW,pages 46–51, 2006.
[22] Forssen, P.-E.; , “Maximally Stable Colour Regions for Recognition and Matching”, Computer Vision and Pattern Recognition, 2007. CVPR '07. IEEE Conference on , vol., no., pp.1-8, 17-22 June 2007