簡易檢索 / 詳目顯示

研究生: 林書佑
Shu-You Lin
論文名稱: 視覺化三維街景偵測技術之研究
Study of the Detection Technique for Visual Three-Dimensional Street View
指導教授: 徐勝均
Sheng-Dong Xu
口試委員: 瞿忠正
Chung-Cheng Chiu
柯正浩
Cheng-Hao Ko
徐勝均
Sheng-Dong Xu
學位類別: 碩士
Master
系所名稱: 工程學院 - 自動化及控制研究所
Graduate Institute of Automation and Control
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 67
中文關鍵詞: 立體視覺三維重建街景影像
外文關鍵詞: Stereo vision, three-dimensional reconstruction, street view
相關次數: 點閱:152下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來,街景資訊在人們的生活中一直扮演著重要的角色。伴隨著Google街景服務的普及應用與自駕車對街景環境的偵測與辨識的相關技術發展,三維街景資訊已成現代生活中不可或缺的一環。目前,單一攝影機的立體視覺技術普遍以視覺同步定位與地圖構建(Visual Simultaneous Localization and Mapping, VSLAM)的相關研究方法來實現街景環境的定位與三維地圖重建。本研究利用單一攝影機來實現立體視覺偵測技術。將單支攝影機裝載於移動載具上,並採用與行進方向平行的方式來拍攝街景影像。從直線移動所拍攝的連續影像中,依序挑選兩張影像進行三維街景偵測。由於目前沒有相對準確的方法能估測出影像之間的移動長度,因此,本論文提出相對基線分析演算法,其能相對準確估側出兩張影像之間的移動長度,藉此掌握連續影像之間的座標轉換關係,以達到三維街景正確拼貼的目的。將所拍攝的連續影像以及擷取經由Google街景車拍攝的街景影像數據代入本文方法來驗證分析。實驗結果顯示本文所提出的方法能實現三維街景重建與拼貼,並可從不同視角清楚觀看與分辨每個物體彼此之間的位置和相對距離。


    Street view information has always played an important role in people’s lives. With the popularization and application of Google Street View service and the development of technologies related to the detection and identification of the street view environment by self-driving cars, 3D street view information has become an indispensable part of life. At present, the stereo vision technology of a single camera generally uses the related research methods of VSLAM to realize the positioning of the street view environment and the three-dimensional map reconstruction. In this thesis, the proposed algorithm uses a single camera to realize stereo vision detection with a single camera mounted on a mobile vehicle and shoot street view images parallel to the moving direction. From the continuous images, two images are sequentially selected for 3D street view detection. Since there is currently no relatively accurate method to estimate the movement between the two images, this paper proposes a relative baseline analysis algorithm that can relatively accurately estimate the movement length between two images to achieve the purpose of the 3D street reconstruction. Using the street view images taken by our equipment and the street view images provided by Google cars to verify and analyze the proposed algorithm, it is verified that the proposed algorithm can indeed achieve 3D street view reconstruction and collage. The reconstructed 3D map can clearly see and distinguish the position and relative distance of each object from different viewing angles.

    致謝 II 摘要 III Abstract IV 目錄 V 圖目錄 VI 表目錄 IX 第一章 緒論 1 1.1 研究動機與背景 1 1.2 研究目的 7 1.3 論文架構 10 第二章 文獻探討 11 第三章 三維重建處理演算法 23 3.1 影像比對 23 3.2 相對基線分析演算法 31 3.3 座標轉換 32 第四章 實驗分析與結果 40 第五章 結論與未來展望 59 5.1 結論 59 5.2 未來展望 59 參考文獻 61

    [1] E. Reid, “A look back at 15 years of mapping the world,” Feb. 06, 2020. Accessed on March 24, 2020, [Online]. Available: https:// blog.google/products/maps/look-back-15-years-mapping-world
    [2] L. Vincent, “Taking online maps down to street level,” Computer, vol. 40, no. 12, pp. 118-120, Dec. 2007, DOI: 10.1109/MC.2007.442.
    [3] R. N. Sadekov, K. A. Asatryan, V. E. Prun, V. V. Postnikov, F. G. Kirdyashov, and M. R. Koren, “Road sign detection and recognition in panoramic images to generate navigational maps,” in Proc. Saint Petersburg International Conference on Integrated Navigation Systems, St. Petersburg, Russia, May 29-31, 2017, pp. 1-5, DOI: 10.23919/ICINS.2017.7995611.
    [4] T.-H. Tsai, W.-H. Cheng, C.-W. You, M.-C. Hu, A. W. Tsui, and H.-Y. Chi, “Learning and recognition of on-premise signs from weakly labeled street view images,” IEEE Transactions on Image Processing, vol. 23, no. 3, pp. 1047-1059, March 2014, DOI: 10.1109/TIP.2014.2298982.
    [5] C. Zhang, Y. Tao, Kai Du, W. Ding, B. Wang, J. Liu, and W. Wang, “Character-level street view text spotting based on deep multisegmentation network for smarter autonomous driving,” IEEE Transactions on Artificial Intelligence, vol. 3, no. 2, pp. 297-308, April 2022, DOI: 10.1109/TAI.2021.3116216.
    [6] C. Zhang, W. Ding, G. Peng, F. Fu, and W. Wang, “Street view text recognition with deep learning for urban scene understanding in intelligent transportation systems,” IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 7, pp. 4727-4743, July 2021, DOI: 10.1109/TITS.2020.3017632.
    [7] Y. Lei, D. Peng, P. Zhang, Q. Ke, and H. Li, “Hierarchical paired channel fusion network for street scene change detection,” IEEE Transactions on Image Processing, vol. 30, pp. 55-67, 2021, DOI: 10.1109/TIP.2020.3031173.
    [8] A. Flores and S. Belongie, “Removing pedestrians from Google street view images,” in Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition -Workshops, San Francisco, CA, USA, June 13-18, 2010, pp. 53-58, DOI: 10.1109/CVPRW.2010.5543255.
    [9] J. Yu, M. Wu, C. Li, and S. Zhu, “A street view image privacy detection and protection method based on Mask-RCNN,” in Proc. IEEE Joint International Information Technology and Artificial Intelligence Conference, Chongqing, China, Dec. 11-13 2020, pp. 2184-2188, DOI: 10.1109/ITAIC49862.2020.9338847.
    [10] H. Uchiyama, D. Deguchi, T. Takahashi, I. Ide, and H. Murase, “Removal of moving objects from a street-view image by fusing multiple image sequences,” in Proc. International Conference on Pattern Recognition, Istanbul, Turkey, Aug. 23-26, 2010, pp. 3456-3459, DOI: 10.1109/ICPR.2010.844.
    [11] D. Anguelov, C. Dulong, D. Filip, C. Frueh, S. Lafon, R. Lyon, A. Ogale, L. Vincent, and J. Weaver, “Google street view: capturing the world at street level,” Computer, vol. 43, no. 6, pp. 32-38, June 2010, DOI: 10.1109/MC.2010.170.
    [12] R. Amadeo, “Google’s street view cars are now Giant, Mobile 3D scanners,” Ars Technica, 2017. Access on March 24, 2020, [Online]. Available: https:// arstechnica.com/gadgets/2017/09/googles-street-view-cars-are-now-giant-mobile-3d-scanners/
    [13] W. Maddern, G. Pascoe, and P. Newman, “Leveraging experience for large-scale LIDAR localisation in changing cities,” in Proc. IEEE International Conference on Robotics and Automation, Seattle, WA, USA, May 26-30, 2015, pp. 1684-1691, DOI: 10.1109/ICRA.2015.7139414.
    [14] Y. Budisusanto, M. N. Cahyadi, I. W. Farid, M. R. Ubaidillah, and D. W. Imani, “Low cost LiDAR prototype design for 3D mapping,” in Proc. International Conference on Advanced Mechatronics, Intelligent Manufacture and Industrial Automation, Surabaya, Indonesia, Dec. 08-09, 2021, pp. 13-17. DOI: 10.1109/ICAMIMIA54022.2021.980869
    5.
    [15] J. Lambert, A. Carballo, A. M. Cano, P. Narksri, D. Wong, E. Takeuchi, and K. Takeda, “Performance analysis of 10 models of 3D LiDARs for automated driving,” IEEE Access, vol. 8, pp. 131699-131722, 2020, DOI: 10.1109/ACCESS.2020.3009680.
    [16] J. Zhang and S. Singh, “Low-drift and real-time LiDAR odometry and mapping,” Autonomous Robots, vol. 41, pp. 401-416, 2017, DOI: 10.1007/s10514-016-9548-2.
    [17] H. Zhang, J. Wang, T. Fang, and L. Quan, “Joint segmentation of images and scanned point cloud in large-scale street scenes with low-annotation cost,” IEEE Transactions on Image Processing, vol. 23, no. 11, pp. 4763-4772, Nov. 2014, DOI: 10.1109/TIP.2014.2348795.
    [18] P. Babahajiani, L. Fan, J. Kamarainen, and M. Gabbouj, “Automated super-voxel based features classification of urban environments by integrating 3D point cloud and image content,” in Proc. IEEE International Conference on Signal and Image Processing Applications, Kuala Lumpur, Malaysia, Oct. 19-21, 2015, pp. 372-377, DOI: 10.1109/ICSIPA.2015.7412219.
    [19] X. Zhao, P. Sun, Z. Xu, H. Min, and H. Yu, “Fusion of 3D LiDAR and camera data for object detection in autonomous vehicle applications,” IEEE Sensors Journal, vol. 20, no. 9, pp. 4901-4913, May 2020, DOI: 10.1109/JSEN.2020.2966034.
    [20] R. Qin and A. Gruen, “3D change detection at street level using mobile laser scanning point clouds and terrestrial images,” ISPRS Journal of Photogrammetry and Remote Sensing, vol. 90, pp. 23-35, April 2014, DOI: 10.1016/j.isprsjprs.2014.01.006.
    [21] Y. He and S. Chen, “Recent advances in 3D data acquisition and processing by time-of-flight camera,” IEEE Access, vol. 7, pp. 12495-12510, Jan 2019, DOI: 10.1109/ACCESS.2019.2891693.
    [22] Bitfab, “3D scanning with structured light”, Access on March 22, 2023, [Online]. Available: https://bitfab.io/blog/3d-structured-light-scanning/
    [23] J. Bao, Y. Gu, L. Hsu, and S. Kamijo, “Vehicle self-localization using 3D building map and stereo camera,” IEEE Intelligent Vehicles Symposium, June 2016, pp. 927-932, DOI: 10.1109/IVS.2016.7535499.
    [24] A. Osep, W. Mehner, M. Mathias, and B. Leibe, “Combined image- and world-space tracking in traffic scenes,” in Proc. IEEE International Conference on Robotics and Automation, Singapore, May 29-June 03, 2017, pp. 1988-1995, DOI: 10.1109/ICRA.2017.7989230.
    [25] A. Akbarzadeh, J. M. Frahm, P. Mordohai, B. Clipp, C. Engels, D. Gallup, P. Merrell, M. Phelps, S. Sinha, B. Talton, L. Wang, Q. Yang, H. Stewenius, R. Yang, G. Welch, H. Towles, D. Nister, and M. Pollefeys, “Towards urban 3D reconstruction from video,” in Proc. International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT), Chapel Hill, NC, USA, June 14-16, 2006, pp. 1-8, DOI: 10.1109/ICRA.2017.7989230.
    [26] E. Zheng, R. Raguram, P. Fite-Georgel, and J.-M. Frahm, “Efficient generation of multi-perspective panoramas,” in Proc. International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission, Hangzhou, China, May 16-19, 2011, pp. 86-92, DOI: 10.1109/3DIMPVT.2011.60.
    [27] A. Torii, M. Havlena, and T. Pajdla, “From google street view to 3D city models,” in Proc. IEEE International Conference on Computer Vision Workshops, Kyoto, Japan, Sep. 27–Oct. 04, 2009, pp. 2188-2195, DOI: 10.1109/ICCVW.2009.5457551.
    [28] J. Xiao and L. Quan, “Multiple view semantic segmentation for street view images,” in Proc. IEEE International Conference on Computer Vision, Kyoto, Japan, Sep. 29–Oct. 02, 2009, pp. 686-693, DOI: 10.1109/ICCV.2009.5459249.
    [29] D. Nister, “An efficient solution to the five-point relative pose problem,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 6, pp. 756-770, June 2004, DOI: 10.1109/TPAMI.2004.17.
    [30] D. Barath and L. Hajder, “Efficient recovery of essential matrix from two affine correspondences,” IEEE Transactions on Image Processing, vol. 27, no. 11, pp. 5328-5337, Nov. 2018, DOI: 10.1109/TIP.2018.2849866.
    [31] W. Pannao and C. Pintavirooj, “Application of direct linear transform for calibration of miniature computed tomography,” in Proc. Biomedical Engineering International Conference, Muang, Thailand, Dec. 05-07, 2012, pp. 1-5, DOI: 10.1109/TPAMI.2004.17
    [32] Fischler, A. Martin, and R. C. Bolles. “Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography,” Communications of the ACM, vol. 24, no. 6, pp. 381-395, June 1981, DOI:
    10.1145/358669.358692.
    [33] R. Mur-Artal and J. D. Tardós, “ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras,” IEEE Transactions on Robotics, vol. 33, no. 5, pp. 1255-1262, Oct. 2017, DOI: 10.1109/TRO.2017.2705103.
    [34] D. Esparza and G. Flores, “The STDyn-SLAM: A stereo vision and semantic segmentation approach for VSLAM in dynamic outdoor environments,” IEEE Access, vol. 10, pp. 18201-18209, Feb. 2022, DOI: 10.1109/ACCESS.2022.3149885.
    [35] Y. Zhong, S. Hu, G. Huang, L. Bai, and Q. Li, “WF-SLAM: A robust VSLAM for dynamic scenarios via weighted features,” IEEE Sensors Journal, vol. 22, no. 11, pp. 10818-10827, June 2022, DOI: 10.1109/JSEN.2022.3169340.
    [36] E. Gent, “Tesla places big bet on vision-only self-driving,” 2021. Accessed on March 22, [Online].Available: https://spectrum.ieee.org/tesla-places-big-bet-vision-only-self-driving
    [37] N. Cristovao, “Tesla FSD will be pure vision and not rely on radar use,” 2021. Accessed on March 22, [Online]. Available: https://www.notateslaapp.com/software-updates/upc
    oming-features/id/482/tesla-fsd-will-be-pure-vision-and-not-rely-on-radar-use
    [38] P. Burt and E. Adelson, “The Laplacian pyramid as a compact image code,” IEEE Transactions on Communications, vol. 31, no. 4, pp. 532-540, April 1983, DOI: 10.1109/TCOM.1983.1095851.
    [39] J.-H. Jung, H.-S. Lee, J. H. Lee, and D. -J. Park, “A novel template matching scheme for fast full-search boosted by an integral image,” IEEE Signal Processing Letters, vol. 17, no. 1, pp. 107-110, Jan. 2010, DOI: 10.1109/LSP.2009.2032452.
    [40] K. V. Stefanik, J. C. Gassaway, K. Kochersberger, and A. L. Abbott, “UAV-based stereo vision for rapid aerial terrain mapping,” GIScience & Remote Sensing, vol. 48, no. 1, pp. 24-49, May 2011, DOI: 10.2747/1548-1603.48.1.24.
    [41] K.-L. Huang, C.-C. Chiu, S.-Y. Chiu, Y.-J. Teng, and S.-S. Hao, “Monocular vision system for fixed altitude flight of unmanned aerial vehicles” Sensors, vol. 15, no. 7, pp. 16848-16865, July 2015, DOI: 10.3390/s150716848.
    [42] W.-C. Lo, C.-C. Chiu, and J.-H. Yang “Three-dimensional object segmentation and labeling algorithm using contour and distance information,” Applied Sciences, vol. 12, no. 13, pp. 6602-6629, June 2022, DOI: 10.3390/app12136602.
    [43] A. Davies, “Canon EOS 600D review,” April 2, 2011. Access on March 24, 2020, [Online]. Available: https://amateurphotographer.com/review/canon-eos-600d-2
    [44] Canon,「EOS 600D規格」,2011。Access on March 24, 2020, [Online]. Available: https://tw.canon/zh_TW/support/6200098100
    [45] E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, “ORB: An efficient alternative to SIFT or SURF,” in Proc. International Conference on Computer Vision, Barcelona, Spain, Nov, 06-13, 2011, pp. 2564-2571, DOI: 10.1109/ICCV.2011.6126544.
    [46] Google, 2023. Access on May 07, 2023, [Online]. Available: https://www.google.co m/maps

    無法下載圖示 全文公開日期 2025/08/15 (校內網路)
    全文公開日期 2025/08/15 (校外網路)
    全文公開日期 2025/08/15 (國家圖書館:臺灣博碩士論文系統)
    QR CODE