基於視覺和深度學習技術的全自主蘭花芽團機械手臂切割參數生成

簡易檢索 / 詳目顯示

回結果列表

研究生：	Pham The Thinh Pham The Thinh
論文名稱：	基於視覺和深度學習技術的全自主蘭花芽團機械手臂切割參數生成 Autonomous Orchid Buds Robot-arm Cutting Parameter Generation Based on Vision and Deep Learning Techniques
指導教授：	林其禹 Chyi-Yeu Lin
口試委員:	林其禹 Chyi-Yeu Lin 李維楨 Wei-Chen Lee 林柏廷 Po-Ting Lin
學位類別：	碩士 Master
系所名稱：	工程學院 - 機械工程系 Department of Mechanical Engineering
論文出版年：	2023
畢業學年度：	111
語文別：	英文
論文頁數：	71
中文關鍵詞：	2D 實例分割、深度學習、3D 重建、蘭花植物
外文關鍵詞：	2D Instance segmentation, deep learning, 3D reconstruction, Orchid plant
相關次數：	點閱：165 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

花藝是台灣的一大產業。在產業中，台灣是全球最大的蝴蝶蘭出口國之一。蘭花組織培養的所有生產過程在台灣和全球仍然是手動實施的。蘭花組培操作的人為操作存在兩大弊端，人為操作素質不一致和人為污染源。本研究旨在構建基於視覺的算法，以支持機器人手臂進行的自主蘭花芽分離和修剪操作。蘭花芽的形態和尺寸各不相同，因此很難確定蘭花芽的姿態以及隨後使用基本計算機視覺技術（如 SIFT、HOG 等）確定蘭花芽的切割線。因此，無論是 2D 還是 3D 需要蘭花芽的圖像信息。基於深度學習的2D視覺和3D視覺技術相結合，為自主剪芽（分離）和葉暗區修剪系統提供切割位置估計。本論文開發的 Mask R-CNN 性能穩定可靠，在同一數據集上分別具有 0.5 和 0.9 閾值置信度分數的邊界框 AP 81.6% 和掩碼 AP 78.6 和 88.57% 準確率。對於芽體切開過程，系統結合了2D物體檢測和3D物體位姿估計，使芽體切開位置估計準確率達到88%。對於葉子和暗區修剪位置估計，結合深度學習和計算機視覺技術來估計目標位置。深度學習的表現顯示bounding box AP 55.7%，mask AP 56.5%。該切線位置估計在大量測試實驗中的成功率高於97%的準確率。綜上所述，本文提出的算法說明了芽體切割和葉片和暗區修剪位置估計對先進的全自動機器人進行的蘭花組織培養過程的價值和巨大潛力。

Floriculture is a big industry in Taiwan. Among the industry, Taiwan is one of the largest moth orchid exporters in the world. All the orchid tissue culture production processes are still implemented manually in Taiwan and Globally. Human operations on orchid tissue culture operation have two major drawbacks, inconsistent human-operation qualities and human pollution source. This study aims to build vision-based algorithms for supporting robot arm-conducted autonomous orchid bud separation and trimming operations. The orchid buds have varied formation and dimension so that it is difficult to determine the pose of the orchid buds and subsequently the cutting line for separating orchid buds by using basic computer vision techniques such as SIFT, HOG, etc. Therefore, both 2D and 3D image information of the orchid buds is needed. The 2D vision based on deep learning and 3D vision techniques are integrated to provide the cutting position estimation for the autonomous buds cutting (for separation) and leaf-and-dark area trimming system. This Mask R-CNN developed in this thesis has stable and reliable performance that can achieve the bounding box AP 81.6% and mask AP 78.6, and 88.57% accuracy in the same dataset with 0.5, and 0.9 threshold confidence scores, respectively. For the bud body slitting process, the system has combined 2D object detection and 3D object pose estimation so that the bud slitting position estimation can reach 88% accuracy. For the leaf and dark area trimming position estimation, both deep learning and computer vision techniques are integrated to estimate the target location. The performance of deep learning shows the bounding box AP 55.7% and mask AP 56.5%. The success rate of experiments on a large number of tests for this trimming position estimation is higher than 97% accuracy. To sum up, the proposed algorithms in this thesis illustrate the value and great potential of the bud body slitting and leaf and dark area trimming position estimation towards an advanced fully autonomous robot-conducted orchid tissue culture process.

摘要    I
ABSTRACT    II
ACKNOWLEDGEMENTS    III
TABLE OF CONTENTS    IV
LIST OF FIGURES    VII
LIST OF TABLES    X
Chapter 1 Introduction    1
1 Background and Literature Review    1
2 The Objective and Scope of the Study    2
3 Thesis Organization    2
Chapter 2 Vision Technique Elements    3
1 Camera Model    3
1.1 Camera Imaging Principle    3
1.2 Intrinsic Parameter    4
1.3 Extrinsic Parameters    5
1.4 Distortion    6
2 The Stereo Vision System    7
2.1 Stereo Cameras    7
2.2 Two-view Geometry    10
2.3 Stereo Vision Calibration    10
2.4 Stereo Matching    12
2.5 Depth map    13
3 The Object Detection    13
4 Deep Learning for Instance Segmentation    16
Chapter 3 Deep Learning based Detection Algorithms for the Orchid Sprouts    19
1 Introduction    19
2 Deep Learning Algorithms    19
2.1 Mask R-CNN    19
2.2 BlendMask    20
3.3 CondInst    21
3.4 SOLOv2    22
3.5 BoxInst    23
3 Implementation    23
3.1 The Architecture of the System    23
3.2 Orchid Dataset    24
3.3 Data Augmentation    25
3.4 Performance Evaluation    26
3.5 Setting Parameters    27
4 Results and Conclusion    27
4.1 Results    27
4.2 Conclusion    34
Chapter 4 Deep Learning-based Cutting Position Estimation for Orchid Sprouts    35
1 Overall System    35
1.1 Definitions and Descriptions of Terms of the Process    35
2 Deep Learning-based the Bud Slitting Position Estimation of the Orchid Sprouts    37
2.1 The Architecture of the System    37
2.2 Methodology and Experiment    38
2.3 Results and Conclusion    41
3 Deep Learning-based the Leaf and Dark Area Trimming Position Estimation of the Orchid Sprouts    43
3.1 The Architecture of the System    43
3.2 Methodology and Experiment    44
3.3 Results and Conclusion    48
Chapter 5 Conclusion and Future Works    51
1 Conclusion    51
2 Future Works    52
References    53

                                

[1] W. Yin, H. Wen, Z. Ning, J. Ye, Z. Dong, and L. Luo, “Fruit Detection and Pose Estimation for Grape Cluster–Harvesting Robot Using Binocular Imagery Based on Deep Neural Networks,” Front Robot AI, vol. 8, Jun. 2021, doi: 10.3389/frobt.2021.626989.
[2] K. Lakshmi R and N. Savarimuthu, “Investigation on Object Detection Models for Plant Disease Detection Framework,” in 2021 IEEE 6th International Conference on Computing, Communication and Automation (ICCCA), 2021, pp. 214–218. doi: 10.1109/ICCCA52192.2021.9666441.
[3] U. P. Singh, S. S. Chouhan, S. Jain, and S. Jain, “Multilayer Convolution Neural Network for the Classification of Mango Leaves Infected by Anthracnose Disease,” IEEE Access, vol. 7, pp. 43721–43729, 2019, doi: 10.1109/ACCESS.2019.2907383.
[4] T. T. Santos, L. L. de Souza, A. A. dos Santos, and S. Avila, “Grape detection, segmentation, and tracking using deep neural networks and three-dimensional association,” Comput Electron Agric, vol. 170, Mar. 2020, doi: 10.1016/j.compag.2020.105247.
[5] S. Arwatchananukul, K. Kirimasthong, and N. Aunsri, “A New Paphiopedilum Orchid Database and Its Recognition Using Convolutional Neural Network,” Wirel Pers Commun, vol. 115, no. 4, pp. 3275–3289, 2020, doi: 10.1007/s11277-020-07463-3.
[6] T. H. M. Siddique, Y. Rehman, T. Rafiq, M. Z. Nisar, M. S. Ibrahim, and M. Usman, “3D Object Localization Using 2D Estimates for Computer Vision Applications,” in 2021 Mohammad Ali Jinnah University International Conference on Computing (MAJICC), 2021, pp. 1–6. doi: 10.1109/MAJICC53071.2021.9526270.
[7] R. Hartley and A. Zisserman, Multiple view geometry in computer vision.
[8] A. Ewbank, T. Promoteur, and B. Boigelot, “Master thesis : Efficient and precise stereoscopic vision for humanoid robots.” [Online]. Available: http://matheo.ulg.ac.be
[9] W. Zhang, L. Fu, and X. Wang, “Research and development of stereo matching algorithm,” in ICMLCA 2021; 2nd International Conference on Machine Learning and Computer Application, 2021, pp. 1–5.
[10] A. Distante and C. Distante, “Camera Calibration and 3D Reconstruction,” in Handbook of Image Processing and Computer Vision: Volume 3: From Pattern to Object, A. Distante and C. Distante, Eds. Cham: Springer International Publishing, 2020, pp. 599–667. doi: 10.1007/978-3-030-42378-0_7.
[11] Ab. N. A. Fakhri, A. F. Ab. Nasir, M. S. Nordin, and Abd. R. bin Mamat, “A Study of Image Processing in Agriculture Application under High Performance Computing Environment,” 2012.
[12] Y. Jeong, S. Son, E. Jeong, and B. Lee, “An Integrated Self-Diagnosis System for an Autonomous Vehicle Based on an IoT Gateway and Deep Learning,” Applied Sciences, vol. 8, no. 7, 2018, doi: 10.3390/app8071164.
[13] X. Wu, D. Sahoo, and S. C. H. Hoi, “Recent Advances in Deep Learning for Object Detection,” Aug. 2019, [Online]. Available: http://arxiv.org/abs/1908.03673
[14] Z. Zou, Z. Shi, Y. Guo, and J. Ye, “Object Detection in 20 Years: A Survey,” May 2019, [Online]. Available: http://arxiv.org/abs/1905.05055
[15] Z.-Q. Zhao, P. Zheng, S. Xu, and X. Wu, “Object Detection with Deep Learning: A Review,” Jul. 2018, [Online]. Available: http://arxiv.org/abs/1807.05511
[16] N. O’Mahony et al., “Deep Learning vs. Traditional Computer Vision,” in Advances in Computer Vision, 2020, pp. 128–144.
[17] D. H. Ballard, “Generalizing the Hough transform to detect arbitrary shapes,” Pattern Recognit, vol. 13, no. 2, pp. 111–122, 1981, doi: https://doi.org/10.1016/0031-3203(81)90009-1.
[18] P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple features,” in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, 2001, vol. 1, pp. I–I. doi: 10.1109/CVPR.2001.990517.
[19] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 2005, vol. 1, pp. 886–893 vol. 1. doi: 10.1109/CVPR.2005.177.
[20] D. G. Lowe, “Distinctive Image Features from Scale-Invariant Keypoints,” Int J Comput Vis, vol. 60, no. 2, pp. 91–110, 2004, doi: 10.1023/B:VISI.0000029664.99615.94.
[21] C. J. C. Burges, “A Tutorial on Support Vector Machines for Pattern Recognition,” Data Min Knowl Discov, vol. 2, pp. 121–167, 2004.
[22] X. Jin, X. Hou, and C.-L. Liu, “Multi-class AdaBoost with Hypothesis Margin,” in 2010 20th International Conference on Pattern Recognition, 2010, pp. 65–68. doi: 10.1109/ICPR.2010.25.
[23] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You Only Look Once: Unified, Real-Time Object Detection,” Jun. 2015, [Online]. Available: http://arxiv.org/abs/1506.02640
[24] D. Bolya, C. Zhou, F. Xiao, and Y. J. Lee, “YOLACT: Real-time Instance Segmentation,” Apr. 2019, [Online]. Available: http://arxiv.org/abs/1904.02689
[25] Z. Tian, C. Shen, H. Chen, and T. He, “FCOS: Fully Convolutional One-Stage Object Detection,” Apr. 2019, [Online]. Available: http://arxiv.org/abs/1904.01355
[26] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation,” in 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 580–587. doi: 10.1109/CVPR.2014.81.
[27] T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature Pyramid Networks for Object Detection,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 936–944. doi: 10.1109/CVPR.2017.106.
[28] A. M. Hafiz and G. M. Bhat, “A survey on instance segmentation: state of the art,” Int J Multimed Inf Retr, vol. 9, no. 3, pp. 171–189, 2020, doi: 10.1007/s13735-020-00195-x.
[29] W. Gu, S. Bai, and L. Kong, “A review on 2D instance segmentation based on deep neural networks,” Image Vis Comput, vol. 120, p. 104401, 2022, doi: https://doi.org/10.1016/j.imavis.2022.104401.
[30] H. Zhang, H. Sun, W. Ao, and G. Dimirovski, “A survey on instance segmentation: Recent advances and challenges,” International Journal of Innovative Computing, Information and Control, vol. 17, no. 3, pp. 1041–1053, 2021, doi: 10.24507/ijicic.17.03.1041.
[31] K. He, G. Gkioxari, P. Dollár, and R. Girshick, “Mask R-CNN,” in 2017 IEEE International Conference on Computer Vision (ICCV), 2017, pp. 2980–2988. doi: 10.1109/ICCV.2017.322.
[32] S. W. Chen et al., “Counting Apples and Oranges With Deep Learning: A Data-Driven Approach,” IEEE Robot Autom Lett, vol. 2, no. 2, pp. 781–788, 2017, doi: 10.1109/LRA.2017.2651944.
[33] H. Chen, K. Sun, Z. Tian, C. Shen, Y. Huang, and Y. Yan, “BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 8570–8578. doi: 10.1109/CVPR42600.2020.00860.
[34] Z. Tian, C. Shen, and H. Chen, “Conditional Convolutions for Instance Segmentation,” in Computer Vision – ECCV 2020, 2020, pp. 282–298.
[35] Z. Tian, B. Zhang, H. Chen, and C. Shen, “Instance and Panoptic Segmentation Using Conditional Convolutions,” Feb. 2021, [Online]. Available: http://arxiv.org/abs/2102.03026
[36] X. Wang, R. Zhang, T. Kong, L. Li, and C. Shen, “SOLOv2: Dynamic and Fast Instance Segmentation,” Mar. 2020, [Online]. Available: http://arxiv.org/abs/2003.10152
[37] Z. Tian, C. Shen, X. Wang, and H. Chen, “BoxInst: High-Performance Instance Segmentation with Box Annotations.” [Online]. Available: https://git.io/AdelaiDet
[38] S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” Jun. 2015, [Online]. Available: http://arxiv.org/abs/1506.01497
[39] J. Long, E. Shelhamer, and T. Darrell, “Fully Convolutional Networks for Semantic Segmentation,” Nov. 2014, [Online]. Available: http://arxiv.org/abs/1411.4038
[40] X. Wang, T. Kong, C. Shen, Y. Jiang, and L. Li, “SOLO: Segmenting Objects by Locations,” Dec. 2019, [Online]. Available: http://arxiv.org/abs/1912.04488
[41] G. Ghiasi et al., “Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation,” Dec. 2020, [Online]. Available: http://arxiv.org/abs/2012.07177
[42] G. Georgakis, M. A. Reza, A. Mousavian, P.-H. Le, and J. Košecká, “Multiview RGB-D Dataset for Object Instance Detection,” in 2016 Fourth International Conference on 3D Vision (3DV), 2016, pp. 426–434. doi: 10.1109/3DV.2016.52.
[43] E. D. Cubuk, B. Zoph, J. Shlens, and Q. v Le, “Randaugment: Practical automated data augmentation with a reduced search space,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2020, pp. 3008–3017. doi: 10.1109/CVPRW50498.2020.00359.
[44] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778. doi: 10.1109/CVPR.2016.90.
[45] S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He, “Aggregated Residual Transformations for Deep Neural Networks,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 5987–5995. doi: 10.1109/CVPR.2017.634.

全文公開日期 2033/01/05 (校內網路)
全文公開日期本全文未授權公開 (校外網路)
全文公開日期 2113/01/05 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文