簡易檢索 / 詳目顯示

研究生: 明言林
Faisal Fuad Nursyahid
論文名稱: 應用深度學習於外來物偵測-以顯示卡組裝線為例
Foreign Objects Detection Using Deep Learning- A Case Study on Graphic Card Assembly Line
指導教授: 郭人介
Ren-Jieh Kuo
口試委員: 羅士哲
Shih-Che Lo
歐陽超
Chao Ou-Yang
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 87
中文關鍵詞: 外來物檢測注意力卷積神經網路U-net
外文關鍵詞: Foreign Object Detection, Attention, CNN, U-net
相關次數: 點閱:206下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 組裝在生產中將半成品製成成品的過程,通常經由機器及操作者完成。而在整條產線的最後,安排品質檢測及確保沒有異物在生產線上是必要的。然而,這項人工檢測的過程是非常容易犯錯,以及耗費時間的。因此,此篇研究中以顯示卡製造時的外來物檢測為例,開發出一個能夠檢測以及標記外來物的卷積神經網路模型。
    本研究使用了Inception-Resnet v2 來分類外來物及注意力殘差U-net ++模型區隔外來物,並對注意力機制及Inception Resnet v2中的激活方程進行了全面的比較,還比較了結合擠壓興奮網路(SE Net)及使用Swish 和 Mish 激活方程式的卷積塊注意力模型在Cifar-10 及 Cifar-100 資料集上的表現。結果表示,Inception-Resnet相比於ZF-Net以及VGG19在Cifar-10 及 Cifar-100 資料集上有著更好的表現,準確率分別是79.22%及 49.44%。接著,把SE 模型以及Mish 激活方程式加進Inception Resnet v2中可以在同個資料集上分別改善4.59% 以及 5.06%的準確率。
    本研究亦利用注意力殘差U-net ++比較現有的U-net模型,例如U-net、殘差U-net、注意力U-net以及U-net ++在Oxford Pet IIIT 和Carvana 資料集。結果顯示,本研究所提出的模型在Carvana 資料集中有最好的表現,IoU達到94.13%。同時,注意力U-net在Oxford Pet IIIT資料集中勝過我們提出的模型,IoU達到87.83%。此外,在研究案例中,Inception Resnet v2 和 注意力殘差 U-net ++有著最高的準確率以及IoU,為最好的分類以及區隔模型。


    An assembly process is a process of manufacturing goods from semi-finished parts to finished good carried by operators and machines. At the end of the assembly line, it is necessary to implement quality inspection and ensure no foreign object put on the conveyor. However, this manual process is erroneous and time consuming. Therefore, this study uses a case of foreign object detection in graphic card manufacturing to develop a model which is capable of detecting and labeling foreign objects by using the convolutional neural network model.
    This study used Inception-Resnet v2 to perform classification for the foreign object and Attention Residual U-net ++ for segmentation of the foreign object. A comparative study of attention mechanism and activation function in Inception Resnet v2 were made. This study compared the combination performance of Squeeze-Excited Network (SE Net) and Convolutional Block Attention Module (CBAM) with Swish and Mish Activation functions on Cifar-10 and Cifar-100 datasets. The results showed that Inception-Resnet performs better on Cifar-10 and Cifar-100 datasets than the ZF-Net and VGG19 with accuracy of 79.22% and 49.4%, respectively. Then, adding SE module and Mish Activation function into Inception Resnetv2 can help to improve the accuracy by 4.59% and 5.06%, respectively, for the same dataset.
    Furthermore, this study also made a comparative study of the existing U-net models, such as U-net, Residual U-net, Attention U-net, and U-net ++ with the proposed model, Attention Residual U-net ++, on Oxford Pet III-T and Carvana datasets. The result showed that the proposed model is the best one on Carvana dataset with IoU of 94.13%. Meanwhile, Attention U-net can outperform the proposed model on Oxford Pet III-T dataset with IoU of 87.83%. Inception Resnet v2 and Attention Residual U-net ++ are also the best models for classification and segmentation in the case study with the highest accuracy and IoU.

    ACKNOWLEDGEMENT i 摘要 ii ABSTRACT iii TABLE OF CONTENTS iv LIST OF TABLES vii LIST OF FIGURES viii CHAPTER 1 INTRODUCTION 1 1.1 Background and Motivation 1 1.1.1 History of computer vision 2 1.1.2 Development of CNN 4 1.2 Research Objectives 5 1.3 Research Scope, Constraints, and Assumptions 5 1.4 Thesis Organization 6 CHAPTER 2 LITERATURE REVIEW 8 2.1 Foreign Object Detection 8 2.2 Convolutional Neural Network (CNN) 11 2.2.1 Convolutional layer 11 2.2.2 Sub-sampling layer (Pooling layer) 12 2.2.3 Fully connected layer 12 2.3 CNN Architecture 13 2.3.1 Alex network 13 2.3.2 Zeiler-Fergus network (ZF net) 14 2.3.3 Inception Network 14 2.3.4 VGG net 16 2.3.5 Residual network 16 2.4 U-network (U-net) 18 2.4.1 Attention U-net 19 2.4.2 Recurrent Residual U-net (R2U-net) 20 2.4.3 U-net ++ 21 2.5 Attention Module for CNN 22 2.5.1 Squeeze-excited network (SE net) 22 2.5.2 Convolutional block attention module (CBAM) 23 CHAPTER 3 METHODOLOGY 24 3.1 Methodology Framework 24 3.2 Data Collection 26 3.2.1 Benchmark dataset 26 3.2.2 Case study dataset 28 3.3 The Proposed Model 30 3.3.1 Classification task 30 3.3.2 Segmentation task 35 3.4 Model Verification 37 3.4.1 Accuracy 37 3.4.2 Intersection over Union (IoU) 38 CHAPTER 4 EXPERIMENTAL RESULTS 39 4.1 Classification Task 39 4.1.1 Cifar-10 dataset 39 4.1.2 Cifar-100 dataset 43 4.2 Segmentation Task 46 4.2.1 Oxford Pet III-T dataset 46 4.2.2 Carvana dataset 48 CHAPTER 5 CASE STUDY 52 5.1 Case Study Description 52 5.2 Training Parameters 53 5.2.1 Classification task 53 5.2.2 Segmentation task 54 5.3 Classification Task Analysis 54 5.3.1 Method comparison 54 5.3.2 Effect of changing activation function on Inception Resnet v2 56 5.3.3 Effect of adding attention mechanism to Inception Resnet v2 58 5.4 Segmentation Task Analysis 59 5.4.1 Statistical hypothesis 61 CHAPTER 6 CONCLUSIONS AND FUTURE RESEARCH 63 6.1 Conclusions 63 6.2 Research Contributions 64 6.3 Research Limitations 64 6.4 Suggestions for Future Research 64 REFERENCES 66 APPENDIX 71

    Alom, M. Z., Hasan, M., Yakopcic, C., Taha, T. M., & Asari, V. K. (2018). Recurrent residual convolutional neural network based on u-net (r2u-net) for medical image segmentation. arXiv preprint arXiv:1802.06955.
    Dieleman, S., Willett, K. W., & Dambre, J. (2015, 25 April). Rotation-invariant convolutional neural networks for galaxy morphology prediction. Monthly Notices of The Royal Astronomical Society, 450(2), 1441-1459.
    Fukushima, K. (1988, 15 September). Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural Networks, 1(2), 119-130.
    Hu, J., Shen, L., & Sun, G. (2018, 18-22 June). Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp 7132-7141), Salt Lake City, USA.
    Hubel, D. H., & Wiesel, T. N. (1959). Receptive fields of single neurones in the cat's striate cortex. The Journal of Physiology, 148(3), 574-591. doi:10.1113/jphysiol.1959.sp006308
    Kirsch, R. A., Cahn, L., Ray, C., & Urban, G. H. (1957). Experiments in processing pictorial information with a digital computer. Proceedings of the Eastern Joint Computer Conference: Computers With Deadlines to Meet, Washington, D.C., December 9-13, 1957.
    Kon, S., Watabe, K., & Horibe, M. (2018, 12 - 14 March). Nondestructive method using transmission line for detection of foreign objects in food. Proceedings of the 2018 IEEE Sensors Applications Symposium (SAS) (pp 1-4), Seoul, South Korea.
    Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84-90.
    Kwon, J.-S., Lee, J.-M., & Kim, W.-Y. (2008, 14-16 April). Real-time detection of foreign objects using X-ray imaging for dry food manufacturing line. Proceedings of the 2008 IEEE International Symposium on Consumer Electronics (pp 1-4), Vilamoura, Portugal.
    LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324. doi:10.1109/5.726791
    Lu, L., Shin, Y., Su, Y., & Karniadakis, G. E. (2019). Dying relu and initialization: Theory and numerical examples. arXiv preprint arXiv:1903.06733.
    Misra, D. (2019). Mish: A self regularized non-monotonic neural activation function. arXiv preprint arXiv:1908.08681.
    Mittal, S., Chopra, C., Trivedi, A., & Chanak, P. (2019, 27-28 Sept.). Defect Segmentation in Surfaces using Deep Learning. Proceedings of the 2019 International Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT) (pp 1-6), Ghaziabad, India.
    Moghadas, S. M., & Rabbani, N. (2010, 27-28 October). Detection and classification of foreign substances in medical vials using MLP neural network and SVM. Proceedings of the 2010 6th Iranian Conference on Machine Vision and Image Processing (pp 1-5), Isfahan, Iran.
    Oktay, O., Schlemper, J., Folgoc, L. L., Lee, M., Heinrich, M., Misawa, K., Kainz, B. (2018). Attention u-net: Learning where to look for the pancreas. arXiv preprint arXiv:1804.03999.
    Ramachandran, P., Zoph, B., & Le, Q. V. (2017). Searching for activation functions. arXiv preprint arXiv:1710.05941.
    Roberts, L. G. (1963). Machine perception of three-dimensional solids. Massachusetts Institute of Technology, USA.
    Rong, D., Xie, L., & Ying, Y. (2019). Computer vision detection of foreign objects in walnuts using deep learning. Computers and Electronics in Agriculture, 162, 1001-1010.
    Ronneberger, O., Fischer, P., & Brox, T. (2015, 18 November). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention (pp 234- 241), Shenzhen, China.
    Sarakon, P., Kawano, H., & Serikawa, S. (2019, 19-22 November). Surface-Defect Segmentation using U-shaped Inverted Residuals. Proceedings of the 2019 12th Biomedical Engineering International Conference (BMEiCON) (pp 1-4), Ubon Ratchathani, Thailand.
    Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
    Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. (2017, 12 December). Inception-v4, inception-resnet and the impact of residual connections on learning. Proceedings of the AAAI Conference on Artificial Intelligence, 31(1).
    Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Rabinovich, A. (2015, 7-12 June). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp 1-9), Boston, USA.
    Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
    Viola, P., & Jones, M. (2004). Robust real-time face detection. International Journal of Computer Vision, 57(2), 137-154.
    Wang, W., Yang, Y., Wang, X., Wang, W., & Li, J. (2019). Development of convolutional neural network and its application in image classification: a survey. Optical Engineering, 58(4), 040901.
    Woo, S., Park, J., Lee, J.-Y., & So Kweon, I. (2018, 8-14 September). Cbam: Convolutional block attention module. Proceedings of the European Conference on Computer Vision (ECCV) (pp 3-19), Munich, Germany.
    Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K. (2017, 21-26 July). Aggregated residual transformations for deep neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Hawaii, USA.
    Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. Proceedings of European Conference on Computer Vision (pp 818-833), Munich, Germany, Sep. 8-14.
    Zhao, X., Li, X., Yin, L., Feng, W., Zhang, N., & Zhang, X. (2019, 23-25 Oct.). Foreign body recognition for coal mine conveyor based on improved PCANSet. Proceedings of the 2019 11th International Conference on Wireless Communications and Signal Processing (WCSP) (pp 1-6), Xi'an, China.
    Zhou, Z., Siddiquee, M. M. R., Tajbakhsh, N., & Liang, J. (2018). Unet++: A nested u-net architecture for medical image segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support (pp. 3-11): Springer

    無法下載圖示 全文公開日期 2026/01/27 (校內網路)
    全文公開日期 2026/01/27 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE