應用深度學習於外來物偵測-以顯示卡組裝線為例｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	明言林 Faisal Fuad Nursyahid
論文名稱：	應用深度學習於外來物偵測-以顯示卡組裝線為例 Foreign Objects Detection Using Deep Learning- A Case Study on Graphic Card Assembly Line
指導教授：	郭人介 Ren-Jieh Kuo
口試委員:	羅士哲 Shih-Che Lo 歐陽超 Chao Ou-Yang
學位類別：	碩士 Master
系所名稱：	管理學院 - 工業管理系 Department of Industrial Management
論文出版年：	2021
畢業學年度：	109
語文別：	英文
論文頁數：	87
中文關鍵詞：	外來物檢測、注意力、卷積神經網路、U-net
外文關鍵詞：	Foreign Object Detection, Attention, CNN, U-net
相關次數：	點閱：206 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

組裝在生產中將半成品製成成品的過程，通常經由機器及操作者完成。而在整條產線的最後，安排品質檢測及確保沒有異物在生產線上是必要的。然而，這項人工檢測的過程是非常容易犯錯，以及耗費時間的。因此，此篇研究中以顯示卡製造時的外來物檢測為例，開發出一個能夠檢測以及標記外來物的卷積神經網路模型。
本研究使用了Inception-Resnet v2 來分類外來物及注意力殘差U-net ++模型區隔外來物，並對注意力機制及Inception Resnet v2中的激活方程進行了全面的比較，還比較了結合擠壓興奮網路(SE Net)及使用Swish 和 Mish 激活方程式的卷積塊注意力模型在Cifar-10 及 Cifar-100 資料集上的表現。結果表示，Inception-Resnet相比於ZF-Net以及VGG19在Cifar-10 及 Cifar-100 資料集上有著更好的表現，準確率分別是79.22%及 49.44%。接著，把SE 模型以及Mish 激活方程式加進Inception Resnet v2中可以在同個資料集上分別改善4.59% 以及 5.06%的準確率。
本研究亦利用注意力殘差U-net ++比較現有的U-net模型，例如U-net、殘差U-net、注意力U-net以及U-net ++在Oxford Pet IIIT 和Carvana 資料集。結果顯示，本研究所提出的模型在Carvana 資料集中有最好的表現，IoU達到94.13%。同時，注意力U-net在Oxford Pet IIIT資料集中勝過我們提出的模型，IoU達到87.83%。此外，在研究案例中，Inception Resnet v2 和注意力殘差 U-net ++有著最高的準確率以及IoU，為最好的分類以及區隔模型。

An assembly process is a process of manufacturing goods from semi-finished parts to finished good carried by operators and machines. At the end of the assembly line, it is necessary to implement quality inspection and ensure no foreign object put on the conveyor. However, this manual process is erroneous and time consuming. Therefore, this study uses a case of foreign object detection in graphic card manufacturing to develop a model which is capable of detecting and labeling foreign objects by using the convolutional neural network model.
This study used Inception-Resnet v2 to perform classification for the foreign object and Attention Residual U-net ++ for segmentation of the foreign object. A comparative study of attention mechanism and activation function in Inception Resnet v2 were made. This study compared the combination performance of Squeeze-Excited Network (SE Net) and Convolutional Block Attention Module (CBAM) with Swish and Mish Activation functions on Cifar-10 and Cifar-100 datasets. The results showed that Inception-Resnet performs better on Cifar-10 and Cifar-100 datasets than the ZF-Net and VGG19 with accuracy of 79.22% and 49.4%, respectively. Then, adding SE module and Mish Activation function into Inception Resnetv2 can help to improve the accuracy by 4.59% and 5.06%, respectively, for the same dataset.
Furthermore, this study also made a comparative study of the existing U-net models, such as U-net, Residual U-net, Attention U-net, and U-net ++ with the proposed model, Attention Residual U-net ++, on Oxford Pet III-T and Carvana datasets. The result showed that the proposed model is the best one on Carvana dataset with IoU of 94.13%. Meanwhile, Attention U-net can outperform the proposed model on Oxford Pet III-T dataset with IoU of 87.83%. Inception Resnet v2 and Attention Residual U-net ++ are also the best models for classification and segmentation in the case study with the highest accuracy and IoU.

ACKNOWLEDGEMENT    i
摘要    ii
ABSTRACT    iii
TABLE OF CONTENTS    iv
LIST OF TABLES    vii
LIST OF FIGURES    viii
CHAPTER 1 INTRODUCTION    1
1    Background and Motivation    1
1.1    History of computer vision    2
1.2    Development of CNN    4
2    Research Objectives    5
3    Research Scope, Constraints, and Assumptions    5
4    Thesis Organization    6
CHAPTER 2 LITERATURE REVIEW    8
1    Foreign Object Detection    8
2    Convolutional Neural Network (CNN)    11
2.1    Convolutional layer    11
2.2    Sub-sampling layer (Pooling layer)    12
2.3    Fully connected layer    12
3    CNN Architecture    13
3.1    Alex network    13
3.2    Zeiler-Fergus network (ZF net)    14
3.3    Inception Network    14
3.4    VGG net    16
3.5    Residual network    16
4    U-network (U-net)    18
4.1    Attention U-net    19
4.2    Recurrent Residual U-net (R2U-net)    20
4.3    U-net ++    21
5    Attention Module for CNN    22
5.1    Squeeze-excited network (SE net)    22
5.2    Convolutional block attention module (CBAM)    23
CHAPTER 3 METHODOLOGY    24
1    Methodology Framework    24
2    Data Collection    26
2.1    Benchmark dataset    26
2.2    Case study dataset    28
3    The Proposed Model    30
3.1    Classification task    30
3.2    Segmentation task    35
4    Model Verification    37
4.1    Accuracy    37
4.2    Intersection over Union (IoU)    38
CHAPTER 4 EXPERIMENTAL RESULTS    39
1    Classification Task    39
1.1    Cifar-10 dataset    39
1.2    Cifar-100 dataset    43
2    Segmentation Task    46
2.1    Oxford Pet III-T dataset    46
2.2    Carvana dataset    48
CHAPTER 5 CASE STUDY    52
1    Case Study Description    52
2    Training Parameters    53
2.1    Classification task    53
2.2    Segmentation task    54
3    Classification Task Analysis    54
3.1    Method comparison    54
3.2    Effect of changing activation function on Inception Resnet v2    56
3.3    Effect of adding attention mechanism to Inception Resnet v2    58
4    Segmentation Task Analysis    59
4.1    Statistical hypothesis    61
CHAPTER 6 CONCLUSIONS AND FUTURE RESEARCH    63
1    Conclusions    63
2    Research Contributions    64
3    Research Limitations    64
4    Suggestions for Future Research    64
REFERENCES    66
APPENDIX    71

                                

Alom, M. Z., Hasan, M., Yakopcic, C., Taha, T. M., & Asari, V. K. (2018). Recurrent residual convolutional neural network based on u-net (r2u-net) for medical image segmentation. arXiv preprint arXiv:1802.06955.
Dieleman, S., Willett, K. W., & Dambre, J. (2015, 25 April). Rotation-invariant convolutional neural networks for galaxy morphology prediction. Monthly Notices of The Royal Astronomical Society, 450(2), 1441-1459.
Fukushima, K. (1988, 15 September). Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural Networks, 1(2), 119-130.
Hu, J., Shen, L., & Sun, G. (2018, 18-22 June). Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp 7132-7141), Salt Lake City, USA.
Hubel, D. H., & Wiesel, T. N. (1959). Receptive fields of single neurones in the cat's striate cortex. The Journal of Physiology, 148(3), 574-591. doi:10.1113/jphysiol.1959.sp006308
Kirsch, R. A., Cahn, L., Ray, C., & Urban, G. H. (1957). Experiments in processing pictorial information with a digital computer. Proceedings of the Eastern Joint Computer Conference: Computers With Deadlines to Meet, Washington, D.C., December 9-13, 1957.
Kon, S., Watabe, K., & Horibe, M. (2018, 12 - 14 March). Nondestructive method using transmission line for detection of foreign objects in food. Proceedings of the 2018 IEEE Sensors Applications Symposium (SAS) (pp 1-4), Seoul, South Korea.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84-90.
Kwon, J.-S., Lee, J.-M., & Kim, W.-Y. (2008, 14-16 April). Real-time detection of foreign objects using X-ray imaging for dry food manufacturing line. Proceedings of the 2008 IEEE International Symposium on Consumer Electronics (pp 1-4), Vilamoura, Portugal.
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324. doi:10.1109/5.726791
Lu, L., Shin, Y., Su, Y., & Karniadakis, G. E. (2019). Dying relu and initialization: Theory and numerical examples. arXiv preprint arXiv:1903.06733.
Misra, D. (2019). Mish: A self regularized non-monotonic neural activation function. arXiv preprint arXiv:1908.08681.
Mittal, S., Chopra, C., Trivedi, A., & Chanak, P. (2019, 27-28 Sept.). Defect Segmentation in Surfaces using Deep Learning. Proceedings of the 2019 International Conference on Issues and Challenges in Intelligent Computing Techniques (ICICT) (pp 1-6), Ghaziabad, India.
Moghadas, S. M., & Rabbani, N. (2010, 27-28 October). Detection and classification of foreign substances in medical vials using MLP neural network and SVM. Proceedings of the 2010 6th Iranian Conference on Machine Vision and Image Processing (pp 1-5), Isfahan, Iran.
Oktay, O., Schlemper, J., Folgoc, L. L., Lee, M., Heinrich, M., Misawa, K., Kainz, B. (2018). Attention u-net: Learning where to look for the pancreas. arXiv preprint arXiv:1804.03999.
Ramachandran, P., Zoph, B., & Le, Q. V. (2017). Searching for activation functions. arXiv preprint arXiv:1710.05941.
Roberts, L. G. (1963). Machine perception of three-dimensional solids. Massachusetts Institute of Technology, USA.
Rong, D., Xie, L., & Ying, Y. (2019). Computer vision detection of foreign objects in walnuts using deep learning. Computers and Electronics in Agriculture, 162, 1001-1010.
Ronneberger, O., Fischer, P., & Brox, T. (2015, 18 November). U-net: Convolutional networks for biomedical image segmentation. Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention (pp 234- 241), Shenzhen, China.
Sarakon, P., Kawano, H., & Serikawa, S. (2019, 19-22 November). Surface-Defect Segmentation using U-shaped Inverted Residuals. Proceedings of the 2019 12th Biomedical Engineering International Conference (BMEiCON) (pp 1-4), Ubon Ratchathani, Thailand.
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. (2017, 12 December). Inception-v4, inception-resnet and the impact of residual connections on learning. Proceedings of the AAAI Conference on Artificial Intelligence, 31(1).
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Rabinovich, A. (2015, 7-12 June). Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp 1-9), Boston, USA.
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition,
Viola, P., & Jones, M. (2004). Robust real-time face detection. International Journal of Computer Vision, 57(2), 137-154.
Wang, W., Yang, Y., Wang, X., Wang, W., & Li, J. (2019). Development of convolutional neural network and its application in image classification: a survey. Optical Engineering, 58(4), 040901.
Woo, S., Park, J., Lee, J.-Y., & So Kweon, I. (2018, 8-14 September). Cbam: Convolutional block attention module. Proceedings of the European Conference on Computer Vision (ECCV) (pp 3-19), Munich, Germany.
Xie, S., Girshick, R., Dollár, P., Tu, Z., & He, K. (2017, 21-26 July). Aggregated residual transformations for deep neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Hawaii, USA.
Zeiler, M. D., & Fergus, R. (2014). Visualizing and understanding convolutional networks. Proceedings of European Conference on Computer Vision (pp 818-833), Munich, Germany, Sep. 8-14.
Zhao, X., Li, X., Yin, L., Feng, W., Zhang, N., & Zhang, X. (2019, 23-25 Oct.). Foreign body recognition for coal mine conveyor based on improved PCANSet. Proceedings of the 2019 11th International Conference on Wireless Communications and Signal Processing (WCSP) (pp 1-6), Xi'an, China.
Zhou, Z., Siddiquee, M. M. R., Tajbakhsh, N., & Liang, J. (2018). Unet++: A nested u-net architecture for medical image segmentation. In Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support (pp. 3-11): Springer

全文公開日期 2026/01/27 (校內網路)
全文公開日期 2026/01/27 (校外網路)
全文公開日期本全文未授權公開 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文