基於卷積神經網路之組裝程序技能移轉模型｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	張仙女 Lucy Sanjaya
論文名稱：	基於卷積神經網路之組裝程序技能移轉模型 Deep Learning-based Skill Transfer Model in Assembly Process
指導教授：	王孔政 Kung-Jeng Wang
口試委員:	陳怡永 Yi-Yung Chen 郭人介 Ren-Jieh Kuo 王孔政 Kung-Jeng Wang
學位類別：	碩士 Master
系所名稱：	管理學院 - 工業管理系 Department of Industrial Management
論文出版年：	2020
畢業學年度：	108
語文別：	英文
論文頁數：	58
中文關鍵詞：	技能移轉、人機互動、深度學習、卷積神經網路、更快速區域卷積神經網路
外文關鍵詞：	skill transfer, human machine interaction, deep learning, convolutional neural network, faster region-based convolutional neural network
相關次數：	點閱：403 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

在工業4.0為工業革命時代開啟新篇章的同時，對生產作業員的技能要求亦逐漸提高。隨著產品品項與製造程序的複雜化，發展具有彈性的教育訓練方法，對作業員的技能傳承極為重要。基於上述，本案試圖提出一套技能移轉模型，在製造環境中，用於提取(學習)專業人員的工作技巧，以及基於不同動作與相關物件而構成的生產情境下所產生之決策，並將兩者整合於計算模型中。本案所提出之技能移轉模型，以兩種不同類型的深度學習架構作為基礎，分別為：應用於動作辨識的卷積神經網路 (Convolutional Neural Network , CNN)，以及用來達成物件偵測之更快速區域卷積神經網路(Faster Region-based Convolutional Neural Network , R-CNN)。此外，為了評估模型效率，本案以高階顯示卡 (GPU card) 的產品組裝情境，作為個案研究之對象。在CNN與R-CNN的訓練下，結果分別呈現出95.4% 及96.8%的高辨識準確度。經訓練的技能移轉模型，主要用於指導新進人員，在組裝作業的同時，提供工作步驟指示，協助其快速適應複雜的作業程序。而本案的核心價值在於，透過提取專業人員之技能並轉移到新進人員的概念，進而促進彈性化教育訓練模型之發展。

Industry 4.0 refers to a new phase in the Industrial Revolution that requires workers to have higher capabilities in carrying out their duties. As the variety of products and manufacturing processes increase, the expansion of flexible training approaches are indispensable to support the development of human skills. This study proposes a skill transfer support model in a manufacturing scenario in which the model will extract the experts' relevant skills or control strategies as actions, and objects relevant to the action into a computational model. The proposed model engages two types of deep learning as the groundwork: Convolutional Neural Network (CNN) for action recognition and Faster Region-based Convolutional Neural Network (R-CNN) for object detection. To evaluate the performance of the proposed model, a case study of GPU card final assembly was conducted. The accuracy for CNN and Faster R-CNN are respectively 95.4% and 96.8%. The final outcome of this model is to guide junior operators while they are doing the assembly by providing step-by-step instructions in performing complex tasks. The contribution of the present study is to facilitate flexible training models in terms of adapting skills from skilled operators to junior operators.

Abstract
摘要
Acknowledgement
Table of Content
List of Figures
List of Tables
Chapter 1. Introduction
Chapter 2. Literature Review
2.1 Skill Transfer
2.2 Action Recognition
2.3 Object Detection
2.4 Summary
Chapter 3. Methodology
3.1 Research Framework
3.2 Image Collection and Pre-processing
3.3 Deep Learning for Training
3.3.1 CNN Architecture for Action Recognition
3.3.2 Faster R-CNN Architecture for Object Detection
3.4 Skill Transferring
Chapter 4. Experiment Results and Discussions
4.1 Experiment Setup
4.2 Model Evaluation
4.3 Skill Transfer Support Model
Chapter 5. Conclusion
Reference
Appendix 1. Convolutional Neural Network
Appendix 2. Faster Regional-Convolutional Neural Network
Appendix 3. Circle Hough Transform
                                

Ahadi, S., & Jacobs, R. L. (2017). A review of the literature on structured on-the-job training and directions for future research. Human Resource Development Review, 16(4), 323-349.
Baccouche, M., Mamalet, F., Wolf, C., Garcia, C., & Baskurt, A. (2012, September). Spatio-Temporal Convolutional Sparse Auto-Encoder for Sequence Classification. In BMVC (pp. 1-12).
Battiato, S., Farinella, G. M., Leo, M., & Gallo, G. (2017). New Trends in Image Analysis and Processing–ICIAP 2017 (pp. 426-427).
Benabderrahmane, S., Mellouli, N., & Lamolle, M. (2018). On the predictive analysis of behavioral massive job data using embedded clustering and deep recurrent neural networks. Knowledge-Based Systems, 151, 95-113.
Boureau, Y. L., Ponce, J., & LeCun, Y. (2010). A theoretical analysis of feature pooling in visual recognition. In Proceedings of the 27th international conference on machine learning (ICML-10) (pp. 111-118).
Bose, S. R., & Kumar, V. S. (2019). Efficient inception V2 based deep convolutional neural network for real-time hand action recognition. IET Image Processing, 14(4), 688-696.
Bratton, J. & Gold, J. (2005). Human resource management: theory and practice. Trans Beijing: Economic and Management Press.
Campbell, J. P. & Kuncel, N. R. (2001). Individual and team training. In N. Anderson, D. S.Ones, H. K. Sinangil and C. Viswesvaran (eds), Handbook of Industrial, Work and Organizational Psychology (pp. 278-313).
Cao, Z., Simon, T., Wei, S. E., & Sheikh, Y. (2017). Realtime multi-person 2d pose estimation using part affinity fields. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 7291-7299).
Chen, B. (2010). Deep learning of invariant spatio-temporal features from video (Doctoral dissertation, University of British Columbia).
Chen, X., Ma, H., Wan, J., Li, B., & Xia, T. (2017). Multi-view 3d object detection network for autonomous driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1907-1915).
Cheng, Z., Niu, Z., & Wei, P. (2011, September). Operational skill training needs analysis for manufacturing industry. In 2011 International Conference of Information Technology, Computer Engineering and Management Sciences Vol. 3 (pp. 394-397). IEEE.
Deng, Z., Sun, H., Zhou, S., Zhao, J., & Zou, H. (2017). Toward fast and accurate vehicle detection in aerial images using coupled region-based convolutional neural networks. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 10(8), 3652-3664.
Dertat, A. (November, 2017). Applied Deep Learning - Part 4: Convolutional Neural Networks. Retrieved from https://towardsdatascience.com/applied-deep-learning-part-4-convolutional-neural-networks-584bc134c1e2.
Djekoune, A. O., Messaoudi, K., & Amara, K. (2017). Incremental circle hough transform: An improved method for circle detection. Optik, 133, 17-31.
Duan, F. (2009). Assembly skill transfer system for cell production. Ph.D. dissertation, Dept. Precision Eng., Univ. Tokyo, Tokyo, Japan.
Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2015). Region-based convolutional networks for accurate object detection and segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(1) (pp. 142-158).
Gu, J., Wang, Z., Kuen, J., Ma, L., Shahroudy, A., Shuai, B., ... & Chen, T. (2018). Recent advances in convolutional neural networks. Pattern Recognition, 77 (pp. 354-377).
Guo, K., Ishwar, P., & Konrad, J. (2013). Action recognition from video using feature covariance matrices. IEEE Transactions on Image Processing, 22(6) (pp. 2479-2494).
Guo, R., & Qi, H. (2013, December). Partially-sparse restricted boltzmann machine for background modeling and subtraction. In 2013 12th International Conference on Machine Learning and Applications (Vol. 1, pp. 209-214). IEEE. He, K., Zhang, X., Ren, S., & Sun, J. (2015). Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(9) (pp. 1904-1916). Hoedt, S., Claeys, A., Van Landeghem, H., & Cottyn, J. (2017). The evaluation of an elementary virtual training system for manual assembly. International Journal of Production Research, 55(24), 7496-7508.
Illingworth, J., & Kittler, J. (1988). A survey of the Hough transform. Computer Vision, Graphics, and Image Processing, 44(1) (pp. 87-116).
Ioannou, D., Huda, W., & Laine, A. F. (1999). Circle recognition through a 2D Hough transform and radius histogramming. Image and Vision Computing, 17(1) (pp. 15-26).
Jhaver, S., Birman, I., Gilbert, E., & Bruckman, A. (2019). Human-machine collaboration for content regulation: The case of Reddit Automoderator. ACM Transactions on Computer-Human Interaction (TOCHI), 26(5), 1-35.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. In 26th Annual Conference on Neural Information Processing Systems (pp. 1097-1105).
Le, Q. V., Zou, W. Y., Yeung, S. Y., & Ng, A. Y. (2011, June). Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In CVPR 2011 (pp. 3361-3368). IEEE.
LeCun, Y., et al. (1998). Gradient-based learning applied to document recognition. In Proceedings of the IEEE 86.11 (pp. 2278-2324).
Li, Q., Cai, W., Wang, X., Zhou, Y., Feng, D. D., & Chen, M. (2014). Medical image classification with convolutional neural network. In 13th International Conference on Control Automation Robotics & Vision (ICARCV) (pp. 844-848).
Liu, L., Ouyang, W., Wang, X. et al. (2019, October). Deep learning for generic object detection: a survey. Int J Comput Vis 128 (pp. 261-318).
Liu, Q., Liu, Z., Xu, W., Tang, Q., Zhou, Z., & Pham, D. T. (2019). Human-robot collaboration in disassembly for sustainable manufacturing. International Journal of Production Research, 57(12), 4027-4044.
Mathieu, M., Couprie, C., & LeCun, Y. (2015). Deep multi-scale video prediction beyond mean square error. arXiv preprint arXiv:1511.05440.
Murali, V. (2017). Recruiting unskilled labor: a specific issue in the manufacturing industry. Munich, GRIN Verlag.
Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning: ICML-10 (pp. 807-814).
Nechyba, M. C. & Yangsheng Xu. (1995). Human skill transfer: neural networks as learners and teachers. In 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems (pp. 314-319 vol.3).
Ren, S., He, K., Girshick, R., & Sun, J. (2016). Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems (pp. 91-99).
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Berg, A. C. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3) (pp. 211-252).
Scherer, D., Müller, A., & Behnke, S. (2010, September). Evaluation of pooling operations in convolutional architectures for object recognition. In International conference on artificial neural networks, (pp. 92-101). Springer, Berlin, Heidelberg.
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
Steinbach, R. (2004). On-the-job training: preparing employees for success. In Crisp Fifty-Minute Series, Illustrated Ed. ed. Boston, MA, USA.
Tchaban, A. (1999). Towards more flexibility in training: A review of some experiences in rationalizing the provision of vocational qualifications. Employment and Training Journals 56, ILO Geneva.
Tensorflow (2020) https://www.tensorflow.org/
Varol, G., Laptev, I., & Schmid, C. (2017). Long-term temporal convolutions for action recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(6), 1510-1517.
Veeriah, V., Zhuang, N., & Qi, G. J. (2015). Differential recurrent neural networks for action recognition. In Proceedings of the IEEE international conference on computer vision (pp. 4041-4049).
Vysocky, A. & Novak, P. (2016). Human – robot collaboration in industry. MMScience Journal, vol. 2016, no. 02, 2016 (pp. 903-906).
Wan, J., Wang, D., Hoi, S. C. H., Wu, P., Zhu, J., Zhang, Y., & Li, J. (2014, November). Deep learning for content-based image retrieval: a comprehensive study. In MM ’14: Proceedings of the 22nd ACM International Conference on Multimedia (pp. 903-906).
Wang, K. J., Rizqi D. A. and Nguyen P. H. (2020) Skill transfer support model based on deep learning. Journal of Intelligent Manufacturing. In press.
Wang K. J. and Wang, S. M. (2012) A negotiation-based capacity-planning model, IEEE Transactions on Systems Man and Cybernetics Part C - Applications and Reviews, 42(6), 983 - 993.
Weinland, D., Ronfard, R., & Boyer, E. (2011). A survey of vision-based methods for action representation, segmentation and recognition. Computer Vision and Image Understanding, 115(2) (pp. 224-241).
Yang, Z., & Nevatia, R. (2016, December). A multi-scale cascade fully convolutional network face detector. In 2016 23rd International Conference on Pattern Recognition (ICPR) (pp. 633-638). IEEE.
Yao, G., Lei, T., & Zhong, J. (2019). A review of Convolutional-Neural-Network-based action recognition. Pattern Recognition Letters, 118 (pp. 14-22).
Yokokohji, Y., Hollis, R. L., Kanade, T., Henmi, K., & Yoshikawa, T. (2002). Toward machine mediated training of motor skills: skill transfer from human to human via virtual environment. In Proceedings 5th IEEE International Workshop on Robot and Human Communication (pp. 32-37).
Zhao, Z. Q., Zheng, P., Xu, S. T., & Wu, X. (2019). Object detection with deep learning: A review. IEEE Transactions on Neural Networks and Learning Systems.
Zhu, J., Zhu, Z., & Zou, W. (2018). End-to-end video-level representation learning for action recognition. In 2018 24th International Conference on Pattern Recognition (ICPR) (pp. 645-650). IEEE.

全文公開日期 2025/06/29 (校內網路)
全文公開日期本全文未授權公開 (校外網路)
全文公開日期本全文未授權公開 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文