簡易檢索 / 詳目顯示

研究生: 陳學勤
Hsueh-Chin Chen
論文名稱: 居家機器人之色情物件識別:可解釋的複合與分散式人工智慧
Pornographic Object Recognition for Home Robot: Explainable Composite and Distributed Artificial Intelligence
指導教授: 李敏凡
Min-Fan Lee
口試委員: 李敏凡
Min-Fan Lee
蔡明忠
湯梓辰
Joni-Tzuchen Tang
學位類別: 碩士
Master
系所名稱: 工程學院 - 自動化及控制研究所
Graduate Institute of Automation and Control
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 75
中文關鍵詞: 人工智慧深度學習物件辨識居家服務機器人
外文關鍵詞: Artificial intelligence, Deep learning, Object recognition, Service robots, Vision transformer
相關次數: 點閱:232下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

集中式服務器存在隱私洩露和擴展性挑戰的風險,這可能導致數據洩露和管理大量數據的效率低下。識別色情對象的任務充滿挑戰,包括對象遮擋、目標識別的模糊性以及場景視點的變化。此類挑戰通常會導致較高的誤報率和漏報率。使問題更加複雜的是人工智慧普遍缺乏可解釋性,引發了對其可靠性和用戶接受度的合理擔憂。本文介紹了一種分佈式人工智慧系統,該系統利用雲、邊緣和終端設備上處理的能力來訓練和測試色情對象識別。在訓練階段,所提出的模型利用複合人工智慧, 其中包括用於增強特徵表示的 Vision Transformer,以及用於對象識別的You Only Look Once 4和後續模糊處理。為了增強模型的可解釋性和清晰度,這項工作集成了可解釋的人工智慧方法,特別是Local Interpretable Model-agnostic Explanations。初步研究結果表明,這種分佈式人工智慧方法可以將訓練時間和資源使用量減少40%,令人印象深刻,從而促進更無縫的邊緣雲協作。綜合評估模型準確率96%,精確率98.5%,召回率96%,F1分數97.5%。該系統的平均精度達到94%,在內容過濾和對象識別方面較現有模型有顯著改進。


Centralized servers present risks of privacy breaches and scalability challenges, which can lead to data breaches and inefficiencies in managing vast data volumes. The task of recognizing pornographic objects is fraught with challenges, including object occlusion, ambiguities in target recognition, and variations in scene viewpoints. Such challenges often lead to high false positive and false negative rates. Further complicating matters is the prevalent lack of explainability in Artificial Intelligence (AI), raising valid concerns about its reliability and user acceptance. This paper introduces a distributed AI system that harnesses the capabilities of cloud, edge, and on-device processing (specifically, home robots) for the training and testing of pornographic object recognition. In its offline phase, the proposed model harnesses a composite AI. This includes the Vision Transformer to bolster feature representation, while the "You Only Look Once 4" algorithm is employed for object recognition and subsequent blurring. To enhance model interpretability and clarity, this work integrates explainable AI components, specifically the Local Interpretable Model-agnostic Explanations (LIME). Preliminary findings suggest that this distributed AI approach can cut training times and resource use by an impressive 40%, facilitating more seamless edge-cloud collaboration. Comprehensive evaluation of the model reveals an accuracy of 96%, precision at 98.5%, a recall rate of 96%, and an F1 score of 97.5%. The system’s mean average precision (mAP) stands at 94%, marking a notable improvement over existing models in content filtering and object recognition.

致謝………………………………………..…………………………………………III 摘要……………………………..……………………………………………………IV ABSTRACT…………..………………………………………………………………V Table of Contents…..…………………………………………………………………VI List of Figures………………………………………………………………………VIII List of Tables…………………………………………………………………………..X Chapter 1 Introduction…………………………………………………………………1 Chapter 2 Method……………………………………………………………………...6 2.1 Related Work…………………………………………………………………6 2.1.1 Distributed AI…………………………………………………………6 2.1.2 Composite AI…………………………………………………….……7 2.1.3 Explainable AI………………………………………………………...8 2.2 Problem Statement……………………………………………………………9 2.3 AIoT Scenario……………………………………………………………….11 2.3.1 Training Phase……………………………………………………….13 2.3.2 Prediction Phase……………………………………………………...14 2.4 Distributed AI System……………………………………………………….15 2.5 Composite AI Framework………………………………………………...…18 2.5.1 Feature Enhancement………………………………………………...18 2.5.2 Object Detection……………………………………………………..24 2.6 Explainable AI………………………………………………………………28 Chapter 3 Result………………………………………………………………………34 3.1 Experimental Environment…………………………………………….……35 3.1.1 Home Service Robot…………………………………………………35 3.1.2 Edge Server…………………………………………………………..36 3.2 Experimental Architecture Parameter……………………………………….37 3.3 Data Management…………………………………………………………...38 3.4 Performance Analysis…………………………………………………….…40 3.4.1 Performance Analysis with ViT…………...…………………………40 3.4.2 Performance Analysis with YOLOv4……………………………………..44 3.4.3 Explainable Model………………………………………………………...49 3.5 Distributed AI System Performance and Comparison of Other Papers……51 Chapter 4 Conclusion…………………………………………………………………55 References……………………………………………………………………………59

[1] G. Lulla, A. Kumar, G. Pole, and G. Deshmukh, "IoT based Smart Security and Surveillance System," in 2021 International Conference on Emerging Smart Computing and Informatics (ESCI), 2021, pp. 385-390.
[2] D. Mocrii, Y. Chen, and P. Musilek, "IoT-based smart homes: A review of system architecture, software, communications, privacy and security," Internet of Things, vol. 1-2, pp. 81-98, 2018.
[3] S. H. Shah and I. Yaqoob, "A survey: Internet of Things (IOT) technologies, applications and challenges," in 2016 IEEE Smart Energy Grid Engineering (SEGE), 2016, pp. 381-385.
[4] S. Reza and S. Vitaly, "Privacy-preserving deep learning," Allerton Conference on Communication, Control, and Computing, 2015.
[5] D. D. Phan, T. T. Nguyen, Q. H. Nguyen, H. L. Tran, K. N. K. Nguyen, and D. L. Vu, "A Novel Pornographic Visual Content Classifier based on Sensitive Object Detection," INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, vol. 12, no. 5, pp. 787-795, 2021.
[6] J. Aleksandar et al., "Keeping Children Safe Online With Limited Resources: Analyzing What is Seen and Heard," IEEE Access, 2021.
[7] C. S. Arthur et al., "Should I See or Should I Go: Automatic Detection of Sensitive Media in Messaging Apps," 2021.
[8] G. Riccardo et al., "A Survey of Methods for Explaining Black Box Models," ACM Computing Surveys, 2018.
[9] Y. Li, Y. Zuo, H. Song, and Z. Lv, "Deep Learning in Security of Internet of Things," IEEE Internet of Things Journal, vol. 9, no. 22, pp. 22133-22146, 2022.
[10] T. C. Chiu, Y. Y. Shih, A. C. Pang, C. S. Wang, W. Weng, and C. T. Chou, "Semisupervised Distributed Learning With Non-IID Data for AIoT Service Platform," IEEE Internet of Things Journal, vol. 7, no. 10, pp. 9266-9277, 2020.
[11] Z. Chang, S. Liu, X. Xiong, Z. Cai, and G. Tu, "A Survey of Recent Advances in Edge-Computing-Powered Artificial Intelligence of Things," IEEE Internet of Things Journal, vol. 8, no. 18, pp. 13849-13875, 2021.
[12] L. Jia, Z. Zhou, F. Xu, and H. Jin, "Cost-Efficient Continuous Edge Learning for Artificial Intelligence of Things," IEEE Internet of Things Journal, vol. 9, no. 10, pp. 7325-7337, 2022.
[13] J. Mallmann, A. Santin, E. Kugler Viegas, R. Santos, and J. Geremias, "PPCensor: Architecture for real-time pornography detection in video streaming," Future Generation Computer Systems, vol. 112, 2020.
[14] H. Étienne, D. Jean-Louis, D. Ada, M. David, and S. Mathieu, "A Decentralized Explanatory System for Intelligent Cyber-Physical Systems," 2022.
[15] Z. Di et al., "A bi-level machine learning method for fault diagnosis of oil-immersed transformers with feature explainability," International Journal of Electrical Power & Energy Systems, 2022.
[16] D. Yunheng et al., "A combination of P300 and eye movement data improves the accuracy of auxiliary diagnoses of depression," Journal of Affective Disorders, 2022.
[17] R. Marco Tulio, S. Sameer, and G. Carlos, ""Why Should I Trust You?": Explaining the Predictions of Any Classifier," arXiv: Learning, 2016.
[18] B. Alexey, W. Chien-Yao, W. Chien-Yao, and L. Hong-Yuan Mark, "YOLOv4: Optimal Speed and Accuracy of Object Detection," arXiv: Computer Vision and Pattern Recognition, 2020.
[19] S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 2017.
[20] K. He, G. Gkioxari, P. Dollar, and R. Girshick, "Mask R-CNN," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 2, pp. 386-397, 2020.
[21] Z.-X. Zheng, Z. Fuquan, and F.-q. Zhang, "Image Real-Time Detection Using LSE-Yolo Neural Network in Artificial Intelligence-Based Internet of Things for Smart Cities and Smart Homes," Wireless Communications and Mobile Computing, 2022.
[22] K. Sulaiman, N. Shah, and K. Habib Ullah, "Smart Object Detection and Home Appliances Control System in Smart Cities," Cmc-computers Materials & Continua, 2021.
[23] M. Sai Nikhil Reddy, M. Vineetha, and M. Vineetha, "A Smart Eco-System for Parking Detection Using Deep Learning and Big Data Analytics," 2019.
[24] M. Faisal, U. Israr, A. Shabir, and K. Do-Hyeun, "Object detection mechanism based on deep learning algorithm using embedded IoT devices for smart home appliances control in CoT," Journal of Ambient Intelligence and Humanized Computing, 2019.
[25] W. Lei et al., "3D Object Detection based on Sparse Convolution Neural Network and Feature Fusion for Autonomous Driving in Smart Cities," Sustainable Cities and Society, 2020.
[26] H. Ling and N. Qiang, "IoT-Driven Automated Object Detection Algorithm for Urban Surveillance Systems in Smart Cities," IEEE Internet of Things Journal, 2018.
[27] N. Rashmiranjan, B. Mohini Mohan, C. P. Umesh, and D. Santos Kumar, "Video-based Real-time Intrusion Detection System using Deep-Learning for Smart City Applications," 2019.
[28] A. Syma, H. Mahmudul, and N. Hussain, "Machine Learning Models for Content Classification in Film Censorship and Rating," 2022.
[29] T. André, P. C. Kenneth, B. Alexandra, C. Stefania, A. F. Reuben, and G. B. Mark, "Pornographic content classification using deep-learning," 2021.
[30] G. Jhonatan, K. V. Eduardo, S. B. Alceu, and O. S. Altair, "A motion-based approach for real-time detection of pornographic content in videos," 2022.
[31] G. Abhishek, G.-C. Víctor, A. Enrique, and F. Eduardo, "AttM-CNN: Attention and metric learning based CNN for pornography, age and Child Sexual Abuse (CSA) Detection in images," Neurocomputing, 2021.
[32] A. Dosovitskiy et al., "An image is worth 16X16 words: Transformers for image recognition at scale," arXiv, 2020.
[33] A. Noktedan. Adult content dataset, Figshare.
[34] M. Mohammad Reza, M. Mohammad, M. Mohammad Reza, H. Seyyed Mojtaba, A. Kourosh, and A. Kourosh Dadashtabar, "Time-Sensitive Adaptive Model for Adult Image Classification," Computing and Informatics \/ Computers and Artificial Intelligence, 2020.
[35] B. Dwyer, Nelson, J., Solawetz, J., et. al. PORN_LAB Computer Vision Project, Roboflow (Version 1.0) [Software].
[36] Y. Zhao and S. Qingshuang, "Toward efficient neural architecture search with dynamic mapping-adaptive sampling for resource-limited edge device," Neural Computing and Applications, 2022.
[37] A. Vaswani et al., "Attention Is All You Need," p. arXiv:1706.03762.
[38] N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, and S. Zagoruyko, "End-to-End Object Detection with Transformers," ArXiv, vol. abs/2005.12872, 2020.
[39] L. Wei et al., "SSD: Single Shot MultiBox Detector," European Conference on Computer Vision, 2016.
[40] R. Joseph, K. D. Santosh, G. Ross, and F. Ali, "You Only Look Once: Unified, Real-Time Object Detection," Computer Vision and Pattern Recognition, 2016.
[41] W. Chien-Yao et al., "CSPNet: A New Backbone that can Enhance Learning Capability of CNN," arXiv: Computer Vision and Pattern Recognition, 2019.
[42] L. Petter Eilif de, M. Borger, V. Christian Bakke, and W. Sjur, "Explainable AI for Credit Assessment in Banks," Journal of risk and financial management, 2022.
[43] M. Sudi, M. Ben Wycliff, J. Nakatumba-Nabende, and M. Ggaliwango, "Interpretable Machine Learning for Predicting Customer Churn in Retail Banking," 2023 7th International Conference on Trends in Electronics and Informatics (ICOEI), 2023.
[44] W. Ke et al., "Interpreting Adversarial Examples and Robustness for Deep Learning-Based Auto-Driving Systems," IEEE Transactions on Intelligent Transportation Systems, 2021.
[45] M. L. Scott and L. Su-In, "A Unified Approach to Interpreting Model Predictions," NIPS, 2017.
[46] L. Min-Fan Ricky, L. Min-Fan Ricky, C. Tzu-Wei, and C. Tzu-Wei, "Artificial Intelligence and Internet of Things for Robotic Disaster Response," ARIŞ, 2020.
[47] C. Yang et al., "Lite Vision Transformer with Enhanced Self-Attention," in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 18-24 June 2022, pp. 11988-11998.
[48] Q. Lou, Y. C. Hsu, B. Uzkent, T. Hua, Y. Shen, and H. Jin, "Lite-MDETR: A Lightweight Multi-Modal Detector," in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 18-24 June 2022, pp. 12196-12205.
[49] J. Huang et al., "Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors," in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 21-26 July 2017, pp. 3296-3297.
[50] T. Chuanqi et al., "A Survey on Deep Transfer Learning," International Conference on Artificial Neural Networks, 2018.
[51] O. Maxime, B. Léon, L. Ivan, and S. Josef, "Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks," IEEE Conference on Computer Vision and Pattern Recognition, 2014.
[52] L. Tsung-Yi et al., "Microsoft COCO: Common Objects in Context," arXiv: Computer Vision and Pattern Recognition, 2014.

無法下載圖示 全文公開日期 2025/08/23 (校內網路)
全文公開日期 2025/08/23 (校外網路)
全文公開日期 2025/08/23 (國家圖書館:臺灣博碩士論文系統)
QR CODE