研究生: |
蔡永楨 Yung-Chen Tsai |
---|---|
論文名稱: |
直接端對端具注意力之多重表示潛藏特徵轉移學習 Direct Edge-to-Edge Attention-based Multiple Representation Latent Feature Transfer Learning |
指導教授: |
陸敬互
Ching-Hu Lu |
口試委員: |
陸敬互
Ching-Hu Lu 鍾聖倫 Sheng-Luen Chung 蘇順豐 Shun-Feng Su 廖峻鋒 Chun-Feng Liao 馬尚彬 Shang-Pin Ma |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電機工程系 Department of Electrical Engineering |
論文出版年: | 2022 |
畢業學年度: | 110 |
語文別: | 中文 |
論文頁數: | 122 |
中文關鍵詞: | 直接端對端 、多對多轉移學習 、深度學習 、進階潛藏特徵 、多重表示 、邊緣模型 、邊緣運算 、物聯網 |
外文關鍵詞: | direct edge-to-edge, many-to-many transfer learning, deep learning, advanced latent features, multi-representation, edge model, edge computing, Internet of Things |
相關次數: | 點閱:558 下載:5 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,善用邊緣運算能力的攝影機 (以下稱為邊緣攝影機) 讓具備「低接觸服務」的「無人商店」之智慧生活得以實現。然而,在無人商店佈署大量邊緣攝影機並訓練其邊緣模型非常消耗時間與人力成本。為解決以上問題,雖有研究提出不須集中式伺服器的直接端對端潛藏特徵轉移,但其未善用進階潛藏特徵之效益,甚至需要透過不斷的對潛藏特徵提取位置及領域損失函數的權重的設定進行調整,除了花費大量時間在初始邊緣模型訓練所導致的時間瓶頸外,亦不符合物聯網降低人為介入的特性。因此本研究提出一強化機制,透過「輕量化具注意力之多重表示殘差模組」讓輕量化的架構降低對邊緣裝置的運算負擔,而多尺度且多樣化的特徵提取架構能夠提取更加豐富且完整的進階潛藏特徵。另外,「輕量化聯合分布自主領域適應模組」可進一步減少模型重複訓練次數,節省邊緣攝影機佈署所需時間,並減少人為介入。本研究亦將其結合於直接端對端轉移技術中,透過一對多轉移學習讓一台邊緣攝影機作為多台邊緣攝影機的潛藏特徵轉移來源,提高知識的重用度,並加速模型的建立,另外,透過多對一讓單一邊緣攝影機可以更加有效的利用多台邊緣攝影機之進階潛藏特徵進行學習。實驗結果顯示,跟最新研究相比,強化機制結合一對多轉移可提升6.30%之總準確度,而在傳輸資料量減少75% 的情況下依然可獲得與既有研究相近之總準確度。另外,結合多對一轉移可在傳輸成本上減少83.33%,並提升5.10%之總準確度。
Cameras that make use of edge computing (hereinafter referred to as edge cameras) have enabled smart living with "low-touch services". However, deploying a large number of edge cameras and training their edge models are very time-consuming and labor-intensive. To solve the above problems, although some studies have proposed direct edge-to-edge (e2e) latent feature transfer learning without centralized servers, they do not make good use of the benefits of advanced latent features, and even need to adjust some hyperparameters constantly, which not only takes a lot of time in the initial model training, but also does not meet the characteristics of the Internet of Thing to reduce human intervention. Therefore, this study proposes an enhanced mechanism called “Lightweight Multi-representation Attention-based Residual (LiMAR)” module to reduce the computational cost on an edge device by its lightweight architecture, while the multi-scale and diverse feature extraction framework can extract richer and more complete advanced latent features. In the mechanism, the "Lightweight Joint-distribution Autonomous Domain Adaptation" module can further reduce the number of model training repetitions, save the time required for edge camera deployment, and reduce human intervention. Ours study also combines the LiMAR with direct e2e one-to-many transfer learning technique, which allows one edge camera to be the source of latent feature transfer for multiple edge cameras to improve knowledge reuse and accelerate model building. In addition, through many-to-one transfer learning, the system allows a single edge camera to utilize the advanced latent features from multiple edge cameras more effectively for learning. The experimental results show that the enhanced mechanism combined with one-to-many transfer can improve the total accuracy by 6.30% compared to the latest study, and the total accuracy can still be similar to the existing study with 75% less transmission data. In addition, the combination of many-to-one transfer learning can reduce the transmission cost by 83.33% and improve the total accuracy by 5.10%.
[1] A. Banafa. (2019). Ten Trends of IoT in 2020. Available: https://iot.ieee.org/newsletter/november-2019/ten-trends-of-iot-in-2020
[2] J. Hu, L. Shen, and G. Sun, "Squeeze-and-excitation networks," pp. 7132-7141.
[3] G. Ananthanarayanan et al., "Real-time video analytics: The killer app for edge computing," computer, vol. 50, no. 10, pp. 58-67, 2017.
[4] "Starbucks Pickup and Amazon Go Collaborate to Launch New Store Concrpt in New York City," 2021.
[5] "FamilyMart preps 1,000 unmanned stores in Japan by 2024," 2021.
[6] F. Zhuang et al., "A comprehensive survey on transfer learning," Proceedings of the IEEE, vol. 109, no. 1, pp. 43-76, 2020.
[7] Z. Tao and Q. Li, "esgd: Communication efficient distributed deep learning on the edge."
[8] R. Sharma, S. Biookaghazadeh, B. Li, and M. Zhao, "Are existing knowledge transfer techniques effective for deep learning with edge devices?," pp. 42-49: IEEE.
[9] C.-H. Lu and Y.-M. Zhou, "Direct Edge-to-Edge Many-to-Many Latent Feature Transfer Learning," IEEE Internet of Things Journal, 2021.
[10] S. J. Pan and Q. Yang, "A survey on transfer learning," IEEE Transactions on knowledge and data engineering, vol. 22, no. 10, pp. 1345-1359, 2009.
[11] S. Ruder, "An overview of multi-task learning in deep neural networks," arXiv preprint arXiv:1706.05098, 2017.
[12] A. Bettge, R. Roscher, and S. Wenzel, "Deep self-taught learning for remote sensing image classification," arXiv preprint arXiv:1710.07096, 2017.
[13] Y. Ganin et al., "Domain-adversarial training of neural networks," The journal of machine learning research, vol. 17, no. 1, pp. 2096-2030, 2016.
[14] M. Johnson et al., "Google’s multilingual neural machine translation system: Enabling zero-shot translation," Transactions of the Association for Computational Linguistics, vol. 5, pp. 339-351, 2017.
[15] W. Dai, Q. Yang, G.-R. Xue, and Y. Yu, "Self-taught clustering," pp. 200-207.
[16] Y. Zhu et al., "Multi-representation adaptation network for cross-domain image classification," Neural Networks, vol. 119, pp. 214-221, 2019.
[17] C.-H. Lu and X.-Z. Lin, "Toward Direct Edge-to-Edge Transfer Learning for IoT-Enabled Edge Cameras," IEEE Internet of Things Journal, vol. 8, no. 6, pp. 4931-4943, 2020.
[18] K. M. Borgwardt, A. Gretton, M. J. Rasch, H.-P. Kriegel, B. Schölkopf, and A. J. Smola, "Integrating structured biological data by kernel maximum mean discrepancy," Bioinformatics, vol. 22, no. 14, pp. e49-e57, 2006.
[19] B. Sun and K. Saenko, "Deep coral: Correlation alignment for deep domain adaptation," pp. 443-450: Springer.
[20] S. Zhao, B. Li, P. Xu, and K. Keutzer, "Multi-source domain adaptation in the deep learning era: A systematic survey," arXiv preprint arXiv:2002.12169, 2020.
[21] E. Tzeng, J. Hoffman, N. Zhang, K. Saenko, and T. Darrell, "Deep domain confusion: Maximizing for domain invariance," arXiv preprint arXiv:1412.3474, 2014.
[22] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in neural information processing systems, vol. 25, 2012.
[23] M. Long, Y. Cao, J. Wang, and M. Jordan, "Learning transferable features with deep adaptation networks," pp. 97-105: PMLR.
[24] A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Schölkopf, and A. Smola, "A kernel two-sample test," The Journal of Machine Learning Research, vol. 13, no. 1, pp. 723-773, 2012.
[25] C. Chen et al., "Homm: Higher-order moment matching for unsupervised domain adaptation," vol. 34, pp. 3422-3429.
[26] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," pp. 770-778.
[27] Y. Du, Z. Tan, Q. Chen, X. Zhang, Y. Yao, and C. Wang, "Dual adversarial domain adaptation," arXiv preprint arXiv:2001.00153, 2020.
[28] M. Ghifary, W. B. Kleijn, M. Zhang, D. Balduzzi, and W. Li, "Deep reconstruction-classification networks for unsupervised domain adaptation," pp. 597-613: Springer.
[29] Z. Chen, J. Zhuang, X. Liang, and L. Lin, "Blending-target domain adaptation by adversarial meta-adaptation networks," pp. 2248-2257.
[30] X. Peng, Q. Bai, X. Xia, Z. Huang, K. Saenko, and B. Wang, "Moment matching for multi-source domain adaptation," pp. 1406-1415.
[31] Y. Zhu, F. Zhuang, and D. Wang, "Aligning domain-specific distribution and classifier for cross-domain classification from multiple sources," vol. 33, pp. 5989-5996.
[32] R. Xu, Z. Chen, W. Zuo, J. Yan, and L. Lin, "Deep cocktail network: Multi-source unsupervised domain adaptation with category shift," pp. 3964-3973.
[33] S. Zhao et al., "A review of single-source deep unsupervised visual domain adaptation," IEEE Transactions on Neural Networks and Learning Systems, 2020.
[34] E. Granger, M. Kiran, J. Dolz, and L.-A. Blais-Morin, "Joint progressive knowledge distillation and unsupervised domain adaptation," pp. 1-8: IEEE.
[35] J. Yang, H. Zou, S. Cao, Z. Chen, and L. Xie, "MobileDA: Toward edge-domain adaptation," IEEE Internet of Things Journal, vol. 7, no. 8, pp. 6909-6918, 2020.
[36] X. Zhou, Y. Tian, and X. Wang, "Source-Target Unified Knowledge Distillation for Memory-Efficient Federated Domain Adaptation on Edge Devices," 2021.
[37] G. Hinton, O. Vinyals, and J. Dean, "Distilling the knowledge in a neural network," arXiv preprint arXiv:1503.02531, vol. 2, no. 7, 2015.
[38] J. Li, M. Jing, H. Su, K. Lu, L. Zhu, and H. T. Shen, "Faster domain adaptation networks," IEEE Transactions on Knowledge and Data Engineering, 2021.
[39] C. Szegedy et al., "Going deeper with convolutions," pp. 1-9.
[40] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, "Rethinking the inception architecture for computer vision," pp. 2818-2826.
[41] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, "Inception-v4, inception-resnet and the impact of residual connections on learning."
[42] S. Zhu, B. Du, L. Zhang, and X. Li, "Attention-Based Multiscale Residual Adaptation Network for Cross-Scene Classification," IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1-15, 2021.
[43] M.-H. Guo et al., "Attention mechanisms in computer vision: A survey," Computational Visual Media, pp. 1-38, 2022.
[44] Z. Niu, G. Zhong, and H. Yu, "A review on the attention mechanism of deep learning," Neurocomputing, vol. 452, pp. 48-62, 2021.
[45] Q. Wang, B. Wu, P. Zhu, P. Li, W. Zuo, and Q. Hu, "ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks," pp. 11534-11542.
[46] Z. Gao, J. Xie, Q. Wang, and P. Li, "Global second-order pooling convolutional networks," pp. 3024-3033.
[47] M. Jaderberg, K. Simonyan, and A. Zisserman, "Spatial transformer networks," Advances in neural information processing systems, vol. 28, 2015.
[48] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, "Cbam: Convolutional block attention module," pp. 3-19.
[49] X. Zhang, F. X. Yu, S.-F. Chang, and S. Wang, "Deep transfer network: Unsupervised domain adaptation," arXiv preprint arXiv:1503.00591, 2015.
[50] J. Wang, Y. Chen, W. Feng, H. Yu, M. Huang, and Q. Yang, "Transfer learning with dynamic distribution adaptation," ACM Transactions on Intelligent Systems and Technology (TIST), vol. 11, no. 1, pp. 1-25, 2020.
[51] S. Ben-David, J. Blitzer, K. Crammer, and F. Pereira, "Analysis of representations for domain adaptation," Advances in neural information processing systems, vol. 19, 2006.
[52] C.-H. Lu and B.-E. Shao, "Environment-aware multiscene image enhancement for internet of things enabled edge cameras," IEEE Systems Journal, vol. 15, no. 3, pp. 3439-3449, 2020.
[53] F. Yu and V. Koltun, "Multi-scale context aggregation by dilated convolutions," arXiv preprint arXiv:1511.07122, 2015.
[54] S. Hwang, J. Lee, C. Jung, and J. Kim, "Attention-Aware Linear Depthwise Convolution for Single Image Super-Resolution," pp. 72-76.
[55] S. Mehta, M. Rastegari, A. Caspi, L. Shapiro, and H. Hajishirzi, "Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation," pp. 552-568.
[56] F. Yu, V. Koltun, and T. Funkhouser, "Dilated residual networks," pp. 472-480.
[57] C. Wu, W. Wen, T. Afzal, Y. Zhang, and Y. Chen, "A compact dnn: approaching googlenet-level accuracy of classification and domain adaptation," pp. 5668-5677.
[58] M. Long, J. Wang, G. Ding, J. Sun, and P. S. Yu, "Transfer feature learning with joint distribution adaptation," pp. 2200-2207.
[59] C.-Y. Lee, T. Batra, M. H. Baig, and D. Ulbricht, "Sliced wasserstein discrepancy for unsupervised domain adaptation," pp. 10285-10295.
[60] H. Shimodaira, "Improving predictive inference under covariate shift by weighting the log-likelihood function," Journal of statistical planning and inference, vol. 90, no. 2, pp. 227-244, 2000.
[61] J. Huang, A. Gretton, K. Borgwardt, B. Schölkopf, and A. Smola, "Correcting sample selection bias by unlabeled data," Advances in neural information processing systems, vol. 19, 2006.
[62] M. J. Swain and D. H. Ballard, "Color indexing," International journal of computer vision, vol. 7, no. 1, pp. 11-32, 1991.
[63] P. Chamoso, A. Rivas, J. J. Martín-Limorti, and S. Rodríguez, "A hash based image matching algorithm for social networks," pp. 183-190: Springer.
[64] R. Müller, S. Kornblith, and G. E. Hinton, "When does label smoothing help?," Advances in neural information processing systems, vol. 32, 2019.
[65] Y. Mansour, M. Mohri, and A. Rostamizadeh, "Domain adaptation with multiple sources," Advances in neural information processing systems, vol. 21, 2008.
[66] I. Hussain, Q. He, and Z. Chen, "Automatic fruit recognition based on DCNN for commercial source trace system," Int. J. Comput. Sci. Appl, vol. 8, no. 2/3, pp. 01-14, 2018.
[67] D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.