直接端對端具注意力之多重表示潛藏特徵轉移學習｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	蔡永楨 Yung-Chen Tsai
論文名稱：	直接端對端具注意力之多重表示潛藏特徵轉移學習 Direct Edge-to-Edge Attention-based Multiple Representation Latent Feature Transfer Learning
指導教授：	陸敬互 Ching-Hu Lu
口試委員:	陸敬互 Ching-Hu Lu 鍾聖倫 Sheng-Luen Chung 蘇順豐 Shun-Feng Su 廖峻鋒 Chun-Feng Liao 馬尚彬 Shang-Pin Ma
學位類別：	碩士 Master
系所名稱：	電資學院 - 電機工程系 Department of Electrical Engineering
論文出版年：	2022
畢業學年度：	110
語文別：	中文
論文頁數：	122
中文關鍵詞：	直接端對端、多對多轉移學習、深度學習、進階潛藏特徵、多重表示、邊緣模型、邊緣運算、物聯網
外文關鍵詞：	direct edge-to-edge, many-to-many transfer learning, deep learning, advanced latent features, multi-representation, edge model, edge computing, Internet of Things
相關次數：	點閱：345 下載：5
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

近年來，善用邊緣運算能力的攝影機 (以下稱為邊緣攝影機) 讓具備「低接觸服務」的「無人商店」之智慧生活得以實現。然而，在無人商店佈署大量邊緣攝影機並訓練其邊緣模型非常消耗時間與人力成本。為解決以上問題，雖有研究提出不須集中式伺服器的直接端對端潛藏特徵轉移，但其未善用進階潛藏特徵之效益，甚至需要透過不斷的對潛藏特徵提取位置及領域損失函數的權重的設定進行調整，除了花費大量時間在初始邊緣模型訓練所導致的時間瓶頸外，亦不符合物聯網降低人為介入的特性。因此本研究提出一強化機制，透過「輕量化具注意力之多重表示殘差模組」讓輕量化的架構降低對邊緣裝置的運算負擔，而多尺度且多樣化的特徵提取架構能夠提取更加豐富且完整的進階潛藏特徵。另外，「輕量化聯合分布自主領域適應模組」可進一步減少模型重複訓練次數，節省邊緣攝影機佈署所需時間，並減少人為介入。本研究亦將其結合於直接端對端轉移技術中，透過一對多轉移學習讓一台邊緣攝影機作為多台邊緣攝影機的潛藏特徵轉移來源，提高知識的重用度，並加速模型的建立，另外，透過多對一讓單一邊緣攝影機可以更加有效的利用多台邊緣攝影機之進階潛藏特徵進行學習。實驗結果顯示，跟最新研究相比，強化機制結合一對多轉移可提升6.30%之總準確度，而在傳輸資料量減少75% 的情況下依然可獲得與既有研究相近之總準確度。另外，結合多對一轉移可在傳輸成本上減少83.33%，並提升5.10%之總準確度。

Cameras that make use of edge computing (hereinafter referred to as edge cameras) have enabled smart living with "low-touch services". However, deploying a large number of edge cameras and training their edge models are very time-consuming and labor-intensive. To solve the above problems, although some studies have proposed direct edge-to-edge (e2e) latent feature transfer learning without centralized servers, they do not make good use of the benefits of advanced latent features, and even need to adjust some hyperparameters constantly, which not only takes a lot of time in the initial model training, but also does not meet the characteristics of the Internet of Thing to reduce human intervention. Therefore, this study proposes an enhanced mechanism called “Lightweight Multi-representation Attention-based Residual (LiMAR)” module to reduce the computational cost on an edge device by its lightweight architecture, while the multi-scale and diverse feature extraction framework can extract richer and more complete advanced latent features. In the mechanism, the "Lightweight Joint-distribution Autonomous Domain Adaptation" module can further reduce the number of model training repetitions, save the time required for edge camera deployment, and reduce human intervention. Ours study also combines the LiMAR with direct e2e one-to-many transfer learning technique, which allows one edge camera to be the source of latent feature transfer for multiple edge cameras to improve knowledge reuse and accelerate model building. In addition, through many-to-one transfer learning, the system allows a single edge camera to utilize the advanced latent features from multiple edge cameras more effectively for learning. The experimental results show that the enhanced mechanism combined with one-to-many transfer can improve the total accuracy by 6.30% compared to the latest study, and the total accuracy can still be similar to the existing study with 75% less transmission data. In addition, the combination of many-to-one transfer learning can reduce the transmission cost by 83.33% and improve the total accuracy by 5.10%.

中文摘要    I
Abstract    II
致謝    IV
目錄    V
圖目錄    VIII
表格目錄    X
第一章 簡介    1
1.1    研究動機    1
1.2    文獻探討    6
1.2.1基於伺服器運算導向之轉移學習    6
    一對一轉移學習    6
    一對多轉移學習    8
    多對一轉移學習    9
1.2.2基於邊緣運算導向之轉移學習    10
    基於伺服器-邊緣裝置合作的轉移學習    11
    直接邊緣對邊緣轉移學習    11
    「未善用進階潛藏特徵」的議題    12
    「無法有效且自動的對齊目標領域之類別」的議題    15
1.3    本研究貢獻與文章架構    16
第二章 系統設計理念與架構簡介    19
2.1    系統應用情境    19
2.2    邊緣攝影機內之系統架構流程    21
2.3    系統整體流程時序圖    24
2.4 系統整體流程演算法    26
第三章 輕量化具注意力之多重表示殘差模組    29
3.1    深度可分離卷積運算    29
3.2    深度可分離空洞卷積運算    31
3.3    注意力機制    33
3.4 輕量化具注意力之多重表示殘差模組    35
3.4.1 線性深度可分離卷積    35
3.4.2 結合注意力之線性深度可分離卷積模塊 (M2O only)    36
3.4.3空洞卷積與分層特徵融合技術    37
3.4.4短路連接與殘差學習    38
3.4.5 輕量化具注意力之多重表示殘差模組架構設計    39
第四章 輕量化聯合分布自主領域適應模組    42
4.1    無監督領域適應    42
4.2    領域損失函數之選擇與評估    42
4.3    領域分布距離估計與更新    44
第五章 基於多重表示與注意力機制的直接端對端一對多潛藏特徵轉移學習模組    46
5.1    領域權重估計    46
5.2 進階精英潛藏特徵萃取    47
5.3    模型訓練損失函數之評估    48
5.4    MA-e2e-O2M Transfer之流程    51
第六章 基於多重表示與注意力機制的直接端對端多對一潛藏特徵轉移學習模組    53
6.1    模型框架設計與訓練損失函數之評估    53
6.2    進階精英潛藏特徵萃取    55
6.3    相似性感知加權    55
6.4 MA-e2e-M2O Transfer之流程    56
第七章 實驗結果與討論    57
7.1實驗平台    57
7.2 實驗資料集    57
7.3 基於多重表示與注意力機制的直接端對端一對多潛藏特徵轉移學習    58
7.3.1進階精英潛藏特徵層實驗    59
7.3.2最佳組合實驗    63
7.3.3 領域損失函數實驗    66
7.3.4 標籤平滑技術實驗    69
7.3.5 Dropout層實驗    71
7.3.6 LiMAR模型消融實驗    73
7.3.7資料傳輸量與模型訓練時間實驗    75
7.3.8 MA-e2e-O2M Transfer之耗電量實驗    77
7.3.9 MA-e2e-O2M Transfer之記憶體用量實驗    78
7.4 基於多重表示與注意力機制的直接端對端多對一潛藏特徵轉移學習    79
7.4.1 MA-e2e-M2O Transfer實驗    80
7.4.2 注意力機制實驗    81
7.4.3 最佳組合實驗    84
7.4.4 相似性感之加權實驗    87
7.4.5資料傳輸量與模型訓練實驗    88
7.4.6 MA-e2e-M2O Transfer之耗電量實驗    94
7.4.7 MA-e2e-M2O Transfer之記憶體用量實驗    95
第八章 結論與未來研究方向    97
參考文獻    99
口試委員之建議與回覆    103

                                

[1] A. Banafa. (2019). Ten Trends of IoT in 2020. Available: https://iot.ieee.org/newsletter/november-2019/ten-trends-of-iot-in-2020
[2] J. Hu, L. Shen, and G. Sun, "Squeeze-and-excitation networks," pp. 7132-7141.
[3] G. Ananthanarayanan et al., "Real-time video analytics: The killer app for edge computing," computer, vol. 50, no. 10, pp. 58-67, 2017.
[4] "Starbucks Pickup and Amazon Go Collaborate to Launch New Store Concrpt in New York City," 2021.
[5] "FamilyMart preps 1,000 unmanned stores in Japan by 2024," 2021.
[6] F. Zhuang et al., "A comprehensive survey on transfer learning," Proceedings of the IEEE, vol. 109, no. 1, pp. 43-76, 2020.
[7] Z. Tao and Q. Li, "esgd: Communication efficient distributed deep learning on the edge."
[8] R. Sharma, S. Biookaghazadeh, B. Li, and M. Zhao, "Are existing knowledge transfer techniques effective for deep learning with edge devices?," pp. 42-49: IEEE.
[9] C.-H. Lu and Y.-M. Zhou, "Direct Edge-to-Edge Many-to-Many Latent Feature Transfer Learning," IEEE Internet of Things Journal, 2021.
[10] S. J. Pan and Q. Yang, "A survey on transfer learning," IEEE Transactions on knowledge and data engineering, vol. 22, no. 10, pp. 1345-1359, 2009.
[11] S. Ruder, "An overview of multi-task learning in deep neural networks," arXiv preprint arXiv:1706.05098, 2017.
[12] A. Bettge, R. Roscher, and S. Wenzel, "Deep self-taught learning for remote sensing image classification," arXiv preprint arXiv:1710.07096, 2017.
[13] Y. Ganin et al., "Domain-adversarial training of neural networks," The journal of machine learning research, vol. 17, no. 1, pp. 2096-2030, 2016.
[14] M. Johnson et al., "Google’s multilingual neural machine translation system: Enabling zero-shot translation," Transactions of the Association for Computational Linguistics, vol. 5, pp. 339-351, 2017.
[15] W. Dai, Q. Yang, G.-R. Xue, and Y. Yu, "Self-taught clustering," pp. 200-207.
[16] Y. Zhu et al., "Multi-representation adaptation network for cross-domain image classification," Neural Networks, vol. 119, pp. 214-221, 2019.
[17] C.-H. Lu and X.-Z. Lin, "Toward Direct Edge-to-Edge Transfer Learning for IoT-Enabled Edge Cameras," IEEE Internet of Things Journal, vol. 8, no. 6, pp. 4931-4943, 2020.
[18] K. M. Borgwardt, A. Gretton, M. J. Rasch, H.-P. Kriegel, B. Schölkopf, and A. J. Smola, "Integrating structured biological data by kernel maximum mean discrepancy," Bioinformatics, vol. 22, no. 14, pp. e49-e57, 2006.
[19] B. Sun and K. Saenko, "Deep coral: Correlation alignment for deep domain adaptation," pp. 443-450: Springer.
[20] S. Zhao, B. Li, P. Xu, and K. Keutzer, "Multi-source domain adaptation in the deep learning era: A systematic survey," arXiv preprint arXiv:2002.12169, 2020.
[21] E. Tzeng, J. Hoffman, N. Zhang, K. Saenko, and T. Darrell, "Deep domain confusion: Maximizing for domain invariance," arXiv preprint arXiv:1412.3474, 2014.
[22] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in neural information processing systems, vol. 25, 2012.
[23] M. Long, Y. Cao, J. Wang, and M. Jordan, "Learning transferable features with deep adaptation networks," pp. 97-105: PMLR.
[24] A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Schölkopf, and A. Smola, "A kernel two-sample test," The Journal of Machine Learning Research, vol. 13, no. 1, pp. 723-773, 2012.
[25] C. Chen et al., "Homm: Higher-order moment matching for unsupervised domain adaptation," vol. 34, pp. 3422-3429.
[26] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," pp. 770-778.
[27] Y. Du, Z. Tan, Q. Chen, X. Zhang, Y. Yao, and C. Wang, "Dual adversarial domain adaptation," arXiv preprint arXiv:2001.00153, 2020.
[28] M. Ghifary, W. B. Kleijn, M. Zhang, D. Balduzzi, and W. Li, "Deep reconstruction-classification networks for unsupervised domain adaptation," pp. 597-613: Springer.
[29] Z. Chen, J. Zhuang, X. Liang, and L. Lin, "Blending-target domain adaptation by adversarial meta-adaptation networks," pp. 2248-2257.
[30] X. Peng, Q. Bai, X. Xia, Z. Huang, K. Saenko, and B. Wang, "Moment matching for multi-source domain adaptation," pp. 1406-1415.
[31] Y. Zhu, F. Zhuang, and D. Wang, "Aligning domain-specific distribution and classifier for cross-domain classification from multiple sources," vol. 33, pp. 5989-5996.
[32] R. Xu, Z. Chen, W. Zuo, J. Yan, and L. Lin, "Deep cocktail network: Multi-source unsupervised domain adaptation with category shift," pp. 3964-3973.
[33] S. Zhao et al., "A review of single-source deep unsupervised visual domain adaptation," IEEE Transactions on Neural Networks and Learning Systems, 2020.
[34] E. Granger, M. Kiran, J. Dolz, and L.-A. Blais-Morin, "Joint progressive knowledge distillation and unsupervised domain adaptation," pp. 1-8: IEEE.
[35] J. Yang, H. Zou, S. Cao, Z. Chen, and L. Xie, "MobileDA: Toward edge-domain adaptation," IEEE Internet of Things Journal, vol. 7, no. 8, pp. 6909-6918, 2020.
[36] X. Zhou, Y. Tian, and X. Wang, "Source-Target Unified Knowledge Distillation for Memory-Efficient Federated Domain Adaptation on Edge Devices," 2021.
[37] G. Hinton, O. Vinyals, and J. Dean, "Distilling the knowledge in a neural network," arXiv preprint arXiv:1503.02531, vol. 2, no. 7, 2015.
[38] J. Li, M. Jing, H. Su, K. Lu, L. Zhu, and H. T. Shen, "Faster domain adaptation networks," IEEE Transactions on Knowledge and Data Engineering, 2021.
[39] C. Szegedy et al., "Going deeper with convolutions," pp. 1-9.
[40] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, "Rethinking the inception architecture for computer vision," pp. 2818-2826.
[41] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, "Inception-v4, inception-resnet and the impact of residual connections on learning."
[42] S. Zhu, B. Du, L. Zhang, and X. Li, "Attention-Based Multiscale Residual Adaptation Network for Cross-Scene Classification," IEEE Transactions on Geoscience and Remote Sensing, vol. 60, pp. 1-15, 2021.
[43] M.-H. Guo et al., "Attention mechanisms in computer vision: A survey," Computational Visual Media, pp. 1-38, 2022.
[44] Z. Niu, G. Zhong, and H. Yu, "A review on the attention mechanism of deep learning," Neurocomputing, vol. 452, pp. 48-62, 2021.
[45] Q. Wang, B. Wu, P. Zhu, P. Li, W. Zuo, and Q. Hu, "ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks," pp. 11534-11542.
[46] Z. Gao, J. Xie, Q. Wang, and P. Li, "Global second-order pooling convolutional networks," pp. 3024-3033.
[47] M. Jaderberg, K. Simonyan, and A. Zisserman, "Spatial transformer networks," Advances in neural information processing systems, vol. 28, 2015.
[48] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, "Cbam: Convolutional block attention module," pp. 3-19.
[49] X. Zhang, F. X. Yu, S.-F. Chang, and S. Wang, "Deep transfer network: Unsupervised domain adaptation," arXiv preprint arXiv:1503.00591, 2015.
[50] J. Wang, Y. Chen, W. Feng, H. Yu, M. Huang, and Q. Yang, "Transfer learning with dynamic distribution adaptation," ACM Transactions on Intelligent Systems and Technology (TIST), vol. 11, no. 1, pp. 1-25, 2020.
[51] S. Ben-David, J. Blitzer, K. Crammer, and F. Pereira, "Analysis of representations for domain adaptation," Advances in neural information processing systems, vol. 19, 2006.
[52] C.-H. Lu and B.-E. Shao, "Environment-aware multiscene image enhancement for internet of things enabled edge cameras," IEEE Systems Journal, vol. 15, no. 3, pp. 3439-3449, 2020.
[53] F. Yu and V. Koltun, "Multi-scale context aggregation by dilated convolutions," arXiv preprint arXiv:1511.07122, 2015.
[54] S. Hwang, J. Lee, C. Jung, and J. Kim, "Attention-Aware Linear Depthwise Convolution for Single Image Super-Resolution," pp. 72-76.
[55] S. Mehta, M. Rastegari, A. Caspi, L. Shapiro, and H. Hajishirzi, "Espnet: Efficient spatial pyramid of dilated convolutions for semantic segmentation," pp. 552-568.
[56] F. Yu, V. Koltun, and T. Funkhouser, "Dilated residual networks," pp. 472-480.
[57] C. Wu, W. Wen, T. Afzal, Y. Zhang, and Y. Chen, "A compact dnn: approaching googlenet-level accuracy of classification and domain adaptation," pp. 5668-5677.
[58] M. Long, J. Wang, G. Ding, J. Sun, and P. S. Yu, "Transfer feature learning with joint distribution adaptation," pp. 2200-2207.
[59] C.-Y. Lee, T. Batra, M. H. Baig, and D. Ulbricht, "Sliced wasserstein discrepancy for unsupervised domain adaptation," pp. 10285-10295.
[60] H. Shimodaira, "Improving predictive inference under covariate shift by weighting the log-likelihood function," Journal of statistical planning and inference, vol. 90, no. 2, pp. 227-244, 2000.
[61] J. Huang, A. Gretton, K. Borgwardt, B. Schölkopf, and A. Smola, "Correcting sample selection bias by unlabeled data," Advances in neural information processing systems, vol. 19, 2006.
[62] M. J. Swain and D. H. Ballard, "Color indexing," International journal of computer vision, vol. 7, no. 1, pp. 11-32, 1991.
[63] P. Chamoso, A. Rivas, J. J. Martín-Limorti, and S. Rodríguez, "A hash based image matching algorithm for social networks," pp. 183-190: Springer.
[64] R. Müller, S. Kornblith, and G. E. Hinton, "When does label smoothing help?," Advances in neural information processing systems, vol. 32, 2019.
[65] Y. Mansour, M. Mohri, and A. Rostamizadeh, "Domain adaptation with multiple sources," Advances in neural information processing systems, vol. 21, 2008.
[66] I. Hussain, Q. He, and Z. Chen, "Automatic fruit recognition based on DCNN for commercial source trace system," Int. J. Comput. Sci. Appl, vol. 8, no. 2/3, pp. 01-14, 2018.
[67] D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.

簡易檢索 / 詳目顯示

相關論文