簡易檢索 / 詳目顯示

研究生: 林晉德
CHIN-TE LIN
論文名稱: 遷移學習與三元組網絡在醫院處方審核系統中的應用:自我升級的適應性學習識別網絡
Application of Transfer Learning and Triplet Networks in Hospital Prescription Verification Systems: A Self-Upgrading Adaptive Learning Identification Network
指導教授: 鍾聖倫
Sheng-Luen Chung
口試委員: 陸敬互
蘇順豐
吳啟瑞
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2023
畢業學年度: 112
語文別: 中文
論文頁數: 80
中文關鍵詞: 適應學習遷移學習三元組網路影像辨識對比損失函數深度學習電腦視覺
外文關鍵詞: Adaptive learning, Transfer learning, Triplet networks, Image Recognition, Contrastive loss , Deep learning, Computer vision
相關次數: 點閱:104下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 針對特定識別目標集進行訓練的深度學習識別網路, 在實際應用中,通常面臨兩個主要的限制:無法修改識別目標集,以及無法根據實際測試場域的統計特性調整網路參數。本研究探討了兩個關鍵問題:(1)如何快速重新配置已訓練的識別網路以適應更新的識別目標;(2)如何根據測試場域運行中累積的測試結果,特別是錯誤判斷的樣本,調整之前訓練的分類器以提高識別性能。
    為解決這些挑戰,我們利用了遷移學習中的少量樣本學習,來適應識別目標的更新,並使用基於先前訓練的特徵提取器的歐式距離相似性排名進行分類。另外,我們利用了三元組網路 (triplet network) 的概念,根據從測試場域累積的數據,組成的三元組樣本來微調現有的特徵提取器,從而提高提取器識別同類相似性和異類間對比的能力。
    我們採用的方法是先在固定場景針對固定的藥牌,訓練提取藥牌特徵的深度學習網路。之後,針對特定應用場域與實際要辨識的藥牌種類,利用取少樣本範例的原型法 ( prototyping) 決定測試的結果。而在後續的線上微調操作中,會使用辨識錯誤的資料,透過探討使用:(1) 批次比對 (contrastive loss):提升網路的區分辨識效果;(2) 正規化 (regularization):降低網路過擬合的效果;以及 (3) 跨批記憶 (XBM):有效增加訓練模型的等效資料批次大小等機制,觀察對於訓練特徵提取器的改善。
    由實驗結果我們的到結論,訓練特徵提取器時,Backbone 使用 ViT,Loss function 使用 Contrastive loss,以及 Dataloader 使用 TriContr,和使用 KoLeo、XBM 機制,可以得到良好的特徵提取器;在線上微調時,使用Contrastive loss 結合 TriContr,在微調次數少的情況下可以得到良好的表現。我們提供有標記的資料集,實際拍攝藥牌影像共62700張。
    作為所提出的適應性學習識別框架的示範,我們針對醫院處方審核系統提出了一種自我升級的適應性學習識別網路框架(SFALIC)。該框架允許更新識別的藥品類型,並根據連續累積的測試樣本批量調整特徵識別網路。在智能配藥驗證框架內,配發錯誤藥品的概率少於每百萬次配藥中的一次。本研究在廣泛的遷移學習框架下,提高了識別目標需要持續更新,並且實際測試集與原始訓練集之間存在顯著統計差異的應用場景的整體識別性能。


    In practical applications, deep learning identification networks trained for specific target sets often face two major limitations: the inability to modify the identification target set and the inability to adjust network parameters based on the statistical characteristics of the actual testing environment. This study addresses two critical questions: (1) how to rapidly reconfigure a pre-trained identification network to adapt to updated identification targets and (2) how to adjust previously trained classifiers based on the accumulated test results in the testing environment, particularly with regards to error-prone samples, to enhance identification performance.
    To tackle these challenges, we employ few-shot learning from transfer learning to adapt to updated identification targets and utilize Euclidean distance similarity ranking based on previously trained feature extractors for classification. Furthermore, we leverage the concept of triplet networks to fine-tune existing feature extractors based on triplet samples composed of data accumulated from the testing environment, thereby enhancing the extractor's ability to recognize intra-class similarities and inter-class differences.
    Our approach involves initially training a deep learning network for extracting drug-related features in a fixed setting for specific drug categories. Subsequently, we employ prototyping with few-shot examples to classify drugs relevant to a particular application field based on actual identification requirements. In the subsequent online fine-tuning process, we use erroneous data to enhance network performance through (1) contrastive loss to improve discrimination, (2) regularization to mitigate overfitting, and (3) cross-batch memory (XBM) to effectively increase the effective batch size for network adjustments.
    As a demonstration of the proposed adaptive learning identification framework, we have developed a self-upgrading adaptive learning identification network framework (SFALIC) for a hospital prescription verification system. This framework allows for the updating of identified drug types and batch adjustments of feature identification networks based on continuously accumulated test samples. Within the intelligent medication verification framework, the probability of dispensing incorrect medications is less than one instances per million medication dispensations. This study enhances overall identification performance in application scenarios where identification targets need continuous updates, and significant statistical differences exist between the actual testing set and the original training set.

    摘要 1 Abstract 2 致謝 3 目錄 4 圖目錄 6 表目錄 7 第1章、 簡介 (Introduction) 1 1.1 物件辨識的挑戰 3 1.2 智慧配藥核實系統 4 1.3 研究的重要性 6 1.4 本文架構 7 第2章、 文獻審閱 (Related work) 8 2.1 轉移學習 (transfer learning) 8 2.1.1 少樣本學習 (few-shot learning) 9 2.2 度量學習 (metric learning) 10 2.2.1 三元組網路 (triplet network) 10 2.2.2 對比損失函數 (contrastive loss) 11 2.3 預訓練模型 (pre-training model) 12 2.4 深度學習網路之訓練技術 14 2.4.1 跨批次記憶 (Cross-Batch Memory: XBM) 14 2.4.2 正規化 (Regularization) 15 第3章、 方法 (Methodology) 16 3.1 兩階段:特徵提取網路的訓練、線上錯誤樣本微調 17 3.1.1 特徵提取網路的訓練 18 3.1.2 線上錯誤樣本微調 21 3.2 DataLoader 及負樣本取樣方法 27 3.3 使用預訓練模型 (pre-training model) 29 3.4 Prototype 31 3.5 訓練網路的不同方法與組合 32 3.5.1 Loss function 33 3.5.2 Regularization (Koleo) 35 3.5.3 XBM (Cross-Batch Memory) 36 第4章、 實驗與結果 (Experiments and Results) 38 4.1 資料集 40 4.2 評測指標 (Evaluation Metrics) 42 4.3 實驗環境 44 4.4 實驗結果 (Results) 44 4.4.1 第一階段實驗(訓練特徵提取網路) 45 4.4.2 第二階段實驗(線上錯誤樣本微調) 47 第5章、 結論 (Conclusion) 58 5.1 適應改換場景與辨識種類的訓練方法 59 5.2 實際佈署的考量 60 5.3 未來研究 61 參考文獻 (Reference) 64 口試委員之建議與答覆 68

    [1] S. Beery, G. Van Horn, and P. Perona, "Recognition in terra incognita," in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 456-473.
    [2] Z. Yaniv et al., "The national library of medicine pill image recognition challenge: An initial report," in 2016 IEEE Applied Imagery Pattern Recognition Workshop (AIPR), 2016: IEEE, pp. 1-9.
    [3] K. Saenko, B. Kulis, M. Fritz, and T. Darrell, "Adapting visual category models to new domains," in Computer Vision–ECCV 2010: 11th European Conference on Computer Vision, Heraklion, Crete, Greece, September 5-11, 2010, Proceedings, Part IV 11, 2010: Springer, pp. 213-226.
    [4] S. J. Pan and Q. Yang, "A survey on transfer learning," IEEE Transactions on knowledge and data engineering, vol. 22, no. 10, pp. 1345-1359, 2009.
    [5] C. Tan, F. Sun, T. Kong, W. Zhang, C. Yang, and C. Liu, "A survey on deep transfer learning," in Artificial Neural Networks and Machine Learning–ICANN 2018: 27th International Conference on Artificial Neural Networks, Rhodes, Greece, October 4-7, 2018, Proceedings, Part III 27, 2018: Springer, pp. 270-279.
    [6] J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, "How transferable are features in deep neural networks?," Advances in neural information processing systems, vol. 27, 2014.
    [7] P. Tschandl, C. Sinz, and H. Kittler, "Domain-specific classification-pretrained fully convolutional network encoders for skin lesion segmentation," Computers in biology and medicine, vol. 104, pp. 111-116, 2019.
    [8] T. D. Pham, "A comprehensive study on classification of COVID-19 on computed tomography with pretrained convolutional neural networks," Scientific reports, vol. 10, no. 1, p. 16942, 2020.
    [9] J. Zhou, P. Yu, W. Tang, and Y. Wu, "Efficient online local metric adaptation via negative samples for person re-identification," in Proceedings of the IEEE International Conference on Computer Vision, 2017, pp. 2420-2428.
    [10] Y. Wang, Q. Yao, J. T. Kwok, and L. M. Ni, "Generalizing from a few examples: A survey on few-shot learning," ACM computing surveys (csur), vol. 53, no. 3, pp. 1-34, 2020.
    [11] W.-Y. Chen, Y.-C. Liu, Z. Kira, Y.-C. F. Wang, and J.-B. Huang, "A closer look at few-shot classification," arXiv preprint arXiv:1904.04232, 2019.
    [12] J. Snell, K. Swersky, and R. Zemel, "Prototypical networks for few-shot learning," Advances in neural information processing systems, vol. 30, 2017.
    [13] A. Santoro et al., "A simple neural network module for relational reasoning," Advances in neural information processing systems, vol. 30, 2017.
    [14] G. Koch, R. Zemel, and R. Salakhutdinov, "Siamese neural networks for one-shot image recognition," in ICML deep learning workshop, 2015, vol. 2, no. 1: Lille.
    [15] L. Yan, Y. Zheng, and J. Cao, "Few-shot learning for short text classification," Multimedia Tools and Applications, vol. 77, pp. 29799-29810, 2018.
    [16] H. Oh Song, Y. Xiang, S. Jegelka, and S. Savarese, "Deep metric learning via lifted structured feature embedding," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 4004-4012.
    [17] H. V. Nguyen and L. Bai, "Cosine similarity metric learning for face verification," in Asian conference on computer vision, 2010: Springer, pp. 709-720.
    [18] N. Wojke and A. Bewley, "Deep cosine metric learning for person re-identification," in 2018 IEEE winter conference on applications of computer vision (WACV), 2018: IEEE, pp. 748-756.
    [19] Y. Chen, X. Wang, Z. Liu, H. Xu, and T. Darrell, "A new meta-baseline for few-shot learning," arXiv preprint arXiv:2003.04390, vol. 1, no. 2, p. 3, 2020.
    [20] Y. Chen, Z. Liu, H. Xu, T. Darrell, and X. Wang, "Meta-baseline: Exploring simple meta-learning for few-shot learning," in Proceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 9062-9071.
    [21] A. Nandy, S. Haldar, S. Banerjee, and S. Mitra, "A survey on applications of siamese neural networks in computer vision," in 2020 International Conference for Emerging Technology (INCET), 2020: IEEE, pp. 1-5.
    [22] J. Bromley, I. Guyon, Y. LeCun, E. Säckinger, and R. Shah, "Signature verification using a" siamese" time delay neural network," Advances in neural information processing systems, vol. 6, 1993.
    [23] E. Hoffer and N. Ailon, "Deep metric learning using triplet network," in Similarity-Based Pattern Recognition: Third International Workshop, SIMBAD 2015, Copenhagen, Denmark, October 12-14, 2015. Proceedings 3, 2015: Springer, pp. 84-92.
    [24] F. Schroff, D. Kalenichenko, and J. Philbin, "Facenet: A unified embedding for face recognition and clustering," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 815-823.
    [25] C. Su, Y. Yan, S. Chen, and H. Wang, "An efficient deep neural networks training framework for robust face recognition," in 2017 IEEE International Conference on Image Processing (ICIP), 2017: IEEE, pp. 3800-3804.
    [26] Y. He, M. Huang, Q. Miao, H. Guo, and J. Wang, "Deep embedding network for robust age estimation," in 2017 IEEE international conference on image processing (ICIP), 2017: IEEE, pp. 1092-1096.
    [27] A. Irshad, R. Hafiz, M. Ali, M. Faisal, Y. Cho, and J. Seo, "Twin-net descriptor: Twin negative mining with quad loss for patch-based matching," IEEE Access, vol. 7, pp. 136062-136072, 2019.
    [28] K. Sohn, "Improved deep metric learning with multi-class n-pair loss objective," Advances in neural information processing systems, vol. 29, 2016.
    [29] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770-778.
    [30] A. Vaswani et al., "Attention is all you need," Advances in neural information processing systems, vol. 30, 2017.
    [31] A. Dosovitskiy et al., "An image is worth 16x16 words: Transformers for image recognition at scale," arXiv preprint arXiv:2010.11929, 2020.
    [32] P. Ramachandran, N. Parmar, A. Vaswani, I. Bello, A. Levskaya, and J. Shlens, "Stand-alone self-attention in vision models," Advances in neural information processing systems, vol. 32, 2019.
    [33] S.-L. Chung, C.-L. Cho, and S.-F. Su, "End-to-end identification of pharmaceutical blister packages based on one-side handheld images," in 2020 International Conference on System Science and Engineering (ICSSE), 2020: IEEE, pp. 1-5.
    [34] S.-L. Chung, C.-L. Cho, and S.-F. Su, "Toward an end-to-end solution to identification of handheld pharmaceutical blister packages," in 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2020: IEEE, pp. 3729-3734.
    [35] S.-L. Chung and W.-T. Guo, "Error-Driven Triplet-Based Online Fine-Tuning for Cross-Background Image Classification," in 2023 IEEE 3rd International Conference on Computer Communication and Artificial Intelligence (CCAI), 2023: IEEE, pp. 255-260.
    [36] J.-S. Wang, A. Ambikapathi, Y. Han, S.-L. Chung, H.-W. Ting, and C.-F. Chen, "Highlighted deep learning based identification of pharmaceutical blister packages," in 2018 IEEE 23rd International Conference on Emerging Technologies and Factory Automation (ETFA), 2018, vol. 1: IEEE, pp. 638-645.
    [37] A. El-Nouby, N. Neverova, I. Laptev, and H. Jégou, "Training vision transformers for image retrieval," arXiv preprint arXiv:2102.05644, 2021.
    [38] S. X. Hu, D. Li, J. Stühmer, M. Kim, and T. M. Hospedales, "Pushing the limits of simple pipelines for few-shot learning: External data and fine-tuning make a difference," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 9068-9077.
    [39] X. Wang, X. Han, W. Huang, D. Dong, and M. R. Scott, "Multi-similarity loss with general pair weighting for deep metric learning," in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 5022-5030.
    [40] X. Wang, H. Zhang, W. Huang, and M. R. Scott, "Cross-batch memory for embedding learning," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 6388-6397.
    [41] J. Wang, J. Zhu, and X. He, "Cross-batch negative sampling for training two-tower recommenders," in Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021, pp. 1632-1636.
    [42] A. Sablayrolles, M. Douze, C. Schmid, and H. Jégou, "Spreading vectors for similarity search," arXiv preprint arXiv:1806.03198, 2018.
    [43] A. Sablayrolles, M. Douze, C. Schmid, and H. Jégou, "A neural network catalyzer for multi-dimensional similarity search," arXiv preprint arXiv:1806.03198, 2018.
    [44] S. Ioffe and C. Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift," in International conference on machine learning, 2015: pmlr, pp. 448-456.
    [45] X. Chen, C.-J. Hsieh, and B. Gong, "When vision transformers outperform resnets without pre-training or strong data augmentations," arXiv preprint arXiv:2106.01548, 2021.
    [46] O. Vinyals, C. Blundell, T. Lillicrap, and D. Wierstra, "Matching networks for one shot learning," Advances in neural information processing systems, vol. 29, 2016.
    [47] A. Hermans, L. Beyer, and B. Leibe, "In defense of the triplet loss for person re-identification," arXiv preprint arXiv:1703.07737, 2017.

    無法下載圖示 全文公開日期 2026/01/11 (校內網路)
    全文公開日期 2026/01/11 (校外網路)
    全文公開日期 2026/01/11 (國家圖書館:臺灣博碩士論文系統)
    QR CODE