端對端手持藥排辨識技術｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	卓長霖 Chang-Lin Cho
論文名稱：	端對端手持藥排辨識技術 Toward End-to-End Identification of Handheld Pharmaceutic Blister Packages
指導教授：	鍾聖倫 Sheng-Luen Chung
口試委員:	鍾聖倫 Sheng-Luen Chung 蘇順豐 Shun-Feng Su 郭重顯 Chung-Hsien Kuo 徐繼聖 Gee-Sern Hsu 方文賢 Wen-Hsien Fang
學位類別：	碩士 Master
系所名稱：	電資學院 - 電機工程系 Department of Electrical Engineering
論文出版年：	2020
畢業學年度：	108
語文別：	英文
論文頁數：	61
中文關鍵詞：	藥排辨識、端對端物件偵測、資料增量
外文關鍵詞：	pharmaceutic blister package identification, end-to-end object detection, data augmentation
相關次數：	點閱：218 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

完美正確的處方箋配置是醫藥安全追求的主要目標。但不可避免地，不管這機率有多低，人終會出錯。自動化的藥排辨識技術被視為輔助藥劑師配藥達成上述目標的最有效技術。特別是不影響藥師配藥流程的辨識技術，像是在開放空間中針對手持藥排進行辨識的技術，就更顯重要。然而，由於藥排種類眾多約有230種以上，藥排部份被手遮蔽，而且操作環境中背景與光照條件的不確定性，致使開放空間中的手持藥排辨識具技術挑戰性。先前最好的HBIN解決方案是兩階段的：先經過一個網路同時從正面與反面圖像中框取出藥排，然後個別轉置並合併到固定大小的模版上，再由另一個網路上進行辨識。雖然有 92% 的 F1-score，此架構的訓練與實現的成本昂貴。對於達成趨於完美的配藥目標，本論文的貢獻有三：第一是更簡潔有效的網路架構：在仍延用雙面圖像的基礎上，尋求僅用一個深度學習網的端對端解決方案。另外在尋求只仰賴單面手持藥排影像為依據的辨識技術，我們系統性地檢試各物件偵測的解決方案。第二是有效的資料增量技術：在所檢視的各方案中，述採雙面的ROR，以及單面的 YOLO與 SSD網路架構，在經由合成影像預訓練後，在陌生環境中F1-score 的辨識率可達 95%以上，而在熟悉環境中更可達100%。第三個是趨於完美的配藥保證：我們將本藥排辨識系統與醫院的處方箋系統整合為自動處方箋核時系統。即使有1%的人為操作錯誤率 (這是嚴重高估的失誤率)，整合核實系統平均出錯的機率會降至每百萬次為2.1次以下。以馬偕醫院的配藥流程為應用例，利用本論文所提新的藥排辨識技術所實現的自動整合核實系統，已成功展示其可行性與優越性。

Error-free prescription dispensing is the ultimate goal set by drug safety. Nonetheless, human errs. To attain the perfect goal, automated pharmaceutic blister package identification (PBPI) is regarded as the most effective technology to. In particular, the technique that identifies hand-held packages in open environment. However, identification of handheld blister packages in open spaces is challenging because of the numerous number of more than 230 package types, the partial occlusion covered by hands, and the uncertainty posed in the open environment. Thus far, the best reported solution HBIN that relies on complementarily paired front and back images of the handheld package is a two-stage process: the first is to crop the hand-held package from both side images, and to juxtapose them into a fixed-size template, before the joint template is identified by the second stage. Achieving a 92% F1-score, the two-stage solution require mores resources for implementation and training, in addition to more computational time. To approach the error-free dispensing goal effectively, the study contributes in three accounts: First, better network architectures: This is done in two directions: one is an end-to-end trainable solution by only one deep learning network; the other is solutions that rely only on single-sided hand-held package image. Second, performance boosting through data augmentation: Through multiplied and diversified synthetic images to contain uncertainty posed in open spaces, the identification performance for the two-sided ROR, and the single-sided YOLO and SSD, pre-trained by the synthetic images all attain an F1-score of more than 95% in new testing environments and 100% in familiar ones, both significantly boosted. Third, the integrated Dispensation Verification System (DVS): For an exaggerated case when the human error rate is 1\%, the resultant error probability by the DVS that integrates PBPI with the dispensing reminder module, on average is drastically reduced to less than 2.1 per million. The proposed DVS has been implemented and successfully field-tested at Mackay Hospital, demonstrating its feasibility and superiority in assuring almost error-free prescription dispensing.

摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . II
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . III
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IV
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VII
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IX
Chapter 1: Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Blister Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Proposed Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Chapter 2: Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1 License Plate Recognition . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Scene Text Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Pharmaceutical Blister Identification . . . . . . . . . . . . . . . . . . . . 7
2.4 End-to-End vs. Two-Staged Solution . . . . . . . . . . . . . . . . . . . . 7
2.5 Methods to Improve Object Detection . . . . . . . . . . . . . . . . . . . 8
2.6 End-to-End Unified Object Detection . . . . . . . . . . . . . . . . . . . 9
Chapter 3: The ROR Method . . . . . . . . . . . . . . . . . . . . . . . . 11
3.1 ROR Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Feature Extractor, localization network and RoIRotate Transform . . . . . 12
3.2.1 Feature Extractor . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.2 Localization Network . . . . . . . . . . . . . . . . . . . . . . . . 12
3.2.3 RoIRotate Transform . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Recognition Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.4 Loss Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Chapter 4: Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2 Two-Stage Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.3 Training Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
4.4 Comparison with Two-Stage and End-to-End Methods . . . . . . . . . . 25
Chapter 5: Single Side Pharmaceutic Blister Package Identification . . . . 27
5.1 End-to-End Solutions to Object Detection . . . . . . . . . . . . . . . . . 27
5.1.1 Region Based Framework . . . . . . . . . . . . . . . . . . . . . 27
5.1.2 Unified Framework . . . . . . . . . . . . . . . . . . . . . . . . 29
5.2 Comparison of Experiment Result . . . . . . . . . . . . . . . . . . . . . 32
5.2.1 Implement Details . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.2.2 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . 32
Chapter 6: Dispensation Verification System (DVS) . . . . . . . . . . . . . 37
6.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
6.2 System Implement Detail . . . . . . . . . . . . . . . . . . . . . . . . . 39
6.3 Verification System Error Analysis . . . . . . . . . . . . . . . . . . . . . 39
Chapter 7: Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Appendix A: Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46


                                

[1] J.-S. Wang, A. Ambikapathi, Y. Han, S.-L. Chung, H.-W. Ting, and C.-F. Chen, “Highlighted deep learning based identification of pharmaceutical blister packages,” in 2018 IEEE 23rd International Conference on Emerging Technologies and Factory Automation (ETFA), 2018, pp. 638-645.
[2] S. Chung, C. Chen, G. Hsu and S. Wu, "Identification of Partially Occluded Pharmaceutical Blister Packages," 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Taipei, Taiwan, 2019, pp. 1-8.
[3] S. Ren, K. He, R. Girshick and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp. 1137-1149, 1 June 2017.
[4] Y. Sun, L. Zheng, Y. Yang, Q. Tian, and S. Wang, "Beyond Part Models: Person Retrieval with Refined Part Pooling (and A Strong Convolutional Baseline)," in Proceedings of 15th European Conference, 2018, pp. 501-518.
[5] K. He, G. Gkioxari, P. Dollár and R. Girshick, "Mask R-CNN," 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 2017, pp. 2980-2988.
[6] X. Liu, D. Liang, S. Yan, D. Chen, Y. Qiao and J. Yan, "FOTS: Fast Oriented Text Spotting with a Unified Network," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 5676-5685.
[7] Redmon, Joseph, and Ali Farhadi. "Yolov3: An incremental improvement." arXiv preprint arXiv:1804.02767 (2018).
[8] J. Redmon and A. Farhadi, "YOLO9000: Better, Faster, Stronger," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, pp. 6517-6525.
[9] Liu, Wei, et al. "Ssd: Single shot multibox detector," European conference on computer vision. Springer, Cham, 2016.
[10] Laroca, Rayson, et al. "A robust real-time automatic license plate recognition based on the YOLO detector." 2018 International Joint Conference on Neural Networks (IJCNN). IEEE, 2018.
[11] J. Redmon, S. Divvala, R. Girshick and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016, pp. 779-788.
[12] S. M. Silva and C. R. Jung, "Real-Time Brazilian License Plate Detection and Recognition Using Deep Convolutional Neural Networks," 2017 30th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), Niteroi, 2017, pp. 55-62.
[13] Silva, Sérgio Montazzolli, and Cláudio Rosito Jung, "License Plate Detection and Recognition in Unconstrained Scenarios," European Conference on Computer Vision. Springer, Cham, 2018.
[14] Li, Hui, Peng Wang, and Chunhua Shen, "Toward end-to-end car license plate detection and recognition with deep neural networks," IEEE Transactions on Intelligent Transportation Systems 20.3 (2018): 1126-1136.
[15] Xu, Zhenbo, et al., "Towards End-to-End License Plate Detection and Recognition: A Large Dataset and Baseline," Proceedings of the European Conference on Computer Vision (ECCV). 2018.
[16] Liao, Minghui, et al. "Textboxes: A fast text detector with a single deep neural network." Thirty-First AAAI Conference on Artificial Intelligence. 2017.
[17] F. Borisyuk, A. Gordo, and V. Sivakumar, "Rosetta: Large scale system for text detection and recognition in images," in Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2018, pp. 71-79.
[18] A. Graves, S. Fernández, F. Gomez, and J. Schmidhuber, "Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks," in ACM International Conference Proceeding Series, 2006, vol. 148, pp. 369-376.
[19] T. He, Z. Tian, W. Huang, C. Shen, Y. Qiao and C. Sun, "An End-to-End TextSpotter with Explicit Alignment and Attention," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 5020-5029.
[20] X. Zhou et al., "EAST: An Efficient and Accurate Scene Text Detector," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, pp. 2642-2651.
[21] P. Isola, J. Zhu, T. Zhou and A. A. Efros, "Image-to-Image Translation with Conditional Adversarial Networks," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, pp. 5967-5976.
[22] J. Špaňhel, J. Sochor, R. Juránek, A. Herout, L. Maršík and P. Zemčík, "Holistic recognition of low quality license plates by CNN using track annotated data," 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), Lecce, 2017, pp. 1-6
[23] Guo, Chaoxu, et al. "Augfpn: Improving multi-scale feature learning for object detection." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
[24] Cao, Jiale, et al. "D2Det: Towards High Quality Object Detection and Instance Segmentation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
[25] Guo, Jianyuan, et al. "Hit-Detector: Hierarchical trinity architecture search for object detection." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020.
[26] W. Shao, R. Kawakami, R. Yoshihashi, S. You, H. Kawase, and T. Naemura, "Cattle detection and counting in UAV images based on convolutional neural networks," International Journal of Remote Sensing, Article vol. 41, no. 1, pp. 31-52, 2020.
[27] Z. Chen, D. Chen, Y. Zhang, X. Cheng, M. Zhang, and C. Wu, "Deep learning for autonomous ship-oriented small ship detection," Safety Science, Article vol. 130, 2020, Art. no. 104812.
[28] Y. Tian, G. Yang, Z. Wang, H. Wang, E. Li, and Z. Liang, "Apple detection during different growth stages in orchards using the improved YOLO-V3 model," Computers and Electronics in Agriculture, Article vol. 157, pp. 417-426, 2019.
[29] Ayush, Kumar, et al. "Generating interpretable poverty maps using object detection in satellite images." arXiv preprint arXiv:2002.01612 (2020).
[30] K. He, X. Zhang, S. Ren and J. Sun, "Deep Residual Learning for Image Recognition," 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016, pp. 770-778.
[31] T. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan and S. Belongie, "Feature Pyramid Networks for Object Detection," 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, 2017, pp. 936-944.
[32] J. Yu, Y. Jiang, Z. Wang, Z. Cao, and T. Huang, "UnitBox: An advanced object detection network," in MM 2016 - Proceedings of the 2016 ACM Multimedia Conference, 2016, pp. 516-520.
[33] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, "Inception-v4, inception-ResNet and the impact of residual connections on learning," in 31st AAAI Conference on Artificial Intelligence, AAAI 2017, 2017, pp. 4278-4284.
[34] J. Hu, L. Shen and G. Sun, "Squeeze-and-Excitation Networks," 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 7132-7141.
[35] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet classification with deep convolutional neural networks," in Advances in Neural Information Processing Systems, 2012, vol. 2, pp. 1097-1105.
[36] R. Girshick, J. Donahue, T. Darrell and J. Malik, "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation," 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, 2014, pp. 580-587.
[37] R. Girshick, "Fast R-CNN," 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015, pp. 1440-1448.
[38] T. Y. Lin et al., "Microsoft COCO: Common objects in context," in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) vol. 8693 LNCS, ed, 2014, pp. 740-755.
[39] Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
[40] Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
[41] Redmon, J. (n.d.). Retrieved July 08, 2020, from https://pjreddie.com/darknet/imagenet/

全文公開日期 2025/08/06 (校內網路)
全文公開日期 2025/08/06 (校外網路)
全文公開日期 2025/08/06 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文