簡易檢索 / 詳目顯示

研究生: 張馨
Xin Zhang
論文名稱: 基於卷積神經網路之知識蒸餾法,應用於行動裝置上量測瞳孔尺寸
Using mobile devices to measure pupil size via distillation method based on the convolutional neural network
指導教授: 黃忠偉
Allen Jong-Woei Whang
陳怡永
Yi-Yung Chen
口試委員: 黃忠偉
Allen Jong-Woei Whang
陳怡永
Yi-Yung Chen
林瑞珠
JUI-CHU LIN
王孔政
Kung-Jeng Wang
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 60
中文關鍵詞: 瞳孔量測深度學習知識蒸餾瞳孔尺寸卷積神經網路空洞卷積
外文關鍵詞: Pupil detection, Deep Learning, Knowledge Distillation, Pupil Size, Convolutions Neural Network, Dilated Convolutions
相關次數: 點閱:280下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 眼科醫生透過瞳孔對光的反應了解病患的生理狀態,當醫生檢查病患的瞳孔時,雖然會使用筆燈與瞳孔尺作為量測,但是主要還是依賴醫生過去的經驗作為判斷結果,因此每位醫生會有不同的判斷情況,造成量測上的誤差。因此為了解決每位醫生判斷不一的情況下,開始使用傳統演算法。傳統演算法是透過大量數學計算(例如:濾波雜訊、提取邊緣等)和疊代校正去計算出瞳孔位置和尺寸,雖然此方法誤差率低,準確率高,也能快速進行量測,但也因此造成這些儀器體積龐大、硬體成本高,無法給予繁忙的醫生即時攜帶量測。
    近年來,隨著人工智慧的崛起,已廣泛運用在不同的領域當中,像是透過人工智慧可以達到與傳統演算法相同準確、快速、誤差低的結果,在與傳統演算法的比較下,不會受到硬體和成本影響,因此可以藉由將人工智慧導入至行動裝置中,實現輕量化、低成本的即時瞳孔量測。
    本文提出基於卷積神經網路(Convolutional Neural Networks,CNN)的架構,來量測瞳孔尺寸,以及使用知識蒸餾的方式,使大模型的學習能力轉移至小模型上,成功讓小型模型有與大模型接近的準確率,以及成功在行動裝置上運作。所有的小模型的平均錯誤率全小於10%,以及在行動裝置上的運算速度也高於35FPS,達成即時運算的效果。透過Grad-CAM (Class Activation Mapping)確認模型的關注位置,大部分都來自瞳孔區域上。根據結果,本文最佳模型Type II,平均誤差為5.54%,運算速度為37.77 FPS。最後使用合作的臨床圖像與最好的模型進行驗證,並透過遷移學習來改進優化模型,最終模型預測的結果,將與醫生的診斷和醫療設備G4的結果進行比較。


    The ophthalmologist understands the patient’s physiological state through the pupil’s response to light. When the doctor examines the patient’s pupil, they mainly rely on the doctor’s experience as the judgment result; both the penlight and pupil ruler is used as assisted measurements. Since different doctors will have varied judgments, it may cause measurement errors. Therefore, in order to solve the situation where each doctor has the varied judgment, traditional algorithms have been used. It uses a large number of mathematical calculations (such as filtering noise, extracting edges, etc.) and iterative correction to calculate the size and position of the pupil. Although this method has a low error rate and high accuracy rate, it can also be measured quickly. As a result, these instruments are bulky, and the hardware cost is high, making it impossible for busy doctors to carry for measurement in real-time.
    In recent years, with the development of artificial intelligence, it has been widely used in different fields. For example, artificial intelligence can achieve accurate, fast, and low-error results as traditional algorithms. Compared with traditional algorithms, it is not affected by hardware and cost. It is possible to realize lightweight and low-cost real-time pupil measurement by introducing artificial intelligence into mobile devices.
    This paper proposes an architecture based on Convolutional Neural Networks (CNN) to measure pupil size and use knowledge distillation to transfer the learning ability of large models to small models, and successfully make small models more powerful The accuracy of the model's closeness and the successful operation on mobile devices. The average error rate of all small models is less than 10%, and the calculation speed on mobile devices is also higher than 35FPS, achieving the effect of real-time calculation. Use Grad-CAM (Class Activation Mapping) to confirm the focus of the model, most of which come from the pupil area. According to the results, the best model in this paper, Type II, has an average error of 5.54% and a calculation speed of 37.77 FPS. Finally, the collaborative clinical images and the best model are used for verification, and the optimization model is improved through transfer learning. The final model prediction result will be compared with the doctor's diagnosis and the result of the medical device G4.

    中文摘要 I Abstract II 誌謝 IV 目錄 V 圖目錄 VIII 表目錄 X 第1章 緒論 1 1.1 研究背景 1 1.2 研究動機 2 1.3 論文架構 4 第2章 眼睛介紹 5 2.1 人眼 5 2.1.1 人眼結構 5 2.2 瞳孔 6 2.3 瞳孔反應 6 2.3.1 瞳孔擴張 6 2.3.2 瞳孔收縮 6 2.4 瞳孔檢測介紹 7 2.4.1 醫生診斷 7 第3章 深度學習 8 3.1 人工智慧、機器學習與深度學習 8 3.1.1 機器學習(Machine Learning) 8 3.1.2 深度學習(Deep Learning) 8 3.2 卷積神經網路(Convolutional Neural Networks) 9 3.2.1 卷積層(Convolutional Layer) 9 3.2.2 空洞卷積(Dilation Convolution) 10 3.2.3 池化層(Pooling Layer) 11 3.2.4 攤平層(Flatten Layer) 11 3.2.5 全連接層(Fully Connected Layer) 12 3.2.6 激活函數(Activation Function) 12 3.2.7 輸出層(Output Layer) 13 3.2.8 損失函數(Loss Function) 13 3.2.8.1. 均方誤差(Mean Square Error) 13 3.2.8.2. 平均絕對誤差(Mean Absolute Error) 14 3.2.8.3. 交叉熵(Cross-Entropy) 15 3.2.9 模型擬合 15 3.2.9.1. 低度擬合(Underfitting) 16 3.2.9.2. 過度擬合(Overfitting) 16 3.3 VGGNet模型 16 3.4 知識蒸餾(Knowledge Distillation) 17 3.4.1 知識蒸餾的理論 18 3.4.2 知識蒸餾的關鍵 18 3.4.3 知識蒸餾的方法 20 3.4.4 知識蒸餾的溫度 21 3.5 可解釋性的AI方法 22 3.5.1 CAM (Class Activation Mapping) 22 3.5.2 Grad-CAM(Gradient-weighted Class Activation Mapping) 23 3.6 TensorFlow模型優化(TensorFlow model optimization) 23 3.6.1 模型優化技術 24 3.7 遷移學習(Transfer Learning) 24 第4章 研究方法 26 4.1 圖像來源 27 4.2 傳統演算法 28 4.3 深度學習 29 4.3.1 網路架構 29 4.3.1.1. 老師模型 29 4.3.1.2. 學生模型 30 4.3.1.3. 知識蒸餾 32 4.4 模型訓練 33 4.5 模型結果評估 34 4.5.1 模型誤差測試 34 4.5.1.1. 第一次測試 35 4.5.1.2. 第二次測試 35 4.5.2 模型速度測試 35 4.5.3 模型驗證測試 36 4.6 研究結果 37 第5章 實驗與討論 38 5.1 實驗過程 39 5.1.1 臨床介紹 39 5.1.2 模型結果評估2 39 5.1.3 模型優化 40 5.1.4 模型結果評估3 40 5.2 實驗結果 41 5.2.1 傳統演算法檢測比較 41 5.2.2 醫院儀器量測比較 43 5.2.3 眼科醫生量測比較 44 5.3 結果討論 44 第6章 結論與未來展望 46 6.1 結論 46 6.2 未來展望 46 參考文獻 47

    [1] Li, D., Winfield, D., Parkhurst, D.J.: Starburst: a hybrid algorithm for videobased eye tracking combining feature-based and model-based approaches. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops, 2005. CVPR Workshops, pp. 79–79. IEEE (2005)
    [2] Swirski, L., Bulling, A., Dodgson, N.: Robust real-time pupil tracking in highly off-axis images. In: Proceedings of the Symposium on Eye Tracking Research and Applications, pp. 173–176. ACM (2012)
    [3] Goni, S., Echeto, J., Villanueva, A., Cabeza, R.: Robust algorithm for pupilglint vector detection in a video-oculography eyetracking system. In: Pattern Recognition. ICPR 2004, vol. 4, pp. 941–944. IEEE (2004)
    [4] Long, X., Tonguz, O.K., Kiderman, A.: A high speed eye tracking system with robust pupil center estimation algorithm. In: 29th Annual International Conference of the IEEE on Engineering in Medicine and Biology Society. EMBS 2007, pp. 3331–3334. IEEE (2007)
    [5] Valenti, R., Gevers, T.: Accurate eye center location through invariant isocentric patterns. Transactions on pattern analysis and machine intelligence 34(9), 1785–1798 (2012)
    [6] Zhu, D., Moore, S.T., Raphan, T.: Robust pupil center detection using a curvature algorithm. Computer methods and programs in biomedicine 59(3), 145–157 (1999)
    [7] W. Fuhl, T. Kübler, K. Sippel, W. Rosenstiel and E. Kasneci, "Excuse: Robust pupil detection in real-world scenarios", Proc. Int. Conf. Comput. Anal. Images Patterns, pp. 39-51 (2015)
    [8] W. Fuhl, T. C. Santini, T. Kübler and E. Kasneci, "ElSe: Ellipse selection for robust pupil detection in real-world environments", Proc. 9th Biennial ACM Symp. Eye Tracking Res. Appl. (ETRA), pp. 123-130 (2016)
    [9] Wolfgang Fuhl, Thiago Santini, Gjergji Kasneci, and Enkelejda Kasneci. "PupilNet: Convolutional Neural Networks for Robust Pupil Detection." (2016).
    [10] W. Chinsatit and T. Saitoh, "CNN-based pupil center detection for wearable gaze estimation system", Appl. Comput. Intell. Soft Comput., vol. 2017, pp. 1-10 (2017)
    [11] N. Kondo, W. Chinsatit and T. Saitoh, "Pupil center detection for infrared irradiation eye image using CNN", Proc. 56th Annu. Conf. Soc. Instrum. Control Eng. Jpn. (SICE), pp. 100-105 (Sep 2017)
    [12] Yuk-Hoi Yiu, Moustafa Aboulatta, Theresa Raiser, Leoni Ophey, Virginia L. Flanagin, Peter zu Eulenburg, and Seyed-Ahmad Ahmadi. "DeepVOG: Open-source pupil segmentation and gaze estimation in neuroscience using deep learning".Journal of Neuroscience Methods 324 (2019): 108307.
    [13] S. Y. Han, H. J. Kwon, Y. Kim and N. I. Cho, "Noise-Robust Pupil Center Detection Through CNN-Based Segmentation With Shape-Prior Loss," in IEEE Access, vol. 8, pp. 64739-64749 (2020), doi: 10.1109/ACCESS.2020.2985095.
    [14] Judd, Deane B.; Wyszecki, Günter. Color in Business, Science and Industry. Wiley Series in Pure and Applied Optics third edition. New York: Wiley-Interscience. 1975: 388. ISBN 0471452122.
    [15] eye, human."Encyclopædia Britannica from Encyclopædia Britannica 2006 Ultimate Reference Suite DVD 2009
    [16] Zi-Yi Lian, "Non-imaging Optical Simulation Analysis of Human Eye and Implementation of Portable Innovative Pupillometer," Master's Thesis , National Taiwan University of Science and Technology of Electronic Engineering, (Sep 2018) https://hdl.handle.net/11296/xm6629.
    [17] 從gxyixue的網頁中,可以獲得瞳孔擴張的重要訊息http://www.gxyixue.com/h-nd-62.html
    [18] Costello, F. (2006). NEURO-OPHTHALMOLOGY: THE PRACTICAL GUIDE. 2005. Edited by Leonard A. Levin, Anthony C. Arnold. Published by Thieme Medical Publishers, New York. 494 pages. Price C$210. Canadian Journal of Neurological Sciences / Journal Canadien Des Sciences Neurologiques, 33(4), 436-436. doi:10.1017/S0317167100049970
    [19] 從AWS網頁中,可以獲得人工智慧、機器學習與深度學習的重要訊息https://aws.amazon.com/tw/machine-learning/what-is-ai/
    [20] Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based learning applied to document recognition," in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324 (Nov 1998), doi: 10.1109/5.726791.
    [21] Playing Super Mario Bros with Proximal Policy Optimization. Retrieved July 23, 2021, from https://brandinho.github.io/mario-ppo/
    [22] F. Yu and V. J. a. p. a. Koltun, "Multi-scale context aggregation by dilated convolutions," (2015).
    [23] How do Convolutional Neural Networks work? Retrieved July 23, 2021, from https://e2eml.school/how_convolutional_neural_networks_work.html
    [24] 卷積神經網路(Convolutional neural network, CNN) CNN運算流程。 Retrieved July 23, 2021, from https://reurl.cc/Ene08R
    [25] 從keejko.blogspot的網頁中,可以獲得四種激活函數的重要訊息http://keejko.blogspot.com/2019/07/
    [26] 從芒果浩明的網頁中,可以獲得均方誤差的重要訊息https://mangoroom.cn/opencv/mean-absolute-error-circle-fit.html
    [27] 從desmos的網頁中,可以獲得交叉熵的重要訊息(https://www.desmos.com/calculator/zytm2sf56e?lang=zh-TW
    [28] Underfitting and Overfitting in Machine Learning. Retrieved July 23, 2021, from https://www.geeksforgeeks.org/underfitting-and-overfitting-in-machine-learning/
    [29] K. Simonyan and A. Zisserman. (2014). “Very deep convolutional networks for large-scale image recognition.” [Online]. Available: https://arxiv.org/abs/1409.1556.
    [30] 從csdn網頁中,可以獲得【深度學習】經典神經網絡VGG 論文解讀的重要訊息(https://blog.csdn.net/briblue/article/details/83792394)
    [31] 從csdn網頁中,可以獲得經典CNN之:VGGNet介紹的重要訊息(https://blog.csdn.net/daydayup_668819/article/details/79932324)
    [32] Hinton, G., Vinyals, O., & Dean, J. (2015). Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531.
    [33] 從intellabs.github網頁中,可以獲得Knowledge Distillation的重要訊息(https://intellabs.github.io/distiller/knowledge_distillation.html)
    [34] 從知乎網頁中,可以獲得知識蒸餾(Knowledge Distillation)簡述(一)的重要訊息(https://zhuanlan.zhihu.com/p/81467832)
    [35] 從知乎網頁中,可以獲得知识蒸馏(Knowledge Distillation) 经典之作的重要訊息(https://zhuanlan.zhihu.com/p/102038521)
    [36] 從muenai網頁中,可以獲得淺談AI可解釋性的重要訊息(https://www.muenai.com/article_detail/20.htm)
    [37] Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. (2016). Learning deep features for discriminative localization. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2921-2929).
    [38] 從知乎網頁中,可以獲得CAM的重要訊息(https://zhuanlan.zhihu.com/p/51631163)
    [39] Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision (pp. 618-626)
    [40] 從TensorFlow網頁中,可以獲得TensorFlow模型優化的重要訊息(https://www.tensorflow.org/model_optimization/guide#clustering)
    [41] 從jason-chen-1992.weebly網頁中,可以獲得遷移學習的重要訊息(https://jason-chen-1992.weebly.com/home/fine-tuning)
    [42] Portions of the research in this paper use the CASIA-IrisV3 collected by the Chinese Academy of Sciences' Institute of Automation (CASIA)" and a reference to "CASIA Iris Image Database, http://biometrics.idealtest.org/", (Nov 2020)
    [43] Malaysia Multimedia University. MMU1 iris image database, 2004. http://pesona.mmu.edu.my/˜ccteo.
    [44] A. Rickmann, M. Waizel, S. Kazerounian, P. Szurman, H. Wilhelm, and K. T. J. N.-O. Boden, "Digital pupillometry in normal subjects," vol. 41, no. 1, pp. 12-18, 2017.
    [45] Cortez, Melissa M et al. "Altered pupillary light response scales with disease severity in migrainous photophobia." Cephalalgia : an international journal of headache vol. 37,8 (2017): 801-811. doi:10.1177/0333102416673205.
    [46] 從systemvision網頁中,可以獲得ZIEMER Galilei G4的重要訊息(http://www.systemvision.eu/galilei-g4.html)
    [47] P. Samant, R. J. C. m. Agarwal, and p. i. biomedicine, "Machine learning techniques for medical diagnosis of diabetes using iris images," vol. 157, pp. 121-128 (2018)
    [48] M. D. Larson and V. J. C. C. Singh, "Portable infrared pupillometry in critical care," vol. 20, no. 1, p. 161 (2016)

    無法下載圖示 全文公開日期 2024/10/25 (校內網路)
    全文公開日期 2026/10/25 (校外網路)
    全文公開日期 2026/10/25 (國家圖書館:臺灣博碩士論文系統)
    QR CODE