簡易檢索 / 詳目顯示

研究生: 陳建祐
Chien-Yu Chen
論文名稱: 基於序列幀輸入的生成對抗網路架構結合 降噪機制於影片時空相關性分析之 跌倒偵測技術
Fall Detection with Spatial-Temporal Correlation Encoded by the Sequence-to-Sequence Denoised GAN
指導教授: 郭景明
Jing-Ming Guo
口試委員: 楊士萱
Shin-Hsuan Yang
王乃堅
Nai-Jian Wang
鍾國亮
Kuo-Liang Chung
夏至賢
Chih-Hsien Hsia
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 105
中文關鍵詞: 生成對抗網路跌倒偵測飛時測距相機紅外線深度影像
外文關鍵詞: GAN, Fall Detection, Time of Flight, IR-depth Images
相關次數: 點閱:885下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 老年族群的跌倒意外通常會導致較高的住院和死亡風險,或引起後續的併發症。如何讓跌倒意外發生當下能藉由示警系統發報而有立即的對應救護措施,成為銀髮長照中重要的環節。以往基於視覺的跌倒檢測方法大多缺乏實務上的考量,例如相機的架設角度、日夜光源環境變化與相機架設的地點所需考量到的隱私維護問題。本研究實驗中使用深度圖與熱成像圖做為偵測的輸入影像,其優點在於能保障用戶者的隱私且不受外在光源與日夜間環境的影響,以達到長時間連續偵測警示的效果。在使用深度學習架構下,由於跌倒行為的偶發性質會造成正負樣本收集不均的問題,故採用非監督式學習架構僅針對一般的正常行為進行模型訓練,而將跌倒意外定義為異常事件。本研究中提出了基於序列幀輸入預測未來幀的生成對抗網路架構結合降噪機制(S2SdGAN),其結合生成網路、判別網路、光流網路與編碼特徵比較降噪機制。生成網路由encoder-decoder的架構組成,它能夠基於正常行為的序列幀影像預測所對應的未來幀,以萃取受測者於一般正常行為的連續動態特徵,並且透過判別網路判斷目前生成之未來幀是否形態正確。同時使用光流網路幫助生成網路學習預測連續序列幀影像的時序動態關係。最終透過編碼特徵比較降噪機制以降低深度圖中的抖動雜訊影響,並判斷未來幀的時序動態變化是否正常以達到跌倒偵測的目的。與目前最前沿的技術相比,本論文中所提出的方法架構能實現(1)多人跌倒偵測、(2)即時(real-time)判斷是否發生跌倒意外、(3)受測者跌倒後被遮擋之解決方案與(4)濾除深度圖像雜訊 之網路。實驗中收集包含各種日常活動和跌倒意外的深度圖資料集以驗證系統的穩定性及有效性。本研究所提出跌倒偵測架構更能符合實際跌倒發生時的情境,並有效達到長時間連續偵測跌倒意外的功能。


    Falling is a major cause of personal injury and accidental death worldwide, especially for the elderly. For aged care, a falling alarm system is highly demanded so that medical aid can be obtained immediately when the falling accident happened. Previous studies about falling detection lack practical considerations to work for real-world situations, which includes the camera’s mounting angle, lighting differences between day and night, and the privacy protection for users. For our experiments, IR-depth images and thermal images are used as the input image sources for fall detection, as a result, detailed facial information is not captured by the system, and it is invariant to the lighting conditions. Owing to the occasional property of falling, the problem of data imbalance between falling samples and normal samples may occur, and it became the major drawback for supervised learning approaches. Therefore, in this study, the anomaly detection is performed using unsupervised learning approaches. That is, the models are only trained with the normal cases, and falling accident will be defined as an anomaly event. In this thesis, the Sequence-to-Sequence Denoising GAN (S2SdGAN) for fall detection is proposed to perform spatial-temporal correlation analysis. The proposed network comprises the future frame generator, frame discriminator, FlowNet, and denoising scheme of encoded features comparisons. The designed framework for falling detection provides (1) multi-subject detection, (2) real-time fall alarm that is triggered by motion, (3) a solution to the situations that subjects became unseen after falling, and (4) a denoising scheme for depth images. The experimental results show that the proposed system achieves state-of-the-art performance on the public datasets. In addition, a dataset that includes real world falling accidents and other regular activities is futher collected to verify the validity and the robustness of the framework. And the results suggest that the proposed system can work for the real-world cases to achieve fall detection successfully.

    摘要 I Abstract II 誌謝 III 目錄 IV 圖索引 VII 表索引 XI 第一章 緒論 1 1.1 研究背景與動機 1 1.2 論文架構 3 第二章 文獻探討 4 2.1 飛時測距相機 4 2.1.1 量測原理 5 2.1.2 深度圖成像原理 6 2.1.3 飛時測距與其他3D成像技術方案比較與應用領域 7 2.2 類神經網路 7 2.2.1 Fully Convolution Network (FCN)架構 18 2.2.2 U-Net架構 20 2.2.3 視覺化過程 22 2.3 生成對抗網路 25 2.3.1 DCGAN架構 28 2.3.2 CGAN架構 30 2.3.3 CycleGAN架構 34 2.3.4 Pix2pix架構 38 2.4 跌倒偵測演算法之文獻探討 41 2.4.1 Vision-Based Fall Detection with Convolutional Neural Networks 42 2.4.2 RGB-D Fall Detection via Deep Residual Convolutional LSTM Networks 44 2.4.3 Implementation of Fall Detection System Based on 3D Skeleton for Deep Learning Technique 46 2.4.4 DeepFall-Non-invasive Fall Detection with Deep Spatio-Temporal Convolutional Autoencoders 49 2.4.5 跌倒偵測演算法優缺點分析 54 第三章 基於序列幀輸入的生成對抗網路架構結合降噪機制於影片時空相關性分析之跌倒偵測技術 55 3.1 系統簡介 56 3.2 網絡架構 56 3.2.1 生成網路之網路結構 57 3.2.2 判別網路之網路結構 58 3.2.3 編碼特徵比較機制 59 3.3 Loss function 60 3.3.1 L2 loss 61 3.3.2 Gradient loss 62 3.3.3 Optical flow loss 62 3.4 訓練方式 63 第四章 實驗數據與結果 65 4.1 測試環境 65 4.2 資料庫與資料前處理 65 4.3 評估標準介紹 67 4.3.1 Area Under the Curve (AUC) 67 4.3.2 跌到偵測系統性能評估指標△S 69 4.4 結果分析與比較 70 4.4.1 不同損失函數組合之比較 70 4.4.2 使用PSNR與編碼特徵比較機制作為跌倒算分法則之結果比較 72 4.4.3 與過往文獻之比較 75 4.5 實驗結果 78 4.5.1 跌倒偵測於公開資料集之實驗結果 78 4.5.2 跌倒偵測於實際錄製資料集之實驗結果 82 第五章 結論與未來展望 86 參考文獻 87  

    [1] J. Nogas, S.S. Khan, and A. Mihailidis, “DeepFall: Non-Invasive Fall Detection with Deep Spatio-Temporal Convolutional Autoencoders,” Journal of Healthcare Informatics Research, vol. 4, no. 1, pp. 50-70, 2020.
    [2] W. He, D. Goodkind, and P. R. Kowal, “An aging world: 2015,” International Population Reports. U.S. Government Printing Office, 2016.
    [3] Z. Pang, L. Zheng, J. Tian, S. Kao-Walter, E. Dubrova, and Q. Chen, “Design of a terminal solution for integration of in-home health care devices and services towards the Internet-of-Things,” Enterprise Information Systems, vol. 9, no. 1, pp. 86-116, 2015.
    [4] S. Bureau, “Ministry of internal affairs and communications, Japan,” Annual Report on Current Population Estimates, 2010.
    [5] M.E. Tinetti, W.-L. Liu, and E.B. Claus, “Predictors and prognosis of inability to get up after falls among elderly persons,” Jama, vol. 269, no. 1, pp. 65-70, 1993.
    [6] C. Lindmeier, World report on ageing and health, World Health Organization, 2015.
    [7] R. Igual, C. Medrano, and I. Plaza, “Challenges, issues and trends in fall detection systems,” Biomedical engineering online, vol. 12, no. 1, p. 66, 2013.
    [8] A. Dosovitskiy, P. Fischer, E. Ilg, P. Hausser, C. Hazirbas, V. Golkov, P. van der Smagt, D. Cremers, and T. Brox. Flownet: Learning optical flow with convolutional networks. In ICCV, pages 2758–2766, 2015.
    [9] H. Kerdegari, K. Samsudin, A.R. Ramli, and S. Mokaram, “Evaluation of fall detection classification approaches,” in 2012 4th International Conference on Intelligent and Advanced Systems (ICIAS2012), vol. 1, pp. 131-136, 2012.

    [10] A. Abobakr, M. Hossny, and S. Nahavandi, “A skeleton-free fall detection system from depth images using random decision forest,” IEEE Systems Journal, vol. 12, no. 3, pp. 2994-3005, 2017.
    [11] G. Demiris, M.J. Rantz, M.A. Aud, K.D. Marek, H.W. Tyrer, M. Skubic and Ali A Hussam, “Older adults' attitudes towards and perceptions of ‘smart home’technologies: a pilot study,” Medical informatics and the Internet in medicine, vol. 29, no. 2, pp. 87-94, 2004.
    [12] E.E. Stone and M. Skubic, “Fall detection in homes of older adults using the Microsoft Kinect,” IEEE journal of biomedical and health informatics, vol. 19, no. 1, pp. 290-301, 2014.
    [13] A. Abobakr, M. Hossny, H. Abdelkader, and S. Nahavandi, “Rgb-d fall detection via deep residual convolutional lstm networks,” 2018 Digital Image Computing: Techniques and Applications (DICTA), pp. 1-7, 2018.
    [14] A. Núñez-Marcos, G. Azkune, and I. Arganda-Carreras, “Vision-based fall detection with convolutional neural networks,” Wireless communications and mobile computing, vol. 2017, 2017.
    [15] T.H. Tsai and C.W. Hsu, “Implementation of Fall Detection System Based on 3D Skeleton for Deep Learning Technique,” IEEE Access, vol. 7, pp. 153049-153059, 2019.
    [16] B. Kwolek and M. Kepski, “Human fall detection on embedded platform using depth maps and wireless accelerometer,” Computer methods and programs in biomedicine, vol. 117, no. 3, pp. 489-501, 2014.
    [17] S. Vadivelu, S. Ganesan, O. R. Murthy, and A. Dhall, “Thermal imaging based elderly fall detection,” Asian Conference on Computer Vision, Springer, pp. 541-553, 2016.
    [18] O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” International Conference on Medical image computing and computer-assisted intervention, Springer, pp. 234-241, 2015.
    [19] P. Isola, J.Y. Zhu, T. Zhou, and A.A. Efros, “Image-to-image translation with conditional adversarial networks,” Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1125-1134, 2017.
    [20] J.Y. Zhu, T. Park, P. Isola, and A.A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” Proceedings of the IEEE international conference on computer vision, pp. 2223-2232, 2017.
    [21] C. Li and M. Wand, “Precomputed real-time texture synthesis with markovian generative adversarial networks,” European conference on computer vision, Springer, pp. 702-716, 2016.
    [22] D.P. Kingma and J.Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
    [23] G.E. Hinton and R.R. Salakhutdinov, “Reducing the dimensionality of data with neural networks,” science, vol. 313, no. 5786, pp. 504-507, 2006.
    [24] V. Nair and G.E. Hinton, “Rectified linear units improve restricted boltzmann machines,” International Conference on Machine Learning (ICML), 2010.
    [25] F.H. Chan, Y.T. Chen, Y. Xiang, and M. Sun, “Anticipating accidents in dashcam videos,” Asian Conference on Computer Vision, Springer, pp. 136-153, 2016.
    [26] S.S. Khan, J. Nogas, and A. Mihailidis, “Spatio-Temporal Adversarial Learning for Detecting Unseen Falls,” arXiv preprint arXiv:1905.07817, 2019.
    [27] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
    [28] A. Krizhevsky, I. Sutskever, and G.E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” advances in Neural Information Processing Systems (NIPS), 2012.
    [29] V. Nair and G.E. Hinton, “Rectified linear units improve restricted boltzmann machines,” Machine Learning, Haifa, Israel, pp. 807-814, 2010.
    [30] R.T. Tan, “Visibility in bad weather from a single image,” IEEE Inter. Conf. on Computer Vision and Pattern Recognition(CVPR), 2008.
    [31] O. Ronneberger and P. Fischer and T. Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” Medical Image Computing and Computer-Assisted Intervention (MICCAI), vol. 9351, pp. 234-241, 2015.
    [32] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville and Y. Bengio, “Generative adversarial nets,” Advances in Neural Information Processing Systems (NIPS), 2014.
    [33] M. Mirza and S. Osindero, “Conditional generative adversarial nets,” arXiv preprint arXiv:1411.1784, 2014.
    [34] C.O.A. Odena and J. Shlens, “Conditional image synthesis with auxiliary classifier gans,” arXiv preprint arXiv:1610.09585, 2016.
    [35] J. Zhu, T. Park, P. Isola, and A.A. Efros, “Unpaired image-to-image translation using cycle-consistent adversarial networks,” in Proc. IEEE Int. Conf. Computer Vision (ICCV), 2017.
    [36] M. Mathieu, C. Couprie, and Y. LeCun, “Deep multi-scale video prediction beyond mean square error,” arXiv preprint arXiv:1511.05440, 2015.
    [37] J. Long, E. Shelhamer, and T. Darrell, “Fully convolutional networks for semantic segmentation.” Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431-3440, 2015.
    [38] M. Drozdzal, E. Vorontsov, G. Chartrand, S. Kadoury, and C. Pal, “The Importance of Skip Connections in Biomedical Image Segmentation,” Workshop on Deep Learning in Medical Image Analysis (DLMIA), 2016.
    [39] M. D. Zeiler and R. Fergus, "Visualizing and understanding convolutional networks," in European conference on computer vision, 2014: Springer, pp. 818-833.
    [40] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, and S. Ghemawat, “Tensorflow: Large-scale machine learning on heterogeneous distributed systems.”, arXiv preprint arXiv:1603.04467, 2016.
    [41] Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., and Darrell, T., “Caffe: Convolutional architecture for fast feature embedding.”, Proc. of the 22nd ACM international conference on Multimedia, ACM, pp. 675-678, 2014.
    [42] A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” arXiv preprint arXiv:1511.06434, 2015.
    [43] M. Bertalmio, A. L. Bertozzi, and G. Sapiro, “Navier-stokes, fluid dynamics, and image and video inpainting,” Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2001), vol. 1, 2001.
    [44] A. Telea, “An image inpainting technique based on the fast marching method,” Journal of graphics tools, vol. 9, no. 1, pp. 23-34, 2004.

    無法下載圖示 全文公開日期 2025/08/24 (校內網路)
    全文公開日期 2025/08/24 (校外網路)
    全文公開日期 2025/08/24 (國家圖書館:臺灣博碩士論文系統)
    QR CODE