利用改良式生成對抗網路進行跨領域步態辨識｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	葉庭恩 TING-EN, YE
論文名稱：	利用改良式生成對抗網路進行跨領域步態辨識 An Improved GAN-based Method for Cross-domain Gait Recognition
指導教授：	鮑興國 Hsing-Kuo Pao
口試委員:	項天瑞 Tien-Ruey Hsiang 鄧惟中 Wei-Chung Teng
學位類別：	碩士 Master
系所名稱：	電資學院 - 資訊工程系 Department of Computer Science and Information Engineering
論文出版年：	2019
畢業學年度：	107
語文別：	英文
論文頁數：	67
中文關鍵詞：	步態辨識、生成對抗網路、步態能量圖、三元組損失函數、注意力機制
外文關鍵詞：	Gait Recognition, Generative Adversarial Network, Gait Energy Image, Triplet loss, Attention mechanism
相關次數：	點閱：340 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

在現今社會中，人工智慧、深度學習與電腦視覺等相關領域之技術日趨成熟，例如人臉辨識系統已實際應用在生活中，人們可透過人臉辨識系統安全又便利地達到目的，而其中的限制為需與攝影機保持不遠的距離以利於辨識；近幾年來有許多學者在研究遠距離辨識行人的應用，其中一種方式為利用人們走路步態圖來進行行人辨識，此類的方式不僅能在遠距離的情況下辨識行人，也能在更廣泛的拍攝角度下進行辨識因此人們不需要刻意面向鏡頭，如此將能更廣泛的應用在許多地方，例如，於購物中心架設攝影機辨識客人以追蹤客戶及分析購物喜好。

雖然步態辨識能夠更廣泛運用在生活當中，然而也面臨了一些挑戰。當我們擷取行人步態的輪廓圖進行分析，輪廓圖卻容易因穿著、攝影機視角不同而大幅度改變，因此有著相同人卻有許多不同步態輪廓的變化性(intra-class variations)，因跨領域的步態圖而降低辨識準確率。本研究專注於在跨領域的步態輪廓圖進行辨識能有更可靠的結果。

本篇論文利用生成對抗網路方法將各種領域的步態圖生成至某特定領域(例如: 90度視角)之步態圖，以達到步態圖領域不變性的目的，惟此步態圖將不受外觀及視角影響地而幫助提升辨識的效果。此外有鑑於近幾年有學者提出改善生成對抗網路穩定生成圖片的方法，本研究基於更可靠的生成對抗網路方法，加強生成器生成轉換領域的圖片真實度；另外也結合三元組損失函數(triplet loss)，以生成出同類別該領域更真實的步態圖。

In recent years, the development of Artificial Intelligence, Deep Learning and Computer Vision grows fast. For example, face recognition system has been successful in our life, and people can go through safely and conveniently via the system. However, the limitation of face recognition systems is that the camera shouldn't be far from people so as to facilitate identification. Nowadays, the field of research related to human recognition in a long distance has been published. One prevalent way of researches is using human gait-based image analysis for human recognition. In this way, we can not only recognize human at a longer distance but also at a wider angle so that people are not necessary to face the camera straightly. Therefore, using gait image for human recognition can be more widely used in many places, for example, installing cameras in shopping mall to identify guests to track them and then analyze their shopping preferences since the camera could be far from customers.

Although gait recognition can be widely used in our life, there are some challenges in this field. We take the gait contour of pedestrian for analysis. The gait contour tends to change when people wear different clothes or are captured by different camera views. Therefore, there is a phenomenon called intra-class variations that the gait images will have several variations for the same person. While recognizing human in cross-domain situation, it will get low accuracy rate. We focus on cross-domain gait recognition to deal with the challenging problem.

This paper used the generative adversarial network(GAN) method to transform gait image with various view domains into a specific view domain (for example, a 90-degree view angle with normal clothing condition) to achieve the view-invariant gait image. In addition, in view of the suggestions made by some scholars in recent years to improve GAN performance, we apply a more reliable method of GAN to generate a more realistic gait image; We also introduce Triplet network to generate more realistic gait image with same person at target domain.

Recommendation Letter . . . . . . . . . . . . . . . . . . . . . . . . ii
Approval Letter . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
摘要. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . vi
Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
List of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . xiv
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1 Background and Challenges . . . . . . . . . . . . . . . . 1
2 Proposed Method . . . . . . . . . . . . . . . . . . . . . . 3
3 Thesis outline . . . . . . . . . . . . . . . . . . . . . . . . 5
Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1 Generative Adversarial Network . . . . . . . . . . 8
1.2 DCGAN . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Wasserstein GAN . . . . . . . . . . . . . . . . . . 11
1.4 WGAN-GP . . . . . . . . . . . . . . . . . . . . . 12
1.5 Self-Attention GAN . . . . . . . . . . . . . . . . 13
1.6 Triplet loss . . . . . . . . . . . . . . . . . . . . . 16
2 GaitGAN . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.1 Generator . . . . . . . . . . . . . . . . . . . . . . 19
2.2 Discriminator . . . . . . . . . . . . . . . . . . . . 19
2.3 Adversarial Training . . . . . . . . . . . . . . . . 20
3 Proposed Method . . . . . . . . . . . . . . . . . . . . . . 21
3.1 Applying different GAN methods . . . . . . . . . 21
3.2 Combine Triplet loss . . . . . . . . . . . . . . . . 21
3.3 Proposed method . . . . . . . . . . . . . . . . . . 22
Experiments and Results . . . . . . . . . . . . . . . . . . . . . 26
1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . 27
3 Experiment settings . . . . . . . . . . . . . . . . . . . . . 28
4 Recognition pipeline and Evaluation . . . . . . . . . . . . 29
5 Training details for proposed method . . . . . . . . . . . . 31
6 Experiments of cross-view situation . . . . . . . . . . . . 32
6.1 Compared methods: GEI-PCA and GaitGAN . . . 33
6.2 Proposed method . . . . . . . . . . . . . . . . . . 34
6.3 Comparison with state-of-the-art . . . . . . . . . . 37
6.4 Human visualization . . . . . . . . . . . . . . . . 37
7 Experiments of cross-condition and cross-view situation . 40
7.1 Result of proposed method on probeBG . . . . . . 40
7.2 Result of proposed method on probeCL . . . . . . 41
Conclusions and Future work . . . . . . . . . . . . . . . . . . . 43
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Appendix A：Experimental result of tuning parameter . . . . . . . 47
A.1 Experiments of tuning parameter . . . . . . . . . . . . . . 47
                                

[1] I. Rida, N. Almaadeed, and S. Almaadeed, “Robust gait recognition: a comprehensive survey,” IET Biometrics, vol. 8, no. 1, pp. 14–28, 2018.
[2] S. Yu, H. Chen, G. Reyes, B. Edel, and N. Poh, “Gaitgan: invariant gait feature extraction using generative adversarial networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 30–37, 2017.
[3] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, “Generative adversarial nets,” in Advances in neural information processing systems, pp. 2672–2680, 2014.
[4] F. Schroff, D. Kalenichenko, and J. Philbin, “Facenet: A unified embedding for face recognition and clustering,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 815–823, 2015.
[5] G. Zhao, G. Liu, H. Li, and M. Pietikainen, “3d gait recognition using multiple cameras,” in 7th International Conference on Automatic Face and Gesture Recognition (FGR06), pp. 529–534, IEEE, 2006.
[6] J. Tang, J. Luo, T. Tjahjadi, and F. Guo, “Robust arbitrary-view gait recognition based on 3d partial similarity matching,” IEEE Transactions on Image Processing, vol. 26, no. 1, pp. 7–22, 2016.
[7] M. Goffredo, I. Bouchrika, J. N. Carter, and M. S. Nixon, “Self-calibrating view-invariant gait biometrics,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 40, no. 4,pp. 997–1008, 2009.
[8] W. Kusakunniran, Q. Wu, J. Zhang, Y. Ma, and H. Li, “A new view-invariant feature for cross-view gait recognition,” IEEE Transactions on Information Forensics and Security, vol. 8, no. 10, pp. 1642–1653, 2013.
[9] K. Bashir, T. Xiang, and S. Gong, “Cross view gait recognition using
correlation strength.,” in Bmvc,pp. 1–11, 2010.
[10] M. Hu, Y. Wang, Z. Zhang, J. J. Little, and D. Huang, “View-invariant discriminative projection for multi-view gait-based human identification,” IEEE Transactions on Information Forensics and Security, vol. 8, no. 12, pp. 2034–2045, 2013.
[11] Z. Wu, Y. Huang, L. Wang, X. Wang, and T. Tan, “A comprehensive study on cross-view gait based human identification with deep cnns,” IEEE transactions on pattern analysis and machine intelligence, vol. 39, no. 2, pp. 209–226, 2016.
[12] C. Zhang, W. Liu, H. Ma, and H. Fu, “Siamese neural network based gait recognition for human identification,” in 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2832–2836, IEEE, 2016.
[13] S. Yu, H. Chen, Q. Wang, L. Shen, and Y. Huang, “Invariant feature extraction for gait recognition using only one uniform model,” Neurocomputing, vol. 239, pp. 81–93, 2017.
[14] A. Radford, L. Metz, and S. Chintala, “Unsupervised representation learning with deep convolutional generative adversarial networks,” arXiv preprint arXiv:1511.06434, 2015.
[15] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” arXiv preprint arXiv:1502.03167, 2015.
[16] M. Arjovsky and L. Bottou, “Towards principled methods for training generative adversarial networks,” in 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, 2017.
[17] M. Arjovsky, S. Chintala, and L. Bottou, “Wasserstein gan,” arXiv preprint arXiv:1701.07875, 2017.
[18] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville, “Improved training of wasserstein gans,” CoRR, vol. abs/1704.00028, 2017.
[19] H. Zhang, I. Goodfellow, D. Metaxas, and A. Odena, “Self-attention generative adversarial networks,” arXiv preprint arXiv:1805.08318, 2018.
[20] J. H. Lim and J. C. Ye, “Geometric gan,” arXiv preprint arXiv:1705.02894, 2017.
[21] D. Yoo, N. Kim, S. Park, A. S. Paek, and I. S. Kweon, “Pixel-level domain transfer,” in European Conference on Computer Vision, pp. 517–532, Springer, 2016.
[22] J. Han and B. Bhanu, “Individual recognition using gait energy image,” IEEE transactions on pattern analysis and machine intelligence, vol. 28, no. 2, pp. 316–322, 2005.
[23] S. Zheng, J. Zhang, K. Huang, R. He, and T. Tan, “Robust view transformation model for gait recognition,” in ICIP, pp. 2073–2076, 2011.

全文公開日期 2022/08/22 (校內網路)
全文公開日期 2024/08/22 (校外網路)
全文公開日期 2024/08/22 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文