研究生: |
程梓恩 Zih-En Cheng |
---|---|
論文名稱: |
以基於殘差雙向長短期記憶之超寬頻網路無線導引的社交機器人驗證可調整關注的YOLOX之遠距離人臉辨識的研究 Residual Bi-LSTM-Based Wireless Navigation of Social Robot Authenticating Far-Distance Humans by Adjusted Attention YOLOX-Based Face Recognition |
指導教授: |
黃志良
Chih-Lyang Hwang |
口試委員: |
黃志良
洪敏雄 吳修明 陳永耀 |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電機工程系 Department of Electrical Engineering |
論文出版年: | 2023 |
畢業學年度: | 111 |
語文別: | 中文 |
論文頁數: | 56 |
中文關鍵詞: | 無線定位和導航 、殘差雙向長短期記憶模式 、遠距離人臉辨識 、可調整關注 、YOLOX 、社交機器人 |
外文關鍵詞: | Wireless localization and navigation, Residual Bi-LSTM, Face recognition in a far distance, Adjusted attention, YOLOX, Social robot |
相關次數: | 點閱:255 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
為了在全球無法使用GPS的區域實現社交機器人的有效無線定位,使用具有分布式模塊化UWB網路(DM-UWBN)並透過Residual Bi-LSTM(RBLSTM)模型進行訓練和驗證。在一個尺寸為 公尺,具有兩根支柱和十個錨點的區域內,機器人以「8字形」運動的平均2D姿態定位誤差為(0.04公尺, 。為了整合來自遠距離的人臉辨識,設計了一個調整過的基於注意力的YOLOX(AA-YOLOX)多人臉模型,通過適當地對初始圖片進行分割,確保15公尺處的人臉像素(例如 )能夠有效地進行訓練、驗證和測試。基於AI平台(例如Jetson-AGX),對於大小為 的圖片,帶有8個 的分割,處理時間為296.3毫秒,相比之下,降採樣至 的處理時間為95.4毫秒。然而,降採樣技術無法辨識遠距離(例如30公尺)或具有不同照明條件(例如臉部陰影)的人臉。從10公尺到15公尺的平均實時影片辨識率為92.5%。此外,RBLSTM在Jetson-AGX的CPU中包括標籤和錨點之間的傳輸的處理時間為150毫秒。最後,通過無線導航、人臉辨識和路徑追踪控制的整合,驗證社交機器人引導被邀請客人到指定位置的任務。
To achieve effective wireless localization of social robot in a global GPS-denied region, the distributed module UWB network (DM-UWBN) is trained and validated by Residual Bi-LSTM (RBLSTM) model. The average 2D pose localization error of a robot motion with an “8-shape” in the region of with two pillars and ten anchors is (0.04m, To integrate the face recognition from a far distance, an adjusted attention-based YOLOX (AA-YOLOX) for multiple faces is designed by an appropriate segmentation of the incipient image, so that faces’ pixels at 15m (e.g., ) are assured to effectively train, validate, and test. Based on an AI platform (e.g., Jetson-AGX), the processing time for the image with 8 segmentations of equals 296.3ms in comparison to 95.4ms for its down-sampling to However, the down-sampling technique fails to recognize a face at a far distance (e.g., 30m) or with different illumination (e.g., shade in a face). The average online video-based recognition rate for a distance from 10m to 15m is 92.5%. Furthermore, the processing time of RBLSTM including transmission among tags and anchors in the CPU of Jetson-AGX is 150ms. Finally, the task of social robot to guide invited guests to designated localizations is verified by the integration of wireless navigation, face recognition, and path tracking control.
[1] K. Guo, X. Li, and L. Xie, “Ultra-wideband and odometry-based cooperative relative localization with application to multi-UAV formation control,” IEEE Trans. Cybern., vol. 50, no. 6, pp. 2590-2603, Jun. 2020.
[2] L. Lou, Q. Li, Z. Zhang, R. Yang, and W. He, “An IoT-driven vehicle detection method based on multisource data fusion technology for smart parking management system,” IEEE Internet Things J., vol. 7, no. 11, pp. 11020-11029, Nov. 2020.
[3] K. Zhao, M. Zhu, B. Xiao, X. Yang, C. Gong, and J. Wu, “Joint RFID and UWB technologies in intelligent warehousing management system,” IEEE Internet Things J., vol. 7, no. 12, pp. 11640–11655, Dec. 2020.
[4] J. He, and X. Gong, “Resilient path planning of unmanned aerial vehicles against covert attacks on ultra-wideband sensors,” IEEE Trans. Ind. Inform., to be published, DOI 10.1109/TII.2023.3240595.
[5] V. Niculescu, D. Palossi, M. Magno, and L. Benini, “Energy-efficient, precise UWB-based 3-D localization of sensor nodes with a nano-UAV,” IEEE Internet Things J., vol. 10, no. 7, pp. 5760-5777, Apr. 2023.
[6] D. Chiasson, Y. Lin, M. Kok, and P. Shull, “Asynchronous hyperbolic UWB source-localization and self-localization for indoor tracking and navigation,” IEEE Internet Things J., vol. 10, no. 13, pp.11655-11668, Jul. 2023.
[7] A. Alarifi, A. Al-Salman, M.Alsaleh, S. Al-Hadhrami, M.A. Ammar, and H. S. Al-Khalifa, “Ultra wideband indoor positioning technologies: analysis and recent advances,” Sensors, vol. 16, no. 5, pp. 1–36, May 2016.
[8] G. Goswami, M. Vatsa, and R. Singh, “RGB-D face recognition with texture and attribute features,” IEEE Trans. Information Forensics and Security, vol. 9, no. 10, pp. 1629-1640, Oct. 2014.
[9] S. P. Mudunuri and S. Biswas, “Low resolution face recognition across variations in pose and illumination,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 38, no. 5, pp. 1034-1040, May 2016.
[10] H. B. Abebe and C.-L. Hwang, “RGB-D face recognition using LBP with suitable feature dimension of depth image,” IET Cyber Physical Systems: Theory & Applications, vol. 4, no. 3, pp. 189-197, 2019.
[11] Z. Ou, Y. Hu, M. Song, Z. Yan, and P. Hui, “Redundancy removing aggregation network with distance calibration for video face recognition,” IEEE Internet Things J., vol. 8, no. 9, pp. 7279-7287, May. 2021.
[12] S.-J. Horng, J. Supardi, W. Zhou, C.-T. Lin, and B. Jiang, “Recognizing very small face images using convolution neural networks,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 3, pp. 2103-2115, Mar. 2022.
[13] S. Liu, D. Huang, and Y. Wang, “Pay attention to them: deep reinforcement learning-based cascade object detection,” IEEE Trans. Neural Netw. Learn. Syst., vol. 31, no. 7, pp. 2544-2556, Jul. 2020.
[14] Y. Xu, Y. S. Shmaliy, C. K. Ahn, T. Shen, and Y. Zhuang, “Tightly coupled integration of INS and UWB using fixed-lag extended UFIR smoothing for quadrotor localization,” IEEE Internet Things J., vol. 8, no. 3, pp. 1716-1727, Feb. 2021.
[15] Y. Cao, C. Chen, D. St-Onge, and G. Beltrame, “Distributed TDMA for mobile UWB network localization,” IEEE Internet Things J., vol. 8, no. 17, pp. 13449-13464, Sep. 2021.
[16] A. Li, E. Bodanese, S. Poslad, T. Hou, K. Wu, and F. Luo, “A trajectory-based gesture recognition in smart homes based on the ultrawide band communication system,” IEEE Internet Things J., vol. 9, no. 22, pp. , 22861-22873, Nov. 2022.
[17] D.-H. Kim, A. Farhad, and J.-Y. Pyun, “UWB positioning system based on LSTM classification with mitigated NLOS effects,” IEEE Internet Things J., vol. 10, no. 2, pp. 1822-1835, Jan. 2023.
[18] Z. Hajiakhondi-Meybodi, A. Mohammadi, M. Hou, and K. N. Plataniotis, “DQLEL: Deep Q-learning for energy-optimized LoS/NLoS UWB node selection,” IEEE Trans. Signal Processing, vol. 70, pp. 2532-2547, 2022.
[19] M. Martal`o, S. Perri, G. Verdano, F. De Mola, F. Monica, and G. Ferrari, “Hybrid UWB-inertial TDoA-based target tracking with concentrated anchors,” IEEE Internet Things J., vol. 10, no. 14, pp. 12676-12689, Jul. 2023.
[20] D. Feng, J. Peng, Y. Zhuang, C. Guo, T. Zhang, Y. Chu, X. Zhou, and X.-G. Xia, “An adaptive IMU/UWB fusion method for NLOS indoor positioning and navigation,” IEEE Internet Things J., vol. 10, no. 13, pp. 11414-11428, Jul. 2023.
[21] H. Hu, H. Shan, C. Wang, T. Sun, X. Zhen, K. Yang, L. Yu, Z. Zhang, and T. Q. S. Quek, “Video surveillance on mobile edge networks—a reinforcement-learning-based approach,” IEEE Internet Things J., vol. 7, no.6, pp. 4746-4760, Jun. 2020.
[22] Z. Ou, Y. Hu, M. Song, Z. Yan, and P. Hui, “Redundancy removing aggregation network with distance calibration for video face recognition,” IEEE Internet Things J., vol. 8, no.9, pp. 7279-7287, May 2021.
[23] A. Naouri, H. Wu, N. A. Nouri, S. Dhelim, and H. Ning, “A novel framework for mobile-edge computing by optimizing task offloading,” IEEE Internet Things J., vol. 8, no. 16, pp. 13065-13076, Aug. 2021.
[24] G. Muhammad, and M. S. Hossain, “Emotion recognition for cognitive edge computing using deep learning,” IEEE Internet Things J., vol. 8, no.23, pp. 16894-16901, Dec. 2021.
[25] J. Liu, Z. Zhao, P. Li, G. Min, and H. Li, “Enhanced embedded AutoEncoders: An attribute-preserving face de-identification framework,” IEEE Internet Things J., vol. 10, no. 11, pp. 9438-9452, Jun. 2023.
[26] P. Du, X. Zheng, L. Liu, H. Ma, “LC-GAN: Improving adversarial robustness of face recognition systems on edge devices,” IEEE Internet Things J., vol. 10, no. 9, pp. 8172-8184, May 2023.
[27] C.-L. Hwang, Y.-C. Deng, and S.-E. Pu, “Human-robot collaboration using sequential-recurrent-convolution-network-based dynamic face emotion and wireless speech command recognitions,” IEEE Access, vol. 11, pp. 37269-37282, Apr. 2023.
[28] C. Dhiman, and D. K. Vishwakarma, “View-invariant deep architecture for human action recognition using two-stream motion and shape temporal dynamics,” IEEE Trans. Image Process., vol. 29, pp. 3835-3844, 2020.
[29] G. Du, Z. Wang, B. Gao, S. Mumtaz, K. M. Abualnaja, and C. Du, “A convolution bidirectional long short-term memory neural network for driver emotion recognition,” IEEE Trans. Intel. Transp. Syst., vol. 22, no. 7, pp. 4570-4578, Jul. 2021.
[30] X. Sun, Y. Gao, R. Sutcliffe, S.-X. Guo, X. Wang, and J. Feng, “Word representation learning based on bidirectional GRUs with drop loss for sentiment classification,” IEEE Trans. Syst. Man Cybern.: Syst., vol. 51, no. 7, pp.4532-4542, Jul. 2021.
[31] K. Zhang, M. Sun, T. X. Han, X. Yuan, L. Guo, and T. Liu, “Residual networks of residual networks: multilevel residual networks,” IEEE Trans. Cir. Syst. Vid. Technol., vol. 28, no. 6, pp. 1303-1314, Jun. 2018.
[32] L. Zhu, S. Zhang, K. Chen, S. Chen, X. Wang, D. Wei, and H. Zhao, “Low-SNR recognition of UAV-to-ground targets based on micro-doppler signatures using deep convolutional denoising encoders and deep residual learning,” IEEE Trans. Geosci. Remote Sens, vol. 60, article no. 5106913, 2022.
[33] M.-S. Ko, K. Lee, J.-K. Kim, C. W. Hong, Z. Y. Dong, and K. Hur, “Deep concatenated residual network with bidirectional LSTM for one-hour-ahead wind power forecasting,” IEEE Trans. Sustainable Energy, vol. 12, no. 2, pp. 1321-1335, Apr. 2021.
[34] H. Nan, X. Zhu, and J. Ma, “MSTL-GLTP: A global local decomposition and prediction framework for wireless traffic,” IEEE Internet of Things J., vol. 10, no. 6, pp. 5024-5034, Mar. 2023.
[35] G. Wang, H. Zheng, and X. Zhang, “A robust checkerboard corner detection method for camera calibration based on improved YOLOX,” Frontiers in Physics, vol. 9, article no. 819019, Feb. 2022.
[36] H. Y. Wan, J. Chen, Z. X. Huang, R. F. Xia, B. C. Wu, L. Sun, B. D. Yao, X. P. Liu, and M. D. Xing, “ AFSar:_An anchor-free SAR target detection algorithm based on multiscale enhancement representation learning,” IEEE Trans. Geosci. Remote Sens., vol. 60, article no. 5219514, 2022.
[37] X. Xu, Z. Feng, C. Cao, C. Yu, M. Li, Z. Wu, S. Ye, and Y. Shang, “STN-Track: Multiobject tracking of unmanned aerial vehicles by swin transformer neck and new data association method,” IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, vol. 15, pp. 8734-8743, 2022.
[38] C. Zhu, J. Qian, and B. Wang, “YOLOX on embedded device with CCTV amp TensorRT for intelligent multicategories garbage identification and classification,” IEEE Sensor Journal, vol. 22, no. 16, pp. 16522-16532, Aug. 2022.
[39] R. W. Liu, Y. Guo, J. Nie, Q. Hu, Z. Xiong, H. Yu, and M. Guizani, “ Intelligent edge-enabled efficient multi-source data fusion for autonomous surface vehicles in maritime internet of things,” IEEE Trans. Green Communication Networking, vol. 6, no. 3, pp. 1574-1587, Sep. 2022.
[40] X. Wang, J.-S. Gao, B.-J. Hou, Z.-S. Wang, H.-W. Ding, and J. Wang, “A lightweight modified YOLOX network using coordinate attention mechanism for PCB surface defect detection,” IEEE Sensor Journal, vol. 22, no. 21, pp. 20910-20920, Nov. 2022.
[41] Z. Song, X. Huang, C. Ji, and Y. Zhang, “Deformable YOLOX: detection and rust warning method of transmission line connection fittings based on image processing technology,” IEEE Trans. Instrum. Meas., vol. 72, article no. 2504321, 2023.
[42] W. Zhao, Y. Kang, H. Chen, Z. Zhao, Z. Zhao, and Y. Zhai, “ Adaptively attentional feature fusion oriented to multiscale object detection in remote sensing images,” IEEE Trans. Instrum. Meas., vol. 72, article no. 5008111, 2023.
[43] Y. Yang, X. Gao, Y. Wang, and S. Song, “VAMYOLOX: an accurate and efficient object detection algorithm based on visual attention mechanism for UAV optical sensors,” IEEE Sensor Journal, vol. 23, no. 11, pp. 11139-11155, Jun. 2023.