簡易檢索 / 詳目顯示

研究生: 江文彬
Wen-Pin Chiang
論文名稱: 自主性機器人輔助視障者於自然環境下的字牌偵測與辨識技術
Text Plates Detection and Recognition Techniques for an Autonomous Robot to Assist Visually Impaired Persons in Natural Scenes
指導教授: 范欽雄
Chin-Shyurng Fahn
口試委員: 洪一平
Yi-Ping Hung
王聖智
Sheng-Jyh Wang
王榮華
Jung-Hua Wang
李漢銘
Hahn-Ming Lee
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2010
畢業學年度: 98
語文別: 英文
論文頁數: 83
中文關鍵詞: 字牌偵測字牌辨識區域切割線性鑑別分析自主性機器人自然環境
外文關鍵詞: text plate detection, text plate recognition, region segmentation, linear discriminant analysis, autonomous robot, natural environment
相關次數: 點閱:377下載:5
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 對一般人而言,以視覺方式顯示的資訊是最快也是最直接取得外在資訊的方式,譬如透過活動看板、門牌或是霓虹燈招牌等等,然而對於視障者來說,卻無法透過上述的視覺傳達的方式來獲得資訊。最近幾年,有許多研究者投入心力於光學字元識別(Optical Character Recognition)的研究,其主要目的在於協助人群從影像中擷取文字訊息,使得視障者也有閱讀文字的權利。在日常生活中,有許多地方可由服務性機器人提供支援,譬如居家安全、娛樂,以及導引等;對於自然的環境來說,如何利用這些機器人引導或輔助視障者抵達預定地,並且告知視障者周邊環境的狀況及相關文字訊息,是一個值得探討的問題。本論文將提出一個在自然環境下,自主性機器人對於字牌偵測及辨識的技術,並把解析出的結果,以合成的電子語音迅速地朗讀出來,使視障者獲得環境的文字資訊,並利用雷射測距儀與超音波感測器的資料融合,判別環境狀況,以協助視障者安全地行進。

    於我們所建構具有字牌偵測與辨識能力的自主性機器人,其視覺系統為主要擷取週遭資訊的配備,它是利用PTZ 攝影機獲得室內環境中的連續影像,在我們的實驗裡,將使用在影像強度上對比較高的字牌提供區域切割的資訊,字牌上的字元限定為英文大小寫及阿拉伯數字。在自主性機器人行進的途中,會對所取得的連續影像做字牌偵測,首先利用顏色資訊與形狀比例做分析,當在取得欲測區域之後,使用邊緣偵測法來觀察每一個區域的邊點數,其中邊點數過少的區域將被過濾掉,再利用簡單的規則濾除非文字的區域;接著,我們使用線性鑑別分析的方法做字牌辨識,若辨識結果沒包含未知的文字,則機器人便利用揚聲裝置,將內容朗讀出來。在我們的實驗過程中,受到光影的影響之下,字牌偵測率約95.75%,而受到遮蔽的影響之下約87.5%;另外,字牌辨識率平均約有92.44%。在我們的系統,影像的解析度為640×480,而偵測和辨識程序的處理速度平均一秒約有十六張畫面。


    For most people, a visual display of information is the fastest and most direct way to receive external information, through activities such as billboards and neon signs. However, for the visually impaired, it can not be conveyed via the visual way to obtain information. In recent years, many researchers have done the best of their abilities in the development of Optical Character Recognition. The main purpose of these researches is to help people retrieve text messages from the images, allowing the visually impaired also have the right to read the text. In our daily life, many supports could be offered from service robots; for example, home security, entertainment, and road guidance. For natural environments, there is an issue worthy to be investigated that how to use these robots to guide or assist the visually impaired reaching the intended goal. In this thesis, we present text plate detection and recognition techniques used for an autonomous robot in natural environments, and feed the recognition result to electronic speech synthesis to read out quickly, so that the visually impaired access to the environmental information in texts. In order to assist the visually impaired walk on road safely, both the laser range finder and ultrasonic sensor data are employed to determine environmental conditions for obstacle avoidance.

    The vision system of our autonomous robot which possesses the text plate detection and recognition abilities is the main equipment to capture surrounding information. By a PTZ camera, we can obtain the sequential images of natural environments. In the experiments, we adopt the text plates that have high contrast of image intensity to provide the information of region segmentation, and the characters on text plates are restricted to English alphabets and Arabic numerals. As the autonomous robot moving forward, the obtained sequential images are used for text plate detection. We analyze the color information and shape ratio in these images to determine candidate regions, and observe the edge points in each candidate region after edge detection. Some regions are filtered out because the number of edge points is too few in them. Finally, several simple rules are used to filter out non-character regions. We recognize the text plate by applying the linear discriminant analysis method. If the results do not have unknown words, then the robot using the speaker read out the content. The experimental results reveal that the text plate detection rate is 95.75% under the influence of lights and shadows. The detection rate is 87.5% when the text plates are occluded with blots. Besides, the average text plate recognition rate is about 92.44%. In our system, each captured frame contains 640×480 pixels, and the procedures of text plate detection and recognition work by 16 frames per second.

    誌謝 i 中文摘要 ii Abstract iv Contents vi List of Figures viii Chapter 1 Introduction 1 1.1 Overview 1 1.2 Background and motivation 2 1.3 Thesis organization 4 Chapter 2 System description 5 2.1 Hardware System description 6 2.2 Vision system description 12 Chapter 3 Related Works 14 3.1 Review of text detection 15 3.2 Review of text recognition 20 Chapter 4 Text Plate Detection 24 4.1 Otsu’s binarization of blocks 25 4.1.1 Computing Fisher's Discriminant Rate 26 4.1.2 Global and Local Image binarizaing 28 4.2 Connected component processing 29 4.2.1 Connected component labeling 29 4.2.2 Edge detection 31 4.2.3 Labeled component filtering 33 Chapter 5 Text Plate Recognition 37 5.1 Image preprocessing 38 5.1.1 Lighting compensation 39 5.1.2 Letter normalization 40 5.2 Letter classification 41 5.2.1 Feature extraction 41 4.2.2 Linear discriminant analysis 44 Chapter 6 Experimental Results and Discussions 47 6.1 The results of text plate detection 47 6.2 The results of text plate recognition 54 6.3 The results of obstacle avoidance using the scanning laser range finder 63 Chapter 7 Conclusions and Future Works 65 7.1 Conclusions 65 7.2 Future works 67 References 69

    [1] M. Y. Shieh, J. C. Hsieh, and C. P. Cheng, “Design of an intelligent hospital service robot and its application,” in Proceedings of the International Conference on Systems, Man, and Cybernetics, Hague, Netherlands, pp. 4377-4382, 2004.
    [2] R. C. Luo, P. K. Wang, T. Y. Hsu, and T. Y. Lin, “Navigation and mobile security system of intelligent security robot,” in Proceedings of the IEEE International Conference on Industrial Technology, Hong Kong, pp. 260-265, 2005.
    [3] B. Tribelhorn and Z. Dodds, “Evaluating the Roomba: A low-cost, ubiquitous platform for robotics research and education,” in Proceedings of the IEEE International Conference on Robotics and Automation, Roma, Italy, pp. 1394-1399, 2007.
    [4] H. Makino, I. Ishii, and M. Nakashizuka, “Development of Navigation System for the Blind Using GPS and Mobile Phone Combination,” in Proceedings of the 18th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Amsterdam, Holland, pp. 506-507, 1996.
    [5] J. Tang , X. Jing, D. He, S. Liang, “Blind-Road Location and Recognition in Natural Scene,” in Proceedings of the World Congress on Computer Science and Information Engineering, Los Angeles, California, pp.287-283, 2009.
    [6] Yi-Zeng Hsieh, “A Stereo-Vision-Based Aid System for the Blind,” Master Thesis, Department of Computer Science and Information Engineering, National Central University, Zhongli, Taiwan, 2006.
    [7] W. Wu, X. Chen, and J. Yang, “Detection of text on road signs from video,” IEEE Transactions on Intelligent Transportation Systems, vol. 6, no. 4, pp. 378-390, 2005.
    [8] T. Saoi, H. Goto, and H. Kobayashi, “Text detection in color scene images based on unsupervised clustering of multi-channel wavelet features,” in Proceedings of the 8th International Conference on Document Analysis and Recognition, Seoul, Korea, vol. 2, pp. 690-694, 2005.
    [9] N. Ezaki, K. Kiyota, B. T. Minh, M. Bulacu, and L. Schomaker, “Improved text-detection methods for a camera-based text reading system for blind persons,” in Proceedings of the 8th International Conference on Document Analysis and Recognition, Seoul, Korea, vol. 1, pp. 257-261, 2005.
    [10] G. Miao, G. Zhu, S. Jiang, Q. Huang, X. Changsheng, and W. Gao, “A real-time score detection and recognition approach for broadcast basketball video,” in Proceedings of the IEEE International Conference on Multimedia and Expo., Beijing, China, pp. 1691-1694, 2007.
    [11] S. Hu and M. Chen, “Adaptive Fréchet kernel based support vector machine for text detection,” in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Philadelphia, Pennsylvania, vol. 5, pp. 365-368, 2005.
    [12] Q. Ye, Q. Huang, W. Gao and, D. Zhao, “Fast and robust text detection in images and video frames,” Image and Vision Computing, vol. 23, no. 6, pp. 565-576, 2005.
    [13] K. I. Kim, K. Jung, and J. H. Kim, “Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 12, pp. 1631-1639, 2003.
    [14] P. Dubey, “Edge based text detection for multi-purpose application,” in Proceedings of the 8th International Conference on Signal Processing, Guilin, China, vol. 4, 2006.
    [15] C. Liu, C. Wang, and R. Dai, “Text detection in images based on unsupervised classification of edge-based features,” in Proceedings of the 8th International Conference on Document Analysis and Recognition, Seoul, Korea, vol. 2, pp. 610-614, 2005.
    [16] N. Ezaki, M. Bulacu, and L. Schomaker, “Text detection from natural scene images: Towards a system for visually impaired persons,” in Proceedings of the 17th International Conference on Pattern Recognition, Cambridge, United Kingdom, vol. 2, pp. 683-686, 2004.
    [17] Y. Liu, S. Goto, and T. Ikenaga, “A robust algorithm for text detection in color images,” in Proceedings of the 8th International Conference on Document Analysis and Recognition, Seoul, Korea, vol. 1, pp. 399-403, 2005.
    [18] D. Chen, J.-M. Odobez and, H. Bourlard, “Text detection and recognition in images and video frames,” Pattern Recognition, vol. 37, no. 3, pp 595-608, 2004.
    [19] V. Ganapathy and C. C. H. Lean, “Optical character recognition program for images of printed text using a neural network,” in Proceedings of the IEEE International Conference on Industrial Technology, Mumbai, India, pp. 1171-1176, 2006.
    [20] T. H. Su, T. W. Zhang, H. J. Huang, and Y. Zhou, “HMM-based recognizer with segmentation-free strategy for unconstrained Chinese handwritten text,” in Proceedings of the 9th International Conference on Document Analysis and Recognition, Curitiba, Paraná, Brazil, vol. 1, pp. 133-137, 2007.
    [21] T. B. Chen, D. Ghosh, and S. Ranganath, “Video-text extraction and recognition,” in Proceedings of the IEEE Region 10 Conference, Chiang Mai, Thailand, vol. 1, pp. 319-322, 2004.
    [22] X. Chen, J. Yang., J. Zhang, and A. Waibel, “Automatic detection and recognition of signs from natural scenes,” IEEE Transactions on Image Processing, vol. 13, no. 1, pp. 87-99, 2004.
    [23] P.-h. Chang, ”Text Plate Detection and Recognition Techniques Used for Autonomous Robots Navigating in Indoor Environments,” Department of Computer Science and Information Engineering, National Taiwan University of Science Technology, Taipei, Taiwan, 2009.
    [24] K. Suzuki, I. Horiba, and N. Sugie, “Linear-time connected-component labeling based on sequential local operations,” Computer Vision and Image Understanding, vol. 89 , no. 1, pp. 1-23, 2003.
    [25] R. L. Hsu, M. Abdel-Mottaleb, and A. K. Jain, “Face detection in color images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 5, pp. 696-706, 2002.
    [26] R. C. Gonzalez, and R. E. Woods, Digital Image Processing, 2nd Edition, Addison-Wesley, Reading, Massachusetts, 1992.
    [27] X. Lin, J. Ji, and Y. Gu, “The Euler number study of image and its application,” in Proceedings of the IEEE Conference on Industrial Electronics and Applications, Harbin, China, pp. 910-912, 2007.

    QR CODE