簡易檢索 / 詳目顯示

研究生: 郭逸奇
Yi-qi Kuo
論文名稱: 基於支向機的方法在實際場景影像定位文字
A Support-Vector-Machine-Based approach to Locating Texts in Real Scene Images
指導教授: 范欽雄
Chin-Shyurng Fahn
口試委員: 李建德
Jiann-Der Lee
洪西進
Shi-Jinn Horng
范國清
Kuo-Chin Fan
莊仁輝
Jen-Hui Chuang
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2005
畢業學年度: 93
語文別: 中文
論文頁數: 55
中文關鍵詞: 文字偵測支向機獨立成分分析
外文關鍵詞: Independent component analysis, Text detection, Support vector machine
相關次數: 點閱:251下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

影像中的文字常受到人為和環境的影響,其大小、樣式、顏色、角度和排列方式都會隨之變化,使得自動化的文字定位技術充滿挑戰性。在本論文中,我們提出一個植基於支向機的文字定位技術,它的主要架構分成三個部份:候選文字區塊的產生、特徵抽取,以及驗證。首先,候選文字區塊的產生部份,係使用Canny邊緣檢測技術得到二值化的細邊緣影像,復以Suzuki等人提出的方法標記相連成份區塊,經初步條件濾除後得到基本的候選文字區塊。至於區塊特徵抽取部份,則先利用二維小波分解技術將相連成份區塊分為四個頻帶,再提取每個頻帶的直方圖與共生矩陣的特徵,接著以主成份分析、線性鑑別分析和獨立成份分析三種方法對四個頻帶所取出的特徵經轉換矩陣降低維度,並比較三者對文字定位系統的影響。而文字區塊驗證部份係採用1-Norm Soft Margin的支向機,其中先利用核心函數計算既有訓練樣本在特徵空間中的內積關係,然後用循序最小優化法得到最佳的決策函數來驗證候選區塊是否含有文字資訊,最後把驗證後的文字區塊融合成文字行。經實驗證明,區塊的特徵經過降低維度處理後,不但能大大提升系統的執行效能,而且定位的效果仍然維持在高正確率。


Due to artificial and natural disturbances, the variations of texts in size, style, color, orientation, and arrangement make automatic text localization for real scene images be an extremely challenging task. In this thesis, we propose an efficient text localization approach based on support vector machines, which constitutes three main processing phases: generation, feature extraction, and verification of candidate text blocks. First, we adopt the Canny edge detection method to obtain a binary thin edge image, then the candidate text blocks are acquired by labeling connected components. After noise removal, the features consisting of a gray histogram and a co-occurrence matrix are extracted from each of four different bandwidths: LL, LH, HL, and HH sub-band images through the discrete wavelet transform. The principal component analysis, linear discriminant analysis, and independent component analysis are further employed to reduce the dimensions of the features. The performance of the three analysis methods is also evaluated. Finally, we apply the optimal decision function of a 1-norm-soft-margin support vector machine which is obtained from sequential minimal optimization to verify the candidate text blocks whether they contain texts or not before the relationships between the blocks in the feature space are computed. Experimental results show that after the feature reduction of the blocks, our approach can speed up the text localization system and keep a high correction rate simultaneously.

中文摘要 Ⅰ 英文摘要 Ⅱ 誌謝 Ⅲ 目錄 Ⅳ 圖索引 Ⅵ 表索引 Ⅶ 第一章 緒論 1 1.1 研究動機 1 1.2 相關文獻 2 1.3 論文與文字定位系統架構 5 第二章 決定候選文字區塊 6 2.1 邊緣偵測 6 2.1.1 高斯模糊 8 2.1.2 梯度與角度的計算 9 2.1.3 Non-maximum suppression 9 2.1.4 Hysteresis 10 2.2 梯度門檻值的決定 11 2.2.1 Otsu法 11 2.2.2 Kapur法 13 2.2.3 PNN法 13 2.3 候選區塊的產生 15 2.3.1 標記相連區塊 15 2.3.2 篩選相連區塊 17 第三章 區塊特徵抽取 19 3.1 離散小波轉換 21 3.1.1 一維小波轉換 21 3.1.2 二維小波轉換 22 3.2 統計特徵的計算 23 3.3 降低特徵資料的維度 26 3.3.1 主成份分析 26 3.3.2 線性鑑別分析 28 3.3.3 獨立成份分析 30 第四章 驗證候選區塊及取得文字行 33 4.1 線性支向機分類器 33 4.2 非線性支向機分類器 37 4.3 循序最小優化 39 4.4 Bootstrap 41 4.5 文字行的取得 42 第五章 實驗結果 44 5.1 支向機訓練樣本 44 5.2 實驗數據與分析 46 5.3 文字定位結果 49 第六章 結論與未來工作 52 6.1 結論 52 6.2 未來工作 52 參考文獻 53

[1] K. Jung, K. I. Kim, A. K. Jain, “Text information extraction in images and video: a survey.” Pattern Recognition, vol. 37, no. 5, pp. 977-997, 2004.
[2] V. Wu, R. Manmatha and E. M. Riseman, “Textfinder: an automatic system to detect and recognize text in images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, no. 11, pp. 1224-1229, 1999.
[3] M. Cai, J. Song and M. Lyu, “A new approach for video text detection,” IEEE Int. Conf. on Image Processing, vol. 1, no 22-25, pp. 117-120, 2002.
[4] Q. Ye, W. Gao, W. Wang and W. Zeng, “A robust text detection algorithm in images and video frames,” Proc. of the IEEE Conf. on Information, Communications and Signal Processing, vol. 2, no 15-18, pp. 802 – 806, 2003.
[5] R. Lienhart, and A. Wernicke, “Localizing and segmenting text in images and videos,” IEEE Transactions on Circuits and Systems, vol. 12, no. 4, pp. 256-268, 2002.
[6] D. Chen, H. Bourlard and J. P. Thiran, “Text identification in complex background using SVM,” Proc. of the 2001 IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, vol. 2, pp. 621-626, 2001.
[7] X. Chen, J. Yang, J. Zhang and A. Waibel, “Automatic detection and recognition of signs from natural scenes,” IEEE Transactions on Image Processing, vol. 13, no. 1, pp. 87-99, 2004.
[8] Li Huiping, D. Doermann and O. Kia, “Automatic text detection and tracking in digital video,” IEEE Transactions on Image Processing, vol. 9, no. 1, pp. 147-156, 2000.
[9] K. I. Kim, K. Jung and J. H. Kim, “Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 12, pp.1631-1639, 2003.
[10] R. C. Gonzalez and R. E. Woods, “Digital Image Processing”, Addision-Wesley Publishing Company, 2nd Edition, 1992.
[11] J. Canny, “A Computational Approach to Edge Detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 8, no. 6, 1986.
[12] N. Otsu, “A threshold selection method from gray level histogram,” IEEE Transactions on Systems, Man, and Cybernetics, SMC-8, pp. 62-66, 1978.
[13] J. N. Kapur, P. K. Sahoo and A. K. C. Wong, “A new method for gray-level picture thresholding using the entropy of the histogram,” Computer Vision, Graphics, and Image Processing, vol. 29, no. 3, pp. 273-285, 1985.
[14] K. L. Chung and W. Y. Chen, “Fast adaptive PNN-based thresholding algorithms,” Research Report, Department of Computer Science and Information Engineering, National Taiwan Univ. of Sci. and Tech., 2001.
[15] K. Suzuki, I. Horiba, N. Sugie, “Linear-time connected-component labeling based on sequential local operations,” Source Computer Vision and Image Understanding archive, vol. 89 , no. 1, pp. 1-23, 2003.
[16] Y. Zhu; T. Tan and Y. Wang, “Font recognition based on global texture analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 23, no. 10, pp. 1192-1200, 2001.
[17] 林咸仁,「改良線性鑑別式分析在少量訓練樣本下之人臉辨識研究」,國立成功大學資訊工程學系碩士論文,民國91年。
[18] M. Welling, “Fisher Linear Discriminant Analysis,” Department of Computer Science, University of Toronto.
[19] A. Hyvärinen and E. Oja, “Independent Component Analysis: Algorithms and Applications,” Neural Networks, vol. 13, no. 4-5, pp. 411-430, 2000.
[20] L. J. Cao and W. K. Chong, “Feature extraction in support vector machine: a comparison of PCA, XPCA and ICA,” Proceedings of the 9th International Conference on Neural Information, vol. 2, no. 18-22, pp. 1001-1005, 2002.
[21] Nello Cristianini and John Shawe Taylor, “An Introduction to Support Vector Machines and other kernel-based learning methods”, Printed in the United Kingdom at the Cambridge University, 2000.
[22] http://research.microsoft.com/users/jplatt/smo.html

QR CODE