簡易檢索 / 詳目顯示

研究生: 羅盛麟
Sheng-lin Lo
論文名稱: 一種擷取自照相機的扭曲文件影像之復原方法
A Dewarping Method for the Distorted Document Image Captured by a Camera
指導教授: 范欽雄
Chin-shyurng Fahn
口試委員: 鍾國亮
Kuo-liang Chung
廖弘源
Hung-yuan Liao
曾定章
Din-chang Tseng
王榮華
Jung-hua Wang
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2005
畢業學年度: 93
語文別: 中文
論文頁數: 54
中文關鍵詞: 立方雲規配適法文字行復原扭曲方法扭曲文件影像傾斜文件影像
外文關鍵詞: distorted document image, dewarping method, tilted document image, text line, cubic spline fitting
相關次數: 點閱:480下載:8
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 照相機便於攜帶,而且成像快速,它相較於掃描器,不僅是更有效率,而且是更為方便的文件影像輸入工具,尤其目前在行動裝置內建照相機的應用越來越普及,例如個人隨身助理 (PDA) 與智慧型手機 (SmartPhone),使文件影像的處理顯得更為重要。於手持照相機或行動裝置拍攝文件影像時,文件本身的扭曲是很常見的問題,本論文提出一個在不需要其它輔助裝置,只有單一照相機的環境下,復原扭曲文件影像的方法。它可改進現有復原方法的限制,在功能上,可以處理跨頁、雙欄位格式的文件,以及包含圖形或表格的複合文件;在理論上,此方法可不受文件扭曲程度的限制,而且可校正傾斜的文件影像。本論文的主要方法是先分析文件的類型,定位出文件影像中的文字行,接著用立方雲規內插法配適出每條文字行所在的數學函數,然後利用最佳線性組合的原則,挑選出兩條最具代表性的文字行來重建扭曲模型,再予以校正。實驗結果顯示本論文所提的方法能夠有效地校正扭曲和傾斜的文件影像,並且可提高文件影像在後續光學文字辨識(OCR)的正確率。


    Cameras are easy to carry and can take images quickly. Compared with a scanner, a camera acting as one of document image inputting devices is not only more efficient but also more portable. Especially at present, the applications of a camera built in the mobile device, like PDAs and Smartphones, are more and more popular, so that the performance of document image processing for such products is getting more important. When we take document images by a hand-held camera or a mobile device, the document suffering from distortion is a common problem. In this thesis, we propose a dewarping method under the environment of capturing images that only needs a single camera without other auxiliary devices. It can improve the restrictions of existing dewaping methods. In terms of function, our method can deal with binding documents, two-column documents, and complex documents which include graphs or tables. On the theoretical plane, our method is not confined to the limit of distortion degrees, and also can rectify tilted document images. The main idea of this thesis is to analyze the document category and locate text lines in the document image. After that, we use the cubic spline method to fit the mathematical functions of the locations of all text lines. Then based on the optimal linear combination principle, we choose two typical text lines to reconstruct the distorted model for dewarping the document image. The experimental results show our method can dewarp the distorted and tilted document images effectively. It also leads to raise the OCR accuracy in the subsequent process of document image analysis.

    中文摘要 I 英文摘要 II 誌謝 III 目錄 IV 圖索引 VI 表索引 VIII 第一章 緒論 1 1.1 研究動機 1 1.2 相關研究 2 1.3 研究方法 6 1.4 系統架構 7 第二章 前置處理 8 2.1 鏡頭校正 8 2.2 二值化 11 2.3 文件類型的分析 14 2.4 定位文字行 16 2.4.1 擴大化影像 16 2.4.2 求垂直區塊中點 17 2.4.3 連接成文字行並且平滑 18 2.4 圖文分離 19 第三章 復原扭曲影像 22 3.1 扭曲影像模型 22 3.2 曲線配適 24 3.2.1 最小平方回歸法 24 3.2.2 拉格朗日內插法 26 3.2.3 立方雲規內插法 29 3.2.4 曲線配適法的選擇 33 3.3 利用文字行建構模型 34 3.4 復原扭曲文件影像 36 3.5 復原傾斜文件影像 37 第四章 系統實現與實驗結果 39 4.1 系統實現 39 4.2 實驗結果 40 4.3 討論 46 4.3.1 文字行的彎曲改善程度 46 4.3.2 復原前後文件影像 OCR的辨識率 49 第五章 結論與未來研究方向 50 5.1結論 50 5.2未來研究方向 51 參考文獻 52

    [1] T. Wada, H. Ukida, and T. Matsuyama, “Shape from shading with interreflections under proximal light source— 3D shape reconstruction of unfolded book surface from a scanner image,” in Proc. of the 15th Int. Conf. on Computer Vision, pp. 66-71, 1995.

    [2] T. Araki, H. Guan, and K. Kojima, “Distortion correction technique for scanned book images,” in Ricoh Technical Report, No.29, pp. 31-40, 2003.

    [3] Z. Zhang, C. L. Tan and L. Fan, “Estimation of 3D shape of warped document surface for image restoration,” in Proc. of the 17th Int. Conf. on Pattern Recognition, vol. 1, pp.486-489, 2004.

    [4] Z. Zhang, C. L. Tan and L. Fan, “Restoration of curved document images through 3D shape modeling,” in Proc. of the 2004 IEEE Computer Society. Conf. on CVPR, vol. 1, pp.10-15, 2004.

    [5] Z. Zhang and C. L. Tan, “Restoration of images scanned from thick bound documents,” in Proc. of the 2001 Int. Conf. on Image Processing, vol. 1, pp. 1074-1077, 2001

    [6] A. Doncescu, “Former books digital processing: image warping,” in Proc. of the 1997 Workshop on Document Image Analysis, vol. 1, pp. 5-9, 1997.

    [7] M. S. Brown and W. B. Seales, “Beyond 2D images: effective 3D imaging for library materials,” in Proc. of the 5th ACM Conf. on Digital Libraries, pp. 27-36, 2000.

    [8] M. S. Brown and W. B. Seales, “Document restoration using 3D shape: a general deskewing algorithm for arbitrarily warped documents,” in Proc. of the 8th Int. IEEE Conf. on Computer Vision, vol. 2, pp. 367-374, 2001.

    [9] M. S. Brown, W. B. Seales, “Image restoration of arbitrarily warped document,” in IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 26, pp. 1295-1306, 2004.
    [10] M. S. Brown and D. Tsoi, “Correcting common distortions in camera-imaged library materials,” in Proc. of the 2003 Joint Conf. on Digital Libraries, pp. 367-368, 2003.

    [11] Y. C. Tsoi and M. S. Brown, “Geometric and shading correction for images of printed materials: a unified approach using boundary,” in Proc. of the 2004 IEEE Computer Society Conf. on CVPR, vol. 1, pp. 240-246, 2004.

    [12] H. Cao, X. Ding, and C. Liu, “Rectifying the bound document image captured by the camera: a model based approach,” in Proc. of the 17th Int. Conf. on Document Analysis and Recognition, vol. 1, pp. 71-75, 2003.

    [13] H. Cao, X. Ding, and C. Liu, “A cylindrical surface model to rectify the bound document image,” in Proc. of the 9th Int. Conf. on Computer Vision, vol. 1, pp. 228-233, 2003.

    [14] Z. Zhang and C. L. Tan, “Straightening warped text lines using polynomial regression,” in Proc. of the 2002 Int. Conf. on Image Processing, vol. 1, pp.977-980, 2002.

    [15] Z. Zhang, and C. L. Tan, “Correcting document image warping based on regression of curved text lines,” in Proc. of the 7th Int. Conf. on Document Analysis and Recognition, vol. 1, pp.589-593, 2003.

    [16] G. P. Stein, “Lens distortion calibration using point correspondences,” in Proc. of the 16th Int. Conf. on Computer Vision amd Pattern Recognition, pp. 602-608, 1997.

    [17] J. Jun and C. Kim, “Robust camera calibration using neural network,” in Proc. of the IEEE Region 10 Conf. on TENCON 99, vol. 1, pp. 694-697, 1999.

    [18] B. Tordoff and D. W. Murray, “Violating rotating camera geometry: the effect of radial distortion on self-calibration,” in Proc. of the 15th Int. Conf. on Pattern Recognition, vol. 1, pp. 423-427, 2000.

    [19] W. Niblack, An Introduction to Digital Image Processing, Prentice-Hall, Englewood Cliffs, New Jersey, pp. 115-116, 1986.

    [20] Z. Zhang and C. L. Tan, “Recovery of distorted document images from bound volumes,” in Proc. of the 16th Int. Conf. on Document Analysis and Recognition, pp. 429-433, 2001.

    [21] J. Sauvola, T. Seppanen, S. Haapakoski, and M. Pietikainen, “Adaptive document binarization,” in Proc. of the 4th Int. Conf. on Document Analysis and Recognition, pp. 147-152, 1997.

    [22] S. Raju, P. B. Pati, and A. G. Ramakrishnan, “Gabor filter based block energy analysis for text extraction from digital document images,” in Proc. of the 1st Int. Workshop on Document Image Analysis for Libraries, vol. 1, pp. 71-75, 2004.

    [23] M. Acharyya, and M. K. Kundu, “Document image segmentation using wavelet scale features,” in IEEE Transactions on Circuits and System for Video Technology, vol. 12, pp. 1117-1127, 2002.

    [24] N. Papamarkos, J. Tzortzakis and B. Gatos, “Determination of Run-Length smoothing values for document segmentation,” in Proc. of the 3rd IEEE Int. Conf. On Electronics, Circuits and Systems, ICECS 96, pp. 684-687, 1996.

    [25] C. Strouthopoulos, N. Papamarkos, A. Atsalakis, and C. Chamzas, “Locating text in color documents,” in Proc. of 2001 Int. Conf. On Image Processing, vol.1, pp. 1066-1069, 2001.

    [26] S. W. Lee and D. S. Ryu, “Parameter-free geometric document layout analysis,” in IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 23, pp.1240-1256, 2001.

    [27] 張任業,應用數值分析,曉圓出版社,1990。

    [28] 袁帝文,應用數值方法,儒林圖書公司,1997。

    [29] P. F. Hultquist, Numerical Methods, Wendy Calmenson, pp. 144-177, 1988.

    [30] R. Sedgewick, Algorithm in C++, Addison Wesley, pp. 545-554, 1992.

    QR CODE