Mobile Book Recognition using Text and Feature Extraction｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	Ronny Haryanto Ronny - Haryanto
論文名稱：	Mobile Book Recognition using Text and Feature Extraction Mobile Book Recognition using Text and Feature Extraction
指導教授：	林昌鴻 Chang Hong Lin
口試委員:	呂政修 Jenq-Shiou Leu 林敬舜 Ching Shun Lin 李佳翰 Chia Han Lee
學位類別：	碩士 Master
系所名稱：	電資學院 - 電子工程系 Department of Electronic and Computer Engineering
論文出版年：	2013
畢業學年度：	101
語文別：	英文
論文頁數：	71
中文關鍵詞：	augmented reality 、book cover recognition 、text extraction 、feature extraction
外文關鍵詞：	augmented reality, book cover recognition, text extraction, feature extraction
相關次數：	點閱：191 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

As smartphones get more popular these days, mobile Augmented Reality (AR) application is growing, especially applications that recognize an object and augment relevant information on the smartphone’s video viewfinder. Many applications outsource the recognition process to a server via a wireless connection; however this approach suffers from low performance due to restricted bandwidth which limits scalability in the number of client devices. In this thesis, we would like to propose a scalable mobile augmented reality framework by minimizing the bandwidth reliance of the server-client model augmented reality system and dividing the works on server and client respectively.
As a special case study, we propose a mobile book cover recognition system. Our mobile augmented reality system is based on the client-server model. On the mobile device, user prompts to capture the query image to be sent to server. On the server side, title extraction and features extraction is employed. Text detection: wavelet decomposition is employed to calculate the energy intensity around the captured book image and locate the approximate location of the text. Then, an OCR engine is employed to extract the textual information. Feature extraction: we use binary based features detector ORB to detect interest points and binary descriptor BRISK. Fuzzy decision making: two types of features, a set of words and a set of interest point descriptors. We have both database of book titles and image descriptors. The recognized text is matched to the database by using string similarity ranking algorithm to find out which title has the best similarity. The extracted image descriptors is matched to the database by using nearest neighbor algorithm to find out which image descriptor has the nearest distance to the image query and then we calculate which book has most image descriptors matched. A fuzzy decision making is applied to combine both features and to select which book is the best match.

Table of Contents

Abstract		i
Table of Contents		ii
List of Figures		iii
List of Tables		ix
INTRODUCTION		1
1.	Motivation		1
2.	Objective and Contribution		3
3.	Thesis Organization		4
RELATED WORKS		5
1.	Marker-based Augmented Reality		5
2.	Markerless Augmented Reality		6
PROPOSED METHODS		16
1.	Client-Server Communication		17
2.	Book Cover Features Extraction		19
2.1.	Features Detection		20
2.2.	Descriptor Extraction		21
2.3.	Feature Matching Algorithm		23
3.	Title Extraction		24
3.1.	Candidate Text Pixels Detection		25
3.2.	Text Recognition		28
3.3.	Similarity Matching		29
4.	Decision Making		31
EXPERIMENTAL RESULTS
1.	Developing Platform		32
2.	Experimental Results		33
2.1.	Text Detection		33
2.2.	Text Recognition and Similarity Matching		45
2.3.	Features Matching		51
2.4.	Execution Time		52
3.	Environment Testing
3.1.	Scale differences		53
3.2.	Tilt		54
3.3.	Occlusion		54
3.4.	Motion Blur		55
3.5.	Recognition Result		54
CONCLUSION AND FUTURE WORKS
1.	Conclusions		58
2.	Future Works		59
References		60

                                

References

[1] R. Azuma, “A survey of augmented reality,” in Presence: Teleoperators and Virtual Environments, vol. 6, Aug. 1977, pp. 355–385.
[2] D. Chen, S. Tsai, R. Vedantham, R. Grzeszczuk, and B. Girod, “Streaming mobile augmented reality on mobile phones,” in International Symposium on Mixed and Augmented Reality (ISMAR), Orlando, FL, USA, October 2009, pp. 181–182.
[3] G. Takacs, V. Chandrasekhar, S. Tsai, D. Chen, R. Grzeszczuk, and B. Girod, “Unified real-time tracking and recognition with rotation-invariant fast features,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, USA, June 2010, pp. 934 –941.
[4] D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, November 2004, pp. 91–110.
[5] H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, “Speeded-up robust features (SURF),” Computer Vision and Image Understanding, vol. 110, June 2008, pp. 346–359.
[6] V. Chandrasekhar, Y. Reznik, G. Takacs, D. Chen, S. Tsai, R. Grzeszczuk, and B. Girod, “Quantization schemes for low bitrate compressed histogram of gradients descriptors,” IEEE Computer Vision and Pattern Recognition Workshops, San Francisco, CA, USA, June 2010, pp. 33–40.
[7] D. Wagner and D. Schmalstieg, “First Steps Towards Handheld Augmented Reality,” Proc. Seventh Int’l Conf. Wearable Computers (ISWC ’03), pp. 127-135, 2003.
[8] D. Wagner and D. Schmalstieg, “ARToolKitPlus for Pose Tracking on Mobile Devices,” Proc. 12th Computer Vision Winter Workshop (CVWW ’07), pp. 139-146, 2007.
[9] I. Skrypnyk and D. Lowe, “Scene Modeling, Recognition and Tracking with Invariant Image Features,” Proc. Int’l Symp. Mixed and Augmented Reality (ISMAR ’04), pp. 110-119, 2004.
[10] V. Lepetit, P. Lagger, and P. Fua, “Randomized Trees for RealTime Keypoint Recognition,” Proc. Conf. Computer Vision and Pattern Recognition (CVPR ’05), pp. 775-781, 2005.
[11] G. Takacs, V. Chandrasekhar, N. Gelfand, Y. Xiong, W.-C. Chen, T. Bismpigiannis, R. Grzeszczuk, K. Pulli, and B. Girod, “Outdoors Augmented Reality on Mobile Phone Using Loxel-Based Visual Feature Organization,” IEEE Trans. Pattern Analysis and Machine Intelligence, 2008.
[12] J. Ha, K. Cho, F. A. Rojas, and H. S. Yang, “Real-Time Scalable Recognition and Tracking based on the Server-Client Model for Mobile Augmented Reality,” IEEE International Symposium on Virtual Reality Innovations, March 2011, pp. 267-272.
[13] D. Chen, S. Tsai, C. H. Hsu, J. P. Singh, and B. Girod, “Mobile Augmented Reality for Books on a Shelf,” IEEE International Conference on Multimedia and Expo, July 2011, pp. 1-6.
[14] B. R. Huang, C. H. Lin, and C. H. Lee, “Mobile Augmented Reality Based on Cloud Computing,” International Conferences on Anti-Counterfeiting, Security and Identification, August 2012, pp. 1-5.
[15] B.P. Lin, Wen-Hsiang Tsai, C.C. Wu, P.H. Hsu, J.Y. Huang, Tsai-Hwa Liu, "The Design of Cloud-Based 4G/LTE for Mobile Augmented Reality with Smart Mobile Devices," Service Oriented System Engineering (SOSE), March 2013, pp.561-566.
[16] M.Y. Hsieh and Wen-Hsiang Tsai, “A Study on Indoor Navigation by Augmented Reality and Down-Looking Omni-Vision Techniques Using Mobile Devices,” Technical Report, Institute of Multimedia Engineering, Department of Computer Science, NCTU, Hsinchu, Taiwan, July, 2012
[17] A.B. Tillon, I. Marchal, P. Houlier, "Mobile augmented reality in the museum: Can a lace-like technology take you closer to works of art?," Mixed and Augmented Reality - Arts, Media, and Humanities (ISMAR-AMH), October 2011, pp.41,47.
[18] D. Wagner, G. Reitmayr, A. Mulloni, T. Drummond, and D. Schmalstieg, “Real-Time Detection and Tracking for Augmented Reality on Mobile Phones,” Visualization and Computer Graphics, vol. 16, no. 3, June 2010, pp. 355-368.
[19] M. Ozuysal, P. Fua, and V. Lepetit, “Fast Keypoint Recognition in Ten Lines of Code,” Proc. Conf. Computer Vision and Pattern Recognition (CVPR ’07), pp. 1-8, 2007.
[20] E. Rublee, V. Rabaut, K. Konolige, and G. Bradski, “ORB: an efficient alternative to SIFT or SURF,” IEEE International Conference on Computer Vision (ICCV), Barcelona, Spain, November 2011, pp. 2564-2571.
[21] S. Leutenegger, M. Chli, and R. Y. Siegwart, “BRISK: Binary Robust Invariant Scalable Keypoints,” IEEE International Conference on Computer Vision (ICCV), Barcelona, Spain, November 2011, pp. 2548-2555.
[22] O. Miksik and K. Mikolajczyk, “Evaluation of Local Detectors and Descriptors for Fast Feature Matching,” International Conference on Pattern Recognition (ICPR), Tsukuba, Japan, November 2012, pp. 2681-2684.
[23] M. Muja and D. G. Lowe, “Fast approximate nearest neighbors with automatic algorithm configuration,” International Conference on Computer Vision Theory and Applications (VISAPP), 2009.
[24] Q. Ye, Q. Huang, W. Gao, and D. Zhao, “Fast and robust text detection in images and video frames,” Image and Vision Computing, vol. 23, no. 6, 1 June 2005, pp. 565-576.
[25] L. R. Dice, “Measures of the Amount of Ecologic Association between Species,” Ecology, vol. 26, no. 3, July 1945, pp. 297-302.

簡易檢索 / 詳目顯示

相關論文