簡易檢索 / 詳目顯示

研究生: 盧凱傑
Kai-jay Lu
論文名稱: 機器人的仿真人閱讀鋼琴譜技術
Humanoid Reading Printed Piano Scores Techniques for Robots
指導教授: 范欽雄
Chin-shyurng Fahn
口試委員: 傅立成
Li-chen Fu
宋開泰
Kai-tai Song
李漢銘
Hahn-ming Lee
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2009
畢業學年度: 97
語文別: 英文
論文頁數: 51
中文關鍵詞: 娛樂型機器人仿真人閱讀鋼琴譜光學樂譜辨識邊看邊辨識方式
外文關鍵詞: entertainment robot, humanoid reading, printed piano score, optical music recognition, recognizing while looking manner
相關次數: 點閱:227下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

在本篇論文中,我們提出了一種新穎的文件辨識技巧,此技術有別於以往辨識掃瞄器所產生的影像圖片,而是採用市售的攝影機,針對前方的文件進行影像擷取與辨識。為了投入最近研究十分熱絡的機器人主題,特別是娛樂型機器人這個領域,我們應用這個技巧在辨識鋼琴譜上面,希望讓娛樂型機器人能夠閱讀一般人所使用的鋼琴譜,並且可以即時地表演所閱讀的內容。
為了建構所需的機器人視覺系統,我們選擇一般用於視訊監控的PTZ攝影機,雖然這種攝影機不能像數位相機一樣擁有高解析度,但卻可以大角度轉動、傾斜與鏡頭自動對焦、高倍率放大,因而我們可以精準地運用這些功能去模仿人們閱讀鋼琴譜的習慣。根據我們所瞭解的閱讀方式,並不像電腦在執行文件分析一樣,一次處理影像全部的內容,而是依序地邊看邊分析局部的內容。為達到即時閱讀真實鋼琴譜的需求下,我們採取這樣的方式並善用PTZ攝影機的功能,按照排程給定的順序去擷取鋼琴譜的局部放大影像,進而辨識其中的音樂符號,完成此動作後,再控制攝影機移動到下一個位置去執行相同的動作,一直反覆直到完成鋼琴譜的閱讀為止。由於這種「邊看邊辨識」的技巧與人們閱讀的方式類似,所以我們稱它為仿真人閱讀鋼琴譜技術。
在鋼琴譜辨識方面,我們設計了一個光學樂譜辨識模組,專門用來確認位於局部放大影像內的音樂符號,在閱讀的過程當中,此模組被巧妙地穿插並多次執行。於實驗的過程裡,我們是以 的影像解析度對10首鋼琴譜進行測試,此模組僅需要200∼300毫秒的執行時間,而且整個系統都能即時地辨識與演奏測試用的鋼琴譜,其辨識正確率平均達到87%以上;為了驗證系統實地運作的性能,我們挑選三種不同的工作環境進行測試,分別是在一般室內、戶外廣場與演奏廳內的表演台上,所獲得的實驗結果均達到相同的辨識與演奏水準,幾乎不會受到外在因素的影響。


In this thesis, we present a novel technique of document analysis which is distinct from the traditional methods that recognize the images produced from a scanner. To realize this technique, we alternatively employ a camera sold in the market to capture the image of the document in front of the camera and recognize its content. For probing into the recent hot topics on the development of robots, especially in the field of entertainment robots, we apply this technique to recognize printed piano scores. We hope entertainment robots can read the printed piano scores that humans use actually and play the recognition results in real time.
For the construction of such a robot vision system, we adopt the PTZ camera used in video surveillance. Although this kind of cameras does not have high resolution like a digital camera, it can pan, tilt, automatically focus, and zoom. Therefore, we may precisely manipulate these functionalities to simulate the human behavior of reading printed piano scores. The reading behavior we understand is not the same as the ordinary manners of document analysis which entirely process the content of an image; to the contrary, it looks at part of a document and recognizes the corresponding content simultaneously in a given reading sequence. To achieve the requirement of real-time reading actual printed piano scores, we implement the aforementioned behavior by use of the functionalities of the PTZ camera to capture and recognize an enlarged partial image properly. After that, we control the PTZ camera to the next position and execute the same actions repeatedly until each part of a printed piano score has been read. Because such a “recognizing while looking” manner is similar to the reading behavior of human being, we call it humanoid reading printed piano scores technique.
In the aspect of recognizing printed piano scores, we design an optic music recognition (OMR) module to identify the music symbols within an enlarged partial image. This module is inserted and executed several times ingeniously during the process of the reading. In the experiments, we set the image resolution of pixels to evaluate both the detection and recognition results from the printed piano scores of 10 songs. The OMR module only costs 200~300 milliseconds. Moreover, our system can recognize and play the printed piano scores in real time and the recognition rate is over 87% on an average. To demonstrate the effectiveness of the system, we do the tests on reading and playing actual printed piano scores in three different working environments: an indoor room, an outdoor square, and a concert hall. The experimental results reveal that the performance of our system maintains the same level of recognizing and playing printed piano scores, almost not suffered from the influence of external factors.

誌謝 i 中文摘要 ii Abstract iii Contents v List of Figures vii List of Tables x Chapter 1 Introduction 1 1.1 Overview 1 1.2 Background and motivation 1 1.3 System description 2 1.4 Thesis organization 5 Chapter 2 Related Works 7 2.1 Some works about staff 7 2.2 Some works about note recognition 9 Chapter 3 Humanoid Reading Principle 13 3.1 Essential apparatus 13 3.2 Detection of printed piano scores 14 3.2.1 Structure of printed piano scores 14 3.2.2 Image processing methods used for detection 15 3.3 Sub-image capturing order planning 21 3.3.1 The transform of image’s coordinates to camera’s angles 21 3.3.2 Sequential image capture 23 Chapter 4 Optical Music Recognition 25 4.1 Overview of the measure-based OMR module 25 4.2 The functions of the measure-based OMR module 26 4.2.1 Main component retaining 27 4.2.2 Bar-lines detection 28 4.2.3 Staff removal 29 4.2.4 Clef recognition and removal 31 4.2.5 Note segmentation and analysis 34 4.3 The output of the measure-based OMR module 36 Chapter 5 Experimental Results and Discussions 38 5.1 The installation of the experimental equipment 39 5.2 The procedures of the experiment 39 5.3 Detection result and available image rate 42 5.4 Optical music recognition rate 45 Chapter 6 Conclusions and Future Works 47 6.1 Conclusions 47 6.2 Future works 48 References 50

[1] Y. Sakagami, R. Watanabe, C. Aoyama, S. Matsunaga, N. Higaki, and K. Fujimura, “The intelligent ASIMO system overview and integration,” in Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and System, Lausanne, Switzerland, vol. 3, pp. 2478-2483, 2002.
[2] R. Hiura, K. Onishi, H. Okazaki, and S. Asano, “Development and demonstrate of reseptionist robot "wakamaru",” Nippon Kikai Gakkai Robotikusu, Mekatoronikusu Koenkai Koen Ronbunshu, vol. 2006, pp.1A1-E02, 2006.
[3] <http://mind.kaist.ac.kr/3_re/HumanRobot/HumanRobot.htm/>
[4] <http://www.pal-robotics.com/>
[5] K. Yoshihiro, “Toyota's violin-playing robot,” International Journal of Industrial Robot, vol. 35, no. 6, pp. 504-506, 2008.
[6] <http://www.toyota.co.jp/en/tech/robot/index.html/>
[7] <http://www.post-gazette.com/pg/07031/758011-96.stm/>
[8] D. Pruslin, “Automatic recognition of sheet music,” Sc.D. Dissertation, Massachusetts Institute of Technology, Boston, Massachusetts, 1966.
[9] D. S. Prerau, “Computer pattern recognition of printed music,” in Proceedings of the Joint Computer Conference on American Federation of Information Processing Societies, Washington, D.C., vol. 39, pp. 153-162, 1971.
[10] A. Andronico and A. Ciampa, “On automatic pattern recognition and acquisition of printed music,” in Proceedings of the International Computer Music Conference, Venice, Italy, pp. 245-278, 1982.
[11] S. Inokuchi and H. Katayose, “Computer and music,” Journal of the Institute of Electronics, Information and Communication Engineers, vol. 73, no. 9, pp. 965-967, 1990.
[12] H. Miyao and Y. Nakano, “Head and stem extraction from printed music scores using a neural network approach,” in Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, Canada, pp. 1074-1079, 1995.

[13] K. T. Reed and J. R. Parker, “Automatic computer recognition of printed music,” in Proceedings of the International Conference on Pattern Recognition, Vienna, Austria, pp. 803-807, 1996.
[14] E. Sicard, “An efficient method for the recognition of printed music,” in Proceedings of the 11th Conference on International Association on Pattern Recognition, Hague, Netherlands, pp. 573-576, 1992.
[15] Y. C. Chung, “Recognition of printed sheet music using Hough transform and morphology operation,” Master Thesis, Department of Computer Science, National Taiwan Normal University, Taipei, Taiwan, 1995.
[16] R. O. Duda and P. E. Hart, “Use of the Hough transformation to detect lines and curves in pictures,” Communications of the Association for Computing Machinery, vol. 15, no. 1, pp. 11-15, 1972.
[17] F. Kimura and M. Shridhar, “Handwritten numerical recognition based on multiple algorithms,” Pattern Recognition, vol. 24, no. 10, pp. 969-983, 1991.
[18] R. J. Randriamahefa, J.P. Cocquerez, C. Fluhr, F. Pépin, and S. Philipp, “Printed music recognition,” in Proceedings of International Conference on Document Analysis and Recognition, Tsukuba Science City, Japan, pp. 898-901, 1993.
[19] T. W. Tsai, “Automatic recognition of printed music score,” Master Thesis, Department of Computer Science, National Sun Yat-Sen University, Kaohsiung, Taiwan, 2004.
[20] L. F. He, Y. Y. Chao, and K. Suzuki, “A run-based two-scan labeling algorithm,” IEEE Transactions on Image Processing, vol. 17, no. 5, pp. 749-756, 2008.

QR CODE