簡易檢索 / 詳目顯示

研究生: 蘇冠武
Kuan-Wu Su
論文名稱: 在適應性雲端架構下具主觀性質的音樂資訊擷取之研究
Study on retrieving subjective knowledge from music using adaptive cloud-based structure
指導教授: 呂政修
Jenq-Shiou Leu
口試委員: 鄭瑞光
Ray-Guang Cheng
陳省隆
Hsing-Lung Chen
石維寬
Wei-Kuan Shih
學位類別: 碩士
Master
系所名稱: 電資學院 - 電子工程系
Department of Electronic and Computer Engineering
論文出版年: 2012
畢業學年度: 100
語文別: 英文
論文頁數: 83
中文關鍵詞: 音樂探索音樂資訊擷取音樂知識主觀認知差異雲端計算適應性系統架構
外文關鍵詞: Music Information Retrieval, Music Discovery, Music Knowledge, Subjective Perspective, Cloud Computing, Adaptive System
相關次數: 點閱:286下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究多方面探討音樂資訊擷取(Music Information Retrieval)技術近幾年來的發展。並詳盡深入的分析,音樂相關資訊與音樂本身的特徵,因人為的主觀認知差異不同,如何對相關知識造成影響。並基於此假設概念,討論如何利用近年來蓬勃發展的雲端計算概念,與結合過去的研究,來建立一個可行的適應性的雲端架構。也進一步研究了解在實用面向上,如何利用此架構催生出實際可用的音樂探索服務。最後並以實驗模擬驗證此架構與其相關假設,如何能帶領我們走向未來的音樂擷取服務。


    In this study, many aspects of Music Information Retrieval techniques and recent year developments are discussed. With extensive research, the assumption that the knowledge of music and its related content and context are more susceptible to subjective perspectives of the human mind is analyzed. Base on this concept, with the help of previous works and recent development of cloud computing, the possibilities of constructing an adaptive cloud-based structure are explored, in order to better understand the implications of implementing integrated practical music discovery services. And the proof of concept using simulation provides us a glimpse about what the future of MIR might be.

    Table of Contents i 論文摘要 ii Abstract iii Acknowledgments iv Table of Contents v List of Figures vi List of Tables 01 Chapter 1: Introduction 05 Chapter 2: Music Perception and Subjective Knowledge 06 2.1 Music Associated Features 11 2.2 Music Seeking Behavior 14 2.3 Subjective Knowledge and Music Similarity 18 Chapter 3: Music Discovery Process 18 3.1 Low-level Features Identification 19 3.2 Mid-level Features Extraction 23 3.3 Music Matching 28 3.4 Music Identification 32 Chapter 4: Adaptive Cloud-based Structure 33 4.1 Simple Engine Structure 35 4.2 Cloud-based Structure 39 4.3 Adaptive Cloud-based Structure 43 Chapter 5: Evaluation and Discussion 44 5.1 Baseline and Operation Time 45 5.1.1 Measurement Metrics 47 5.1.2 Subjective Perspective Simulation 50 5.2 Cloud-based Structure with Profiling Simulation 55 5.3 Adaptive Cloud-based Structure Simulation 64 Chapter 6: Conclusion and Future Work 67 References

    [1] R. Typke, F. Wiering, R. C. Veltkamp, ”A Survey of Music Information Retrieval Systems,” ISMIR, 2005.
    [2] Z. Liu, M. Bacchiani, "TechWare: Mobile Media Search Resources", Signal Processing Magazine, IEEE, Vol. 28, Issue. 4, pp. 142-145, July 2011
    [3] Available on http://www.shazam.cobm/music/web/pages/iphone.html, Shazam on iPhone since 2007, accessed May 2012
    [4] https://play.google.com/store/apps/details?id=com.sonyericsson.trackid, Sony Ericsson TrackID, available on Google Play, accessed May 2012
    [5] SoundHound available on http://www.soundhound.com/, accessed May 2012
    [6] L. Chen, B.G. Hu, “An Implementation of Web Based Query by Humming System,“ IEEE ICME, pp. 1467-1470, 2007.
    [7] J.-S. Roger Jang and H.-R. Lee, "A General Framework of Progressive Filtering and Its Application to Query by Singing/Humming", IEEE Transactions on Audio, Speech, and Language Processing, No. 2, Vol. 16, pp. 350-358, Feb 2008.
    [8] M. A. Casey, R. Veltkamp, M. Goto, M. Leman, C. Rhodes, M. Slaney, "Content-Based Music Information Retrieval: Current Directions and Future Challenges". Proceedings of the IEEE, 96:668–696, 2008.
    [9] C. C. S. Liem, A. Rauber, T. Lidy, R. Lewis, C. Raphael, J.D. Reiss, T. Crawford, A. Hanjalic, "Music Information Technology and Professional Stakeholder Audiences: Mind the Adoption Gap", Multimodal Music Processing, Vol. 3, pp. 227-246, 2012
    [10] http://www.avmediasearch.eu/about_chorus, accessed May 2012
    [11] T. Lidy and P. van der Linden, "Report on 3rd CHORUS+ Think-Tank: Think-Tank on the Future of Music Search, Access and Consumption", MIDEM 2011. Technical report, CHORUS+ European Coordination Action on Audiovisual Search, Cannes, France, March 15 2011.
    [12] J.-S. Leu, C. Changfan, K.-W. Su, and C.-F. Chen, "Design and Implementation of Music Information Retrieval and Gathering Engine (MIRAGE)", The 10th The International Symposium on Pervasive Systems, Algorithms, and Networks (I-SPAN 2009), pp.498-501, Kaohsiung, Taiwan, Dec. 14-16, 2009
    [13] J.-S. Leu, C. Changfan, K.-W. Su, and C.-F. Chen, " Design and Implementation of a Fixed-Mobile Convergent Music Search Engine(FMC-MUSE)", pending and revisions being processed for Wireless Personal Communications, 2012
    [14] C.-T. Chen, J.-S. Leu, K.-W. Su, Z.-Y. Zhu, T.-H. Chiang,"Design and Implementation of a Mobile Ambient Intelligence Based Mesoscale Weather Forecasting System," 2012 IEEE International Conference on Consumer Electronics (IEEE ICCE 2012), pp. 566-567, Las Vegas, NV, USA, Jan. 13-16, 2012.
    [15] X. Hu, J. Liu, "Evaluation of Music Information Retrieval: Towards a User-Centered Approach", in Proceedings of the 4th Workshop on Human-Computer Interaction and Information Retrieval (HCIR), New Brunswick, NJ, August 22, 2010
    [16] C. C.S. Liem, M. Müller, D. Eck, G. Tzanetakis, A. Hanjalic, "The Need for Music Information Retrieval with User-Centered and Multimodal Strategies", MIRUM '11, ACM, Scottsdale, Arizona (2011), pp. 1-6, 2011
    [17] M. Schedl, S. Stober, E. Gómez, N. Orio, C. C.S. Liem, "User-Aware Music Retrieval", Multimodal Music Processing, Vol. 3, pp. 135-156, 2012
    [18] Òscar Celma, "Music Recommendation and Discovery: The Long Tail, Long Fail, and Long Play in the Digital Music Space", Springer, 1st edition, 2010.
    [19] D. Turnbull, L. Barrington, M. Yazdani, G. Lanckriet, "Combining Audio Content and Social Context for Semantic Music Discovery", ACM Special Interest Group on Information Retrieval (SIGIR '09), Boston, July 2009
    [20] Marius Kaminskas, Francesco Ricci. "Contextual Music Information Retrieval and Recommendation: State of the Art and Challenges", Computer Science Review, Vol. 6, Issues 2–3, pp. 89–119, May 2012
    [21] C.-C. Wang, J.-S. Roger Jang and W. Wang, "An Improved Query by Singing/Humming System Using Melody and Lyrics Information", International Symposium on Music Information Retrieval, Utrecht, Netherlands, Aug 2010.
    [22] S. Park, K. Chung, "Query by Singing/Humming (QbSH) System for Polyphonic Music Retrieval," Consumer Electronics (ICCE), 2012 IEEE International Conference on , pp.245-246, Jan. 13-16, 2012
    [23] A. Kotsifakos, P. Papapetrou, J. Hollmén, D. Gunopulos, V. Athitsos, "A Survey of Query-By-Humming Similarity Methods", Conference on Pervasive Technologies Related to Assistive Environments (PETRA), 2012
    [24] Beth Logan, "Mel Frequency Cepstral Coefficients for Music Modeling", In Proceedings of the International Symposium on Music Information Retrieval (ISMIR 2000), Plymouth, Massachusetts, USA, 2000
    [25] F. Gouyon, F. Pachet, O. Delerue, "On the Use of Zero-Crossing Rate for an Application of Classification of Percussive Sounds", In Proceedings of the COST-G6 Conference on Digital Audio Effects (DAFx-00), Verona, Italy, December 7–9, 2000
    [26] M. Müller, D.P.W. Ellis, A. Klapuri, G. Richard, "Signal Processing for Music Analysis", IEEE Journal on Selected Topics in Signal Processing, Vol. 5, Issue 6, pp. 1088–1110, Oct. 2011.
    [27] J. J. Burred, A. Lerch, "A Hierarchical Approach to Automatic Musical Genre Classification", In Proceedings of the 6th International Conference on Digital Audio Effects (DAFx-03), London, UK, September 8–11, 2003
    [28] D. Li, I. K. Sethi, N. Dimitrova, T. McGee, "Classification of General Audio Data for Content-based Retrieval", Pattern Recognition Letters, Vol. 22, Issue 5, pp. 533–544, 2001.
    [29] F. Gouyon, P. Herrera, E. Gomez, P. Cano, J. Bonada, A. Loscos, X. Amatriain, X. Serra, "Content Processing of Music Audio Signals", Sound to Sense, Sense to Sound: A State-of-the-art in Sound and Music Computing, pp. 83–160. Logos Verlag, Berlin GmbH, 2008.
    [30] J.-J. Aucouturier, F. Pachet, "Improving Timbre Similarity: How High is the Sky?", Journal of Negative Results in Speech and Audio Sciences, Vol. 1, Issue 1, 2004.
    [31] M. Schedl, E. Pampalk, G.Widmer, "Intelligent Structuring and Exploration of Digital Music Collections", e&i - Elektrotechnik und Informationstechnik, Vol. 122, Issue 7–8, pp. 232–237, July–August 2005.
    [32] G. Poliner, D. Ellis, A. Ehmann, E. Gómez, S. Streich, and B. Ong, "Melody Transcription from Music Audio Approaches and Evaluation", IEEE Transactions on Audio, Speech and Language Processing, Vol. 15, pp. 1247–1256, 2007.
    [33] H.-H. Wu, H.-J. Lin, S.-H. Yen, H.-W.i Chang, " Solving the Antisymmetry Problem Caused by Pitch Interval and Duration Ratio in Geometric Matching of Music", Journal of Multimedia, Vol 5, Issue 5, pp. 522-527, Oct 2010
    [34] R. Typke, F. Wiering, R. C. Veltkamp, "A Search Method for Notated Polyphonic Music with Pitch and Tempo Fluctuations", In Proceedings of the 5th International Symposium on Music Information Retrieval (ISMIR 2004), pp. 281–288, 2004
    [35] S. Salvador, P. Chan, ”FastDTW: Toward Accurate Dynamic Time Warping in Linear Rime and Space”, Intelligent Data Analysis, Vol. 11, Issue. 5, pp. 561-580, 2007.
    [36] H.R. Lee, C. Chen, J-S, Roger Jang, “Approximate Lower-bounding Functions for the Speedup of DTW for Melody Recognition”, Cellular Neural Networks and Their Applications, pp. 178- 181, 2005
    [37] C. Laurier, P. Herrera, "Automatic Detection of Emotion in Music: Interaction with Emotionally Sensitive Machines", Handbook of Research on Synthetic Emotions and Sociable Robotics: New Applications in Affective Computing and Artificial Intelligence, chapter 2, pp. 9–32. IGI Global, 2009.
    [38] E. Guaus, "Audio Content Processing for Automatic Music Genre Classification: Descriptors, Databases, and Classifiers", PhD thesis, Universitat Pompeu Fabra, 2009.
    [39] N. Orio & A. Rodà, "A Measure of Melodic Similarity Based on a Graph Representation of the Music Structure", in Proceedings International Conference on Music Information Retrieval (ISMIR), pp. 543–548, 2009
    [40] J.-S.R. Jang, J.-C. Chen, M.-Y. Kao, "MIRACLE: A Music Information Retrieval System with Clustered Computing Engine", In Proceedings of International Symposium on Music Information Retrieval, Bloomington, IN, USA, 2001
    [41] P. Mercado, and H. Lukashevich, "Feature Selection in Clustering with Constraints: Application to Active Exploration of Music Collections", Proceedings of the 2010 Ninth International Conference on Machine Learning and Applications, ICMLA '10, pp. 649-654, 2010
    [42] D. Wang, and M. Ogihara, "Potential Relationship Discovery in Tag-aware Music Style Clustering and Artist Social Networks", Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR 2011), pp. 435–40, 2011
    [43] Bing Liu, "Web Data Mining - Exploring Hyperlinks, Contents and Usage Data", book published by Springer, Berlin, Heidelberg, Germany, 2007
    [44] Y.-S. Chang, C.-T, Yang, Y.-C. Luo, "An Ontology based Agent Generation for Information Retrieval on Cloud Environment", Journal of Universal Computer Science, Vol. 17, Issue. 8, pp. 1135-1160, 2011
    [45] B. Zhang, Q. Xiang, H. Lu, J. Shen, Y. Wang, "Comprehensive Query-Dependent Fusion using Regression-on-Folksonomies: A Case Study of Multimodal Music Search", Proceedings of the 17th ACM international conference on Multimedia, pp. 213-222, Beijing, China, 2009
    [46] T. Wang, D.-J. Kim, K.-S. Hong, J.-S. Youn, “Music Information Retrieval System using Lyrics and Melody Information,” Asia-Pacific Conference on Information Processing, pp. 601–604, 2009
    [47] M. Levy, M. Sandler, "A Semantic Space for Music Derived from Social Tags", In Proceedings of the 8th International Conference on Music Information Retrieval (ISMIR 2007), Vienna, Austria, September 2007.
    [48] Ò. Celma, M. Ramírez, P. Herrera, "Foafing the Music: A Music Recommendation System Based on RSS Feeds and User Preferences", In Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR 2005), London, UK, 2005
    [49] http://www.pandora.com/, accessed May 2012
    [50] http://www.last.fm/, accessed May 2012
    [51] S. J. Cunningham, N. Reeves, M. Britland, "An Ethnographic Study of Music Information Seeking: Implications for the Design of a Music Digital Library", In Proceedings of the 2003 Joint Conference on Digital Libraries (JCDL ’03), pp. 5–16, 2003
    [52] S. J. Cunningham, J. S. Downie, D. Bainbridge, “The Pain, The Pain: Modelling Music Information Behavior And The Songs We Hate", In Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR 2005), pp. 474–477, London, UK, September 11–15 2005
    [53] Audrey Laplante, " Social Capital and Music Discovery: An Examination of the Ties through Which Late Adolescents Discover New Music", ISMIR 2011, pp. 341-346, 2011
    [54] Jin Ha Lee, "Analysis of User Needs and Information Features in Natural Language Queries Seeking User Information", Journal of the American Society for Information Science and Technology (JASIST), Vol. 61, pp. 1025–1045, 2010
    [55] http://www.spotify.com/, accessed May 2012
    [56] Mark Mulligan, "The Long Tail Will East Itself: Covers and Tributes Make Up 90% of Digital Music Service Catalogues, by Mark Mulligan", https://musicindustryblog.wordpress.com/2012/05/14/the-long-tail-will-east-itself-covers-and-tributes-make-up-90-of-digital-music-service-catalogues/, accessed May 2012
    [57] A. Wang, "An Industrial Strength Audio Search Algorithm", In Proceedings of the International Conference on Music Information Retrieval (ISMIR), pp. 7–13, Baltimore, USA, 2003
    [58] http://soundcloud.com/, accessed May 2012
    [59] E. Law, L. von Ahn, "Input-agreement: a new mechanism for collecting data using human computation games", In Proceedings of the 27th international conference on Human factors in computing systems, ACM, pp. 1197–1206, 2009
    [60] M. Haro, A. Xambó, F. Fuhrmann, D. Bogdanov, E. Gómez, P. Herrera, "The Musical Avatar - A Visualization of Musical Preferences by Means of Audio Content Description", In 5th Audio Mostly Conference: A Conference on Interaction with Sound, Article No. 14, Piteå Sweden, September 2010.
    [61] D. Bogdanov, M. Haro, F. Fuhrmann, A. Xambó, E. Gómez, P. Herrera, "A Content-based System for Music Recommendation and Visualization of User Preferences Working on Semantic Notions", In 9th International Workshop on Contentbased Multimedia Indexing (CBMI 2011), pp. 249-252, Madrid, Spain, June 13-15, 2011
    [62] M. A. Bartsch, G. H. Wakefield, "Audio Thumbnailing of Popular Music using Chroma-based Representations", IEEE Transactions on Multimedia, Vol. 7, Issue 1, pp. 96–104, February 2005
    [63] M. Müller, S. Ewert, "Chroma Toolbox: MATLAB Implementations for Extracting Variants of Chroma-based Audio Features", In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), pp. 215–220, Miami, USA, 2011
    [64] M. Mauch, S. Dixon, "Approximate Note Transcription for the Improved Identification of Difficult Chords", In Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), pp. 135–140, 2010
    [65] M. Müller, S. Ewert, "Towards Timbre-invariant Audio Features for Harmonybased Music", IEEE Transactions on Audio, Speech, and Language Processing, Vol 18, Issue 3, pp. 649–662, 2010
    [66] J. Serrà, E. Gómez, P. Herrera, X. Serra, "Chroma Binary Similarity and Local Alignment Applied to Cover Song Identification", IEEE Transactions on Audio, Speech and Language Processing, Vol 16, pp. 1138–1151, Oct 2008
    [67] P. Grosche, M. Müller, "Extracting Predominant Local Pulse Information from Music Recordings", IEEE Transactions on Audio, Speech, and Language Processing, Vol. 19, Issue 6, pp. 1688–1701, 2011
    [68] F. Kurth, M. Müller, "Efficient Index-based Audio Matching", IEEE Transactions on Audio, Speech, and Language Processing, Vol. 16, Issue 2, pp. 382–395, February 2008
    [69] M. A. Casey, C. Rhodes, M. Slaney, "Analysis of Minimum Distances in High-dimensional Musical Spaces", IEEE Transactions on Audio, Speech & Language Processing, Vol. 16, Issue 5, pp. 1015–1028, 2008
    [70] K. Mohajer, M. Emami, M. Grabowski, J. M. Hom, "System and Method for Storing and Retrieving Non-text-based Information", U.S. Patent 8041734, issued Oct 18 2011
    [71] J.S. Downie, A.F. Ehmann, M. Bay, and M.C. Jones, "The Music Information Retrieval Evaluation eXchange: Some Observations and Insights," Advances in Music Information Retrieval, W.R. Zbigniew and A.A. Wieczorkowska, Springer, pp. 93-115, 2010
    [72] J. Urbano, “Information Retrieval Meta-Evaluation: Challenges and Opportunities in the Music Domain,” International Society for Music Information Retrieval Conference, pp. 609-614, 2011
    [73] X. Hu, J. Liu, "Evaluation of Music Information Retrieval: Towards a User-Centered Approach", In Proceedings of the 4th Workshop on Human-Computer Interaction and Information Retrieval (HCIR). New Brunswick, NJ, USA, August 2010
    [74] J. Serrà, E. Gómez, P. Herrera, "Audio Cover Song Identification and Similarity: Background, Approaches, Evaluation and Beyond", In Advances in Music Information Retrieval, pp. 307–332. Springer, Berlin, Germany, 2010
    [75] J. Serrà, X. Serra, R. G. Andrzejak, "Cross Recurrence Quantification for Cover Song Identification", New Journal of Physics, 11(9):093017, 2009
    [76] M. Müller, "New Developments in Music Information Retrieval". In Procceding of 42nd Audio Engineering Society Conference, pp. 11-20, Ilmenau, Germany, July 22–24, 2011
    [77] Meinard Müller, "Information Retrieval for Music and Motion", book published by Springer-Verlag, NY, USA, 2007
    [78] T. Bertin-Mahieux and D. P.W. Ellis, "Large-scale cover song recognition using hashed chroma landmarks", In Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), pp. 117-120, New Platz, NY, 2011
    [79] http://www.rovicorp.com/, accessed May 2012
    [80] R. M. Khan, G. Tzanetakis, "Method and System for Analyzing Digital Audio Files", U.S. patent 7853344, Issued Dec 14, 2010
    [81] B. K. Vogel, "Multi-Stage Lookup for Rolling Audio Recognition", U.S. Patent Pub. No. 2011/0713185, published July 14, 2011
    [82] C. Dow, G. Gevorgyan, G. White, "Data Delivery for A Content System", U.S. Patent Pub. No. 2011/0125753, published May 26, 2011
    [83] N.Orio, D. Rizo, O. Lartillot, R. Miotto, N. Montecchio, M. Schedl, "Musiclef: A benchmark activity in multimodal music information retrieval", In: Proceedings of the 12th International Conference on Music Information Retrieval, ISMIR 2011, Miami, USA, Oct. 2011
    [84] http://www.music-ir.org/mirex/wiki/MIREX_HOME, accessed May 2012
    [85] MIR Toolbox by Helen Chen on 18 Mar 2011, Functions for the extraction of musical features from audio files from the University of Jyväskylä at http://www.mathworks.com/matlabcentral/linkexchange/links/2783-mir-toolbox, accessed May 2012.
    [86] J.-S. Roger Jang, "Melody Recognition Toolbox", available from the link at the author's homepage at "http://mirlab.org/jang, accessed May 2012
    [87] http:// www.midomi.com/, accessed May 2012

    QR CODE