簡易檢索 / 詳目顯示

研究生: 蔡宛砡
Wan-Yu Tsai
論文名稱: 基於單一四分樹之最近k點空間關鍵字查詢演算法
An Efficient Algorithm for Top-k Spatial Keyword Query based on Single-Quadtree Traversal
指導教授: 邱舉明
Ge-Ming Chiu
口試委員: 吳秀陽
Shiow-Yang Wu
項天瑞
Tien-Ruey Hsiang
金台齡
Tai-Lin Chin
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2016
畢業學年度: 104
語文別: 中文
論文頁數: 50
中文關鍵詞: 空間關鍵字查詢最近k點空間關鍵字查詢
外文關鍵詞: Spatial keyword query, Top-k spatial keyword query
相關次數: 點閱:210下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著行動網路、社群網路及智慧型行動裝置的普及,以服務為導向之應用不斷地推陳出新,其中,資料查詢之相關應用更是應用服務裡面重要的一環。手機結合適地性服務(Location Based Service, LBS)的應用,目前在全球已蔚為風潮,但是只有提供位置的資訊是不夠的,隨著LBS逐漸成熟,藉由LBS所提供的地理資訊以及附加於地理位置的相關資訊來達到虛實整合,創造行動服務與行動應用的趨勢,正是LBS最有潛力的發展方向,也因此在近年來,空間關鍵字查詢議題被廣泛地研究與探討。本論文主要探討最近k點空間關鍵字查詢(Top-k Spatial Keyword Query),簡稱TkSKQ,其挑戰在於如何同時考慮空間與文本資訊,建立有效率的索引結構並快速地找到具有查詢者關鍵字且距離最近的k個物件。
    在本論文中,為了有效地解決TkSKQ問題並改進目前研究方法的缺點,我們提出一種針對各個關鍵字建立其空間樹狀結構,並且只要走訪一個樹狀結構即可快速地得到查詢結果的方法,此外,我們也針對我們所提出的研究方法以實際資料集與合成資料集進行效能評估。


    With the popularity of mobile network, social network and smart mobile devices, the service-oriented applications innovate constantly. Among of them, the applications which are associated with data query play a significant role in the application service. The applications of cell phone which combine with location based service are so popular. However, queries involving only conditions on object’s geometric properties are not enough. As the location based service is getting mature, the novel forms of spatial queries that are integrated with keyword search are the most potential development in the future. Consequently, the study of spatial keyword search which explores both location and textual description of the objects has attracted great attention from the commercial organizations and research communities. In this paper, we study the problem of top-k spatial keyword query (TkSKQ). The main challenge is how to effectively combine the spatial index and textual index such that each of the k closest objects contains all keywords in the query can be rapidly retrieved.
    In this paper, to answer TkSKQ more efficiently and improve the performance of the existing approaches, we propose an efficient method that can retrieve the k nearest neighbors satisfying the keyword constraint in real time. For each keyword t_i, we build a linear quadtree for the objects which contain the keyword t_i.We only need to traverse one linear quadtree to obtain the query result. Besides, we also employ both real data and synthetic data in the experiments to show that our method outperforms the IL-Quadtree significantly.

    摘要 I Abstract II 目錄 III 圖目錄 V 表目錄 VII 第一章 緒論 1 1.1 背景 1 1.2 論文目標 3 1.3 論文架構 4 第二章 相關研究 5 2.1 Top-k kNN Query 7 2.2 Boolean Range Query 8 2.3 Boolean kNN Query 9 2.4 Inverted R-tree 10 2.5 IR2-tree 12 2.6 IL-Quadtree 15 第三章 最近k點空間關鍵字查詢 19 3.1 問題描述 19 3.2 以物件為基礎之資料結構 20 3.2.1 混合索引結構 21 3.2.2 關鍵字序號的訂定 25 3.3 最近k點空間關鍵字查詢演算法 26 3.4 時間複雜度分析 31 第四章 實驗與模擬結果 33 4.1 環境設定與模擬參數 33 4.2 模擬結果 35 4.2.1 不同資料集 36 4.2.2 查詢的關鍵字個數 36 4.2.3 最近鄰居個數 39 4.2.4 每個物件的關鍵字個數 41 4.2.5 關鍵字頻率分布的偏斜程度 43 4.3 模擬結果分析 44 第五章 結論與未來研究方向 45 重要參考文獻 47

    [1] N. Beckmann, H. Kriegel, R. Schneider, and B. Seeger, “The R*-tree: An Efficient and Robust Access Method for Points and Rectangles,” in Proc. ACM Int’l Conf. on Management of data(SIGMOD), 1990, pp. 322-331.
    [2] A. Guttman, “R-trees: a dynamic index structure for spatial searching,” in Proc. ACM Int’l Conf. on Management of data(SIGMOD),1984, pp. 47-57.
    [3] G. R. Hjaltason and H. Samet, “Distance browsing in spatial databases,” ACM Transactions on Database Systems(TODS), vol. 24, no. 2, pp. 265–318, 1999.
    [4] N. Roussopoulos, S. Kelley, and F. Vincent, “Nearest neighbor queries,” in Proc. ACM Int’l Conf. on Management of data(SIGMOD), 1995, pp. 71-79.
    [5] G. Cong, L. Wang, C. Y. Lin, Y. I. Song, and Y. Sun, “Finding question-answer pairs from online forums,” in Proc. ACM Int’l Conf. on Research and development in information retrieval(SIGIR), 2008, pp. 467-474.
    [6] J. M. Ponte and W. B. Croft, “A language modeling approach to information retrieval,” in Proc. ACM Int’l Conf. on Research and development in information retrieval(SIGIR), 1998, pp. 275-281.
    [7] G. Salton and M. McGill, “Introduction to Modern Information Retrieval,” McGraw-Hill, 1983.
    [8] Y. Zhou, X. Xie, C. Wang, Y. Gong, and W. Y. Ma, “Hybrid index structures for location-based web search,” in Proc. ACM Int’l Conf. on Information and Knowledge Management(CIKM), 2005, pp.155-162.
    [9] I. D. Felipe, V. Hristidis, and N. Rishe, “Keyword search on spatial databases,” in Proc. IEEE Int’l Conf. on Data Engineering(ICDE), 2008, pp.656-665.
    [10] G. Cong, C. S. Jensen, and D. Wu, “Efficient retrieval of the top-k most relevant spatial web objects,” ACM VLDB Endowment, vol. 2, no. 1, pp.337-348, 2009.
    [11] R. Gobel, A. Henrich, R. Niemann, and D. Blank, “A hybrid index structure for geo-textual searches,” in Proc. ACM Int’l Conf. on Information and Knowledge Management(CIKM), 2009, pp. 1625–1628.
    [12] R. Hariharan, B. Hore, C. Li, and S. Mehrotra, “Processing spatial-keyword (sk) queries in geographic information retrieval (gir)systems,” in Proc. Int’l Conf. on Scientific and Statistical Database Management(SSDBM) , 2007, pp.16.
    [13] Z. Li, K. C. K. Lee, B. Zheng, W. C. Lee, D. L. Lee, and X. Wang, “Ir-tree: An efficient index for geographic document search,” IEEE Transactions on Knowledge and Data Engineering, vol. 23, no. 4, pp. 585-599, 2011.
    [14] J. B. Rocha-Junior, O. Gkorgkas, S. Jonassen, and K. Nrvåg, “Efficient processing of top-k spatial keyword queries,” in Proc. Int’l Conf. on Advances in Spatial and Temporal Databases(SSTD), 2011, pp. 205-222.
    [15] A. Cary, O. Wolfson, and N. Rishe, “Efficient and scalable method for processing top-k spatial boolean queries,” in Proc. Int’l Conf. on Scientific and Statistical Database Management(SSDBM),2010, pp. 87-95.
    [16] D. Wu, M. L. Yiu, G. Cong, and C. S. Jensen, “Joint top-k spatial keyword query processing,” IEEE Transactions on Knowledge and Data Engineering, vol. 24, no. 10, pp. 1889-1903, 2012.
    [17] D. Wu, G. Cong, and C. S. Jensen, “A framework for efficient spatial web object retrieval,” The VLDB Journal, vol. 21, no. 6, pp. 797-822, 2012.
    [18] S. Vaid, C. B. Jones, H. Joho, and M. Sanderson, “Spatio-textual indexing for geographical search on the web,” in Proc. Int’l Conf. on Advances in Spatial and Temporal Databases(SSTD), 2005, pp. 218-235.
    [19] A. Khodaei, C. Shahabi, and C. Li, “Hybrid indexing and seamless ranking of spatial and textual features of web documents,” in Proc. Int’l Conf. on Database and Expert Systems Applications, 2010,pp. 450-466.
    [20] Y. Y. Chen, T. Suel, and A. Markowetz, “Efficient query processing in geographic web search engines,” in Proc. ACM Int’l Conf. on Management of data(SIGMOD) , 2006, pp. 277-288.
    [21] M. Christoforaki, J. He, C. Dimopoulos, A. Markowetz, and T. Suel, “Text vs. space: efficient geo-search query processing,” in Proc. ACM Int’l Conf. on Information and Knowledge Management(CIKM), 2011,pp. 423-432.
    [22] D. Papadias, P. Kalnis, J. Zhang, and Y. Tao, “Efficient OLAP operations in spatial data warehouses,” in Proc. Int’l Conf. on Advances in Spatial and Temporal Databases(SSTD), 2001, pp. 443-459.
    [23] H. Yan, S. Ding, and T. Suel, “Inverted index compression and query processing with optimized document ordering,” in Proc. Int’l Conf. on World wide web(WWW), 2009, pp. 401-410.
    [24] C. Faloutsos and S. Christodoulakis, “Signature Files: An Access Method for Documents and Its Analytical Performance Evaluation,” ACM Transactions on Information Systems(TOIS), vol. 2, no. 4, pp. 267-288,1984.
    [25] S. Stiassny, “Mathematical Analysis of Various Superimposed Coding Methods,” Am. Doc., vol. 11, no. 2, pp. 155-169, 1960.
    [26] C. Zhang, Y. Zhang, W. Zhang, and X. Lin, “Inverted linear quadtree: Efficient top k spatial keyword search,” in Proc. IEEE Int’l Conf. on Data Engineering(ICDE), 2013, pp. 901-912.
    [27] D. Comer, “The ubiquitous B-tree,” ACM Computing Surveys, vol. 11, no. 2, pp.121-137, 1979.
    [28] I. Gargantini, “An effective way to represent quadtrees,” Commun. ACM, vol. 25, no. 12, pp. 905-910, 1982.
    [29] G. M. Morton, “A computer oriented geodetic data base and a new technique in file sequencing,” Technical Report, Ottawa, Canada: IBM Ltd, 1966.
    [30] J. A. Orenstein and F. A. Manola, “PROBE spatial data modeling and query processing in an image database application,” IEEE Transactions on Software Engineering, vol. 14, no.5, pp. 611-629, 1988.
    [31] C. Faloutsos and S. Roseman, “Fractals for secondary key retrieval,” in Proc. ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (PODS), 1989, pp. 247-252.
    [32] H.V. Jagadish, “Linear clustering of objects with multiple attributes,” in Proc. ACM Int’l Conf. on Management of data(SIGMOD), 1990, pp. 332-342.
    [33] C. Faloutsos, “Gray codes for partial match and range queries,” IEEE Transactions on Software Engineering, vol. 14, no. 10, pp. 1381-1393, 1988.
    [34] C. Faloutsos, “Multiattribute hashing using gray codes,” in Proc. ACM Int’l Conf. on Management of data(SIGMOD), 1990, pp. 227-238.
    [35] http://geonames.usgs.gov
    [36] L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker, “Web Caching and Zipf-Like Distribution: Evidence and Implication,” in Proc. IEEE Int’l Conf. on Computer Communications(INFOCOM), 1999, pp. 126-134.
    [37] G. Zipf, Human Behavior and the Principle of Least Effort. Addison Wesley, 1949.
    [38] S. Sagiroglu and D. Sinanc, “Big data: a review,” in Proc. Int’l Conf. on Collaboration Technologies and Systems (CTS), 2013, pp. 42-47.
    [39] H. Hu, Y. Wen, T. S. Chua and X. Li, “Toward Scalable Systems for Big Data Analytics: A Technology Tutorial,” IEEE Access, vol. 2, pp. 652 -687, 2014.
    [40] M. Chen, S. Mao and Y. Liu, “Big Data: A Survey,” Mobile Networks and Applications, vol. 19, no. 2, pp. 171-209, 2014.
    [41] I. A. T. Hashem, I. Yaqoob, N. B. Anuar, S. Mokhtar, A. Gani and S. U. Khan, “The rise of big data on cloud computing: Review and open research issues,” Information Systems, vol. 47, pp. 98-115, 2015.
    [42] D. E. Baz, “IoT and the Need for High Performance Computing,” in Proc. Int’l Conf. on Identification, Information and Knowledge in the Internet of Things (IIKI), 2014, pp.1-6.
    [43] M. Nitti, L. Atzori, and I. P. Cvijik, “Friendship Selection in the Social Internet of Things: Challenges and Possible Strategies,” IEEE Internet of Things Journal, vol. 2, no. 3, 2015.
    [44] John A. Stankovic, “Research Directions for the Internet of Things,” IEEE Internet of Things Journal, vol. 1, no. 1, 2014.
    [45] A. Zanella, N. Bui, A. Castellani, L. Vangelista, and M. Zorzi, “Internet of Things for Smart Cities,” IEEE Internet of Things Journal, vol. 1, no. 1, 2014.
    [46] Beacon, https://en.wikipedia.org/wiki/Beacon
    [47] iBeacon, https://zh.wikipedia.org/wiki/IBeacon

    QR CODE