研究生: |
何紹威 Shou-Wei Ho |
---|---|
論文名稱: |
基於概念向量萃取維基百科分類網路建構模糊領域本體論 Mining Fuzzy Domain Ontology Based on Concept Vector from Wikipedia Category Network |
指導教授: |
李漢銘
Hahn-Ming Lee |
口試委員: |
何建明
Jan-Ming Ho 李育杰 Yuh-Jye Lee 王榮英 Jung-Ying Wang 陳志銘 Chih-Ming Chen |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 資訊工程系 Department of Computer Science and Information Engineering |
論文出版年: | 2011 |
畢業學年度: | 99 |
語文別: | 英文 |
論文頁數: | 50 |
中文關鍵詞: | 本體論 、維基百科探勘 、概念向量 、領域分類 |
外文關鍵詞: | Ontology, Wikipedia Mining, Concept Vector, Domain Classification |
相關次數: | 點閱:194 下載:1 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
本體論是用來表達領域知識的一種標準化的語言,並且可使用於需要人機溝通的系統上(例如,專家推薦系統,文件分類)。而領域本體論可表達關鍵字在不同領域上的特殊意義。許多研究者在利用模糊領域本體論來衡量概念間的相似度。然而,建設領域本體論是一種勞動密集和耗費時間成本的工作。根據最近的研究,維基百科可以用來建立與更新本體論,因為維基百科藉由眾人的智慧可以提供最新的資訊。在本論文中,我們提出了一種基於概念向量萃取維基百科分類網路建構模糊領域本體論的方法,並使用概念向量建立關鍵字與概念間的模糊關係。而一個領域的知識是由幾個概念所組成的,這裡所指的概念是由一個特定的維基百科分類所代表。然後,模糊關係是用來衡量關鍵字、概念和領域之間的語義關聯度。模糊領域本體論的構建包含概念集合、領域集合和他們之間的模糊關係。我們的方法可以達到:(1)利用維基百科建立最新的模糊領域本體論;(2)將領域概念化成數個由維基百科分類所組成的概念;(3)利用模糊領域本體論建立的關鍵字與領域間的模糊關係。藉由使用文件檢索會議(Text Etrieval Conference,簡稱TREC)文件資料庫的實驗結果表明,模糊領域本體論有助於改進文件檢索的程序。
Ontology is essential in the formalization of domain knowledge for effective humancomputer interactions (e.g., recommendation system, document classification). Especially domain ontology represents the particular meanings of terms in a specific domain. Many researchers have proposed approaches to measure the similarity between concepts by accessing fuzzy domain ontology. However, engineering of the construction of domain ontologies turns out to be labor intensive and tedious. Recently, Wikipedia mining facilitates the process of ontology construction because Wikipedia provides the up-to-date concept information managed based on socially annotated category structure. In this thesis, we propose an approach to mine domain concepts from Wikipedia Category Network, and to generate the fuzzy relation based on a concept vector extraction method to measure the relatedness between a single term and a concept.
The domain knowledge is composed of several concepts, and the concept is represented by a specific Wikipedia category. Then the fuzzy relation is used to measure the relatedness score among key terms, concepts and domains. The constructed fuzzy domain ontology comprises several concept sets of domains and fuzzy relation between terms and domains. Our methodology can conceptualize domain knowledge by mining Wikipedia Category Network. Especially ontology-based systems can be implemented by our fuzzy domain ontology. An empirical experiment is conducted to evaluate the robustness by using the textual dataset from Text REtrieval Conference (TREC). Experiment results show that the constructed fuzzy domain ontology derived by proposed approach can discover robust fuzzy domain ontology which achieves improvement in information retrieval tasks.
[1] D. Beneventano, S. Bergamaschi, F. Guerra, and M. Vincini, “Synthesizing an
integrated ontology,” Internet Computing, IEEE, vol. 7, no. 5, pp. 42–51, 2003.
[2] J. Brank, M. Grobelnik, and D. Mladenic, “A survey of ontology evaluation techniques,”
in Proceedings of the Conference on Data Mining and Data Warehouses
(SiKDD 2005). Citeseer, 2005, pp. 166–170.
[3] N. Choi, I. Song, and H. Han, “A survey on ontology mapping,” ACM Sigmod
Record, vol. 35, no. 3, pp. 34–41, 2006.
[4] P. Cimiano, A. Hotho, and S. Staab, “Learning concept hierarchies from text corpora
using formal concept analysis,” Journal of Artificial Intelligence Research,
vol. 24, no. 1, pp. 305–339, 2005.
[5] A. Clauset, M. Newman, and C. Moore, “Finding community structure in very
large networks,” Physical Review E, vol. 70, no. 6, pp. 66 111–1–66 111–6, 2004.
[6] A. Doan, P. Domingos, and A. Halevy, “Learning to match the schemas of data
sources: A multistrategy approach,” Machine Learning, vol. 50, no. 3, pp. 279–
301, 2003.
[7] N. Du, B.Wang, and B. Wu, “Overlapping community structure detection in networks,”
in Proceeding of the 17th ACM conference on Information and knowledge
management. ACM, 2008, pp. 1371–1372.
[8] E. Gabrilovich and S. Markovitch, “Computing semantic relatedness using
wikipedia-based explicit semantic analysis,” in Proceedings of the 20th International
Joint Conference on Artificial Intelligence, 2007, pp. 6–12.
[9] X. Han and J. Zhao, “Named entity disambiguation by leveraging wikipedia semantic
knowledge,” in Proceeding of the 18th ACM conference on Information
and knowledge management. ACM, 2009, pp. 215–224.
[10] C. Jones, A. Abdelmoty, D. Finch, G. Fu, and S. Vaid, “The spirit spatial search
engine: Architecture, ontologies and spatial indexing,” Geographic Information
Science, pp. 125–139, 2004.
[11] A. Kittur, E. Chi, and B. Suh, “What’s in wikipedia?: mapping topics and conflict
using socially annotated category structure,” in Proceedings of the 27th international
conference on Human factors in computing systems. ACM, 2009, pp.
1509–1512.
[12] J. Kopecky, T. Vitvar, C. Bournez, and J. Farrell, “Sawsdl: Semantic annotations
for wsdl and xml schema,” IEEE Internet Computing, pp. 60–67, 2007.
[13] R. Lau, A. Chung, D. Song, and Q. Huang, “Towards fuzzy domain ontology
based concept map generation for e-learning,” Advances inWeb Based Learning–
ICWL 2007, pp. 90–101, 2008.
[14] R. Lau, Y. Li, and Y. Xu, “Mining fuzzy domain ontology from textual
databases,” in Proceedings of the IEEE/WIC/ACM International Conference on
Web Intelligence. IEEE Computer Society, 2007, pp. 156–162.
[15] R. Lau, D. Song, Y. Li, T. Cheung, and J. Hao, “Towards a fuzzy domain ontology
extractionmethod for adaptive e-learning,” IEEE Transactions on Knowledge and
Data Engineering, vol. 21, no. 6, pp. 800–813, 2009.
[16] L. Leme, M. Casanova, K. Breitman, and A. Furtado, “Owl schema matching,”
Journal of the Brazilian Computer Society, vol. 16, no. 1, pp. 21–34, 2010.
[17] D. Lewis, “Reuters-21578 text categorization test collection,” AT&T Labs Research,
1997.
[18] H. Liu and Y. Chen, “Computing semantic relatedness between named entities using
wikipedia,” in Artificial Intelligence and Computational Intelligence (AICI),
2010 International Conference on, vol. 1. IEEE, 2010, pp. 388–392.
[19] D. Lizorkin, O. Medelyan, and M. Grineva, “Analysis of community structure in
wikipedia,” in Proceedings of the 18th international conference on World wide
web. ACM, 2009, pp. 1221–1222.
[20] D. McGuinness and F. Van Harmelen, “Owl web ontology language overview,”
W3C recommendation, vol. 10, pp. 2004–03, 2004.
[21] D. Milne, O. Medelyan, and I. Witten, “Mining domain-specific thesauri from
wikipedia: A case study,” in IEEE/WIC/ACM International Conference on Web
Intelligence, 2006. WI 2006, 2006, pp. 442–448.
[22] A. Morozov, “On computable automorphisms in formal concept analysis,”
Siberian Mathematical Journal, vol. 51, no. 2, pp. 289–295, 2010.
[23] V. Nastase and M. Strube, “Decoding wikipedia categories for knowledge acquisition,”
in Proceedings of the 23rd national conference on Artificial intelligence,
2008, pp. 1219–1224.
[24] M. Newman and M. Girvan, “Finding and evaluating community structure in
networks,” Physical review E, vol. 69, no. 2, pp. 26 113–1–26 113–15, 2004.
[25] N. Noy, “Semantic integration: a survey of ontology-based approaches,” ACM
Sigmod Record, vol. 33, no. 4, pp. 65–70, 2004.
[26] T. Quan, S. Hui, and T. Cao, “Foga: a fuzzy ontology generation framework for
scholarly semantic web,” in Proceedings of the 2004 Knowledge Discovery and
Ontologies Workshop. Citeseer, 2004, pp. 37–48.
[27] M. Sahami and T. Heilman, “A web-based kernel function for measuring the similarity
of short text snippets,” in Proceedings of the 15th international conference
on World Wide Web. ACM, 2006, pp. 377–386.
[28] E. Sanchez and T. Yamanoi, “Fuzzy ontologies for the semantic web,” Flexible
Query Answering Systems, pp. 691–699, 2006.
[29] P. Schonhofen, “Identifying document topics using the wikipedia category network,”
in Proceedings of the 2006 IEEE/WIC/ACM International Conference on
Web Intelligence. IEEE Computer Society, 2006, pp. 456–462.
[30] M. Shirakawa, K. Nakayama, T. Hara, and S. Nishio, “Concept vector extraction
from wikipedia category network,” in Proceedings of the 3rd International Conference
on Ubiquitous Information Management and Communication. ACM,
2009, pp. 71–79.
[31] N. Silva and J. Rocha, “Ontology mapping for interoperability in semantic web,”
in Proceedings of the IADIS International Conference WWW/Internet, 2003, pp.
603–610.
[32] Q. Tho, S. Hui, A. Fong, and T. Cao, “Automatic fuzzy ontology generation for
semantic web,” IEEE Transactions on Knowledge and Data Engineering, pp.
842–856, 2006.
[33] K. Toutanova, D. Klein, C. Manning, and Y. Singer, “Feature-rich part-of-speech
tagging with a cyclic dependency network,” in Proceedings of the 2003 Conference
of the North American Chapter of the Association for Computational Linguistics
on Human Language Technology, vol. 1. Association for Computational
Linguistics, 2003, pp. 173–180.
[34] A. Voutilainen, “Part of speech Tagging,” The Oxford Handbook of Computational
Linguistics, pp. 219–232, 2003.
[35] R. Wille, “Restructuring lattice theory: an approach based on hierarchies of concepts,”
in Ordered sets, I. Rival, Ed. Reidel, 1982, pp. 445–470.
[36] R. Wille, “Formal concept analysis as mathematical theory of concepts and concept
hierarchies,” Formal Concept Analysis, pp. 1–33, 2005.
[37] F. Wu and D. Weld, “Automatically refining the wikipedia infobox ontology,” in
Proceeding of the 17th international conference on World Wide Web. ACM,
2008, pp. 635–644.
[38] K. Yang, T. Kuo, H. Lee, and J. Ho, “A Reviewer Recommendation SystemBased
on Collaborative Intelligence,” in Proceedings of the 2009 IEEE/WIC/ACM International
Joint Conference on Web Intelligence and Intelligent Agent Technology,
vol. 1. IEEE Computer Society, 2009, pp. 564–567.
[39] T. Zesch, C. M‥uller, and I. Gurevych, “Extracting Lexical Semantic Knowledge
fromWikipedia andWiktionary,” in Proceedings of the Conference on Language
Resources and Evaluation (LREC), 2008.
[40] J. Zhai, L. Shen, Z. Zhou, and Y. Liang, “Fuzzy ontology model for knowledge
management,” in Proceedings of the 2007 International Conference Proceedings
on Intelligent Systems and Knowledge Engineering, 2007.
[41] H. Zhuge, “Communities and emerging semantics in semantic link network: Discovery
and learning,” IEEE Transactions on Knowledge and Data Engineering,
pp. 785–799, 2008.