簡易檢索 / 詳目顯示

研究生: 陳俊佑
Chun-Yu Chen
論文名稱: 基於維基百科鏈結分析為主之專家搜尋系統
EFS: Expert Finding System Based on Wikipedia Link Analysis
指導教授: 何建明
Jan-Ming Ho
李漢銘
Hahn-Ming Lee
口試委員: 蔡明祺
Mi-Ching Tsai
鮑興國
Hsing-Kuo Kenneth Pao
莊庭瑞
Tyng-Ruey Chuang
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2008
畢業學年度: 96
語文別: 英文
論文頁數: 54
中文關鍵詞: 專家搜尋自動片語辨識維基百科
外文關鍵詞: Expert finding, ATR, Wikipedia
相關次數: 點閱:276下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 一個專家搜尋系統即是依照使用者的問題查找出可以解決使用者問題的專家名冊。但傳統的作法主要是依照各個專家的專業能力做人工建檔,使用者再依據資料庫所提供之分類項尋找適合自己問題的專家。或是依照各個專家在相關文獻主題中出現的次數多寡來決定與使用者問題的關聯程度。如此不但將耗費大量人力並且也無法完整表達各個專家對於各主題的熟悉程度。
    此系統利用各個專家的著作建立該名專家專業能力資料,並且利用維基百科中的分類鏈結關係計算各個專家與計畫書領域的親疏遠近。在系統評估部分,論文使用了國家科學委員會資訊學門二在2007年所收到之668篇專案計畫書以及853位專家學者候選人為實驗資料集,實驗結果顯示,系統可將資料集中的八項分類計畫書分配給符合專長的專家們,並且在MRR3指標中的專家排序方式也達到良好的成效。


    An expert finding system is a system which provides the expert list based on the user’s queries. Traditionally, this process is solved by looking up the expert-expertise databases and ranking by the experts based on the appearance of his/her name in the related documents. However, this causes the process to a significant time-consuming work and database maintaining problems.
    This thesis focuses on an expert search scenario in a science organization. A science organization receives hundred of proposals per year. Traditionally, science organization distributes these proposals for reviewers manually and manpower consuming is a critical problem. For these problems, this thesis proposes an expert finding system to build the expertise-expert profile automatically and to rank the experts by the relatedness between the proposals and experts.
    The proposed expert finding system uses the publication of candidate experts to build the expertise-expert profile. The link structure of Wikipedia also is adopted to calculate the relatedness between the proposals and experts. In the system evaluation, this thesis uses the dataset from NSC (http://www.nsc.com.tw) 2007 which contains 668 proposals and 853 candidate experts. The experimental results show that the system could distribute the proposals to the experts with the same expertise domain. The experiments of MRR also show that the ranking of experts achieves good performance.

    Chapter 1 Introduction 1 1.1 Motivation 2 1.2 The Challenges of Expert Finding Problem 4 1.3 Goals 5 1.4 Outlines of the Thesis 6 Chapter 2 Background 7 2.1 Background of Expert Finding System 7 2.2 Background of Automatic Term Recognition 8 2.2.1 Filtering Stage 9 2.2.2 Statistic Stage 12 2.3 Wikipedia 12 Chapter 3 Expert Finding System 14 3.1 Concept of Expert Finding System (EFS) 14 3.2 System Architecture 17 3.2.1 Expert Profile Building Stage 18 3.2.2 Querying Stage 19 3.2.3 Term Extraction 20 3.2.4 Wikipedia Mapping 22 3.2.5 Expertise Indexing 25 3.2.6 Expert Searching & Ranking 25 3.2.7 Summary of Proposed Expert Finding System 26 3.3 Characteristics of Proposed Approach 27 Chapter 4 Experiments 30 4.1 Dataset 30 4.2 Experimental Methodology 33 4.3 Experimental Results 35 4.3.1 The Performance of Proposals Distribution Task 35 4.3.2 The Effectives of Term Extraction 41 Chapter 5 Conclusion and Further Work 45 5.1 Discussion 45 5.2 Conclusion 47 5.3 Further Work 47 References 49 Vita 53

    1. K. Balog and M. de Rijke, “Finding Experts and Their Details in E-mail Corpora,” in Proceedings of the 15th International Conference on World Wide Web, pp. 1035-1036, 2006.
    2. S. Banerjee, “Boosting Inductive Transfer for Text Classification Using Wikipedia,” in Proceedings of the Sixth International Conference on Machine Learning and Applications (ICMLA), pp. 148-153, 2007.
    3. Z. Cacilia, N. Vivi and S. Michael, “Distinguishing Between Instances and Classes in the Wikipedia Taxonomy,” in Proceedings of the 5th European Semantic Web Conference, pp. 376-387, 2008.
    4. C. S. Campbell, P. P. Maglio, A. Cozzi, and B. Dom, “Expertise Identification Using Email Communications,” in Proceedings of the 12th ACM Conference on Information and Knowledge Management, pp. 528-531, 2003.
    5. Chia-Ching Chou, Kai-Hsiang Yang and Hahn-Ming Lee, “AEFS: Authoritative Expert Finding System Based on a Language Model and Social Network Analysis,” in Proceedings of the 12th Conference on Artificial Intelligence and Applications (TAAI), pp. 412-419, 2007.
    6. N. Craswell, D. Hawking, and P. Wilkins, “P@NOPTIC Expert: Searching for Experts Not Just for Documents,” in Proceedings of the 7th Australasian World Wide Web Conference, pp. 107-114, 2001.
    7. N. Craswell, A. P. de Vries and I. Soboroff, “Overview of the TREC-2005 Enterprise Track,” in Proceedings of the 14th Text REtrieval Conference (TREC), 2005.
    8. H. Fang and C. X. Zhai, “Probabilistic Models for Expert Finding,” in Proceedings of the 29th European Conference on Information Retrieval (ECIR), pp. 418-430, 2007.
    9. K Frantzi, S Ananiadou and H Mima, “Automatic Recognition of Multi-word Terms: the C-value/NC-value Method,” International Journal on Digital Libraries, pp. 115-130, 2000.
    10. E. Gabrilovich and S. Markovitch, “Overcoming the Brittleness Bottleneck Using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge,” in Proceedings of the 21st National Conference on Artificial Intelligence (AAAI), pp. 1301-1306, 2006.
    11. E. Gabrilovich, Feature Generation for Textual Information Retrieval Using World Knowledge. PhD. Thesis, Israel Institute of Technology, 2006.
    12. S. Hettich, “Mining for Proposal Reviewers: Lessons Learned at the National Science Foundation,” in Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 862-871, 2006.
    13. M. Kaki and A. Aula, “Understanding Expert Search Strategies for Designing User-friendly Search Interfaces,” in Proceedings of the International Association for Development of the Information Society is a non-profit association (IADIS), pp. 759-762, 2003.
    14. J. Kleinberg, “Authoritative Sources in a Hyperlinked Environment,” in Proceedings of 9th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 604-632, 1998.
    15. C. Macdonald and I. Ounis, “Voting for candidates: Adapting Data Fusion Techniques for an Expert Search Task,” in Proceedings of the 15th ACM Conference on Information and Knowledge Management, pp. 387-396, 2006.
    16. C. Macdonald and I. Ounis, “Expertise Drift and Query Expansion in Expert Search,” in Proceedings of the 16th ACM conference on Conference on information and knowledge management (CIKM), pp. 341-350, 2007.
    17. C. Macdonald and I. Ounis, “Expert Search Evaluation by Supporting Documents,” in Proceedings of the 30th European Conference on Information Retrieval (ECIR), pp. 283-295, 2008.
    18. M. P. Marcus, B. Santorini and M. A. Marcinkiewicz, “Building a Large Annotated Corpus of English: The Penn Treebank,” Computational Linguistics, Volume 19, Issue 2, pp. 313-330, 1993.
    19. M. E. Maron, S. Curry and P. Thompson, “An Inductive Search system: Theory, Design and Implementation,” IEEE Transaction on Systems, Man and Cybernetics, vol. SMC-16, No. 1, pp. 21-28, 1986.
    20. E. E. Milios, Y. Z., B. He and L. Dong “Term Extraction and Document Similarity in Special Text Corpora,” in Proceedings of the 6th Conference of the Pacific Association for Computational Linguistics, pp. 275-284, 2003.
    21. D. Mimno, A. McCallum, “Expertise Modeling for Matching Papers with Reviewers,” in Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 500-509, 2007.
    22. D. P. T. Nguyen, Y. Matsuo, M. Ishizuka, “Relation Extraction from Wikipedia Using Subtree Mining,” in Proceedings of the 22nd National Conference on Artificial Intelligence (AAAI), pp.1414-1420, 2007.
    23. J. Pehcevski, A. M. Vercoustre and J. A. Thom “Exploiting Locality of Wikipedia Links in Entity Ranking,” in Proceedings of 30th European Conference on Information Retrieval (ECIR), pp. 258-269, 2008.
    24. D. Petkova and B. W. Croft, “Hierarchical Language Models for Expert Finding in Enterprise corpora,” in Proceedings of the 18th IEEE International Conference on Tools with Artificial Intelligence, pp. 599-608, 2006.
    25. S. P. Ponzetto and M. Strube, “WikiRelate! Computing Semantic Relatedness Using Wikipedia,” in Proceedings of the 21st National Conference on Artificial Intelligence (AAAI), pp. 1419-1424, 2006.
    26. S. P. Ponzetto and M. Strube, “Knowledge Derived From Wikipedia For Computing Semantic Relatedness,” Journal of Artificial Intelligence Research (JAIR), Volume 30, pp. 181-212, 2007.
    27. S. P. Ponzetto and M. Strube, “Deriving a Large Scale Taxonomy from Wikipedia,” in Proceedings of the 22nd National Conference on Artificial Intelligence (AAAI), pp. 1440-1447, 2007.
    28. F. Provost, T. Fawcett and R. Kohavi, “The Case against Accuracy Estimation for Comparing Induction Algorithms,” in Proceedings of the 15th International Conference on Machine Learning, pp. 445-453, 1998.
    29. E. M. Voorhees, “The TREC-8 Question Answering Track Report,” in Proceedings of the 8th Text Retrieval Conference, pp. 77-82, 1999.
    30. A. P. de Vries, I. Soboroff, “Overview of the TREC-2006 Enterprise Track,” in Proceedings of the 15th Text REtrieval Conference (TREC), 2006.
    31. D. Yimam, “Expert Finding Systems for Organizations: Domain Analysis and the DEMOIR Approach,” Beyond Knowledge Management: Sharing Expertise, pp. 327-358, 2000.

    QR CODE