研究生: 金冠辰
Kuan-Chen Chin
論文名稱: 使用平滑支撐向量機達到匯總式搜尋引擎個人化排序之目的
Personalized Ranking for Meta-Search Engine by Using SSVM
指導教授: 李育杰
Yuh-Jye Lee
口試委員: 張源俊
Yuan-Chin Chang
Hahn-Ming Lee
Cheng-Seen Ho
Chih-Jen Lin
學位類別: 碩士
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2005
畢業學年度: 93
語文別: 英文
論文頁數: 67
中文關鍵詞: 個人化排序網路資訊檢索匯總式搜尋引擎文件分類平滑支撐向量機
外文關鍵詞: personalized ranking, web information retrieval, meta-search engine, text classification, smooth support vector machines
隨著全球資訊網 (World Wide Web) 的迅速發展,存在於網路上的資訊大量增加。當使用者在網路上搜尋資料時,搜尋引擎 (search engine) 該如何將回傳的網頁依使用者的搜尋目的做排序 (ranking),已經成為一個重要的研究主題。現今的搜尋引擎是被建立來服務所有的使用者,故搜尋結果通常是依大眾的興趣來做排序,再加上現今的搜尋引擎回復給所有使用者相同的結果,因此搜尋結果往往無法滿足所有的使用者。在這篇論文中,我們建立了一個可供個人化的匯總式搜尋引擎 (personalized meta-search engine) 以提供使用者將搜尋結果依照其個人興趣做排序。假設使用者會依自己喜好將文章儲存於特定資料夾中,我們利用文件分類 (text classification) 的技術與平滑支撐向量機 (smooth support vector machine) 來從這些特定資料夾的文件中萃取出關於使用者興趣的模型描述 (user profile)。

由於匯總式搜尋引擎擁有相對於一般搜尋引擎較多的資訊含量,在我們的系統中我們將匯總式搜尋引擎當作一個資訊蒐集器 (data collector),專門幫我們蒐集網頁資料。對於搜尋回來的網頁,我們提出了兩種非個人化的匯總排序方式(meta-ranking method) 來將搜尋結果呈現給使用者。在個人化排序方面,我們將搜尋回來的網頁依照與使用者檔案的相似度以及搜尋引擎喜好程度機制 (search engine preference mechanism) 重新做排序,來達到個人化排序之目的。在我們的實驗中,我們模擬了6個情境來測試我們系統個人化排序的效能。而我們的實驗結果顯示,我們成功的建立了一個方法,將搜尋回來的網頁依使用者個人的興趣做排序,使得使用者能於前幾筆回傳的網頁中,得到所需的資訊。

With the fast growth of the World Wide Web, the amount of information on the Web has become overwhelming. Since current search engines are built to serve all users, the search results are usually ranked based on the public interests of the users. Furthermore, current search engines always provide the same results for all users no matter which field the user belongs to. Hence, some users have to browse the search results laboriously to find out the desired web pages. In this thesis, we build up a personalized meta-search engine (PMSE) which allows users to rank web pages according to their personal interests. A user's interests are represented in a user profile which can be learned from the documents that are stored in the personal computer. Due to the superior performance of support vector machines (SVMs), smooth support vector machine and several other text-classification techniques are applied to obtain user profiles.

We take advantage of meta-search engine's better coverage of the Web to collect a wide variety of web pages. Meta-search engine works as a data collector, and two basic meta-ranking algorithms are proposed in our system. For personalized ranking, we re-rank the collected web pages by consulting with the user profile and the search engine preference mechanism (SEP). We simulate 6 scenarios in order to evaluate the performance of our system, which conventional search engines could not provide satisfied search results for the users. Our experimental results indicate that we successfully provide a way for the users to rank the web pages according to their interests, and also show that the personalized ranking is worth pursuing further.

1. Introduction 1.1 Web Information Retrieval 1.2 Conventional Search Services 1.3 Ranking Problem 1.4 Organization of Thesis 2. Text Classification 2.1 Text Representation 2.2 Feature Selection 2.3 Term Weighting 2.4 Support Vector Machines 2.5 Multi-Class Classification 2.6 Performance Measures 3. Framework of Personalized Meta-Search Engine 3.1 Meta-Search Engine 3.2 Personalized Filter 3.3 Off-Line Training 4. Experiments 4.1 Experimental Setup 4.2 Numerical Results 5. Conclusion

