簡易檢索 / 詳目顯示

研究生: 楊可東
Ke-Dong Yang
論文名稱: 利用LightGCN實現電子商務平台上的快速更新個人化搜尋模型
Implementing a Rapidly Updatable Personalized Search Model on E-Commerce Platforms Using LightGCN
指導教授: 鍾聖倫
Sheng-Luen Chung
口試委員: 鍾聖倫
Sheng-Luen Chung
蘇順豐
Shun-Feng Su
邱裕明
Yu-Ming Qiu
沈哲州
Che-Chou Shen
黃騰毅
Teng-Yi Huang
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2023
畢業學年度: 112
語文別: 中文
論文頁數: 213
中文關鍵詞: 個人化搜尋LightGCN模型電子商務推薦系統模型更新數據集建立
外文關鍵詞: personalized search, LightGCN model, e-commerce recommendation systems, model updating, dataset construction
相關次數: 點閱:49下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本論文聚焦於電商領域個人化搜尋和推薦系統的核心難題,提出創新解法,旨在提升使用者體驗的效率與準確度。研究主要貢獻包括:
    1. 資料集建立與優化:基於國內大型電商平台的點擊流數據,我們透過系統性的篩選和淨化,建立了一個詳盡反映使用者與商品互動的資料集。這資料集不只是推薦和個人化搜尋系統訓練、評估的堅實基礎,還特地選出四萬名使用者和二十萬商品,形成一個小型但精準的資料集,並且確保未來資料集的可擴展性。
    2. 個人化搜尋模型創新:利用圖協同過濾的 LightGCN 模型,我們從使用者和商品的互動學習預測使用者偏好。透過加權重排序策略,對搜尋結果進行個性化重排序,更貼合使用者需求。這方法不僅滿足使用者搜尋意圖,也精準捕捉使用者偏好,讓個人化搜尋效率較傳統方法在 @1的搜尋正確率上達到高達 40% 的效能提升。
    3. 快速模型更新機制:針對電商數據迭代快速和降低更新成本的需求,我們提出一套新穎的模型更新策略。此策略透過更新圖結構和調整正規化鄰接矩陣,快速產生反映最新互動的嵌入向量,實現無需重訓練的快速更新,大幅縮短更新時間並降低成本。
    總的來說,本論文不僅技術上提供有效的個人化搜尋和推薦解決方案,也為電商領域的數據處理、模型訓練與更新提供實用指南,有效提升了使用者的個人化體驗和滿意度。


    This thesis focuses on the core challenges of personalized search and recommendation systems in the e-commerce sector, proposing innovative solutions aimed at enhancing user experience in terms of efficiency and accuracy. The key contributions of this study include:
    1. Dataset Construction and Optimization: Leveraging real clickstream data from a major domestic e-commerce platform, we systematically filtered and purified the data to construct a comprehensive database reflecting user-product interactions. This database serves as a robust foundation for training and evaluating recommendation and personalized search systems. Additionally, a subset of 40,000 users and 200,000 products was specifically selected to create a smaller, yet highly representative dataset, ensuring future scalability and flexibility.
    2. Innovative Personalized Search Model: Utilizing the LightGCN model, which is based on graph collaborative filtering, we learned to predict user preferences from user and product interactions. By employing a weighted re-ranking strategy, the search results were personalized to align more closely with individual user needs. This method not only tracks user search intentions but also accurately captures individual preferences, significantly enhancing the efficiency of personalized searches by up to 40% in search accuracy at @1 compared to traditional methods.
    3. Rapid Model Updating Mechanism: To address the needs for swift data iteration and reduced update costs in the e-commerce environment, we introduced a novel model updating strategy. This approach, involving updating the graph structure and adjusting the normalized adjacency matrix, rapidly generates embedding vectors reflecting the latest interactions, enabling quick model updates without retraining. This significantly shortens the update time and reduces costs.
    In summary, this thesis not only provides effective technical solutions for personalized search and recommendations but also offers practical guidelines for data processing, model training, and updating in the e-commerce domain, effectively enhancing user personalization experiences and satisfaction.

    摘要 I Abstract II 誌謝 IV 目錄 V 圖目錄 IX 表目錄 XI 第 1 章、簡介 1 1.1 研究背景與動機 1 1.2 電子商務個人化搜尋挑戰 3 1.2.1 個人化資料集 3 1.2.2 個人化搜尋 4 1.2.3 快速更新個人化模型 6 1.3 論文貢獻 6 1.4 論文架構 8 第 2 章、文獻審閱 9 2.1 個人化搜尋 9 2.2 推薦系統技術演進 11 2.2.1 基於內容 (Content-based) 的推薦系統 13 2.2.2 協同過濾 (Collaborative Filtering) 推薦系統 14 2.2.3 混合 (Hybrid) 推薦系統 16 2.3 圖協同過濾推薦系統 17 2.3.1 推薦系統中的圖論 18 2.3.2 NGCF 24 2.4 推薦系統可持續更新 32 第 3 章、資料集 34 3.1 電商資料來源 36 3.1.1 追蹤資料表 36 3.1.2 商品資料表 38 3.2 資料前處理 39 3.2.1 挑選互動行為 39 3.2.2 資料清洗 40 3.2.3 剔除內部測試的紀錄 41 3.2.4 剔除疑似爬蟲的紀錄 42 3.2.5 資料前處理成效 45 3.3 資料集建置 46 3.3.1 排除冷啟動問題 46 3.3.2 訓練集與驗證測試集分割 48 3.3.3 小規模資料集 51 3.3.4 應用於個人化搜尋 57 3.4 新交互行為資料集 59 第 4 章、研究方法 61 4.1 個人化搜尋架構 61 4.1.1 語意搜尋模型ABRSS 62 4.1.2 個人化搜尋分數計算方法 65 4.2 個人化模型 66 4.2.1 Matrix Factorization 66 4.2.2 LightGCN 68 4.2.3 損失函數BPR Loss 79 4.2.4 模型訓練 83 4.3 商品嵌入向量初始化方法 83 4.3.1 隨機 (random) 初始化 84 4.3.2 基於語意 (semantic) 關係的初始化 85 4.4 快速更新個人化模型 87 第 5 章、實驗與討論 90 5.1 個人化搜尋評測協定 90 5.1.1 個人化搜尋測試集使用 90 5.1.2 Recall 92 5.1.3 NDCG 93 5.1.4 IDCG and DCG Curve 95 5.1.5 NDCG Curve 96 5.1.6 MRR 96 5.1.7 Relative Improvement 97 5.2 個人化搜尋實驗 98 5.2.1 個人化模型架構對照實驗 98 5.2.2 商品嵌入向量初始化方法對照實驗 105 5.2.3 視覺化分析 108 5.2.4 個人化搜尋對照實驗 115 5.3 快速更新個人化模型實驗 122 5.3.1 資料量與訓練資料時間區段對照實驗 122 5.3.2 快速更新個人化模型對照實驗 125 5.4 實驗結果觀察 130 5.4.1 個人化搜尋結果觀察 130 5.4.2 實驗結果小結 137 第 6 章、結論與未來展望 138 6.1 結論 138 6.2 未來展望 139 6.2.1 線上A/B Test 139 6.2.2 推薦系統 140 6.2.3 基於知識圖譜推薦系統 140 參考文獻 141 附件A、使用者與商品互動紀錄 145 附件B、中文與英文詞彙對照表 181 附件C、口試委員之建議與答覆 192

    [1] X. He, K. Deng, X. Wang, Y. Li, Y. Zhang, and M. Wang, "Lightgcn: Simplifying and powering graph convolution network for recommendation," in Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020, pp. 639-648.
    [2] C.-C. Yang, "E-commerce Information Retrieval Technology Based on Intent Semantic (eCom-Iris)," Master, Department of Electrical Engineering, National Taiwan University of Science and Technology, 2022.
    [3] C.-W. Yang, "Optimizing eCommerce Product Search: Utilizing Batch-Negative and Domain-Adaptive Pre-training Models, along with Test Dataset Augmentation to Enhance Search Performance and Evaluation Accuracy.," Master, Department of Electrical Engineering, National Taiwan University of Science and Technology, 2023.
    [4] M. Sanderson, "Test collection based evaluation of information retrieval systems," Foundations and Trends® in Information Retrieval, vol. 4, no. 4, pp. 247-375, 2010.
    [5] P. Nigam et al., "Semantic product search," in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2019, pp. 2876-2885.
    [6] H. Zhang et al., "Towards personalized and semantic retrieval: An end-to-end solution for e-commerce search via embedding learning," in Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2020, pp. 2407-2416.
    [7] S. Li et al., "Embedding-based product retrieval in taobao search," in Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021, pp. 3181-3189.
    [8] Y. Zheng et al., "Multi-Objective Personalized Product Retrieval in Taobao Search," arXiv preprint arXiv:2210.04170, 2022.
    [9] F. O. Isinkaye, Y. O. Folajimi, and B. A. Ojokoh, "Recommendation systems: Principles, methods and evaluation," Egyptian informatics journal, vol. 16, no. 3, pp. 261-273, 2015.
    [10] M. J. Pazzani and D. Billsus, "Content-based recommendation systems," in The adaptive web: methods and strategies of web personalization: Springer, 2007, pp. 325-341.
    [11] J. Lu, D. Wu, M. Mao, W. Wang, and G. Zhang, "Recommender system application developments: a survey," Decision support systems, vol. 74, pp. 12-32, 2015.
    [12] N. Van Dat, P. Van Toan, and T. M. Thanh, "Solving distribution problems in content-based recommendation system with gaussian mixture model," Applied Intelligence, vol. 52, no. 2, pp. 1602-1614, 2022.
    [13] Y. Yang, Y. Zhu, and Y. Li, "Personalized recommendation with knowledge graph via dual-autoencoder," Applied Intelligence, pp. 1-12, 2022.
    [14] B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, "Incremental singular value decomposition algorithms for highly scalable recommender systems," in Fifth international conference on computer and information science, 2002, vol. 1, no. 012002: Citeseer, pp. 27-8.
    [15] Y. Koren, R. Bell, and C. Volinsky, "Matrix factorization techniques for recommender systems," Computer, vol. 42, no. 8, pp. 30-37, 2009.
    [16] S. Rendle, C. Freudenthaler, Z. Gantner, and L. Schmidt-Thieme, "BPR: Bayesian personalized ranking from implicit feedback," arXiv preprint arXiv:1205.2618, 2012.
    [17] Y. Li, K. Liu, R. Satapathy, S. Wang, and E. Cambria, "Recent Developments in Recommender Systems: A Survey," arXiv preprint arXiv:2306.12680, 2023.
    [18] S. Zhang, L. Yao, A. Sun, and Y. Tay, "Deep learning based recommender system: A survey and new perspectives," ACM computing surveys (CSUR), vol. 52, no. 1, pp. 1-38, 2019.
    [19] S. Wu, F. Sun, W. Zhang, X. Xie, and B. Cui, "Graph neural networks in recommender systems: a survey," ACM Computing Surveys, vol. 55, no. 5, pp. 1-37, 2022.
    [20] C. Gao et al., "A survey of graph neural networks for recommender systems: Challenges, methods, and directions," ACM Transactions on Recommender Systems, vol. 1, no. 1, pp. 1-51, 2023.
    [21] R. Burke, "Hybrid web recommender systems," The adaptive web: methods and strategies of web personalization, pp. 377-408, 2007.
    [22] M. Polignano, C. Musto, M. de Gemmis, P. Lops, and G. Semeraro, "Together is better: Hybrid recommendations combining graph embeddings and contextualized word representations," in Proceedings of the 15th ACM Conference on Recommender Systems, 2021, pp. 187-198.
    [23] P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Lio, and Y. Bengio, "Graph attention networks," stat, vol. 1050, no. 20, pp. 10-48550, 2017.
    [24] W. Hamilton, Z. Ying, and J. Leskovec, "Inductive representation learning on large graphs," Advances in neural information processing systems, vol. 30, 2017.
    [25] T. N. Kipf and M. Welling, "Semi-supervised classification with graph convolutional networks," arXiv preprint arXiv:1609.02907, 2016.
    [26] N. Mu, D. Zha, Y. He, and Z. Tang, "Graph attention networks for neural social recommendation," in 2019 IEEE 31st international conference on tools with artificial intelligence (ICTAI), 2019: IEEE, pp. 1320-1327.
    [27] W. Song, Z. Xiao, Y. Wang, L. Charlin, M. Zhang, and J. Tang, "Session-based social recommendation via dynamic graph attention networks," in Proceedings of the Twelfth ACM international conference on web search and data mining, 2019, pp. 555-563.
    [28] T. Bai, Y. Zhang, B. Wu, and J.-Y. Nie, "Temporal graph neural networks for social recommendation," in 2020 IEEE International Conference on Big Data (Big Data), 2020: IEEE, pp. 898-903.
    [29] R. Ying, R. He, K. Chen, P. Eksombatchai, W. L. Hamilton, and J. Leskovec, "Graph convolutional neural networks for web-scale recommender systems," in Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, 2018, pp. 974-983.
    [30] X. Wang, X. He, M. Wang, F. Feng, and T.-S. Chua, "Neural graph collaborative filtering," in Proceedings of the 42nd international ACM SIGIR conference on Research and development in Information Retrieval, 2019, pp. 165-174.
    [31] S. Kabbur, X. Ning, and G. Karypis, "Fism: factored item similarity models for top-n recommender systems," in Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, 2013, pp. 659-667.
    [32] F. Xue, X. He, X. Wang, J. Xu, K. Liu, and R. Hong, "Deep item-based collaborative filtering for top-n recommendation," ACM Transactions on Information Systems (TOIS), vol. 37, no. 3, pp. 1-25, 2019.
    [33] A. L. Maas, A. Y. Hannun, and A. Y. Ng, "Rectifier nonlinearities improve neural network acoustic models," in Proc. icml, 2013, vol. 30, no. 1: Atlanta, GA, p. 3.
    [34] J. Kirkpatrick et al., "Overcoming catastrophic forgetting in neural networks," Proceedings of the national academy of sciences, vol. 114, no. 13, pp. 3521-3526, 2017.
    [35] S. Ding, F. Feng, X. He, Y. Liao, J. Shi, and Y. Zhang, "Causal incremental graph convolution for recommender system retraining," IEEE Transactions on Neural Networks and Learning Systems, 2022.
    [36] Y. Sun and Y. Zhang, "Conversational recommender system," in The 41st international acm sigir conference on research & development in information retrieval, 2018, pp. 235-244.
    [37] J. Guo, Y. Cai, Y. Fan, F. Sun, R. Zhang, and X. Cheng, "Semantic models for the first-stage retrieval: A comprehensive review," ACM Transactions on Information Systems (TOIS), vol. 40, no. 4, pp. 1-42, 2022.
    [38] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "Bert: Pre-training of deep bidirectional transformers for language understanding," arXiv preprint arXiv:1810.04805, 2018.
    [39] N. Reimers and I. Gurevych, "Sentence-bert: Sentence embeddings using siamese bert-networks," arXiv preprint arXiv:1908.10084, 2019.
    [40] W.-T. Guo, "Optimization of Named Entity Recognition (NER) in E-commerce: Applications and Contributions of Pre-trained Models, Question Answering Architecture, and Uncertainty-oriented Training Data Selection," Master, Department of Electrical Engineering, National Taiwan University of Science and Technology, 2023.
    [41] E. Lundquist. "Factorization Machines for Item Recommendation with Implicit Feedback Data." https://towardsdatascience.com/factorization-machines-for-item-recommendation-with-implicit-feedback-data-5655a7c749db (accessed May 1, 2023).
    [42] Q. Li, Z. Han, and X.-M. Wu, "Deeper insights into graph convolutional networks for semi-supervised learning," in Proceedings of the AAAI conference on artificial intelligence, 2018, vol. 32, no. 1.
    [43] D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
    [44] X. Glorot and Y. Bengio, "Understanding the difficulty of training deep feedforward neural networks," in Proceedings of the thirteenth international conference on artificial intelligence and statistics, 2010: JMLR Workshop and Conference Proceedings, pp. 249-256.
    [45] The PyTorch implementation of LightGCN. (2020). GitHub. [Online]. Available: https://github.com/gusye1234/LightGCN-PyTorch
    [46] L. Van der Maaten and G. Hinton, "Visualizing data using t-SNE," Journal of machine learning research, vol. 9, no. 11, 2008.
    [47] Q. Guo et al., "A survey on knowledge graph-based recommender systems," IEEE Transactions on Knowledge and Data Engineering, vol. 34, no. 8, pp. 3549-3568, 2020.

    無法下載圖示 全文公開日期 2028/12/28 (校內網路)
    全文公開日期 2028/12/28 (校外網路)
    全文公開日期 2028/12/28 (國家圖書館:臺灣博碩士論文系統)
    QR CODE