研究生: |
鄒承翰 Cheng-Han Chou |
---|---|
論文名稱: |
基於文字之語意分析及偏好汲取用於資料非均衡問題 Semantic Analysis and Preference Capturing onData Imbalance Problem |
指導教授: |
戴碧如
Bi-Ru Dai |
口試委員: |
戴志華
Chih-Hua Tai 沈之涯 Chih-Ya Shen 陳怡伶 Yi-Ling Chen |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 資訊工程系 Department of Computer Science and Information Engineering |
論文出版年: | 2021 |
畢業學年度: | 109 |
語文別: | 英文 |
論文頁數: | 46 |
中文關鍵詞: | 推薦系統 、注意力機制 、深度學習 、資料非均衡 |
外文關鍵詞: | Recommendation System, Attention, Deep Learning, Data Imbalance |
相關次數: | 點閱:238 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在現今社會中人們每天都會收到大量的信息。然而,他們只對符合他們偏好的信息感興趣。因此,檢索此類信息成為一項重要任務。基於矩陣分解 (MF) 的方法在推薦任務上取得了相當好的表現。然而,基於 矩陣分解 的方法存在幾個關鍵的問題,例如冷啟動問題和數據稀疏性問題。為了解決上述提到的問題,有許多獲得出色表現的推薦模型被提出。儘管如此,我們認為沒有一個更全面的架構可以通過汲取用戶偏好和項目趨勢來提高模型的表現。因此,我們提出了一種解決上述問題的新方法。在這個嶄新的框架中採用了分層結構。此外,還提出並試驗了額外的採樣技術,以提高所提出模型的性能。我們所提出的模型性能優於最先進的模型並且在幾個現實世界的數據集上取得優異的表現。實驗結果驗證了我們的框架即使在稀疏數據下也能提取有用的特徵。
Nowadays, people receive an enormous amount of information from day to day. However, they are only interested in information which matches their preferences. Thus, retrieving such information becomes an significant task. Matrix Factorization (MF) based methods achieve fairly good performances on recommendation tasks and in our case, reviews on e-commerce platforms. However, there exist several crucial issues with MF-based methods such as cold-start problems and data sparseness. In order to address the above issues, numerous recommendation models are proposed which obtained stellar performances. Nonetheless, we figured that there is not a more comprehensive framework that enhances its performance through retrieving user preference and item trend. Hence, we propose a novel approach to tackle the aforementioned issues. A hierarchical construction is employed in this proposed framework. Furthermore, additional sampling techniques are proposed and experimented in order to enhance the performance of the proposed model. The performance excels in comparison to state-of-the-art models by testing on several real-world datasets. Experimental results verified that our framework can extract useful features even under sparse data.
[1]Y.Koren,R.Bell,andC.Volinsky,“Matrixfactorizationtechniquesforrecommendersystems,”Computer, vol. 42, no. 8, pp. 30–37, 2009.
[2]D. Kim, C. Park, J. Oh, S. Lee, and H. Yu, “Convolutional matrix factorization for document contextaware recommendation,” inProceedings of the 10th ACM Conference on Recommender Systems,RecSys ’16, (New York, NY, USA), p. 233–240, Association for Computing Machinery, 2016.
[3]C. Wang and D. M. Blei, “Collaborative topic modeling for recommending scientific articles,” inProceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and datamining, pp. 448–456, 2011.
[4]Q.WangandM.Zhang, “Personalizedneuralusefulnessnetworkforratingprediction,” in2019 IEEEInternational Conference on Data Mining (ICDM), pp. 1384–1389, IEEE, 2019.
[5]D. Liu, J. Li, B. Du, J. Chang, and R. Gao, “Daml: Dual attention mutual learning between ratingsandreviewsforitemrecommendation,” inProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 344–352, 2019.
[6]R. Catherine and W. Cohen, “Transnets: Learning to transform for recommendation,” inProceedingsof the eleventh ACM conference on recommender systems, pp. 288–296, 2017.
[7]Y. Tay, A. T. Luu, and S. C. Hui, “Multipointer coattention networks for recommendation,” inProceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 2309–2318, 2018.
[8]S. Seo, J. Huang, H. Yang, and Y. Liu, “Interpretable convolutional neural networks with dual localand global attention for review rating prediction,” inProceedings of the eleventh ACM conference onrecommender systems, pp. 297–305, 2017.
[9]L. Wu, C. Quan, C. Li, Q. Wang, B. Zheng, and X. Luo, “A contextaware useritem representationlearningforitemrecommendation,”ACM Transactions on Information Systems (TOIS),vol.37,no.2,pp. 1–29, 2019.
[10]X. He and T.S. Chua, “Neural factorization machines for sparse predictive analytics,” inProceedings of the 40th International ACM SIGIR conference on Research and Development in InformationRetrieval, pp. 355–364, 2017.
[11]L. Zheng, V. Noroozi, and P. S. Yu, “Joint deep modeling of users and items using reviews for recommendation,” inProceedings of the tenth ACM international conference on web search and datamining, pp. 425–434, 2017.
[12]J. Van Hulse, T. M. Khoshgoftaar, and A. Napolitano, “Experimental perspectives on learning fromimbalanced data,” inProceedings of the 24th international conference on Machine learning, pp. 935–942, 2007.43
[13]N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “Smote: synthetic minority oversampling technique,”Journal of artificial intelligence research, vol. 16, pp. 321–357, 2002.
[14]H.Han, W.Y.Wang,andB.H.Mao, “Borderlinesmote: anewoversamplingmethodinimbalanceddatasetslearning,”inInternationalconferenceonintelligentcomputing,pp.878–887,Springer,2005.
[15]B. Krawczyk, “Learning from imbalanced data: open challenges and future directions,”Progress inArtificial Intelligence, vol. 5, no. 4, pp. 221–232, 2016.
[16]N. Moniz, R. Ribeiro, V. Cerqueira, and N. Chawla, “Smoteboost for regression: Improving the prediction of extreme values,” in2018 IEEE 5th international conference on data science and advancedanalytics (DSAA), pp. 150–159, IEEE, 2018.
[17]H. Wang, N. Wang, and D.Y. Yeung, “Collaborative deep learning for recommender systems,” inProceedings of the 21th ACM SIGKDD international conference on knowledge discovery and datamining, pp. 1235–1244, 2015.
[18]I. Konstas, V. Stathopoulos, and J. M. Jose, “On social networks and collaborative recommendation,”inProceedings of the 32nd international ACM SIGIR conference on Research and development ininformation retrieval, pp. 195–202, 2009.
[19]Y.Koren,“Collaborativefilteringwithtemporaldynamics,”inProceedingsofthe15thACMSIGKDDinternational conference on Knowledge discovery and data mining, pp. 447–456, 2009.
[20]Y. Hu, Y. Koren, and C. Volinsky, “Collaborative filtering for implicit feedback datasets,” in2008Eighth IEEE International Conference on Data Mining, pp. 263–272, Ieee, 2008.
[21]B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, “Itembased collaborative filtering recommendationalgorithms,” inProceedings of the 10th international conference on World Wide Web, pp. 285–295,2001.
[22]O. N. Osmanlı, “A singular value decomposition approach for recommendation systems,” Master’sthesis, 2010.[23]I.ManiandI.Zhang,“knnapproachtounbalanceddatadistributions: acasestudyinvolvinginformation extraction,” inProceedings of workshop on learning from imbalanced datasets, vol. 126, ICMLUnited States, 2003.
[24]C. Bunkhumpornpat, K. Sinapiromsaran, and C. Lursinsap, “Safelevelsmote: Safelevelsyntheticminority oversampling technique for handling the class imbalanced problem,” inPacificAsia conference on knowledge discovery and data mining, pp. 475–482, Springer, 2009.
[25]S. Wang, W. Liu, J. Wu, L. Cao, Q. Meng, and P. J. Kennedy, “Training deep neural networks onimbalanced data sets,” in2016 international joint conference on neural networks (IJCNN), pp. 4368–4374, IEEE, 2016.44
[26]S. H. Khan, M. Hayat, M. Bennamoun, F. A. Sohel, and R. Togneri, “Costsensitive learning of deepfeature representations from imbalanced data,”IEEE transactions on neural networks and learningsystems, vol. 29, no. 8, pp. 3573–3587, 2017.
[27]C. Zhang, K. C. Tan, and R. Ren, “Training costsensitive deep belief networks on imbalance dataproblems,”in2016internationaljointconferenceonneuralnetworks(IJCNN),pp.4362–4367,IEEE,2016.
[28]Y. Zhang, L. Shuai, Y. Ren, and H. Chen, “Image classification with category centers in class imbalancesituation,”in201833rdYouthAcademicannualconferenceofChineseAssociationofAutomation(YAC), pp. 359–363, IEEE, 2018.[29]C.Huang,Y.Li,C.C.Loy,andX.Tang,“Learningdeeprepresentationforimbalancedclassification,”inProceedings of the IEEE conference on computer vision and pattern recognition, pp. 5375–5384,2016.
[30]S. Ando and C. Y. Huang, “Deep oversampling framework for classifying imbalanced data,” inJointEuropean Conference on Machine Learning and Knowledge Discovery in Databases, pp. 770–785,Springer, 2017.[31]Q.Dong,S.Gong,andX.Zhu,“Imbalanceddeeplearningbyminorityclassincrementalrectification,”IEEE transactions on pattern analysis and machine intelligence, vol. 41, no. 6, pp. 1367–1381, 2018.
[32]S. Rendle, “Factorization machines,” in2010 IEEE International Conference on Data Mining,pp. 995–1000, IEEE, 2010.
[33]J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word representation,” inProceedings of the 2014 conference on empirical methods in natural language processing (EMNLP),pp. 1532–1543, 2014.
[34]E. Jang, S. Gu, and B. Poole, “Categorical reparameterization with gumbelsoftmax,”arXiv preprintarXiv:1611.01144, 2016.
[35]S. Hochreiter and J. Schmidhuber, “Long shortterm memory,”Neural computation, vol. 9, no. 8,pp. 1735–1780, 1997.
[36]C. Tu, H. Liu, Z. Liu, and M. Sun, “Cane: Contextaware network embedding for relation modeling,”inProceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume1: Long Papers), pp. 1722–1731, 2017.
[37]J. Ni, J. Li, and J. McAuley, “Justifying recommendations using distantlylabeled reviews and finegrained aspects,” inProceedings of the 2019 Conference on Empirical Methods in Natural LanguageProcessing and the 9th International Joint Conference on Natural Language Processing (EMNLPIJCNLP), pp. 188–197, 2019.