簡易檢索 / 詳目顯示

研究生: 汪怡銘
Yi-Ming Wang
論文名稱: 模糊化群體導向推薦系統之特徵重要度評估
Feature Importance Evaluation under Fuzzy Group Based Recommendation System
指導教授: 楊朝龍
Chao-Lung Yang
口試委員: 楊朝龍
Chao-Lung Yang
郭人介
Ren-Jieh Kuo
花凱龍
Kai-Lung Hua
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2019
畢業學年度: 107
語文別: 英文
論文頁數: 83
中文關鍵詞: 特徵選取特徵重要度/貢獻度隨機森林演算法費雪線性判別比值群體導向推薦系統模糊理論
外文關鍵詞: Feature Selection, Feature Importance/Contribution, Random Forest, Fisher Discriminant Ratio, Group Based Recommendation Systems, Fuzzy Theory
相關次數: 點閱:211下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本研究之目的為開發一個以模糊群體導向為基礎之推薦系統架構,並利用特徵選取方法產生之資料特徵重要度,作為決定群體購買行為歸屬度之影響值,以作為客群對特定商品購買與否之評分。在模擬的推薦機制下,本研究以顧客購買行為被推薦模型預測所涵蓋的比例(涵蓋率)來評估不同特徵選取方法對於此推薦系統的影響,並利用金融資料以及購物資料驗證此推薦模型之效能。本研究所使用的三種特徵選取方法分別為費雪線性判別比值法、費雪線性判別比值法、隨機森林演算法。研究結果顯示出費雪線性判別比值法能夠使模型在眾多客群裡找出涵蓋率特別高的群體(本研究稱極端客群)。這些極端的客群佔整體客群人數的比例雖不高,卻擁有相當高的被涵蓋比例(與其他低被涵蓋比例的客群相比)。極端客群的發現除了在實務上能夠幫助企業改善商品推薦上之預算最佳化,並且能提供企業進行更為精準的行銷活動。


    This research aims to propose a fuzzy group based recommendation system structure, which generates feature importance from dataset using feature selection methods. Feature importance will be considered as the influence level of the group based purchasing behavior, and further be used to calculate the rating that quantifies customers’ buying desire to a certain merchandise. Under the simulation of recommendation mechanism, this research utilizes the ratio of customers that are covered by the proposed model’s prediction with their purchased merchandise (coverage rate), evaluating the effect caused by different feature selection methods on the proposed structure. Financial dataset and shopping transaction dataset are also used to verify the performance. The proposed feature selection methods are: Fisher Discriminant Ratio (FDR), Fisher Discriminant Ratio with Membership Degree (FDRMD) and Random Forest (RF). The results indicate that FDR is capable of discovering singular customer groups, which refers to the few customer groups with extremely high coverage rate based on the recommendation, comparing to other customer groups with low coverage rate. For real world applications, identifying singular customers can benefit the companies in recommendation budget optimization problems or accurate promotions.

    摘要 iv Abstract v 致謝 vi CONTENTS vii FIGURE LIST x TABLE LIST xiii CHAPTER 1 INTRODUCTION 1 1.1 Research Background 1 1.1.1 Recommendation System 1 1.1.2 Customer Segmentation on Recommendation Systems 2 1.1.3 Fuzzy Methodology with Score Based Evaluation 3 1.2 Research Problem: 3 1.2.1 Assumptions of this Research 4 1.2.2 Contribution of Demographic Feature against Purchasing Behavior 4 1.2.3 Singular Recommendation with Budget Optimization 5 1.3 Structure of this Research 6 CHAPTER 2 LITERATURE REVIEW 7 2.1 Group Based Recommendation System (GBRS) 7 2.2 Fuzzy Methodology 8 2.2.1 Fuzzy Set and Membership Degree 8 2.2.2 Fuzzy Measure 9 2.2.3 Fuzzy Integral 11 2.2.4 Fuzzy C-Means 12 2.3 Fuzzy Methodology on Recommendation System 13 2.4 Fuzzy Personalized Scoring Model for Recommendation System 14 2.5 Indicators of Feature Selection 15 2.5.1 Random Forest Importance/Contribution (Machine Learning) 15 2.5.2 Complexity Measurement 16 2.5.3 Meta Heuristics Weighting 16 CHAPTER 3 METHODOLOGY 18 3.1 Research framework of the Key Paper 18 3.1.1 User Clustering 19 3.1.2 Importance Finding 20 3.1.3 Fuzzy Integral Scoring 21 3.1.4 Conclusion, Differences and Extensions 23 3.2 Calculation and evaluation of feature importance 24 3.2.1 Notations 24 3.2.2 Random Forest Importance 26 3.2.3 Fisher Discriminant Ratio (FDR) 27 3.2.4 Fisher Discriminant Ratio with C-means Membership Degree (FDRMD) 28 3.3 Generation of Recommendation List and Verification 29 3.3.1 Generation of Recommendation List (Customer Recommendation) 29 3.3.2 Recommendation Verification 30 CHAPTER 4 EXPERIMENTS 34 4.1 Dataset Introduction 34 4.1.1 Santander Dataset 35 4.1.2 Black Friday Dataset 35 4.2 Customer Clustering Results 36 4.2.1 Santander Dataset (April & Mar, Apr, May) 36 4.2.2 Black Friday Dataset 37 4.2.3 Less Clustering Amount 38 4.3 Choosing the Recommendation Amount 38 4.3.1 Santander Dataset (April & Mar, Apr, May) 39 4.3.2 Black Friday Dataset 40 4.4 Experiment Results 41 4.4.1 Random Forest Importance 41 4.4.2 Fisher Discriminant Ratio (FDR) 43 4.4.3 Fisher Discriminant Ratio with C-means Membership Degree (FDRMD) 44 4.4.4 Evaluation/Comparison of Average Coverage rate 46 4.4.5 Evaluation/Comparison of Singularity 47 4.4.6 Cost of Computation Time 48 4.5 Chapter Recap 49 CHAPTER 5 CONCLUSION 50 5.1 Summary of the Research 50 5.2 Future Works 52 Reference 54 Appendix 57 A.1 Dataset Column Description 57 A.1.1 Santander Dataset 57 A.1.2 Black Friday Dataset 59 A.2 User Clustering Results of Other Datasets 59 A.3 Choosing the Best Recommendation Amount 60 A.4 Experiment Results 61 A.4.1 Group Accuracy of Other Datasets 61 A.4.2 Group Accuracy of Different Clustering Amount under Same Dataset 63 A.4.3 Group Accuracy of Recommending 1 or 2 merchandise 66 A.4.4 Group Accuracy of Random Importance 67

    Bezdek, J. C. (2013). Pattern recognition with fuzzy objective function algorithms: Springer Science & Business Media.
    Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
    Breiman, L. (2002). Manual on setting up, using, and understanding random forests v3. 1. Statistics Department University of California Berkeley, CA, USA, 1.
    Cao, Y., & Li, Y. (2007). An intelligent fuzzy-based recommendation system for consumer electronic products. Expert systems with applications, 33(1), 230-240.
    Chen, Y.-W., & Tzeng, G.-H. (2001). Using fuzzy integral for evaluating subjectively perceived travel costs in a traffic assignment model. European journal of operational research, 130(3), 653-664.
    Choquet, G. (1954). Theory of capacities. Paper presented at the Annales de l'institut Fourier.
    Christensen, I. A., & Schiaffino, S. (2011). Entertainment recommender systems for group of users. Expert systems with applications, 38(11), 14127-14135.
    Dagdoug, M. (2018, 07/28). Black Friday. Retrieved from https://www.kaggle.com/mehdidag/black-friday
    Dunn, J. C. (1973). A fuzzy relative of the ISODATA process and its use in detecting compact well-separated clusters.
    Grabisch, M. (1995). A new algorithm for identifying fuzzy measures and its application to pattern recognition. Paper presented at the Proceedings of 1995 IEEE International Conference on Fuzzy Systems.
    Herlocker, J. L., Konstan, J. A., Terveen, L. G., & Riedl, J. T. (2004). Evaluating collaborative filtering recommender systems. ACM Transactions on Information Systems (TOIS), 22(1), 5-53.
    Hsu, M.-H. (2008). A personalized English learning recommender system for ESL students. Expert systems with applications, 34(1), 683-688.
    Huang, Z. (1997). Clustering large data sets with mixed numeric and categorical values. Paper presented at the Proceedings of the 1st pacific-asia conference on knowledge discovery and data mining,(PAKDD).
    Hwang, B.-N., & Shen, Y.-C. (2015). Decision making for third party logistics supplier selection in semiconductor manufacturing industry: a nonadditive fuzzy integral approach. Mathematical Problems in Engineering, 2015.
    Karypis, G. (2001). Evaluation of item-based top-n recommendation algorithms. Paper presented at the Proceedings of the tenth international conference on Information and knowledge management.
    Kuz'min, V. E., Polishchuk, P. G., Artemenko, A. G., & Andronati, S. A. (2011). Interpretation of QSAR models based on random forest methods. Molecular informatics, 30(6‐7), 593-603.
    Lieberman, H., Van Dyke, N. W., & Vivacqua, A. S. (1999). Let's browse: a collaborative Web browsing agent. Paper presented at the IUI.
    Lin, C.-T., & Lee, C. (1996). Neural fuzzy systems: a neuro-fuzzy synergism to intelligent systems.
    Logesh, R., Subramaniyaswamy, V., Vijayakumar, V., & Li, X. (2018). Efficient user profiling based intelligent travel recommender system for individual and group of users. Mobile Networks and Applications, 1-16.
    Lu, J., Shambour, Q., Xu, Y., Lin, Q., & Zhang, G. (2013). a web‐based personalized business partner recommendation system using fuzzy semantic techniques. Computational Intelligence, 29(1), 37-69.
    Lu, J., Wu, D., Mao, M., Wang, W., & Zhang, G. (2015). Recommender system application developments: a survey. Decision Support Systems, 74, 12-32.
    McCarthy, K., Salamó, M., Coyle, L., McGinty, L., Smyth, B., & Nixon, P. (2006). Group recommender systems: a critiquing based approach. Paper presented at the Proceedings of the 11th international conference on Intelligent user interfaces.
    Minaei-Bidgoli, B., Kortemeyer, G., & Punch, W. F. (2004). Mining feature importance: Applying evolutionary algorithms within a web-based educational system. Paper presented at the Proc. of the Int. Conf. on Cybernetics and Information Technologies, Systems and Applications, CITSA.
    Mollineda, R. A., Sánchez, J. S., & Sotoca, J. M. (2005). Data characterization for effective prototype selection. Paper presented at the Iberian Conference on Pattern Recognition and Image Analysis.
    O’connor, M., Cosley, D., Konstan, J. A., & Riedl, J. (2001). PolyLens: a recommender system for groups of users. Paper presented at the ECSCW 2001.
    Okimoto, L. C., Savii, R. M., & Lorena, A. C. (2017). Complexity Measures Effectiveness in Feature Selection. Paper presented at the 2017 Brazilian Conference on Intelligent Systems (BRACIS).
    Palczewska, A., Palczewski, J., Robinson, R. M., & Neagu, D. (2014). Interpreting random forest classification models using a feature contribution method Integration of reusable systems (pp. 193-218): Springer.
    Pei, M., Goodman, E., & Punch, W. (1998). Feature extraction using genetic algorithms. Paper presented at the Proceedings of the 1st International Symposium on Intelligent Data Engineering and Learning, IDEAL.
    Pei, M., Goodman, E. D., Punch, W. F., & Ding, Y. (1995). Genetic algorithms for classification and feature extraction. Paper presented at the Classification Society Conference.
    Rao, C. R. (1948). The utilization of multiple measurements in problems of biological classification. Journal of the Royal Statistical Society. Series B (Methodological), 10(2), 159-203.
    Ricci, F., Rokach, L., & Shapira, B. (2011). Introduction to recommender systems handbook Recommender systems handbook (pp. 1-35): Springer.
    Santander.Inc. (2016, 12/22). Santander Product Recommendation. Retrieved from https://www.kaggle.com/c/santander-product-recommendation/overview
    Schwämmle, V., & Jensen, O. N. (2010). A simple and fast method to determine the parameters for fuzzy c–means cluster analysis. Bioinformatics, 26(22), 2841-2848.
    Shahryari Nia, A., Olfat, L., Esmaeili, A., Rostamzadeh, R., & Antuchevičienė, J. (2016). Using fuzzy Choquet Integral operator for supplier selection with environmental considerations. Journal of Business Economics and Management, 17(4), 503-526.
    Sotoca, J. M., Sánchez, J., & Mollineda, R. A. (2005). A review of data complexity measures and their applicability to pattern classification problems. Actas del III Taller Nacional de Mineria de Datos y Aprendizaje. TAMIDA, 77-83.
    Sugeno, M. (1974). Theory of fuzzy integrals and its applications. Doct. Thesis, Tokyo Institute of technology.
    Terán, L., & Meier, A. (2010). A fuzzy recommender system for eElections. Paper presented at the International Conference on Electronic Government and the Information Systems Perspective.
    Thorat, P. B., Goudar, R., & Barve, S. (2015). Survey on collaborative filtering, content-based filtering and hybrid recommendation system. International Journal of Computer Applications, 110(4), 31-36.
    Torra, V., & Narukawa, Y. (2006). The interpretation of fuzzy integrals and their application to fuzzy systems. International Journal of Approximate Reasoning, 41(1), 43-58.
    Tzeng, G.-H., & Huang, J.-J. (2011). Multiple attribute decision making: methods and applications: Chapman and Hall/CRC.
    Yang, C.-L., Hsu, S.-C., Hua, K.-L., & Cheng, W.-H. (2019). Fuzzy Personalized Scoring Model for Recommendation System. Paper presented at the ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
    Yang, C.-L., & Quyen, N. T. P. (2018). Data analysis framework of sequential clustering and classification using non-dominated sorting genetic algorithm. Applied Soft Computing, 69, 704-718.
    Yu, Z., Zhou, X., Hao, Y., & Gu, J. (2006). TV program recommendation for multiple viewers based on user profile merging. User modeling and user-adapted interaction, 16(1), 63-82.
    Zhang, Z., Lin, H., Liu, K., Wu, D., Zhang, G., & Lu, J. (2013). A hybrid fuzzy-based personalized recommender system for telecom products/services. Information Sciences, 235, 117-129.

    無法下載圖示 全文公開日期 2024/07/17 (校內網路)
    全文公開日期 2024/07/17 (校外網路)
    全文公開日期 2024/07/17 (國家圖書館:臺灣博碩士論文系統)
    QR CODE