簡易檢索 / 詳目顯示

研究生: 朱永燾
Yung-Tao Chu
論文名稱: Ranking Instances via Combining Decision Trees and Weighted Naive Bayes
Ranking Instances via Combining Decision Trees and Weighted Naive Bayes
指導教授: 李育杰
Yuh-Jye Lee
口試委員: 鮑興國
H.K. Pao
戴碧如
B.R. Dai
黃文瀚
Wen-Han Hwang
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2008
畢業學年度: 96
語文別: 英文
論文頁數: 35
中文關鍵詞: RankingDecision Tree
外文關鍵詞: Ranking, Decision Tree
相關次數: 點閱:251下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • In this thesis, we propose a novel method. This new method can distinguish the instances in the same leaf node. In traditional cases, the decision trees rank the instances by using the local probability estimation method for each leaf node. This method will not able to rank the instances within the same leaf node. They will be estimated with the equal amount of probabilities. In order to accurately ranking the instances, we should have a different probability estimated in any test cases.
    This thesis provides a combination way of using the decision tree and the weighted Naive Bayes. This new method can be used to rank the leaf node and the instances in the same leaf node. We are using this new method to build three different trees with the same dataset. In order to combine the local probability estimation in these three trees for the same instance, we are also giving the different weights for these trees. In this way, we are able to make each instance has different probability.
    This new novel method is breaking the limitations on only ranking in a leaf node which the instance fall into. However, if we only rely on the probability estimations based on the Naive Bayes, the results might be poor. Thus, we are investigating some other different techniques which were proposed to modify the NaÄive Bayes as well. Based on our experiments, they are showing that our proposed method has significantly better performing than other traditional methods. All our results are evaluated by using AUC (Area under ROC Curve) instead of classification accuracy.


    In this thesis, we propose a novel method. This new method can distinguish the instances in the same leaf node. In traditional cases, the decision trees rank the instances by using the local probability estimation method for each leaf node. This method will not able to rank the instances within the same leaf node. They will be estimated with the equal amount of probabilities. In order to accurately ranking the instances, we should have a different probability estimated in any test cases.
    This thesis provides a combination way of using the decision tree and the weighted Naive Bayes. This new method can be used to rank the leaf node and the instances in the same leaf node. We are using this new method to build three different trees with the same dataset. In order to combine the local probability estimation in these three trees for the same instance, we are also giving the different weights for these trees. In this way, we are able to make each instance has different probability.
    This new novel method is breaking the limitations on only ranking in a leaf node which the instance fall into. However, if we only rely on the probability estimations based on the Naive Bayes, the results might be poor. Thus, we are investigating some other different techniques which were proposed to modify the NaÄive Bayes as well. Based on our experiments, they are showing that our proposed method has significantly better performing than other traditional methods. All our results are evaluated by using AUC (Area under ROC Curve) instead of classification accuracy.

    1 Introduction 2 Fundamental Learning Methods : Decision Tree and Naive Bayes 2.1 Discretization for Numerical Attributes 2.2 Decision Tree Model 2.2.1 Generating Decision Trees 2.2.2 Pruning Strategies 2.3 Naive Bayes Models 2.3.1 Conventional Naive Bayes Model 2.3.2 Weighted Naive Bayes Model 3 Ranking the Instances of Outputs 3.1 Ranking Leaf Nodes 3.2 Ranking Instances with a Hierarchical Structure 3.3 Breaking the Limitation of Hierarchical Ranking 4 Numerical Results and Comparisons 4.1 Comparing the Proposed Method with Other Methods 4.2 Parameters Tuning 5 Conclusion and Future Work

    [1] I. Alvarez and S. Bernard. Ranking cases with decision trees: a geometric method that preserves intelligibility. In Proceedings of 18th Internationa Conference on Artificial Intelligence(IJCAI-2005), 2005.
    [2] C. Blake and C. Merz. Uci repository of machine learning databases. niversity of California (http://www.ics.uci.edu/»mlearn/MLRepository.html), 1998.
    [3] A. P. Bradley. The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognition, 30:1145{1159, 1997.
    [4] L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classi‾cation and Regression Trees. Wadsworth, Belmont, CA, 1984.
    [5] C. E. Brodley and P. E. Utgo®. Multivariate decision trees. Machine Learning, 19(1):45{77, 1995.
    [6] C. Cortes and M. Mohri. AUC optimization vs. error rate minimization. In Advances in Neural Information Processing Systems(NIPS 2003), 2004.
    [7] D. Dash and G. F. Cooper. Model averaging for prediction with discrete bayesian networks. Journal of Machine Learning Research, 5:1177{1203, 2004.
    [8] J. Dougherty, R. Kohavi, and M. Sahami. Supervised and unsupervised discretizations of continuous features. In Proceedings of the 12th International Conference on Machine Learning, pages 194{202, New York, 1995. Morgan Kaufmann.
    [9] M. H. Dunham. Data Mining: Introductory and Advanced Topics. Prentice Hall,
    New Jersey, 2003.
    [10] J. Egan. Signal detection theory and roc analysis. New York:Academic Pres, 1975.
    [11] Usama M. Fayyad and Keki B. Irani. On the handling of continuous-valued attributes in decision tree generation. Machine Learning, 8:87{102, 1992.
    [12] C. Ferri, P. Flach, and J. Hern¶andez-Orallo. Improving the AUC of probabilistic estimators trees. In Proceedings of 14th European Conference on Machine Learning(ECML'2003), pages 121{132, 2003.
    [13] T. GÄartner and P. A. Flach. Wbcsvm: Weighted bayesian classification based on support vector machines. In Proceedings of the 18th International Conference on Machine Learning, pages 207{209, 2001.
    [14] D. Green and J. Swets. Signal detection theory and psychophysics. New York:Wiley, 1966.
    [15] D. J. Hand and R. J. Till. A simple generalisation of the area under the roc curve for multiple class classification problems. Machine Learning, 45:171{186, 2001.
    [16] D.J. Hand and R.J. Till. A simple generalisation of the area under the roc curve for multiple class classification problems. Machine Learning, 45:171{186, 2001.
    [17] Chun-Nan Hsu, Hung-Ju Huang, and Tzu-Tsung Wong. Implications of the dirichlet assumption for discretization of continuous variables in naive bayesian classifiers. Machine Learning, 53:235{263, 2003.
    [18] M. Kantardzic. Data Mining: Concepts, Models, Methods, and Algorithms. Wiley-IEEE Press, New Jersey, 2002.
    [19] R. Kohavi and G. John. Wrappers for feature subset selection. Artificial Intelligence, 97:273{324, 1997.
    [20] P. Langley and S. Sage. Induction of selective bayesian classifiers. In Proceedings of the 10th Conference on Uncertainty in Artificial Intelligence, pages 399{406, 1994.
    [21] Yuh-Jye Lee and Yi-Ren Yeh. Ranking the rules and instances of decision trees. In Industrial Conference on Data Mining - Posters, pages 67{78, 2006.
    [22] X. B. Li, J. R. Sweigart, J. T. C. Teng, J. M. Donohue, L. A. Thombs, and S. M. Wang. Multivariate decision trees using linear discriminants and tabu search. Systems, Man and Cybernetics, Part A, IEEE Transactions on, 33(2):194{205, 2003.
    [23] C. X. Ling, J. Hunag, and H. Zhang. AUC: a statistically consistent and more discriminating measure than accuracy. In Proceedings of 18th Internationa Conference on Artificial Intelligence(IJCAI-2003), 2003.
    [24] C. X. Ling and J. Yan. Decision tree with better ranking. In Proceedings of 2003 International Conference on Machine Learning (ICML'2003), pages 480{487, 2003.
    [25] M. Mehta, J. Rissanen, and R. Agrawal. MDL-based decision tree pruning. In
    Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD-95), pages 216{221. AAAI Press, 1995.
    [26] C. Metz. Basic principles of roc analysis. Seminars in NuclearMedcine, 8:283{298, 1978.
    [27] J. Mingers. An empirical comparison of pruning methods for decision tree induction. Machine Learning, 4(3):227{243, 1989.
    [28] J. Mingers. An empirical comparison of selection measures for decision tree induction. Machine Learning, 3(4):319{342, 1989.
    [29] T. M. Mitchell. Machine Learning. McGraw-Hill, Boston, 1997.
    [30] H. K. Pao, S. C. Chang, and Y. J. Lee. Model trees for hybrid data type classification. In Proceedings of 6th Internationa Conference on Intelligent Data Engineering and Automated Learning, 2005.
    [31] J. Platt. Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. in Advances in Large Margin Classifiers, A. Smola, P. Bartlett, B. SchÄolkopf, and D. Schuurmans, Eds. Cambridge, MA: MIT Press, 2000.
    [32] F. Provost and P. Domingos. Well-trained pets: Improving probability estimation trees. Technical Report CDER #00-04-IS, 2000.
    [33] F. Provost and P. Domingos. Tree induction for probability-based ranking. Machine Learning, 52:199{215, 2003.
    [34] F. Provost and T. Fawcett. Analysis and visualization of classi‾er performance: comparison under imprecise class and cost distribution. In Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, pages 43{48, 1997.
    [35] J. R. Quinlan. Induction of decision trees. Machine Learning, 1(1):81{106, 1986.
    [36] J. R. Quinlan. Simplifying decision trees. International Journal of Man Machine Studies, 27:221{234, 1987.
    [37] J. R. Quinlan. Improved use of continuous attributes in C4.5. Journal of Artificial Intelligence Research, 4:77{90, 1996.
    [38] J. Swets. Measuring the accuracy of diagnostic systems. Science, 240:1285{1293, 1988.
    [39] I. H. Witten and E. Frank. Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Francisco, CA, 2000. http://www.cs.waikato.ac.nz/»ml/index.html.
    [40] B. Zadrozny and C. Elkan. Obtaining calibrated probability estimates from decision trees and naive bayesian classi‾ers. In Proceedings of 18th International Conference on Machine Learning, pages 609{616, 2001.
    [41] H. Zhang and S. Sheng. Learning weighted naive bayes with accurate ranking. In Proceedings of the 4th IEEE International Conference on Data Mining (ICDM'04), pages 567{570, 2004.

    QR CODE