簡易檢索 / 詳目顯示

研究生: 陳立偉
Li-wei Chen
論文名稱: 軟體品質分類模式績效比較與敏感度分析之研究
A Study on Performance Comparisons and Sensibility Analysis of Software Quality Classification Models
指導教授: 黃世禎
Sun-jen Huang
口試委員: 李允中
Yun-jong Lee
鄭炳強
Bing-chiang Jeng
梁德容
De-ron Liang
王有禮
Yue-li Wang
徐俊傑
Chiun-chieh Hsu
李國光
Gwo-guang Lee
學位類別: 博士
Doctor
系所名稱: 管理學院 - 資訊管理系
Department of Information Management
論文出版年: 2008
畢業學年度: 96
語文別: 英文
論文頁數: 81
中文關鍵詞: 軟體品質分類軟體度量資料品質分類模式建構技術分類精確性分類效率性敏感度分析度量與分析
外文關鍵詞: Software Quality Classification, Software Measurement Data Quality, Classification Modeling Technique, Classification Accuracy, Classification Efficiency, Sensibility Analysis, Software Measurement and Analysis
相關次數: 點閱:414下載:8
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 對於軟體產業與學術界而言,能夠較正確且可靠地預測軟體品質一直是一個嚴峻的挑戰。軟體品質分類模式(Software Quality Classification Model)是一種廣為採用的軟體品質預測工具,軟體開發管理者可以透過軟體品質分類模式的結果,找出發生缺失可能性比較大的軟體模組,以便進一步執行較具成本效益的軟體品質改善活動。在軟體度量與分析的文獻中,軟體品質分類模式的相關研究大都僅探討其模式在品質分類上的正確性。然而軟體開發管理者在實際挑選適合其需求的軟體品質分類模式時,除了考慮模式的分類正確性外,模式的分類效率性對其而言亦是很重要的,因為這樣的活動通常需要被頻繁執行以便能較即時且準確地控管好軟體的品質,但不幸的是這些活動所需之軟體度量資料的收集卻又相當費時、費力且耗資源的。此外,實務上的軟體開發專案也常存在著許多不確定性的因素,因此在軟體開發生命週期所收集到的度量資料也往往隱含著不同程度的不精確性,然而相關文獻中對所建置之軟體品質分類模式的敏感度分析(不精確軟體度量資料對模式之分類正確性的影響)也尚未被充分地探索。因此,本論文的主要目的便在於以實證研究的方式來深入探討上述二個重要的議題。
    本研究以模式預測階段時各模式不同的軟體度量資料收集方式為角度,將辨識分析(Discriminant Analysis)、決策樹(Decision Tree)與邏輯迴歸(Logistic Regression)所建構的模式區分為單一週期與多元週期之軟體品質分類模式,然後深入探討這兩種模式之分類精確性與分類效率性的差異,以及進一步探索軟體度量資料的不精確性對這些模式在分類正確性的影響。根據本實證研究的結果,辨識分析與邏輯迴歸模式在整體的分類正確性(Overall Misclassification Rate, Overall MR)、軟體再審視所花費的成本(Software Reappraisal Cost or Type I MR)與敏感度上有較好的表現,而決策樹模式則在軟體失敗成本(Software Failure Cost or Type II MR)、軟體度量資料收集成本上較為優勢。因此,本研究建議軟體開發管理者在挑選適合其本身需求之軟體品質分類模式時,這些重要模式績效影響因素都必須同時列入考量,以便能獲得較精確且全面性的權衡結果。


    A more accurate and reliable prediction of software quality is always a challenge for both the software industrial and academic communities. Software quality classification models have been widely adopted to perform such activities. As various techniques are available, the relatively comparative study of such models is increasingly important. Most of the past comparative studies were only devoted to comparing the classification accuracy of the models. For generating a classification result, numerous of the related base measures need to be collected first. In software practice, however, the collection and verification of these measures are generally labored as well as implemented under an incomplete, vague and uncertain data collection environment. That is, the model’s classification efficiency and sensibility of the classification accuracy also play important roles on affecting the decision of choosing an appropriate software quality classification model that is better suitable to a software development project manager’s specific environments and needs. Unfortunately, these important topics are not adequately explored in software measurement and analysis literature. Therefore, this dissertation aims to adopt an empirical study to research into these important topics.
    Firstly, an industrial software quality dataset is utilized to explore the quantitative classification accuracy and efficiency comparisons of the DA- and LR-based single-cycled and DT-based multi-cycled software quality classification models at the prediction stage. Secondly, the sensitivity analysis of the classification accuracy of the established models is performed based on the different imprecise software measurement data. The experimental results show that The DA- and LR-based models are better in Overall misclassification rate (MR), Type I MR and the sensibility of the classification accuracy, while the DT-based models have better results in Type II MR and the classification efficiency. Accordingly, this dissertation suggests that the re-appraisal cost of Type I MR, software failure cost of Type II MR, collection cost of software measurement data, and the impact of imprecise software measurement data on the classification accuracy should be considered simultaneously when a software development project manager wants to choose an more appropriate software quality classification model for their particular requirements.

    摘要 I Abstract II Acknowledgement III Table of Contents IV List of Figures VI List of Tables VII 1. Introduction 1 1.1 Research Background and Motivation 1 1.2 Research Purpose, Questions and Scope 6 1.3 Outline of the Dissertation 7 2. Literature Review 9 2.1 Evolution of Software Quality Classification Studies 9 2.1.1 Studies on Performance Comparison of Software Quality Classification Models 11 2.1.2 Studies on Quality of Software Measurement Data 13 2.2 Definitions of Single- and Multi-cycled Models 17 2.3 Description of Classification Modeling Techniques 20 2.3.1. Discriminant Analysis (DA) 20 2.3.2. Logistic Regression (LR) 21 2.3.3. Decision Tree (DT) 22 2.3.4. Operation of DA, DT and LR at Prediction Stage 22 3. Research Method 25 3.1. Research Procedure 26 3.2. NASA KC2 Software Quality Dataset 27 3.3 Experiment Design for Accuracy and Efficiency Analysis 29 3.4 Experiment Design for Sensitivity Analysis on Accuracy 31 3.5 Performance Evaluation Criteria 34 3.5.1 Classification Accuracy Indicators 34 3.5.2 Classification Efficiency Indicators 36 4. Accuracy and Efficiency Analysis 39 4.1 Experiment Results 39 4.1.1 Comparison of Classification Accuracy 41 4.1.2 Comparison of Classification Efficiency 43 4.2. Suggestions and Discussions 44 5. Sensitivity Analysis on Accuracy 49 5.1 Experiment Results 49 5.2 Suggestions and Discussions 62 6. Conclusions, Limitations and Future Work 65 6.1 Conclusions 65 6.2 Limitations 70 6.3 Future Work 71 References 73 Curriculum Vitae 79 Publication List 81

    Arisholm E., Briand L.C., & Fuglerud M. (2007). Data Mining Techniques for Building Fault-proneness Models in Telecom Java Software. Proceedings of the 18th IEEE International Symposium on Software Reliability (ISSRE), 215-224.
    Bansiya, J., & Davis, C.G. (2002). A hierarchical model for object-oriented design quality assessment. IEEE Transactions on Software Engineering, 28(1), 4-17.
    Basili, V.R., Condon, S.E., Emam, K.E., Hendrick, R.B., & Melo, W. (1997). Characterizing and modeling the cost of rework in a library of reusable software components. Proceedings of the 19th International Conference on Software Engineering, May, 282-291.
    Bellini, P., Bruno, I., Nesi, P., & Rogai, D. (2005). Comparing fault-proneness estimation models. Proceedings of the IEEE International Conference on Engineering of Complex Computer Systems (ICECCS), 205-214.
    Biggs, D., DeVille, B., & Suen, E. (1991). A method of choosing multi-way partitions for classification and decision trees. Journal of Applied Statistics, 18(1), 49-62.
    Burgin, M., Debnath, N., & Debnath, J. (2006). Fuzzyness and imprecision in software engineering. Proceedings of the 06’ World Automation Congress, 24-26 July, 1-8.
    Burman, P. (1989). A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika, 76, 503-514.
    Chen, J.S., & Cheng, C.H. (2008). Extracting classification rule of software diagnosis using modified MEPA. Expert Systems with Applications, 34(1), 411-418.
    CMU/SEI CMMI Product Team (2002). CMU/SEI-2002-TR-012: Capability Maturity Model Integration (CMMI-SE/SW/IPPD/SS, V1.1, Staged Representation).

    Denaro, G., Morasca, S., & Pezzè, M. (2002). Deriving models of software fault-proneness. Proceedings of the 14th International Conference on Software Engineering and Knowledge Engineering, July, 361-368.
    Ebert, C. (1996). Classification techniques for metric-based software development. Software Quality Journal, 5(4), 255-272.
    Engel, A., & Last, M. (2007). Modeling software testing costs and risks using fuzzy logic paradigm. Journal of Systems and Software, 80(6), 817-835.
    Esteva, J.C. (1990). Learning to recognize reusable software modules using an inductive classification system. Proceedings of the 5th Jerusalem Conference on Information Technology, 22-25 Oct., 278-285.
    Han, Y.S., Park, Y.C., & Choi, K.S. (1996). Efficient inference for sigmoid Bayesian networks by reducing sampling space. Applied Intelligence, 6(4), 275-285.
    Hosmer, D.W., & Lemeshow, S. (1989), Applied Logistic Analysis, New York: Wiley.
    Huang, X.S., Ho, D., Ren, J., & Capretz, L.F. (2007). Improving the COCOMO model using a neuro-fuzzy approach. Applied Soft Computing Journal, 7(1), 29-40.
    ISO/IEC (2002). 15939: Software Engineering-Software Measurement Process.
    Kass, G.V. (1980). An exploratory technique for investigating large quantities of categorical data. Journal of Applied Statistics, 29, 119-127.
    Khoshgoftaar, T.M., Allen, E.B., Kalaichelvan, K.S., & Goel, N. (1996). Early quality prediction: a case study in telecommunications. IEEE Software, 13(1), 65-71.
    Khoshgoftaar, T.M., Allen, E.B., Goel, N., Nandi, A., & Mcmullan, J. (1996) Detection of software modules with high debug code churn in a very large legacy system. Proceedings of the Seventh International symposium on Software Reliability Engineering, White Plains N.Y., Oct., 364-371.
    Khoshgoftaa, T.M., & Allen, E.B. (1999) Logistic regression modeling of software quality. International Journal of Reliability, Quality and Safety Engineering, 6(4), 303-317.

    Khoshgoftaar, T.M., Allen, E.B., Jones, W.D., & Hudepohl, J.I. (1999). Classification tree models of software quality over multiple releases. Proceedings of the 10th International Symposium on Software Reliability Engineering, Nov., 116-125.
    Khoshgoftaar, T.M., & Allen, E.B. (2001) A practical classification rule for software quality models. IEEE Transactions on Reliability, 49(2), 209-216.
    Khoshgoftaar, T.M., & Allen, E.B. (2001). Controlling over-fitting in classification-tree models of software quality. Empirical Software Engineering, 6(1), 59-79.
    Khoshgoftaar, T.M., & Seliya, N. (2004). Comparative assessment of software quality classification techniques: An empirical case study. Empirical Software Engineering, 9(3), 229-257.
    Khoshgoftaar, T.M., & Seliya, N. (2004) The necessity of assuring quality in software measurement data. Proceedings of the 10th International Symposium on Software Metrics, 14-16 Sept., 119-130.
    Khoshgoftaar, T.M., Seliya, N., & Herzberg, A. (2005). Resource-oriented software quality classification models. Journal of Systems and Software, 76(2), 111-126.
    Lanubile, F., & Visaggio, G. (1997). Evaluating predictive quality models derived from software measures: Lessons learned. Journal of Systems and Software, 38(3), 225-234.
    Lo, J.H., Huang, C.Y., Chen, I.Y., Kuo, S.Y., & Lyu, M.R. (2005). Reliability assessment and sensitivity analysis of software reliability growth modeling based on software module structure. Journal of Systems and Software, 76(1), 3-13.
    Liebchen, G.A., & Shepperd, M. (2008). Data sets and data quality in software engineering. Proceedings of the 4th International Workshop on Predictor Models in Software Engineering, 12-13 May, 39-44.
    Menzies, T., Greenwald, J., & Frank, A. (2007). Data Mining Static Code Attributes to Learn Defect Predictors. IEEE Transactions on Software Engineering, 33(1), 2-13.

    Menzies, T., Dekhtyar, A., Distefano, J., & Greenwald, J. (2007). Problems with Precision: A Response to "Comments on 'Data Mining Static Code Attributes to Learn Defect Predictors'". IEEE Transactions on Software Engineering, 33(9), 637-640.
    Mockus, A., & Weiss, D.M. (2000). Predicting risk of software changes. Bell Labs Technical Journal, 5(2), 169-180.
    Munson, J.C., & Khoshgoftaar, T.M. (1992). The detection of fault-prone programs. IEEE Transactions on Software Engineering, 18(5), 423-433.
    NASA Metrics Data Program (2004). NASA KC2 dataset of software fault prediction, 2 December, from http://promise.site.uottawa.ca/SERepository/datasets/kc2.arff.
    Ping, Y., Systa, T., & Muller, H. (2002). Predicting fault-proneness using OO metrics: An industrial case study. Proceedings of the Sixth European Conference on Software Maintenance and Reengineering, 11-13 March, 99-107.
    Pighin, M., & Zarnolo, R. (1997). A predictive metric based on discriminant statistical analysis. Proceedings of the 9th International Conference on Software Engineering, 17-23 May, 262-270.
    Porter, A.A., & Selby, R.W., (1990) Empirically guided software development using metric-based classification trees. IEEE Software, 7(2), 46-54.
    Quinlan, J.R. (1993). C4.5: Programs for Machine Learning, San Mateo, California: Morgan Kaufmann.
    Quinlan, J.R. (1997). Program: C4.5 Release 8, from http://www.rulequest.com/Personal/.
    Rodriguez, V., & Tsai, W.T. (1987). Evaluation of software metrics using discriminant analysis. Journal of Information and Software Technology, 29(3), 245-251.
    SPSS Inc. (2004), SPSS v.13 Tutorial.
    Standish Group (2004). CHAOS demographics and project resolution in the t-hird quarter of 2004, from http://standishgroup.com/sample_research/PDFpages/q3-spotlight.pdf

    SungBack, H., & Kapsu, K. (1998). An empirical study on identifying fault-prone module in large switching system. Proceedings of the Twelfth International Conference on Information Networking, 21-23 Jan., 415-418.
    Takahashi, R. (1997). Software quality classification model based on McCabe's complexity measure. Journal of Systems and Software, 38(1), 61-69.
    Thomas, Z., Rahul, P., & Andreas, Z. (2007). Predicting Defects for Eclipse. Proceedings of the Third International Workshop on Predictor Models in Software Engineering, 9.
    Weller, A.F., Harris, A.J., Ware, J.A., & Jarvis, P.S. (2006) Determining the saliency of feature measurements obtained from images of sedimentary organic matter for use in its classification. Computers and Geosciences, 32(9), 1357-1367.
    Weyns, K., & Host, M. (2007). Sensitivity of website reliability to usage profile changes. Proceedings of the 18th IEEE International Symposium on Software Reliability, 5-9 Nov., 3-8.
    Xie, M., & Hong, G.Y. (1998). A study of the sensitivity of software release time. Journal of Systems and Software, 44(2), 163-168.
    Xiu, Z., & Khoshgoftaar, T.M. (2004). Identification of fuzzy models of software cost estimation. Fuzzy Sets and Systems, 145(1), 141-163.
    Xu, Y., Yang, J.Y., & Lu, J.F. (2005). An efficient kernel-based nonlinear regression method for two-class classification. Proceedings of the International Conference on Machine Learning and Cybernetics, Aug., 4442-4445.

    QR CODE