簡易檢索 / 詳目顯示

研究生: 鄭歆蓉
Hsin-Jung Cheng
論文名稱: 發展整合性演化式方法於無雜訊與有雜訊事件記錄檔之流程探勘適應值改善
Developing integrated evolutionary approaches to improve process mining fitness for noise-free and noisy logs
指導教授: 歐陽超
Chao Ou-Yang
口試委員: 郭人介
Ren-Jieh Kuo
王孔政
Kung-Jeng Wang
王福琨
Fu-Kwun Wang
蔡瑞煌
Rua-Huan Tsaih
姚銘忠
Ming-Jong Yao
鄭元杰
Yuan-Jye Tseng
學位類別: 博士
Doctor
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2015
畢業學年度: 103
語文別: 英文
論文頁數: 87
中文關鍵詞: 流程探勘平行結構流程模型基因演算法粒子群演算法差分演化演算法模擬退火法雜訊適應值演化式方法
外文關鍵詞: Process mining, Parallel structures, Process model, Genetic algorithm, Particle swarm optimization, Differential evolution, Simulated annealing, Noisy log, Fitness, Evolutionary algorithm
相關次數: 點閱:412下載:3
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著資訊科技的快速變遷,企業活動的發展也日趨繁複,當企業發展到一定規模時,其企業流程定有相當的複雜程度,欲了解龐雜的企業流程內所隱含之各項資訊,往往不是件容易的事情。企業流程模型是一種讓企業內部所有商業活動或行為能一目了然的模型。透過企業流程模型,企業能以較簡易的方式得知現有流程之情形。流程探勘是一種能探勘出企業流程模型之技術,其目的是從事件記錄檔中探勘出可以表現出記錄在事件記錄檔中的流程行為的模型。一個具有高適應值的探勘流程模型可以反映出大部分記錄在事件記錄檔中的流程行為。然而,事件記錄檔中可能會含有不完整或錯誤的流程資料。因此,本論文旨在討論完整與有雜訊的事件記錄檔之流程探勘問題。
    本論文首先探討在完整的事件記錄檔的流程探勘問題。在一般企業流程模型中,可能同時存在多種流程結構,例如:平行結構、選擇結構、非自由選擇結構、迴圈結構…等。然而,平行結構是最難探勘的部分。因此,本論文發展一個基於基因流程探勘法、粒子群演算法與差分演化演算法之整合式演化式方法,目的是從含有複合平行結構的完整的事件記錄檔中找出具有高適應值之流程模型。實驗結果顯示,本論文提出之整合式演化式方法可以從含有複合平行結構之事件記錄檔中有效地探勘出具有高適應值之流程模型,並可利用探勘出之流程模型進行持續性之流程改善。
    接著,本論文探討在有雜訊的事件記錄檔的流程探勘問題。有雜訊的事件記錄檔是指事件記錄是不完整或有錯誤在內。因此,本論文發展一個整合粒子群演算法與模擬退火法之演化式方法,透過粒子群演算法與模擬退火法能降低雜訊干擾的特性,從含有雜訊之事件記錄檔中探勘出具有高適應值之流程模型。經實驗證明,本論文提出之整合粒子群演算法與模擬退火法之演化式方法,較現有最好之流程探勘法能更有效率地從含有雜訊之事件記錄檔中探勘出具有高適應值之流程模型。


    With the changes of information technology, the complexity of the development of business activities has increased markedly. To understand the connotative information of complex business processes is difficult. A process model is a way that can describe the internal activities or behavior in businesses. Process mining (PM) is a technique to extract a process model from an event log to represent the process behavior recorded in that event log. A mined process model with high fitness means that it can reflect most of the process behavior recorded in the event log. Previous studies have shown that the mined model with high fitness can be used in process improvement, such as fraud detection, continuous process improvement, and benchmarking. Additionally, event logs may contain incomplete or incorrect process data. Such logs are called noisy logs. Therefore, this dissertation aims to study the discovery of process models with high fitness for noise-free and noisy event logs.
    Firstly, this dissertation considers discovering parallel (AND) structures in PM. There are several problematic structures in PM, including parallel (AND) structures, exclusive-choice (XOR) structures, non-free-choice structures, loops, and noise. Some PM approaches have been conducted to address only one or a few problematic structures. Genetic process mining (GPM) is a well-known PM method which can simultaneously handle most of the problematic structures. However, GPM still cannot effectively discover parallel structures from noise-free logs. This dissertation proposes a PM approach based on integration of GPM, particle swarm optimization (PSO), and differential evolution (DE) to find process models with high fitness for noise-free logs involving multiple parallel structures. The results show that the proposed approach does indeed lead to improvement in gaining process models with high fitness for event logs involving multiple parallel structures.
    Secondly, this dissertation considers discovering process models from noisy logs. In some industries (such as hospitals), data must sometimes be manually recorded and data concerning some tasks that are performed during emergency situations may be missing. Manual logging may result in logging errors. Missing events result in incomplete logs. Therefore, event logs may contain incorrect process data. These logs are called noisy logs. This dissertation develops a PSOSA approach that combines particle swarm optimization (PSO) with simulated annealing (SA) to discover process models with high fitness from noisy logs. The results achieved using this PSOSA approach reveal that it improves fitness of process models found using noisy logs.

    摘要 Abstract 誌謝 Contents List of Figures List of Tables Chapter 1 Introduction 1.1. Research motivation 1.2. Research objectives 1.3. Organization of dissertation Chapter 2 Literature Review 2.1. Event logs and types of noise 2.2. Process mining 2.3. Petri net 2.4. Genetic algorithm 2.5. Particle swarm optimization 2.6. Differential evolution 2.7. PSO-DE 2.8. Simulated annealing Chapter 3 Discovering Process Models from Noise-free Logs 3.1. Problem statement 3.2. Proposed approach 3.2.1 GPM 3.2.2 PSO 3.2.3 DE 3.3. Cases study and results discussion 3.3.1 Event log of case study 3.3.2 Parameter design 3.3.3 Experimental setup 3.3.4 Results discussion 3.4. Summary Chapter 4 Discovering Process Models from Noisy Logs 4.1. Problem statement 4.2. Proposed approach 4.2.1 PSO 4.2.2 SA 4.3. Experiments 4.3.1 Event log used in experiments 4.3.2 Parameter setting 59 4.3.3 Experimental setup 4.3.4 Results and discussion 4.4. Scenario-based study and results discussion 4.4.1 Scenario-based study of acute ischemic stroke 4.4.2 Event log used in scenario-based study 4.4.3 Experimental setup 4.4.4 Results and discussion 4.5. Summary Chapter 5 Conclusions and Future Work 5.1. Conclusions 5.2. Future research References Appendix Appendix A. Related curves of 100 generations for 100 runs in four cases for Chapter 3 Appendix B. Author Resume

    Anyanwu, K., Sheth, A., Cardoso, J., Miller, J., Kochut, K. 2003. "Healthcare enterprise process development and integration." Journal of Research and Practice in Information Technology, 35(2): 83-98.
    Agrawal, R., D. Gunopulos, and F. Leymann. 1998. "Mining Process Models from Workflow Logs." Sixth International Conference on Extending Database Technology, 1377: 469-483.
    van der Aalst, W.M.P., T. Weijters, and L. Maruster. 2004. "Workflow Mining: Discovering Process Models from Event Logs." IEEE Transactions on Knowledge and Data Engineering, 16(9): 1128-1142.
    van der Aalst, W.M.P., H.A. Reijers, A.J.M.M. Weijters, B.F. van Dongen, A.K.Alves de Medeiros, de Medeiros, M. Song, H.M.W. Verbeek. 2007. "Business process mining: An industrial application." Information Systems, 32(5): 713-732.
    Aber, S., D. Salari, and M.R. Parsa. 2010. "Employing the Taguchi method to obtain the optimum conditions of coagulation-flocculation process in tannery wastewater treatment." Chemical Engineering Journal, 162(1): 127-134.
    Buijs, J.C.A.M., B.F. van Dongen, and W.M.P. van der Aalst. 2012. "On the role of fitness, precision, generalization and simplicity in process discovery." In OTM Federated Conferences, 20th International Conference on Co-operative Information Systems. LNCS, 7565: 305-322. Springer, Berlin.
    Bezerra, F., J. Wainer, and W.M.P. van der Aalst. 2009. "Anomaly detection using process mining." In: Enterprise, Business-Process and Information Systems Modeling. Lecture Notes in Business Information Processing, 29: 149-161. Springer.
    Blum, C., and A. Roli. 2003. "Metaheuristics in combinatorial optimization: overview and conceptual comparison." ACM Computing Surveys, 35(3): 268-308.
    Bouleimen, K., and H. Lecocq. 2003. "A new efficient simulated annealing algorithm for the resource-constrained project scheduling problem and its multiple mode version." European Journal of Operational Research, 149: 268-281.
    Banchs R.E. 1997. "Simulated Annealing." Research progress report #15 on Time Harmonic Field Electric Logging, The University of Texas at Austin, http://www.rbanchs.com/documents/THFEL_PR15.pdf.
    Ciccone A., L. Valvassori, M. Nichelatti, A. Sgoifo, M. Ponzio, R. Sterzi, E. Boccardi; SYNTHESIS Expansion Investigators. 2013. "Endovascular treatment for acute ischemic stroke." N Engl J Med., 368(10): 904-13.
    Cordón, O., F. Herrera, F. Hoffmann, L. Magdalena. 2001. Genetic Fuzzy Systems: Evolutionary Tuning and Learning of Fuzzy Knowledge Bases, Singapore: World Scientific.
    Carretero, J., and F. Xhafa. 2006. "Use of genetic algorithms for scheduling jobs in large scale grid applications." Technological and Economic Development of Economy, 12(1): 11-17.
    Clerc, M., and J. Kennedy. 2002. "The particle swarm–Explosion, stability, and convergence in a multi-dimensional complex space." IEEE Transactions on Evolutionary Computation, 6(1): 58-73.
    Chen, L., X. Li, and Q. Yang. 2012. "Continuous process improvement based on adaptive workflow mining technique." Journal of Computational Information Systems, 8(7): 2891-2898.
    Cook, J.E., and A.L. Wolf. 1998a. "Discovering Models of Software Processes from Event-Based Data." ACM T SOFTW ENG METH, 7(3): 215-249.
    Cook, J.E., and A.L. Wolf. 1998b. "Event-Based Detection of Concurrency." In Proceedings of the Sixth International Symposium on the Foundations of Software Engineering (FSE-6) Orlando, FL, 23(6): 35-45.
    van Dongen, B.F., A.K.A. de Medeiros, H.M.W. Verbeek, A.J.M.M. Weijters, W.M.P. van der Aalst. 2005. "The ProM Framework: A new era in process mining tool support." In G. Ciardo and P. Darondeau, editors, Application and Theory of Petri Nets. LNCS, 3536: 444-454. Springer-Verlag, Berlin.
    Formanowicz, D., A. Sackmann, P. Formanowicz, J. Błażewicz. 2007. "Petri net based model of the body iron homeostasis." Journal of Biomedical Informatics, 40: 476-485.
    GBD 2013 Mortality and Causes of Death, Collaborators (17 December 2014). Global, regional, and national age-sex specific all-cause and cause-specific mortality for 240 causes of death, 1990-2013: a systematic analysis for the Global Burden of Disease Study 2013. Lancet, 385: 117-71.
    Gぴunther, C.W. and W.M.P. van der Aalst. 2006. "A Generic Import Framework for Process Event Logs." In J. Eder and S. Dustdar, editors, Business Process Management Workshops, Workshop on Business Process Intelligence (BPI 2006). Lecture Notes in Computer Science, 4103: 81-92. Springer-Verlag, Berlin.
    Gerke, K., K. Petruch, G. Tamm. 2010. "Optimization of service delivery through continual process improvement: A case study." In Proceedings of the 2nd International Symposium on Services Science. Lecture Notes in Informatics from the Germany Computer Science Society, 94-107, Springer, Berlin, Heidelberg, Germany.
    Günther, C.W., and W.M.P. van der Aalst. 2007. "Fuzzy Mining - Adaptive Process Simplification Based on Multi-perspective Metrics." In: Alonso, G., Dadam, P., Rosemann, M. (eds.) BPM2007. LNCS, 4714: 328-343. Springer, Heidelberg.
    Hwang, S.Y., and W.S. Yang. 2002. "On the discovery of process models from their instances." Decision Support Systems, 34: 41-57.
    Holland, J.H. 1975. Adaptation in Natural and Artificial System. The University of Michigan Press, Ann Arbor, MI.
    Hendtlass, T. 2001. "A Combined Swarm Differential Evolution Algorithm for Optimization Problems." In Proc. 14th Int. Conf. on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems, Lecture notes in Computer Science, 2070: 11-18, Springer, Heidelberg.
    Huang, Z., X. Lu, and H. Duan. 2012. "On mining clinical pathway patterns from medical behaviors." Artificial Intelligence in Medicine, 56(1), 35-50.
    Kennedy, J., and R.C. Eberhart. 1995. "Particle swarm optimization." Proceedings of IEEE International Conference on Neural Networks, IV, 1942-1948.
    Kuo, R.J., and L.M. Lin. 2010. "Application of a hybrid of genetic algorithm and particle swarm optimization algorithm for order clustering." Decision Support Systems, 49(4): 451-462.
    Kim, E., S. Kim, M. Song, S. Kim, D. Yoo, H. Hwang, S. Yoo. 2013. "Discovery of outpatient care process of a tertiary university hospital using process mining." Health Informatics Research, 19(1): 2-49.
    Kirkpatrick, S., C.D. Gelatt, and M.P. Vecchi. 1983. "Optimization by simulated annealing." Science, 220(4598): 671-680.
    Lin, C.T., C.P. Jou, and C.J. Lin. 1998. "GA-based reinforcement learning for neural networks." International Journal of Systems Science, 29(3): 233-247.
    Liu, H., Z. Cai, and Y. Wang. 2010. "Hybridizing particle swarm optimization with differential evolution for constrained numerical and engineering optimization." Applied Soft Computing, 10(2): 629-640.
    Lian, Z.G., X.S. Gu, and B. Jiao. 2008. "A novel particle swarm optimization algorithm for permutation flow-shop scheduling to minimize makespan." Chaos, Solitons & Fractals, 35(5): 851-861.
    Lenz, R. and M. Reichert. 2007. "IT support for healthcare processes—premises, challenges, perspectives." Data & Knowledge Engineering, 61(1): 39-58.
    de Medeiros, A.K.A., A.J.M.M. Weijters, and W.M.P. van der Aalst. 2007. "Genetic process mining: an experimental evaluation." Data Mining and Knowledge Discovery, 14(2): 245-304.
    Murata, T. 1989. "Petri Nets: Properties, Analysis and Applications." Proceedings of the IEEE, 77(4): 541-580.
    Mans, R.S., M.H. Schonenberg, G. Leonardi, S. Panzarasa, A. Cavallini, S. Quaglini, W.M.P. van der Aalst. 2008. "Process Mining Techniques: an Application to Stroke Care." Studies in Health Technology and Informatics, 136: 573-578.
    Ni, J.C., Li, L., Qiao, F., Wu, Q. 2013. "A novel memetic algorithm and its application to data clustering." Memetic Computing, 5(1): 65-78.
    Omran, M.G.H., A.P. Engelbrecht, and A. Salman. 2009. "Bare bones differential evolution." European Journal of Operational Research, 196(1): 128-139.
    Omran, M.G.H., A.P. Engelbrecht, and A. Salman. 2007. "Differential evolution based on particle swarm optimization." In: Proceedings of the IEEE Swarm Intelligence Symposium, 112-119, Honolulu, Hawaii.
    Przybylek, M.R. 2013. "Skeletal Algorithms in Process Mining." Studies in Computational Intelligence, 465: 119-134.
    Pan, Q.K., L. Wang, L. Gao, W.D. Li. 2011. "An effective hybrid discrete differential evolution algorithm for the flow shop scheduling with intermediate buffers." Information Sciences, 181(3): 668-685.
    Pant, M., R. Thangaraj, A. Abraham. 2011. "DE-PSO: a new hybrid meta-heuristic for solving global optimization problems." New Mathematics and Natural Computation, 7(3): 363-381.
    Rozinat, A., I.S.M. de Jong, C.W. Günther, W.M.P. van der Aalst. 2009. "Process Mining Applied to the Test Process of Wafer Steppers in ASML." IEEE Transactions on Systems, Man and, Cybernetics, Part C (Applications and Reviews), 39(4): 474-479.
    Ryan, J., and C. Heavey. 2006. Process modeling for simulation. Computers in Industry, 57(5): 437-450.
    Ratzer, A.V., L. Wells, H.M. Lassen, M. Laursen, J.F. Qvortrup, M.S. Stissing, M. Westergaard, S. Christensen, K. Jensen. 2003. CPN Tools for Editing, Simulating, and Analysing Coloured Petri Nets. In: 24th International Conference on Applications and Theory of Petri Nets (ICATPN). 2679: 450-462.
    Rebuge, A., and D.R., Ferreira. 2012. "Business process analysis in healthcare environments: A methodology based on process mining." Information Systems, 37(2): 99-116.
    Srinivas, M., and L.M. Patnaik. 1994. "Genetic algorithms: A survey." Computer, 27(6): 17-26.
    Storn, R., and K. Price. 1995. "Differential evolution – A simple and efficient adaptive scheme for global optimization over continuous spaces." Technical Report TR-95-012, International Computer Science Institute.
    Shelokar, P., A. Kulkarni, V.K. Jayaraman, P. Siarry. 2014. "Metaheuristics in Process Engineering: A Historical Perspective." Applications of Metaheuristics in Process Engineering, 1-38.
    Tsalgatidou, A., P. Louridas, G. Fesakis, T. Schizas. 1996. "Multilevel Petri Nets for Modelling and Simulating Organisational Dynamic Behaviour." Simulation & Gaming, 27(4): 484-506.
    Turkmen, I., R. Gul, and C. Celik. 2008. "A Taguchi approach for investigation of some physical properties of concrete produced from mineral admixtures." Building and Environment, 43(6): 1127-1137.
    Wen, L., W.M.P. van der Aalst, J. Wang, J. Sun. 2007. "Mining process models with non-free-choice constructs." Data Mining and Knowledge Discovery, 15(2): 145-180.
    Weijters, A.J.M.M., W.M.P. van der Aalst, and A.K.A. de Medeiros. 2006. "Process mining with the heuristics miner-algorithm." BETA working paper series 166, Eindhoven University of Technology.
    Wong, K.P., and Z.Y. Dong. 2005. "Differential Evolution, an Alternative Approach to Evolutionary Algorithm." Proceedings of the 13th International Conference on ISAP, 73-83, Arlington, VA, November.
    Weske, M. 2007. Business process management: concepts, languages, architectures. Springer-Verlag, Berlin.
    van der Werf, J.M.E.M., B.F. van Dongen, C.A.J. Hurkens, A. Serebrenik. 2008. "Process discovery using integer linear programming." In: van Hee, K., Valk, R., (eds.) Proceedings of the 29th International Conference on Applications and Theory of Petri Nets (Petri Nets 2008). Lecture Notes in Computer Science, 5062: 368-387. Springer, Berlin.
    Yang, W.S., and S.Y. Hwang. 2006. "A process-mining framework for the detection of healthcare fraud and abuse." Expert Systems with Applications, 31(1): 56-68.
    Yi, H., Q. Duan, T.W. Liao. 2013. "Three improved hybrid metaheuristic algorithms for engineering design optimization." Applied Soft Computing, 13(5): 2433-2444.
    Zhang, C.S., J.G. Sun, X.J. Zhu, Q.Y. Yang. 2008. "An improved particle swarm optimization algorithm for flowshop scheduling problem." Information Processing Letters, 108(4): 204-209.
    Zhang, W.J., and X.F. Xie. 2003. "DEPSO: Hybrid Particle Swarm with Differential Evolution Operator." In: Proc. of the IEEE International Conference on Systems, Man and Cybernetics, 4: 3816-3821. IEEE Press, Washington.

    QR CODE