研究生: |
Denny Kusoemo Denny Kusoemo |
---|---|
論文名稱: |
Text Mining-based Construction Site Accident Classification Using Hybrid Supervised Machine Learning Model Text Mining-based Construction Site Accident Classification Using Hybrid Supervised Machine Learning Model |
指導教授: |
鄭明淵
Min-Yuan Cheng |
口試委員: |
呂守陞
Lu-Shou Sheng 曾仁杰 Ceng-Ren Jie 高明秀 Gao-Ming Xiu |
學位類別: |
碩士 Master |
系所名稱: |
工程學院 - 營建工程系 Department of Civil and Construction Engineering |
論文出版年: | 2019 |
畢業學年度: | 107 |
語文別: | 英文 |
論文頁數: | 79 |
中文關鍵詞: | Construction Project Safety 、Construction Safety Document 、Gated Recurrent Unit 、GRU 、SOS 、Classification of Accidents Cause |
外文關鍵詞: | Construction Project Safety, Construction Safety Document, Gated Recurrent Unit, GRU, SOS, Classification of Accidents Cause |
相關次數: | 點閱:289 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
Construction project safety performance is a major concern in the construction industry. Accidents in the construction project not only caused severe health issues but also led to huge financial losses. These accidents are usually documented in a form of accident narratives that consist of accident summary and cause classification. While documenting hundreds of these accident narratives may need vast resources and efforts, the implementation of AI model is considered as one favorable solution to this particular classification problem. Nevertheless, previously implemented models still have a room of improvement in terms of model performance. For instance, Decision Tree, KNN, Naïve Bayesian, SVM, and LR are categorized as weak learner where they display a substantial error rate. In this regard, this study proposed a hybrid model between Gated Recurrent Unit (GRU) and Symbiotic Organisms Search (SOS), named Symbiotic Gated Recurrent Unit (SGRU). SOS algorithm searches GRU best parameters to ensure the optimal performance of the corresponding model. Furthermore, the proposed model is applied and evaluated on real construction project accident narrative as a case study. The experimental results in this study demonstrated a promising performance of SGRU on classifying accidents causes. By providing notable classification performance as well as outperforming other applied AI model, SGRU demonstrated the capability to aid prevention strategies development for future use in the construction industry.
Construction project safety performance is a major concern in the construction industry. Accidents in the construction project not only caused severe health issues but also led to huge financial losses. These accidents are usually documented in a form of accident narratives that consist of accident summary and cause classification. While documenting hundreds of these accident narratives may need vast resources and efforts, the implementation of AI model is considered as one favorable solution to this particular classification problem. Nevertheless, previously implemented models still have a room of improvement in terms of model performance. For instance, Decision Tree, KNN, Naïve Bayesian, SVM, and LR are categorized as weak learner where they display a substantial error rate. In this regard, this study proposed a hybrid model between Gated Recurrent Unit (GRU) and Symbiotic Organisms Search (SOS), named Symbiotic Gated Recurrent Unit (SGRU). SOS algorithm searches GRU best parameters to ensure the optimal performance of the corresponding model. Furthermore, the proposed model is applied and evaluated on real construction project accident narrative as a case study. The experimental results in this study demonstrated a promising performance of SGRU on classifying accidents causes. By providing notable classification performance as well as outperforming other applied AI model, SGRU demonstrated the capability to aid prevention strategies development for future use in the construction industry.
Breiman, L. (2017). Classification and regression trees: Routledge.
Chen, L., Vallmuur, K., & Nayak, R. (2015). Injury narrative text classification using factorization model. BMC Medical Informatics and Decision Making, 15(1), S5. doi:10.1186/1472-6947-15-S1-S5
Cheng, M.-Y., & Prayogo, D. (2014). Symbiotic organisms search: a new metaheuristic optimization algorithm. Computers & Structures, 139, 98-112.
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078.
Chokor, A., Naganathan, H., Chong, W. K., & Asmar, M. E. (2016). Analyzing Arizona OSHA Injury Reports Using Unsupervised Machine Learning. Procedia Engineering, 145, 1588-1593. doi:https://doi.org/10.1016/j.proeng.2016.04.200
Dasarathy, B. V. (1991). Nearest neighbor ({NN}) norms:{NN} pattern classification techniques.
Duchi, J., Hazan, E., & Singer, Y. (2011). Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. J. Mach. Learn. Res., 12, 2121-2159.
Goh, Y. M., & Ubeynarayana, C. U. (2017). Construction accident narrative classification: An evaluation of text mining techniques. Accident Analysis & Prevention, 108, 122-130. doi:https://doi.org/10.1016/j.aap.2017.08.026
Guggilla, C., Miller, T., & Gurevych, I. (2016). CNN-and LSTM-based claim classification in online user comments. Paper presented at the Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers.
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.
Hosmer Jr, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression (Vol. 398): John Wiley & Sons.
Khurana, D., Koli, A., Khatter, K., & Singh, S. (2017). Natural language processing: State of the art, current trends and challenges. arXiv preprint arXiv:1708.05148.
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
Kotsiantis, S. B., Zaharakis, I., & Pintelas, P. (2007). Supervised machine learning: A review of classification techniques. Emerging artificial intelligence applications in computer engineering, 160, 3-24.
Poh, C. Q. X., Ubeynarayana, C. U., & Goh, Y. M. (2018). Safety leading indicators for construction sites: A machine learning approach. Automation in Construction, 93, 375-386. doi:https://doi.org/10.1016/j.autcon.2018.03.022
Russell, S. J., & Norvig, P. (2016). Artificial intelligence: a modern approach: Malaysia; Pearson Education Limited.
Shen, G., Tan, Q., Zhang, H., Zeng, P., & Xu, J. (2018). Deep learning with gated recurrent unit networks for financial sequence predictions. Procedia computer science, 131, 895-903.
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1), 1929-1958.
Tieleman, T., & Hinton, G. (2012). Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural networks for machine learning, 4(2), 26-31.
Tixier, A. J. P., Hallowell, M. R., Rajagopalan, B., & Bowman, D. (2016). Automated content analysis for construction safety: A natural language processing system to extract precursors and outcomes from unstructured injury reports. Automation in Construction, 62, 45-56. doi:https://doi.org/10.1016/j.autcon.2015.11.001
Ubeynarayana, C., & Goh, Y. (2017). An Ensemble Approach for Classification of Accident Narratives.
Vapnik, V., Golowich, S. E., & Smola, A. J. (1997). Support vector method for function approximation, regression estimation and signal processing. Paper presented at the Advances in neural information processing systems.
Zhang, F., Fleyeh, H., Wang, X., & Lu, M. (2019). Construction site accident analysis using text mining and natural language processing techniques. Automation in Construction, 99, 238-248. doi:https://doi.org/10.1016/j.autcon.2018.12.016