簡易檢索 / 詳目顯示

研究生: Husni Mubarok
Husni Mubarok
論文名稱: Optical-Inspired Deep Machine Learning For Text Classification Of Contractual Risk Clause
Optical-Inspired Deep Machine Learning For Text Classification Of Contractual Risk Clause
指導教授: 鄭明淵
Min-Yuan Cheng
口試委員: 曾惠斌
Zeng Huibin
謝佑明
Yo-Ming Hsieh
學位類別: 碩士
Master
系所名稱: 工程學院 - 營建工程系
Department of Civil and Construction Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 82
外文關鍵詞: Contractual Risk, Specification Document
相關次數: 點閱:154下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報


Automated classification of contractual risk clauses plays a vital role in construction engineering, enabling efficient risk analysis and project management. This research investigates the fusion of BERT (Bidirectional Encoder Representations from Transformers) and BiGRU (Bidirectional Gated Recurrent Unit) architectures for text classification, focusing on the case study of classifying contractual risk clauses from construction specifications. BERT captures contextualized word embeddings, while BiGRU excels in modeling sequential patterns. By combining their strengths, the proposed fused model aims to improve classification accuracy and performance. Furthermore, OMA optimization searches BERT and BiGRU’s best parameters to ensure the optimal performance of the corresponding model. The model's effectiveness is evaluated based on performance metrics, namely precision, recall, and F1-score. The result shows that the OMA-BERT-BiGRU is proven a hybrid machine learning model to classify risk clause categories with a satisfactory performance of F1 Score 0.956 and surpasses several machine learning models. The developed model has advantages, including enhanced representation and the ability to capture long-range dependencies. The research contributes to the field of text classification in construction engineering, offering insights for automated risk analysis and management.

ACKNOWLEDGEMENT iv ABSTRACT i TABLE OF CONTENTS ii LIST OF FIGURES v LIST OF TABLES vii ABBREVIATIONS AND SYMBOLS viii CHAPTER 1: INTRODUCTION 1 1.1 Research Background 1 1.2 Research Objective 4 1.3 Scope and Assumptions 5 1.4 Research Methodology 5 1.5 Research Outline 8 CHAPTER 2: LITERATURE REVIEW 10 2.1 Related Research Contractual Risk Classification 10 2.2 Natural Language Processing 12 2.3 Bidirectional Gated Recurrent Unit 15 2.4 Bidirectional Encoder Representations from Transformers 18 2.5 Fusion Feature 20 2.6 Optical Microscope Optimization 22 2.7 The Class Imbalance Problem 27 CHAPTER 3: MODEL CONSTRUCTION 29 3.1 Conceptual Mechanism of Proposed Model 29 3.2 Model Framework of OMA-BERT-BiGRU for Contractual Clause Classification 30 3.2.1 Natural Language Processing Phase 31 3.2.2 Data Partition 34 3.2.3 Parameter Initialization 34 3.2.4 BERT-BiGRU Classifier Phase 35 3.2.5 Objective Function (Average F-1 Score) 41 3.2.6 Optical Microscope Algorithm Searching 41 3.2.7 Termination Criteria 42 3.2.8 Optimized Inference Model 42 3.2.9 Model Performance Evaluation 42 3.3 Performance Evaluation Criteria 42 CHAPTER 4: MODEL IMPLEMENTATION AND VALIDATION 45 4.1 Data Collection 45 4.2 Data Preparation 47 4.2.1 Stop Word Removal Process 47 4.2.2 Tokenization Process 48 4.2.3 Lemmatization Process 49 4.2.4 Embedding Process 50 4.3 OMA-BERT-BiGRU Implementation and Evaluation 51 4.3.1 OMA-BERT-BiGRU Training Result 51 4.3.2 OMA-BERT-BiGRU Testing Result 54 4.3.3 Result Comparison with Other AI Models 56 CHAPTER 5: CONCLUSION 62 5.1 Conclusion 62 5.2 Recommendation 63 REFERENCES 64

Baek, S., Jung, W., & Han, S. H. (2021). A critical review of text-based research in construction: Data source, analysis method, and implications. Automation in Construction, 132, 103915.
Bengio, Y., Ducharme, R., & Vincent, P. (2000). A neural probabilistic language model. Advances in neural information processing systems, 13.
Blanco, A., Perez-de-Viñaspre, O., Pérez, A., & Casillas, A. (2020). Boosting ICD multi-label classification of health records with contextual embeddings and label-granularity. Computer methods and programs in biomedicine, 188, 105264.
Cheng, M.-Y., Kusoemo, D., & Gosno, R. A. (2020). Text mining-based construction site accident classification using hybrid supervised machine learning. Automation in Construction, 118, 103265. doi:https://doi.org/10.1016/j.autcon.2020.103265
Deng, J., Cheng, L., & Wang, Z. (2021). Attention-based BiLSTM fused CNN with gating mechanism model for Chinese long text classification. Computer Speech & Language, 68, 101182.
Deng, Y., Jia, H., Li, P., Tong, X., Qiu, X., & Li, F. (2019). A deep learning methodology based on bidirectional gated recurrent unit for wind power prediction. Paper presented at the 2019 14th IEEE Conference on Industrial Electronics and Applications (ICIEA).
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
Ding, Y., Ma, J., & Luo, X. (2022). Applications of natural language processing in construction. Automation in Construction, 136, 104169.
Hamie Jamileh, M. K., & Abdul-Malak Mohamed-Asem, U. (2018). Model Language for Specifying the Construction Contract’s Order-of-Precedence Clause. Journal of Legal Affairs and Dispute Resolution in Engineering and Construction, 10(3), 04518011. doi:10.1061/(ASCE)LA.1943-4170.0000260
Hassan Fahad, u., & Le, T. (2020). Automated Requirements Identification from Construction Contract Documents Using Natural Language Processing. Journal of Legal Affairs and Dispute Resolution in Engineering and Construction, 12(2), 04520009. doi:10.1061/(ASCE)LA.1943-4170.0000379
Hassan, F. u., Le, T., & Lv, X. (2021). Addressing legal and contractual matters in construction using natural language processing: A critical review. Journal of Construction Engineering and Management, 147(9), 03121004.
Lam Patrick, T. I., Kumaraswamy Mohan, M., & Ng, T. S. T. (2007). International Treatise on Construction Specification Problems from a Legal Perspective. Journal of Professional Issues in Engineering Education and Practice, 133(3), 229-237. doi:10.1061/(ASCE)1052-3928(2007)133:3(229)
Lee, J., & Yi, J.-S. (2017). Predicting Project’s Uncertainty Risk in the Bidding Process by Integrating Unstructured Text Data and Structured Numerical Data Using Text Mining. Applied Sciences, 7(11). doi:10.3390/app7111141
Lee, S., Han, D. K., & Ko, H. (2020). Fusion-ConvBERT: parallel convolution and BERT fusion for speech emotion recognition. Sensors, 20(22), 6688.
Li, P., Luo, A., Liu, J., Wang, Y., Zhu, J., Deng, Y., & Zhang, J. (2020). Bidirectional gated recurrent unit neural network for Chinese address element segmentation. ISPRS International Journal of Geo-Information, 9(11), 635.
Ling, M., Chen, Q., Sun, Q., & Jia, Y. (2020). Hybrid neural network for Sina Weibo sentiment analysis. IEEE Transactions on Computational Social Systems, 7(4), 983-990.
Luo, X., Zhou, W., Wang, W., Zhu, Y., & Deng, J. (2017). Attention-based relation extraction with bidirectional gated recurrent unit and highway network in the analysis of geological data. IEEE Access, 6, 5705-5715.
Luong, T.-L., Cao, M.-S., Le, D.-T., & Phan, X.-H. (2017). Intent extraction from social media texts using sequential segmentation and deep learning models. Paper presented at the 2017 9th International Conference on Knowledge and Systems Engineering (KSE).
Moon, S., Lee, G., & Chi, S. (2022). Automated system for construction specification review using natural language processing. Advanced Engineering Informatics, 51, 101495.
Pathik, N., & Shukla, P. (2021). An efficient sentiment analysis using topic model based optimized recurrent neural network. International Journal on Smart Sensing and Intelligent Systems, 14(1), 1-12.
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. nature, 323(6088), 533-536.
Salama Dareen, M., & El-Gohary Nora, M. (2016). Semantic Text Classification for Supporting Automated Compliance Checking in Construction. Journal of Computing in Civil Engineering, 30(1), 04014106. doi:10.1061/(ASCE)CP.1943-5487.0000301
Turian, J., Ratinov, L., & Bengio, Y. (2010). Word representations: a simple and general method for semi-supervised learning. Paper presented at the Proceedings of the 48th annual meeting of the association for computational linguistics.
V. Pancoast, T. H., P. Mullen, J. Recan, G. Kitt, M. Torres. (2O21). Global Construction Disputes Report. Retrieved from https://www.arcadis.com/en/knowledge-h ub/perspectives/global/global-construction-disputes-report
Wu, C., Li, X., Guo, Y., Wang, J., Ren, Z., Wang, M., & Yang, Z. (2022). Natural language processing for smart construction: Current status and future directions. Automation in Construction, 134, 104059.
Wu, S., Zhang, N., Xiang, Y., Wu, D., Qiao, D., Luo, X., & Lu, W.-Z. (2022). Automated Layout Design Approach of Floor Tiles: Based on Building Information Modeling (BIM) via Parametric Design (PD) Platform. Buildings, 12(2), 250. doi:10.3390/buildings12020250
Yu, S., Su, J., & Luo, D. (2019). Improving bert-based text classification with auxiliary sentence and domain knowledge. IEEE Access, 7, 176600-176612.
Zhang, D., Tian, L., Hong, M., Han, F., Ren, Y., & Chen, Y. (2018). Combining convolution neural network and bidirectional gated recurrent unit for sentence semantic classification. IEEE Access, 6, 73750-73759.

無法下載圖示
全文公開日期 2025/08/28 (校外網路)
全文公開日期 2025/08/28 (國家圖書館:臺灣博碩士論文系統)
QR CODE