簡易檢索 / 詳目顯示

研究生: 蘇彥融
YEN-JUNG SU
論文名稱: 以文字探勘技術應用在臺灣營造業重大職災案例
Using text mining approach to analyze construction site accidents in Taiwan
指導教授: 洪嫦闈
Cathy C.W. Hung
口試委員: 呂守陞
Sou-Sen Leu
王翰翔
Han-Hsiang Wang
徐書謙
Shu-Chang Hsu
學位類別: 碩士
Master
系所名稱: 工程學院 - 營建工程系
Department of Civil and Construction Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 112
中文關鍵詞: 營造業職業災害文字探勘LDA分析共現分析
外文關鍵詞: Construction Industry, Occupational Hazards, Text Mining, LDA Analysis, Co-occurrence Analysis
相關次數: 點閱:270下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

營造業長期居我國職災死亡率之冠,對比其它行業往往一個災害會造成嚴重的生命財產損失。過去與營造職災領域相關研究大多使用資料探勘(Data Mining)進行職災特性關聯性分析,但在有受限(研究團隊)自定義類別屬性而無法對職災資訊有完整性挖掘的疑慮及複雜繁瑣的資料處理等限制下,現已陸續有國外學者應用文字探勘(Text Mining)於營造職災領域,可不受限於類別屬性而擷取有限資訊。本研究目的是以國內少見無監督式文字探勘技術分析營造業重大職災報告書,由報告書案例中挖掘關鍵詞彙作為職災因子,後續延伸至違反法條(母法)之相關性,歸納出各別類型災害下的關鍵詞彙及潛在觸犯法條。
本研究樣本取自2018~2021年營造業「墜落滾落」類型337篇、「物體倒塌」類型45篇、「感電」類型34篇的職災報告書案例,依據擷取內容分成「(A)間接原因&應辦理」、「(B)間接原因&刑事罰」、「(C)基本原因&應辦理」、「(D)基本原因&刑事罰」四組文本集,經由文字預處理(斷詞、停用詞刪除、tf-idf演算法)清除文本雜訊歸納關鍵詞彙,透過LDA主題模型進行文本分群,進一步使用Jaccard文字共現演算法分析,聚焦各主題(詞彙)共現連結較高詞彙,並繪製文字共現網絡圖進一步討論關鍵詞彙及法條(母法)之連結規則。


The construction industry consistently has the highest occupational fatality rate in Taiwan. Compared to other industries, a single construction accident may lead to severe loss of property and even fatality. Previous researches in the field of construction occupational accidents often relied on data mining techniques to explore the correlation or relationship between accident characteristics or variables. However, limited or incomplete exploration of accident information due to pre-defined categories and the complexities of processing data have been present.
In contrast, foreign scholars have applied text mining techniques to construction occupational accidents instead, in order to reveal more informaiton. This study aims to utilize an unsupervised text mining approach to analyze significant construction accident reports. Key terms such as occupational accident factors from these cases are systematically extracted and subsequently extend to examine the relationship between these terms and the violations of legal provisions, then to summarize key terms and potential legal violations for major types of construction accidents.
The sample for this study consists of 337 accident report cases related to "falls and slips," 45 cases related to "object collapse," and 34 cases related to "electrocution" in the construction industry from 2018 to 2021. These cases are divided into four sets of categories: (A) indirect causes & administrative measures, (B) indirect causes & criminal penalties, (C) basic causes & administrative measures, and (D) basic causes & criminal penalties. Text preprocessing, including word segmentation, word removal, and tf-idf algorithm, is proceeded to remove the text noise and identify key terms. The LDA topic model is employed for text clustering, followed by the use of Jaccard text co-occurrence algorithm to analyze higher co-occurrence links among various topics (terms). Additionally, text co-occurrence network diagrams are developed to further explore the association between key terms and legal provisions. The outcomes are expected to provide valuable insights into the factors contributing to construction occupational accidents with the potential legal implications.

中文摘要 I 英文摘要 II 致謝 III 第一章 緒論 1 1.1研究背景 1 1.2研究動機與目的 2 1.3研究流程與範圍 4 第二章 文獻回顧 7 2.1 國內外營造職災研究概況 7 2.1.1 國內營造職災研究概況及相關研究文獻 8 2.1.2 國外營造職災文獻研究 10 2.2 文字探勘 11 2.3小結 17 第三章 研究方法 19 3.1 研究流程與架構 19 3.2 資料蒐集 21 3.3 文字預處理 25 3.4 LDA主題模型建立及分群 26 3.5文字共現分析 28 第四章 研究結果分析 33 4.1「墜落滾落」災害類型 33 4.1.1「墜落滾落」類型tf-idf關鍵詞彙選取 33 4.1.2「墜落滾落」類型LDA主題數決定及分群 34 4.1.3「墜落滾落」類型文字共現分析 37 4.1.3.1「墜落滾落」類型-屋頂墜落主題 38 4.1.3.2「墜落滾落」類型-開口墜落主題 46 4.1.3.3「墜落滾落」類型-施工架墜落主題 51 4.1.3.4「墜落滾落」類型-其它墜落主題 55 4.2「物體倒塌」災害類型 57 4.2.1「物體倒塌」類型tf-idf關鍵詞彙選取 58 4.2.2「物體倒塌」類型文字共現分析 58 4.3「感電」災害類型 63 4.3.1「感電」類型tf-idf關鍵詞彙選取 63 4.3.2「感電」類型文字共現分析 63 4.4文字探勘分析結論 68 4.5資料探勘、文字探勘方法對比 74 第五章 結論與建議 91 5.1結論 91 5.2未來相關研究建議 92 參考文獻 95 附錄 其它主題文字共現網絡圖 101

英文文獻
Adam Lebowitz, Kazuhiko Kotani, Yasushi Matsuyama, Masami Matsumura, 2020. Using text mining to analyze reflective essays from Japanese medical students after rural community placement. Lebowitz et al. BMC Medical Education (2020) 20:38.
Ah-Hwee Tan, 2000. Text Mining: The state of the art and the challenges.
Antonio López Arquillos, Juan Carlos Rubio Romero, Alistair Gibb, 2012. Analysis of construction accidents in Spain, 2003-2008. Journal of Safety Research 43 (2012), 381–388.
Botao Zhong, Xing Pan, Peter E.D. Love, Jun Sun, Chanjuan Tao, 2020. Hazard analysis: A deep learning and text mining framework for accident prevention. Advanced Engineering Informatics, Volume 46.
Chip Martel. The expected complexity of Prim's minimum spanning tree algorithm. Information Processing Letters, Volume 81, Issue 4, 28 February 2002, Pages 197-201.
Daniel Jurafsky, James H. Martin, 2008. Speech and Language Processing: An Introduction to Natural Language Processing. Computational Linguistics, and Speech Recognition, Edition: Second.
David M, Blei, John D. Lafferty, 2007. A correlated topic model of Science. The Annals of Applied Statistics, 2007, Vol. 1, No. 1,17–35.
David M. Blei, Andrew Y. Ng, Michael I. Jordan, 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research 3 (2003) 993-1022.
Dong-Pil Shin, Young-Jun Park, Jongwon Seo, Dong-Eun Lee, 2018. Association Rules Mined from Construction Accident Data. KSCE Journal of Civil Engineering (2018) 22(4), 1027-1039.
Douglass R. Cutting, David R. Karger, Jan O. Pedersen, 1993. Constant Interaction-Time Scatter/Gather Browsing of Very Large Document Collections. Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval.
Hongqin Fan, Heng Li, 2013. Retrieving similar cases for alternative dispute resolution in construction accidents using text mining techniques. Automation in Construction, Volume 34, Pages 85-91.
J.H. Kroeze, M.C. Matthee, T.J.D. Bothma, 2004. Differentiating between data-mining and text-mining terminology. SA Journal of Information Management 6(4).
Jan H Kroeze, Machdel C Matthee, Theo JD Bothma, 2004. Differentiating between data-mining and text-mining terminology. SA Journal of Information Management 6(4).

Jiawei Han, Micheline Kamber, Jian Pei, 2012. Data Mining: Concepts and Techniques. A volume in The Morgan Kaufmann Series in Data Management Systems.
Kim, Joon-Soo, Kim, Byung-Soo, 2019. Characteristics Analysis of Seasonal Construction Site Fall Accident using Text Mining, KJCEM 20. 3. 113~121.
M Choirul Rahmadan, Achmad Nizar Hidayanto, Dika Swadani Ekasari, Betty Purwandari, Theresiawati, 2020. Sentiment Analysis and Topic Modelling Using the LDA Method related to the Flood Disaster in Jakarta on Twitter. 2020 International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS).
Mehdi Allahyari, Seyedamin Pouriyeh, Mehdi Assef, Saied Safaei, Elizabeth D. Trippe, Juan B. Gutierrez, Krys Kochut, 2017. A Brief Survey of Text Mining: Classification. Clustering and Extraction Techniques.
Muneo Kushima, Tomoyoshi Yamazak, Kenji Araki, 2019. Text Data Mining of the Nursing Care Life Log from Electronic Medical Record. Proceedings of the International MultiConference of Engineers and Computer Scientists 2019, IMECS 2019, March 13-15, 2019, Hong Kong.
Murzintcev Nikita, Nathan Chaney, 2022. Package ‘ldatuning’.
Panagiotis Mitropoulos, Tariq S. Abdelhamid, Gregory A. Howell, 2005. Systems Model of Construction Accident Causation. Journal of Construction Engineering and Management, Volume 131, Issue 7.
Pang-Ning Tan, Michael Steinbach, Michael, Anuj Karpatne, Vipin Kumar, 2005. Introduction to Data Mining.
R.A. Haslam, S.A. Hide, A.G.F. Gibb, D.E. Gyi, T.Pavitt, S.Atkinson, A.R. Duff, 2005. Contributing factors in construction accidents. Applied Ergonomics 36(2005), 401–415.
Rubayyi Alghamdi, Khalid Alfalqi, 2015. A Survey of Topic Modeling in Text Mining. International Journal of Advanced Computer Science and Applications, Vol. 6, No. 1, 2015.
S. Hori, 2015. An exploratory analysis of the text mining of news articles about “water and society”, Sustainable Development, Vol. 1.
Said A. Salloum, Mostafa Al-Emran, Khaled Shaalan, 2018. Using Text Mining Techniques for Extracting Information from Research Articles. Intelligent Natural Language Processing: Trends and Applications , pp.373-397.
Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, Richard Harshman, 1990. Indexing by latent semantic analysis. Journal of the American Society for Information ScienceVolume 41, Issue 6 p. 391-407.
Seungil Huh, Stephen E. Fienberg, 2010. Discriminative Topic Modeling based on Manifold Learning. KDD '10: Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, Pages 653–662.
Taekhyung Kim, Seokho Chi, M.ASCE, 2019. Accident Case Retrieval and Analyses: Using Natural Language Processing in the Construction Industry. Journal of Construction Engineering and Management, Volume 145, Issue 3.
Takemasa Ishikawa, Yugo Narita, Tamotsu Imura, Yuji Tanaka, Michiko Nakai, Keiko Fukuroku, 2021. A half-day education program for healthcare students on communication support for people with amyotrophic lateral sclerosis. Journal of Communication in Healthcare, Strategies, Media and Engagement in Global Health
Thomas Hofmann, 1999. Probabilistic Latent Semantic Analysis. Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, Pages 50–57.
Vladimer B. Kobayashi, Stefan T. Mol, Hannah A. Berkers, Ga´bor Kismiho´k, Deanne N. Den Hartog, 2018. Text Mining in Organizational Research. Organizational Research Methods, Vol. 21(3) 733-765.
Xixi Luo, Quanlong Liu, Zunxiang Qiu, 2021. A Correlation Analysis of Construction Site Fall Accidents Based on Text Mining. Frontiers in Built Environment 7:690071.
Yang Miang Goh, C.U. Ubeynarayana, 2017. Construction accident narrative classification: An evaluation of text mining techniques. Accident Analysis & Prevention, Volume 108, November 2017, Pages 122-130.
Yasunobu Kino, Hiroshi Kuroki, Tomomi Machida, Norio Furuya, Kanako Takano, 2017. Text Analysis for Job Matching Quality Improvement. Procedia Computer Science, Volume 112, Pages 1523-1530.
Zhou Tong, Haiyi Zhang, 2016. A Text Mining Research Based on LDA Topic Modelling. The Sixth International Conference on Computer Science, Engineering and Information Technology.
中文文獻
李佳桓,“以社會網路分析觀點探討巨量資料在健康保健領域之研究發展”,國立中央大學資訊管理學系碩士論文,指導教授:許文錦,2016。
林楨中、鄭慶武、呂守陞、余家均,“營造業重大職災知識管理及加值應用研究”, 勞動部勞動及職業安全衛生研究所,2010。
張世宏,“臺灣營造業重大職業災害防止對策之研究”,國立台灣大學工學院土木工程學系碩士論文,指導教授:郭斯傑,2013。
莊琇麟,“應用資料探勘研究營造業重大墜落工安事件發生原因”, 國立交通大學經營管理研究所碩士班碩士論文,指導教授:張芳仁,2006。
陳良駒、張正宏、陳日鑫,“以特徵詞共現特性探討知識管理研究議題相關性-使用共詞與關聯法則分析”,資訊管理學報第十七卷第四期,2010。
勞動部職業安全衛生署,中華民國110年勞動檢查統計年報,2021。
黃品尊,“營造業公共工程職災原因及防護措施之研究”,亞洲大學經營管理學系碩士論文,指導教授:張庭彰、張育瑋,2013。
楊美媛,“營造業職災要因分析及因應對策之研究”,國立台灣大學工學院土木工程學系碩士論文,指導教授:曾惠斌,2016。
趙敏慧,“營造業職災特性關聯性分析與知識庫建立之研究”,國立中央大學營建管理研究所碩士論文,指導教授:王翰翔,2017。
蔡光明,“監督式與非監督式機器學習技術應用於商品評論的文件探勘之研究”,國立高雄應用科技大學電機工程系碩士班碩士論文,指導教授:李俊宏,2009。
鄭慶武,“台灣營造業總體職災資料探勘及要因之研究”,國立台灣科技大學營建工程系博士學位論文,指導教授:呂守陞,2010。

無法下載圖示 全文公開日期 2024/08/14 (校內網路)
全文公開日期 2024/08/14 (校外網路)
全文公開日期 2025/08/14 (國家圖書館:臺灣博碩士論文系統)
QR CODE