簡易檢索 / 詳目顯示

研究生: 蕭凱駿
Kai-Chun Hsiao
論文名稱: 應用集成方法於半導體蝕刻製程晶圓分類之研究
Application ensemble method for wafer classification in semiconductor etching process
指導教授: 王福琨
Fu-Kwun Wang
口試委員: 李文義
歐陽超
朱道鵬
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理系
Department of Industrial Management
論文出版年: 2019
畢業學年度: 107
語文別: 中文
論文頁數: 76
中文關鍵詞: 數據不平衡隨機森林多層感知器集成方法
外文關鍵詞: Imbalanced Data, Random forest, Multilayer perceptron, Ensemble method
相關次數: 點閱:287下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年來物聯網、工業4.0、大數據分析、智慧製造蓬勃發展,加速了預測與健康管理的普及。在半導體的產業中,任何的故障造成生產線突然停線所造成的所失,都是難以估計的,由於半導體製造過程愈趨複雜,傳統檢驗控制方式已無法滿足製程需求,故各大半導體製造廠投入龐大的資源開發新的監控系統,而先進製程控制技術已經被應用在製程監控系統上。本研究的問題是對蝕刻製程晶圓之分類進行研究。因此本研究主要目的是提供一個集成學習方法(Ensemble method),主題注重於使用深度學習多層感知器(Multilayer Perceptron, MLP)與機器學習隨機森林(Random Forest, RF)分類技術方式,對分類預測結果表現的比較。採用源自Eigenvector Research, Inc德州儀器公司(Texas Instruments Inc.)之設備編號SEMTECH J-88專案的公開數據集,資料中含有129個晶圓資料,由108個正常的晶圓和21個異常的晶圓組成,經由合成少數過取樣技術(Synthetic Minority Oversampling Technique, SMOTE)解決數據不平衡資料,然後再進行RF與MLP分類之後,運用Ensemble結合兩種分類器,使用分類的評判標準為TPR、FPR、AUC,最後Ensemble之TPR、FPR、AUC值分別為99.85%、3.67%、99.94%皆優於Moyne et al.(2017)提到的Whole Trace Stats、Manual Windowing、Semi-automated FD之TPR、FPR、AUC。


    In recent years, the Internet of Things, Industry 4.0, Big Data and Analytics, and Smart Manufacturing Technologies have flourished, accelerating the popularity of Prognostics Health Management. In the semiconductor industry, any failure caused by the sudden stop of the production line is difficult to estimate. Due to the increasingly complex semiconductor manufacturing process, the traditional inspection control method can’t meet the process requirements, so the major semiconductor manufacturers have invested huge resources to develop new monitoring systems, and advanced process control technology has been applied to process monitoring systems. The problem in this study was to study the classification of etching process wafers. Therefore, the main purpose of this study is to provide an Ensemble method base on Multilayer Perceptron (MLP) and Random Forest (RF) classification techniques to get prediction results. A public data set from the SEMTECH J-88 project from Eigenvector Research, Inc. Data set includes 129 wafers, 108 normal wafers and 21 anomalies Wafer composition, through the synthesis of a few oversampling techniques (SMOTE) to solve the data imbalance data, and then after the random forest and multi-layer perceptron classification, Ensemble combined with two classifiers, using the classification criteria TPR, FPR, AUC, and finally Ensemble's TPR,FPR and AUC values are 99.85%, 3.67%, and 99.94%, respectively, which are superior to Whole Trace Stats, Manual Windowing, Semi-automated FD, mentioned by Moyne et al. (2017).

    摘要 .......................................................................................................................... I Abstract .................................................................................................................... II 致謝 ........................................................................................................................ III 目錄 ........................................................................................................................ IV 圖目錄 .................................................................................................................... VI 表目錄 ................................................................................................................... VII 第一章 緒論 ........................................................................................................... 1 1.1研究背景 ....................................................................................................... 1 1.2研究動機 ....................................................................................................... 2 1.3研究問題與目的 ........................................................................................... 2 1.4研究範圍與限制 ........................................................................................... 3 1.5研究流程 ....................................................................................................... 3 第二章 文獻探討 ................................................................................................... 5 2.1半導體產業 ................................................................................................... 5 2.2 蝕刻製程 ....................................................................................................... 7 2.3 數據不平衡 ................................................................................................... 8 2.4 檢測與分類的方法 ....................................................................................... 9 2.4.1支持向量機 ................................................................................................ 9 2.4.2 隨機森林 .................................................................................................. 14 2.4.3 引導隨機森林 .......................................................................................... 22 2.4.4 加權子空間隨機森林 .............................................................................. 23 2.4.5多層感知器類神經網路 .......................................................................... 26 2.5混淆矩陣 ..................................................................................................... 31 第三章 研究方法 ................................................................................................. 34 3.1 合成少數過取樣技術 ................................................................................. 34 3.2 集成學習方法 ............................................................................................. 34 3.3 隨機森林 ..................................................................................................... 37 3.4多層感知器 ................................................................................................. 37 第四章 案例分析 ................................................................................................. 39 4.1半導體蝕刻製程 ......................................................................................... 39 4.2資料介紹 ..................................................................................................... 39 4.3.1支持向量機 .............................................................................................. 42 4.3.2 隨機森林 .................................................................................................. 43 4.3.3加權子空間隨機森林 .............................................................................. 44 4.3.4引導隨機森林 .......................................................................................... 45 4.3.5多層感知器 .............................................................................................. 46 4.3.6集成方法 .................................................................................................. 47 第五章 結論 ......................................................................................................... 50 參考文獻 ............................................................................................................... 51 附錄 ....................................................................................................................... 55 附錄A 機器學習 .............................................................................................. 55 附錄B 多層感知器 .......................................................................................... 61 附錄C 多層感知器與隨機森林 ...................................................................... 64

    Amit, Y., & Geman, D. (1997). Shape quantization and recognition with randomized trees. Neural Computation, 9(7), 1545-1588.
    An, D., Ko, H. H., Gulambar, T., Kim, J., Baek, J. G., & Kim, S. S. (2009). A semiconductor yields prediction using stepwise support vector machine. In 2009 IEEE International Symposium on Assembly and Manufacturing (pp. 130-136).
    Baly, R., & Hajj, H. (2012). Wafer classification using support vector machines. IEEE Transactions on Semiconductor Manufacturing, 25(3), 373-383.
    Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5-32. Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and Regression Trees. CRC Press.
    Chao, J., Di, Y., Moyne, J., Iskandar, J., Hao, H., Schulze, B., Armacost, M., Lee, J., (2018) Extensible framework for pattern recognition-augmented feature extraction (PRAFE) in robust prognostics and health monitoring. Working paper.
    Deng, H. (2013). Guided random forest in the RRF package. arXiv preprint arXiv:1306.0237.
    Deng, H., & Runger, G. (2013). Gene selection with guided regularized random forest. Pattern Recognition, 46(12), 3483-3489.
    Dietterich, T. G. (2000). Ensemble methods in machine learning. In International Workshop on Multiple Classifier Systems (pp. 1-15). Springer, Berlin, Heidelberg.
    Fletcher, R. (2013). Practical methods of optimization. John Wiley & Sons.
    Guo, L., Chehata, N., Mallet, C., & Boukir, S. (2011). Relevance of airborne lidar and multispectral image data for urban scene classification using random forests. ISPRS Journal of Photogrammetry and Remote Sensing, 66(1), 56-66.
    Gupta, P., Doermann, D., & DeMenthon, D. (2002). Beam search for feature selection in automatic SVM defect classification. In Object Recognition Supported by User Interaction for Service Robots (Vol. 2, pp. 212-215).
    Han, K., Kim, S., Park, K. J., Yoon, E. S., & Chae, H. (2008). Principal component analysis based support vector machine for the end point detection of the metal etch process. IFAC Proceedings Volumes, 41(2), 4560-4565.
    Hanley, J. A., & McNeil, B. J. (1982). The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology, 143(1), 29-36.
    Ho, T. K. (1998, August). Nearest neighbors in random subspaces. In Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR) (pp. 640-648). Springer, Berlin, Heidelberg. Jüngel, A. (2009). Transport equations for semiconductors (Vol. 773). Springer. Kim, J., Han, Y., & Lee, J. (2016). Data imbalance problem solving for smote based oversampling: Study on fault detection prediction model in semiconductor manufacturing process. Advanced Science and Technology Letters, 133, 79-84.Schölkopf, B. (2001). Statistical learning and kernel methods. In Data Fusion and Perception (pp. 3-24). Springer, Vienna.
    Kwok, S. W., & Carter, C. (1990). Multiple decision trees. In Machine Intelligence and Pattern Recognition (Vol. 9, pp. 327-335). North-Holland.
    Moyne, J., & Iskandar, J. (2017). Big data analytics for smart manufacturing: Case studies in semiconductor manufacturing. Processes, 5(3), 39.
    Provost, F., & Kohavi, R. (1998). Glossary of terms. Journal of Machine Learning, 30(2-3), 271-274.
    Puggini, L., Doyle, J., & McLoone, S. (2015). Fault detection using random forest similarity distance. IFAC-PapersOnLine, 48(21), 583-588. Quinlan, J. R. (1993). C4.5: programs for machine learning. Elsevier.
    Quirk, M., & Serda, J. (2001). Semiconductor manufacturing technology (Vol. 1). Upper Saddle River, NJ: Prentice Hall.
    Rosenblatt, F. (1958). The perceptron: A theory of statistical separability in cognitive systems. United States Department of Commerce.
    Shul, R. J., & Pearton, S. J. (2011). Handbook of advanced plasma processing techniques. Springer Science & Business Media.
    Storn, R., & Price, K. (1997). Differential evolution – a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization, 11(4), 341-359.
    Sumsion, G. R., Bradshaw, M. S., Hill, K. T., Pinto, L. D., & Piccolo, S. R. (2018). Remote sensing tree classification with a multilayer perceptron. PeerJ, 7, e6101. Tan, S. C., Wang, S., & Watada, J. (2018). A self-adaptive class-imbalance TSK neural network with applications to semiconductor defects detection. Information Sciences, 427, 1-17.
    Uddin, M. T., & Uddiny, M. A. (2015). A guided random forest based feature selection approach for activity recognition. In 2015 International Conference on Electrical Engineering and Information Communication Technology (ICEEICT) (pp. 1-6).
    Vapnik, V. (1995). The nature of statistical learning theory. Springer Science & Business Media.
    Williams, P. F. (2013). Plasma processing of semiconductors (Vol. 336). Springer Science & Business Media. Xu, B., Huang, J. Z., Williams, G., Wang, Q., & Ye, Y. (2012). Classifying very high-dimensional data with random forests built from small subspaces. International Journal of Data Warehousing and Mining (IJDWM), 8(2), 44-63.
    Zhang, J., & Zulkernine, M. (2006). Anomaly based network intrusion detection with unsupervised outlier detection. In 2006 IEEE International Conference on Communications (Vol. 5, pp. 2388-2393).
    Zhao, H., Williams, G. J., & Huang, J. Z. (2017). WSRF: An R package for classification with scalable weighted subspace random forests. Journal of Statistical Software, 77, 1-30.
    網路文獻
    Eigenvector Research, Inc。網址:https://eigenvector.com/resources/data-sets/。上網時間:2018-08-12(1999)
    科普雜誌。網址: https://kopu.chat/2017/03/24/ic-terms/。上網時間:2019-04-25(2017)
    產業價值鏈資訊平台。網址:https://ic.tpex.org.tw/introduce.php?ic=D000。上網時間:2019-05-01(2019)

    無法下載圖示 全文公開日期 2024/07/12 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE