簡易檢索 / 詳目顯示

研究生: 劉繼遠
Chi-Yuan Liu
論文名稱: 以深度卷積自動編碼器分辨異常軌道號誌紀錄
Anomalous Railway Signaling Logs Detection with Deep Convolutional Autoencoder
指導教授: 楊振雄
Cheng-Hsiung Yang
口試委員: 陳金聖
Chin-Sheng Chen
吳長熙
Chang-Shi Wu
郭永麟
Yong-Lin Kuo
楊振雄
Cheng-Hsiung Yang
學位類別: 碩士
Master
系所名稱: 工程學院 - 自動化及控制研究所
Graduate Institute of Automation and Control
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 83
中文關鍵詞: 異常檢測離群值檢測深度學習集成學習軌道號誌
外文關鍵詞: anomaly detection, outlier detection, deep learning, ensemble learning, railway signaling
相關次數: 點閱:234下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

本文目的為利用深度卷積自動編碼器分析軌道號誌系統事件紀錄中的列車速度、位置、加速度以及其他相關資訊來檢測異常(打滑和車輛效能錯誤)的紀錄樣本,藉由實驗了解深度卷積自動編碼器能否在數據不連續且缺乏更詳盡資訊(如輪軸速度與加速度大小區間)的情況下找出打滑期間的紀錄樣本並且擁有高的準確率。
由於事件紀錄並非是已經標記好的資料,因此本文列出事件紀錄中所使用的數值所代表的意義以及標記異常(打滑和車輛效能錯誤)樣本的條件,並將異常樣本從訓練集移除,以讓深度卷積自動編碼器學習正常樣本的樣態,此為半監督式學習。為了讓神經網路能夠快速地收斂訓練和驗證誤差,將樣本輸入給深度卷積自動編碼器學習或是預測之前會將訓練集以及測試集的資料正規化,也比較了正規化方法之間的差異。
深度卷積自動編碼器的功能是根據學習訓練集累積的經驗來重建輸入樣本。藉由輸出與輸入之間的差異來檢測異常樣本,本文採用兩種檢查差異的方法,平均絕對誤差(MAE)與馬哈蘭(Mahalanobis)距離。前者是一種直觀的方法,當輸入與輸出差異越大,誤差也就越大。後者則是統計上常使用的方法,很適合用在多變量資料,距離叢集中心越遠越有可能是異常(離群值)。
實驗所使用的每個測試集的異常樣本佔總樣本數的0.2% 以下。在約兩成偽陽性的情況下,深度卷積自動編碼器可以達到九成二的準確率。而將標記為異常的樣本依經驗法則控制在0.3%左右時,可以達到約八成的準確率。


The subject of this thesis is to use the deep convolutional autoencoder (DCAE) as an analyzer, analyzing the event logs of railway signaling system using information such as train location, speed and acceleration in discrete form in order to classify anomalies where slip/slide occurs. Also, trying to optimize the model by several experiments and to understand whether the DCAE can classify anomalies with high accuracy in spite of the lack of detail information such as axle speeds and min-max boundary of acceleration.
The data of railway signaling system event logs is not labelled. The meaning of each data value is therefore explained and then the conditions to label anomalous (slip/slide) samples are described. The samples which are labelled as anomalies are removed from the training set for the DCAE to learn the normal samples. It is called semi-supervised learning. Before feeding the data to the DCAE for training and prediction, the data will be normalized. The selection principle of normalization methods is explained as well.
What the DCAE does is to reconstruct the input data with the experience it learned from the training set. By measuring the error between input and output of the DCAE, it is able to classify anomalies. Two methods to measure the errors are picked, one is MAE (Mean Absolute Error) and the other is Mahalanobis distance. The former method is intuitive, the error is larger when difference between input and output is larger. The latter is commonly used in statistics in order to find anomalies (outliers) and is suitable for multivariate data. The error is larger when the samples are far from the cluster.
The test sets used in the experiments are contaminated with less than 0.2% anomalies. The DCAE is able to achieve 92% accuracy if approximately 20% false positive rate is not a concern. However, when the number of anomalies is limited at around 0.3% with empirical rules, the predicting accuracy of the DCAE is around 80%.

摘要 I ABSTRACT II 目錄 III 圖目錄 V 表目錄 VI 第一章 緒論 1 1.1 前言 1 1.2 文獻探討 1 1.3 研究動機 2 1.4 本文大綱 4 第二章 軌道號誌系統 7 2.1. 自動列車控制和IEEE標準中的CBTC系統 7 2.2. 號誌系統架構 9 2.3. 自動列車保護 12 2.4. 自動列車監視(ATS) 與自動列車操作 (ATO) 17 2.5. 路網圖表示法 19 第三章 深度學習網路 21 3.1. 自動編碼器 21 3.2. 深度卷積自動編碼器 23 3.3. 判別異常樣本 25 3.3.1. 平均絕對誤差 25 3.3.2. Mahalanobis距離 25 3.3.3. 決定閾值 26 第四章 事件紀錄特性與前處理 28 4.1. 事件紀錄內容 28 4.2. 事件前處理 33 4.3. 標記已知異常樣本 35 4.4. 選擇資料樣本 39 4.5. 字串和列舉類型資料轉換 39 4.6. 正規化 40 第五章 實驗與結果 44 5.1. 性能指標 44 5.2. 實驗過程與結果 47 5.2.1. 神經網路容量 47 5.2.2. 損失函數 53 5.2.3. 啟動函數 56 5.2.4. 學習率 64 5.2.5. 批次正規化和隨機關閉神經元 74 第六章 結論 78 6.1 實驗結果分析 78 6.2 結論與未來工作 79 參考文獻 81

[1] V. Chandola, A. Banerjee, and V. Kumar, "Anomaly detection: A survey," ACM computing surveys (CSUR), vol. 41, no. 3, pp. 1-58, 2009.
[2] M. Ahmed, A. N. Mahmood, and J. Hu, "A survey of network anomaly detection techniques," Journal of Network and Computer Applications, vol. 60, pp. 19-31, 2016.
[3] D. Kwon, H. Kim, J. Kim, S. C. Suh, I. Kim, and K. J. Kim, "A survey of deep learning-based network anomaly detection," Cluster Computing, vol. 22, no. 1, pp. 949-961, 2019.
[4] M. Du, F. Li, G. Zheng, and V. Srikumar, "Deeplog: Anomaly detection and diagnosis from system logs through deep learning," in Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, 2017, pp. 1285-1298.
[5] X. Zhang et al., "Robust log-based anomaly detection on unstable log data," in Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2019, pp. 807-817.
[6] R. Yang, D. Qu, Y. Gao, Y. Qian, and Y. Tang, "NLSALog: An anomaly detection framework for log sequence in security management," IEEE Access, vol. 7, pp. 181152-181164, 2019.
[7] M. A. Elsayed and M. Zulkernine, "PredictDeep: Security Analytics as a Service for Anomaly Detection and Prediction," IEEE Access, vol. 8, pp. 45184-45197, 2020.
[8] M.-A. Lutz et al., "Evaluation of anomaly detection of an autoencoder based on maintenace information and scada-data," Energies, vol. 13, no. 5, p. 1063, 2020.
[9] M. Wurzenberger, F. Skopik, M. Landauer, P. Greitbauer, R. Fiedler, and W. Kastner, "Incremental clustering for semi-supervised anomaly detection applied on log data," in Proceedings of the 12th International Conference on Availability, Reliability and Security, 2017, pp. 1-6.
[10] "IEEE Standard for Communications-Based Train Control (CBTC) Performance and Functional Requirements," IEEE Std 1474.1-2004 (Revision of IEEE Std 1474.1-1999), pp. 0_1-45, 2004, doi: 10.1109/IEEESTD.2004.95746.
[11] "The ERTMS/ETCS signalling system." [Online]. Available: http://www.railwaysignalling.eu/wp-content/uploads/2014/08/ERTMS_ETCS_signalling_system_revF.pdf
[12] "Railways Localisation System High Level Users’ Requirements." [Online]. Available: https://ertms.be/sites/default/files/2019-12/18E112_TL%20high%20level%20principles_2.pdf
[13] "Detailing the latest advancements on world metro automation in new UITP Statistics Brief," p. 8. [Online]. Available: https://cms.uitp.org/wp/wp-content/uploads/2020/06/Statistics-Brief-Metro-automation_final_web03.pdf
[14] J. Zhang, "Advancements of outlier detection: A survey," ICST Transactions on Scalable Information Systems, vol. 13, no. 1, pp. 1-26, 2013.
[15] R. Domingues, M. Filippone, P. Michiardi, and J. Zouaoui, "A comparative evaluation of outlier detection algorithms: Experiments and analyses," Pattern Recognition, vol. 74, pp. 406-421, 2018.
[16] I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. MIT Press, 2016.
[17] G. E. Hinton and R. R. Salakhutdinov, "Reducing the dimensionality of data with neural networks," science, vol. 313, no. 5786, pp. 504-507, 2006.
[18] F. Chollet, "keras," ed, 2015.
[19] R. Butler, P. Davies, and M. Jhun, "Asymptotics for the minimum covariance determinant estimator," The Annals of Statistics, pp. 1385-1400, 1993.
[20] M. Hubert and M. Debruyne, "Minimum covariance determinant," Wiley interdisciplinary reviews: Computational statistics, vol. 2, no. 1, pp. 36-43, 2010.
[21] J. Kazil and K. Jarmul, Data Wrangling with Python: Tips and Tools to Make Your Life Easier. O'Reilly Media, Inc., 2016.
[22] A. Zheng and A. Casari, Feature engineering for machine learning: principles and techniques for data scientists. " O'Reilly Media, Inc.", 2018.
[23] R. V. Raghav, G. Lemaitre, and T. Unterthiner. "Compare the effect of different scalers on data with outliers." https://scikit-learn.org/stable/auto_examples/preprocessing/plot_all_scaling.html (accessed 2021/03/25.
[24] F. Pedregosa et al., "Scikit-learn: Machine learning in Python," the Journal of machine Learning research, vol. 12, pp. 2825-2830, 2011.
[25] V. Bewick, L. Cheek, and J. Ball, "Statistics review 13: receiver operating characteristic curves," Critical care, vol. 8, no. 6, pp. 1-5, 2004.
[26] J. Brownlee, Better Deep Learning: Train Faster, Reduce Overfitting, and Make Better Predictions. Machine Learning Mastery, 2018.
[27] H. Ide and T. Kurita, "Improvement of learning for CNN with ReLU activation by sparse regularization," in 2017 International Joint Conference on Neural Networks (IJCNN), 2017: IEEE, pp. 2684-2691.
[28] D.-A. Clevert, T. Unterthiner, and S. Hochreiter, "Fast and accurate deep network learning by exponential linear units (elus)," arXiv preprint arXiv:1511.07289, 2015.
[29] P. Ramachandran, B. Zoph, and Q. V. Le, "Searching for activation functions," arXiv preprint arXiv:1710.05941, 2017.
[30] S. L. Smith, P.-J. Kindermans, C. Ying, and Q. V. Le, "Don't decay the learning rate, increase the batch size," arXiv preprint arXiv:1711.00489, 2017.
[31] J. Duchi, E. Hazan, and Y. Singer, "Adaptive subgradient methods for online learning and stochastic optimization," Journal of machine learning research, vol. 12, no. 7, 2011.
[32] G. Hinton, N. Srivastava, and K. Swersky, "Neural networks for machine learning lecture 6a overview of mini-batch gradient descent," Cited on, vol. 14, no. 8, 2012.
[33] D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.
[34] K. Zhou and J. C. Doyle, Essentials of robust control. Prentice hall Upper Saddle River, NJ, 1998.
[35] T. Dozat, "Incorporating nesterov momentum into adam," 2016.
[36] S. Ioffe and C. Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift," in International conference on machine learning, 2015: PMLR, pp. 448-456.
[37] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, "Dropout: a simple way to prevent neural networks from overfitting," The journal of machine learning research, vol. 15, no. 1, pp. 1929-1958, 2014.
[38] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in neural information processing systems, vol. 25, pp. 1097-1105, 2012.
[39] G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov, "Improving neural networks by preventing co-adaptation of feature detectors," arXiv preprint arXiv:1207.0580, 2012.
[40] D. Warde-Farley, I. J. Goodfellow, A. Courville, and Y. Bengio, "An empirical analysis of dropout in piecewise linear networks," arXiv preprint arXiv:1312.6197, 2013.

無法下載圖示
全文公開日期 本全文未授權公開 (校外網路)
全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
QR CODE