簡易檢索 / 詳目顯示

研究生: 朱璟軒
Ching-Hsuan Boris Chu
論文名稱: 一個基於深度學習之刑度預測模型—以臺灣地區竊盜案件為例
A Deep Learning-based Legal Penalty Prediction Model—Taking Offense of Larceny in Taiwan as an Example
指導教授: 范欽雄
Chin-Shyurng Fahn
口試委員: 王榮華
Jung-Hua Wang
林啟芳
Chi-Fang Lin
馮輝文
Huei-Wen Ferng
范欽雄
Chin-Shyurng Fahn
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 英文
論文頁數: 61
中文關鍵詞: 刑度預測深度學習機器學習注意力機制學習可視化文本處理卷積神經網路雙向門控制循環神經網路
外文關鍵詞: Legal Penalty Prediction, Deep Learning, Machine Learning, Attention Mechanism, Learning Visualization, Text Processing, Convolutional Neural Network, Bidirectional Gated Recurrent Unit
相關次數: 點閱:286下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 由於最近幾年來的刑度案件數量不斷地增長,且在法官的數量無法跟隨增加之下,法官的判決壓力,可以說是愈來愈大,何況法官的判決也常與平民的想法有所差距;而鑑於近年來人工智慧技術的發展,也開始有關於人工智慧在法律領域中的研究。我們從臺灣各級法院的判決書中,讀取「事實及理由」的部分,並把法院的判決結果當作標籤,讓模型學習如何進行刑度的判決。藉此訓練出來的模型,可讓使用者進行案件的刑度預測;對於法官來說,可以當作一個參考,而有效地減少處理案件所需要的人力與時間,對於平民而言,是與司法之間的一座橋梁,可以縮短與法律之間的距離。
    本文提出一個刑度預測系統,首先對各級法院的判決書進行文本清理及篩選,再進行中文斷詞,並把這些斷詞後的資料轉換成詞向量,成為模型可以讀取的資料,來訓練深度學習模型;在學習的過程中,我們需要藉由學習的結果逐步地調整參數,以達到最好的表現。實驗結果表明我們最好的模型使用了文本分類卷積神經網路模型(TextCNN),在竊盜案件上可以達到76.22%的準確度,並且在執行時間上可以達到每一筆預測只需9.43ms。除此之外,藉由雙向門控制循環神經網路模型(Bi-GRU)加上注意力機制,我們可以將原本無法讓人們理解的深度學習過程,利用可視化的方式,不僅可以確認學習的結果,也能讓使用者對於判決系統所做出的判決,有一個能信服的依據。


    Due to the continuous increase in the number of criminal cases in recent years and the fact that the number of judges cannot keep up with the increase, the pressure on judges' judgments can be said to be increasing. Moreover, judges' judgments often differ from common people's ideas. In view of the recent development of artificial intelligence technology, the research on artificial intelligence in the legal field has also begun. We read the “事實及理由” from the judgments of the courts at all levels in Taiwan, and use the judgment results of the courts as labels to let the model learn how to make a judgment. The model trained by this manner allows users to predict the legal penalty of the case. For judges, it can be used as a reference to effectively reduce the manpower and time consuming required to deal with cases; for civilians, it is related to a judicial bridge that can reduce the distance between they and law.
    This thesis proposes a legal penalty prediction system. First, it cleans up and screens the judgments of courts at all levels, then performs Chinese word segmentation, and converts the resulting segmented data into word vectors that can be read by the deep learning model for training. In the learning process, we need to adjust the parameters step by step according to the learning results to achieve the best performance. The experimental results reveal that our best model uses a Text Convolutional Neural Network (TextCNN), which can achieve 76.22% accuracy in Larceny cases, and can reach an execution time of only 9.43ms for each prediction. In addition, through the Bidirectional Gated Recurrent Unit (Bi-GRU) with Attention mechanism, we can visualize the deep learning process that cannot be understood by people, not only to confirm the results of the learning, but also allow users to have a convincing basis for the judgment made by the judgment system.

    中文摘要 i Abstract ii 誌謝 iii List of Figures vi List of Tables viii Chapter 1 Introduction 1 1.1 Overview 1 1.2 Motivation 2 1.3 System Description 3 1.4 Thesis Organization 5 Chapter 2 Related Work 7 2.1 Early LJP Using Machine Learning 7 2.2 Recent LJP Using Deep Learning 8 2.3 Online System 9 Chapter 3 Data Pre-processing 12 3.1 Data Acquisition and Cleaning 12 3.2 Word Segmentation 15 3.3 Word Embedding 16 Chapter 4 Legal Penalty Prediction Method 22 4.1 Artificial Neural Network 22 4.2 Bidirectional Gated Recurrent Unit 24 4.3 Transformer 28 4.4 Bert 31 4.5 CNN for Text Classification 34 4.5.1 CNN 34 4.5.2 TextCNN 36 4.6 Legal Penalty Prediction Model 37 Chapter 5 Experimental Results and Discussions 39 5.1 Experimental Environment Setup 39 5.2 Larceny Dataset 40 5.3 The Results of Legal Penalty Prediction 41 5.3.1 The prediction results 41 5.3.2 The prediction results from Bi-GRU with attention visualization 43 Chapter 6 Conclusions and Future Work 48 6.1 Conclusions 48 6.2 Future Work 49

    References
    [1] 司法院, [Online] Available: https://www.judicial.gov.tw/tw/lp-1920-1.html. [Accessed Feb. 12, 2021].
    [2] 警政署統計查詢網, [Online] Available: https://ba.npa.gov.tw/npa/stmain.jsp?sys=100. [Accessed Feb. 12, 2021].
    [3] U. J. Schild, “Criminal sentencing and intelligent decision support,” Artificial Intelligence and Law, vol. 6, pp. 151-202, 1998.
    [4] M. J. Theresa and V. J. Raj, “Analogy making in criminal law with neural network,” in Processdings of the International Conference on Emerging Trends in Electrical and Computer Technology, Nagercoil, India, pp. 772-775, 2011.
    [5] W.C. Lin et al., “Exploiting machine learning models for chinese legal documents labeling, case classification, and sentencing prediction,” in Processdings of the 24th Conference on Computational Linguistics and Speech Processing, Chung-Li, Taiwan, pp. 140-141, 2012.
    [6] H. Zhong et al., “Legal judgment prediction via topological learning,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, pp. 3540-3549, 2018.
    [7] Y. Le et al., “Learning to predict charges for legal judgment via self-attentive capsule network,” in Proceedings of the 24th European Conference on Artificial Intelligence, Santiago, Spain, pp. 1802-1809, 2020.
    [8] 量刑趨勢建議系統 — 司法院量刑資訊服務, [Online] Available: https://sen.judicial.gov.tw/pub_platform/sugg/index.html. [Accessed Feb. 12, 2021].
    [9] J. Sun, “Jieba,” [Online] Available: https://github.com/fxsjy/jieba/blob/master/LICENSE. [Accessed Jan. 20, 2020].
    [10] J. Pennington, R. Socher, and C. D. Manning, “Glove: Global vectors for word representation,” in Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 1532-1543, 2014.
    [11] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning representations by back-propagating errors,” Nature, vol. 323, no. 6088, pp. 533-536, 1986.
    [12] V. Nair and G. E. Hinton, “Rectified linear units improve restricted boltzmann machines,” in Proceedings of the 27th International Conference on Machine Learning, Haifa, Israel, 2010.
    [13] N. Srivastava et al., “Dropout: a simple way to prevent neural networks from verfitting,” The Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929-1958, 2014.
    [14] J. J. Hopfield, “Neural networks and physical systems with emergent collective computational abilities,” Proceedings of the National Academy of Sciences, vol. 79, no. 8, pp. 2554-2558, 1982.
    [15] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997.
    [16] J. Chung et al., “Empirical evaluation of gated recurrent neural networks on sequence modeling,” arXiv preprint arXiv:1412.3555, 2014.
    [17] M. Schuster and K. K. Paliwal, “Bidirectional recurrent neural networks,” IEEE Transactions on Signal Processing, vol. 45, no. 11, pp. 2673-2681, 1997.
    [18] A. Vaswani et al., “Attention is all you need,” in Proceedings of the 31st Conference on Neural Information Processing Systems, Long Beach, California, pp. 6000-6010, 2017.
    [19] A. Dosovitskiy et al., “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
    [20] A. Katharopoulos et al., “Transformers are rnns: Fast autoregressive transformers with linear attention,” in Proceedings of the 37th International Conference on Machine Learning, Vienna, Austria, Virtual Presentation, vol. 119, pp. 5156-5165, 2020
    [21] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning with neural networks,” in Proceedings of the 28th Advances in Neural Information Processing Systems, Montreal, Canada, pp. 3104-3112, 2014.
    [22] J. Devlin et al., “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
    [23] Y. LeCun et al., “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.
    [24] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Proceedings of the 26th Annual Conference on Neural Information Processing Systems, Lake Tahoe, Nevada, vol. 25, pp. 1097-1105, 2012.
    [25] Y. Kim, “Convolutional neural networks for sentence classification,” arXiv preprint arXiv:1408.5882, 2014.
    [26] I. Chalkidis et al., “LEGAL-BERT: The muppets straight out of law school,” arXiv preprint arXiv:2010.02559, 2020.
    [27] S. Shaghaghian et al., “Customizing contextualized language models for legal document reviews,” arXiv preprint arXiv:2102.05757, 2021.

    無法下載圖示 全文公開日期 2024/08/30 (校內網路)
    全文公開日期 2026/08/30 (校外網路)
    全文公開日期 2031/08/30 (國家圖書館:臺灣博碩士論文系統)
    QR CODE