研究生: |
游景翔 Ching-Hsiang Yu |
---|---|
論文名稱: |
混合式電腦程式抄襲偵測 Hybrid Plagiarism Detection in Computer Programs |
指導教授: |
林彥君
Yen-Chun Lin |
口試委員: |
黃為德
Wei-Te Huang 陳恭 Kung Chen 林伯慎 Bor-Shen Lin 吳怡樂 Yi-Leh Wu |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 資訊工程系 Department of Computer Science and Information Engineering |
論文出版年: | 2007 |
畢業學年度: | 95 |
語文別: | 中文 |
論文頁數: | 46 |
中文關鍵詞: | 程式碼 、抄襲偵測 、相似度 |
外文關鍵詞: | program plagiarism, plagiarism detection, program similarity |
相關次數: | 點閱:198 下載:6 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
電腦程式的抄襲事件時有所聞。而且往往在抄襲的過程中,還會使用一些改寫手法。針對這個問題,本論文提出一種新的方式比對程式碼的相似度。我們混合現有的樹狀比對(tree matching-based comparison)與字串比對(string matching-based comparison)兩種方式。首先利用樹狀比對演算法,負責找出兩個程式中相似的結構,再針對這些相似的結構,以字串比對演算法比較其中的循序敘述。我們用這種方式來分別比較兩個程式的函式,並且根據它們的結構與循序敘述的相似度,計算整個函式的相似度。另外,我們依此理念設計的比對程式,經過實驗,能找出在兩份程式碼中相似的函式,也可省去人工比對所需要的大量時間與人力。
Plagiarism in computer programs occurs frequently. A plagiarized program is often modified from the original copy and does not change its behavior. We propose a new approach to comparing two programs to determine their similarity. We combine the tree matching-based comparison and string matching-based comparison. First, the method finds similarity of structures between two programs using tree matching-based comparison algorithm. Then, sequential statements within similar structures are compared by using the string matching-based comparison algorithm. We use this approach to compare every pair of components (functions) of two programs, and based on their structure similarity and sequential statement similarity between functions, we can obtain the similarity between functions. In addition, we have developed a comparison tool, and the experimental results show that it can find out similar functions in two programs. This software tool can save a large amount of time in detecting plagiarism in computer programs.
[1] JPlag - Detecting Software Plagiarism, http://www.ipd.uni-karlsruhe.de/jplag/.
[2] SmartWin++: An Open Source C++ GUI library, http://smartwin.sourceforge.net/.
[3] A. Aiken, MOSS: A System for Detecting Software Plagiarism, http://theory.stanford.edu/~aiken/moss/.
[4] B. Belkhouche, A. Nix, J. Hassell, Plagiarism detection in software designs, in: Proc. ACM Southeast Conference, Huntsville, AL, 2004, pp. 207-211.
[5] C. Daly, J. Horgan, A technique for detecting plagiarism in computer code, The Computer Journal 48 (6) (2005) 662-666.
[6] D. S. Hirschberg, A linear space algorithm for computing maximal common subsequences, Communications of the ACM 18 (6) (1975) 341-343.
[7] iParadigms, Plagiarism.org: Statistics,
http://www.plagiarism.org/plagiarism_stats.html.
[8] iParadigms, Turnitin: Plagiarism Prevention,
http://www.turnitin.com/static/plagiarism.html.
[9] E. L. Jones, Metrics based plagiarism monitoring, The Journal of Computing in Small Colleges 16 (4) (2001) 253-261.
[10] M. Joy, M. Luck, Plagiarism in programming assignments, IEEE Trans. on Education 42 (2) (1999) 129-133.
[11] L. Moussiades, A. Vakali, PDetect: a clustering approach for detecting plagiarism in source code datasets, The Computer Journal 48 (6) (2005) 651-661.
[12] M. Mozgovoy, Desktop tools for offline plagiarism detection in computer programs, Informatics in Education 5 (1) (2006) 97-112.
[13] S.-Y. Noh, S. Kim, C. Jung, A lightweight program similarity detection model using XML and Levenshtein distance, in: Proc. International Conference on Frontiers in Education: Computer Science and Computer Engineering, Las Vegas, Nevada, 2006, pp. 3-9.
[14] L. Prechelt, G. Malpohl, M. Philippsen, Finding plagiarisms among a set of programs with JPlag, Journal of Universal Computer Science 8 (11) (2002) 1016-1038.
[15] S. Schleimer, D. S. Wilkerson, A. Aiken, Winnowing: local algorithms for document fingerprinting, in: Proc. ACM SIGMOD International Conference on Management of Data, San Diego, CA, 2003, pp. 76-85.
[16] N. R. Wagner, Plagiarism by Student Programmers, 2000, http://www.cs.utsa.edu/~wagner/pubs/plagiarism0.html.
[17] M. J. Wise, Detection of similarities in student programs: YAP'ing may be preferable to Plague'ing, in: Proc. Twenty-Third SIGCSE Technical Symposium, Kansas, 1992, pp. 268-271.
[18] M. J. Wise, String similarity via greedy string tiling and running Karp-Rabin matching, 1993,
http://www.pam1.bcs.uwa.edu.au/~michaelw/ftp/doc/RKR_GST.ps.
[19] M. J. Wise, YAP3: Improved detection of similarities in computer program and other texts, in: Proc. SIGCSE'96, Philadelphia, 1996, pp. 130-134.
[20] W. Yang, Identifying syntactic differences between two programs, Software-Practice and Experience 21 (7) (1991) 739-755.
[21] 郭忠義, 黃福助, 薛念林, 智慧代理人程式碼相似度偵測系統, in: Proc. Taiwan Software Engineering Conference, Taipei, Taiwan, 2006, pp. 193-198.