簡易檢索 / 詳目顯示

研究生: 游景翔
Ching-Hsiang Yu
論文名稱: 混合式電腦程式抄襲偵測
Hybrid Plagiarism Detection in Computer Programs
指導教授: 林彥君
Yen-Chun Lin
口試委員: 黃為德
Wei-Te Huang
陳恭
Kung Chen
林伯慎
Bor-Shen Lin
吳怡樂
Yi-Leh Wu
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2007
畢業學年度: 95
語文別: 中文
論文頁數: 46
中文關鍵詞: 程式碼抄襲偵測相似度
外文關鍵詞: program plagiarism, plagiarism detection, program similarity
相關次數: 點閱:198下載:6
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 電腦程式的抄襲事件時有所聞。而且往往在抄襲的過程中,還會使用一些改寫手法。針對這個問題,本論文提出一種新的方式比對程式碼的相似度。我們混合現有的樹狀比對(tree matching-based comparison)與字串比對(string matching-based comparison)兩種方式。首先利用樹狀比對演算法,負責找出兩個程式中相似的結構,再針對這些相似的結構,以字串比對演算法比較其中的循序敘述。我們用這種方式來分別比較兩個程式的函式,並且根據它們的結構與循序敘述的相似度,計算整個函式的相似度。另外,我們依此理念設計的比對程式,經過實驗,能找出在兩份程式碼中相似的函式,也可省去人工比對所需要的大量時間與人力。


    Plagiarism in computer programs occurs frequently. A plagiarized program is often modified from the original copy and does not change its behavior. We propose a new approach to comparing two programs to determine their similarity. We combine the tree matching-based comparison and string matching-based comparison. First, the method finds similarity of structures between two programs using tree matching-based comparison algorithm. Then, sequential statements within similar structures are compared by using the string matching-based comparison algorithm. We use this approach to compare every pair of components (functions) of two programs, and based on their structure similarity and sequential statement similarity between functions, we can obtain the similarity between functions. In addition, we have developed a comparison tool, and the experimental results show that it can find out similar functions in two programs. This software tool can save a large amount of time in detecting plagiarism in computer programs.

    摘要 I Abstract II 誌謝 III 目錄 IV 圖表索引 VI 第1章 緒論 1 1.1. 動機與目的 1 1.2. 程式碼的抄襲 2 1.3. 相關研究 3 1.4. 論文組織 3 第2章 程式的抄襲與偵測技術 4 2.1. 常見的抄襲手法 4 2.2. 現有的抄襲偵測技術 5 2.2.1. 以指紋為基礎 6 2.2.2. 以字串比對為基礎 6 2.2.3. 以樹狀結構比對為基礎 7 2.2.4. 其他相關研究 8 2.3. 新的比對方式 9 第3章 混合式比對 11 3.1. 樹狀結構表示法 11 3.2. 比對程式的結構 15 3.3. 比對循序敘述 17 3.4. 相似度之定義 20 第4章 比對演算法 22 4.1. 節點結構 22 4.2. 樹狀結構的建立與比對 23 4.3. 符號化 26 4.4. GST演算法 28 第5章 比對程式與實例 31 5.1 比對程式的使用例 31 5.2 另一實例 37 第6章 討論 40 第7章 結論與未來方向 43 參考文獻 45

    [1] JPlag - Detecting Software Plagiarism, http://www.ipd.uni-karlsruhe.de/jplag/.
    [2] SmartWin++: An Open Source C++ GUI library, http://smartwin.sourceforge.net/.
    [3] A. Aiken, MOSS: A System for Detecting Software Plagiarism, http://theory.stanford.edu/~aiken/moss/.
    [4] B. Belkhouche, A. Nix, J. Hassell, Plagiarism detection in software designs, in: Proc. ACM Southeast Conference, Huntsville, AL, 2004, pp. 207-211.
    [5] C. Daly, J. Horgan, A technique for detecting plagiarism in computer code, The Computer Journal 48 (6) (2005) 662-666.
    [6] D. S. Hirschberg, A linear space algorithm for computing maximal common subsequences, Communications of the ACM 18 (6) (1975) 341-343.
    [7] iParadigms, Plagiarism.org: Statistics,
    http://www.plagiarism.org/plagiarism_stats.html.
    [8] iParadigms, Turnitin: Plagiarism Prevention,
    http://www.turnitin.com/static/plagiarism.html.
    [9] E. L. Jones, Metrics based plagiarism monitoring, The Journal of Computing in Small Colleges 16 (4) (2001) 253-261.
    [10] M. Joy, M. Luck, Plagiarism in programming assignments, IEEE Trans. on Education 42 (2) (1999) 129-133.
    [11] L. Moussiades, A. Vakali, PDetect: a clustering approach for detecting plagiarism in source code datasets, The Computer Journal 48 (6) (2005) 651-661.
    [12] M. Mozgovoy, Desktop tools for offline plagiarism detection in computer programs, Informatics in Education 5 (1) (2006) 97-112.
    [13] S.-Y. Noh, S. Kim, C. Jung, A lightweight program similarity detection model using XML and Levenshtein distance, in: Proc. International Conference on Frontiers in Education: Computer Science and Computer Engineering, Las Vegas, Nevada, 2006, pp. 3-9.
    [14] L. Prechelt, G. Malpohl, M. Philippsen, Finding plagiarisms among a set of programs with JPlag, Journal of Universal Computer Science 8 (11) (2002) 1016-1038.
    [15] S. Schleimer, D. S. Wilkerson, A. Aiken, Winnowing: local algorithms for document fingerprinting, in: Proc. ACM SIGMOD International Conference on Management of Data, San Diego, CA, 2003, pp. 76-85.
    [16] N. R. Wagner, Plagiarism by Student Programmers, 2000, http://www.cs.utsa.edu/~wagner/pubs/plagiarism0.html.
    [17] M. J. Wise, Detection of similarities in student programs: YAP'ing may be preferable to Plague'ing, in: Proc. Twenty-Third SIGCSE Technical Symposium, Kansas, 1992, pp. 268-271.
    [18] M. J. Wise, String similarity via greedy string tiling and running Karp-Rabin matching, 1993,
    http://www.pam1.bcs.uwa.edu.au/~michaelw/ftp/doc/RKR_GST.ps.
    [19] M. J. Wise, YAP3: Improved detection of similarities in computer program and other texts, in: Proc. SIGCSE'96, Philadelphia, 1996, pp. 130-134.
    [20] W. Yang, Identifying syntactic differences between two programs, Software-Practice and Experience 21 (7) (1991) 739-755.
    [21] 郭忠義, 黃福助, 薛念林, 智慧代理人程式碼相似度偵測系統, in: Proc. Taiwan Software Engineering Conference, Taipei, Taiwan, 2006, pp. 193-198.

    QR CODE