簡易檢索 / 詳目顯示

研究生: 洪兆怡
Chao-Yi Hung
論文名稱: 提高機器輔助翻譯後編譯效率與品質之關鍵探究
A Study of the Keys to Better Post-Editing in Machine-Aided Human Translation
指導教授: 陳献忠
Shian-Jung Chen
口試委員: 鄧慧君
Hui-Chun Teng
鄭錦桂
Chin-Kuei Cheng
學位類別: 碩士
Master
系所名稱: 人文社會學院 - 應用外語系
Department of Applied Foreign Languages
論文出版年: 2016
畢業學年度: 104
語文別: 英文
論文頁數: 189
中文關鍵詞: 機器翻譯谷哥翻譯工具包機器輔助人工翻譯後編譯格位格框語法語言剖析器
外文關鍵詞: machine translaiton, Google Translate, machine-assisted human translation, post-editing, case grammar, language parser
相關次數: 點閱:419下載:18
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

機器翻譯的發展超過60年,卻仍無法取代人工翻譯。翻譯記憶加上自動翻譯平台逐漸的成熟技術,發展出以機器翻譯搭配人工後編譯的翻譯模式。此種翻譯模式能較過去提高翻譯產量同時提昇譯文品質,也因此讓翻譯公司與專業譯者提高了使用機器翻譯的意願。惟機器翻譯所產出的譯文錯誤,導致後編譯作業耗時費工,譯文品質不穩定。
為深入探討導致後編譯費時耗工譯文品質不穩定的原因,進一步提昇後編譯效率的方法,本研究提出四個問題:(1)導致機器翻譯譯文難以閱讀的主因為何? (2)機器輔助人工翻譯後編譯效率的瓶頸為何?(3)針對釐清機器翻譯譯文錯置的問題,後編譯譯者需要什麼樣的訓練?(4)針對機器翻譯譯文後編譯訓練,語言剖析器(Language Parser)能提供什麼樣的協助?
研究方法以Google Translate (GT)為機器翻譯模型,輸入五種不同類型的文本產出譯文,再將原文輸入經過內建語言學語法的語言剖析器進行剖析,兩者結果對照分析後,歸納出GT譯文與原文語意的分歧與錯誤後,再以前述分岐錯誤總結出包含40項要點的後翻譯因應策略清單。
本研究最終發現:(1)導致機器翻譯譯文難以閱讀的主因為譯文單詞順序錯置(reordering);(2)譯文單詞順序一錯亂,讓後編譯譯需花費大把時間調整順序,影響工作效率,而成為最大瓶頸;(3)後編譯譯者最需要的訓練就是要能看出造成譯文錯亂的原因是來自事件(events)的數量與事件間(relation-events)的關係,同時加強語言學理論與機器翻譯概念基礎。因此,本研究歸納出19項訓練項目,能協助後編譯譯者一眼看出譯文為何錯置的原因;(4)本研究以語言剖析器剖析原文歸納出40項要點(checklist)、七項指南(guidelie)及19項針對後編譯技能的訓練課程(training program),後編譯譯者參照這些項目由淺入深,經過相當時間的訓練,應能有效加快後編譯的速度同時維持譯文品質。


Machine translation (MT) has been gone along for over sixty years, and gradually develops a translation combination of Machine Translation (MT) and Post-Editing (PE). Although this combination successfully increases the productivity and quality of translation, some problems, like word sense disambiguation and reordering, derived from MT still make PE efficiency impossible.
In order to solve above MT problems, this study raises up four study questions: (1) what are the major problems that affect the quality of current MT? (2) what are the bottlenecks that compromise the efficiency of post-editing in MAHT? (3) what kinds of training are needed to enable post-editors to address the problems derived from MT’s poor reordering? (4) how can a language parser help with the training to untangle cluttered output texts from MT in post-editing training?
This study uses Google Translate (GT) as the testing model to firstly produce the output out of different sample texts. Then the GT texts were compared with the original texts (OT) parsed with a language parser to identify the errors between GT and OT. Finally, those problems were summarized and, as a result, produce a checklist of 40 checkpoints and 7 guidelines. Moreover, around 19 training goals are laid out for the reference of post-editor training programs.
According to the results, four research questions of this study are answered: (1) the major problems that affect the quality of current MT are WSD and reordering, mainly reordering; (2) the bottlenecks that compromise the efficiency of post-editing in MAHT is reordering; (3) instilling and enhancing post-editors’ perspective of linguistic are needed to address the problems derived from MT’s poor reordering; (4) the guideline and checkpoints can help the training to untangle cluttered output texts from MT in post-editing training. All in all, with the aid of the checklist, the translators will gradually improve their PE skills, as well as the productivity and quality of translation.

中文摘要iv ABSTRCTvi ACKNOWLEDGEMENTviii List of Tablesxiii List of Figuresxiv Abbreviationsxv Chapter One Introduction1 1.1 Background and Motivation1 1.1.1 Machine Translation and Post-editing2 1.1.2 Translation memory and MT workbenches4 1.1.3 What Really Slows Things Down?4 1.2 Significance of the Study5 1.3 Purpose of the Study5 1.4 Research Questions6 1.5 Terminology6 1.6 Structure of the Study7 Chapter Two Literature Review9 2.1 History and Development of MT9 2.2 Google Translate12 2.2.1 Features of GT14 2.2.2 Limitation of GT14 2.3 MT Evaluation16 2.4 MT Problem in MT16 2.4.1 Lexical Ambiguity and Word Sense Disambiguation16 2.4.2 Reordering16 2.5 Case Grammar17 2.6 Post-editing18 2.6.1 Efficiency of post-editing20 2.6.2 Training of post-editing skill20 2.6. 3 Guideline of post-editing21 2.7 English parser21 2.8 Summary23 Chapter Three Methodology24 3.1 Corpora24 3.2 Sample Texts25 3.3 Data Collection26 3.3.1 Three stages of data collection26 3.2.2 Google Translate as the MT test model28 3.3 English Parser28 3.4 How Google Translate MT translate?29 3.4.1 Google MT does chunking instead of using events as translation units35 3.4.2 Google MT ignores inter-event relations37 3.5 The concept of event-based translation and post-editing strategies39 3.6 Data Analysis40 Chapter Four Data Analysis and Post-editing in MAHT42 4.1 Introduction42 4.2 Evaluation tools and fundamentals in translation studies42 4.2.1 Sentence alignment and alignment in general43 4.2.2 Case Grammar and the building of the case frame of Who Did What to Whom54 4.2.3 Parsing and parser annotation57 4.2.4 Using parallel (bilingual) annotations to help detect MT problems61 4.3 Data analysis68 4.3.1 MT problems68 4.3.2 Causes of MT problems86 4.3.2.1 Google MT’s WSD86 4.3.2.2 Google MT’s reordering89 4.3.2.3 Relational meaning and relational function words94 4.3.2.4 Split structures mishandled99 4.3.2.5 Other function words101 4.3.2.6 Nominalization and event-related problems102 4.3.2.7 WSD problems106 4.4 Post-editing in MAHT108 4.4.1 Event finding in the course of post-editing109 4.4.2 Use of the finding of PADS gaps to build case frames for events112 4.4.3 Reordering and inter-event relations115 4.4.4 WSD and non-relational meaning119 4.5 Summary122 Chapter Five Post-editing Guidelines and Training123 5.1 Guidelines for post-editing123 5.2 Post-editor training128 Chapter Six Conclusion135 6.1 Research Questions Recalled135 6.2 Remark, Limitation and Future Research136 6.2.1 Final remark136 6.2.2 Limitation of this study137 6.2.3 Future research138 References139 Appendix A: Source of example 1148 Appendix B: Source of example 2153 Appendix C: Source of example 3154 Appendix D: Source of example 4155 Appendix E: Source of example 5156 Appendix F: Source of example 6157 Appendix G: Source of example 7158 Appendix H: Source of example 8159 Appendix I: Source of example 9160 Appendix J: Scenario of Case frame161 Appendix K: Noun Phrases in Sentence167

Almeida, G., O'Brien, S., & Ritchie, P. (2013). Translating the post-editor: an investigation of post-editing changes and correlation with professional experience aross two Romance language. Doctoral Dissertation, Dublin City University.
Balk EM, Chung M, Chen ML, Trikalinos TA, Kong Win Chang L. . (2013). Assessing the Accuracy of Google Translate to Allow Data Extraction From Trials Published in Non-English Languages. Methods Research Report.
Vázquez, S. R., Vázquez, L. M., & Bouillon , P. (2013). Comparing forum data post-editing performance using translation memory and machine translation output: a pilot study . Proceddings of the XIV Machine Translation Summit, (pp. 249-256).
Aiken, M.,& Balan, S. (2011). An analysis of Google Translate Accuracy. 16. Retrieved January 5, 2015, from http://translation journal.net/journal/56google.htm
Allen, J. (2003). Post-editing. In H. Somers (Ed.), Computers and Translation: A Translator's Guide. Amsterdam & Philsfrlphis: John Brnjsmins.
ALPAC. (1966). Language and machines:computers in translation and linguistics.
Animal Sanctuaries in Labrador. (1911, Nov 9). Nature, International weekly journal of science, Volume 88, pp. 60-61.
Arnold, D., Balkan, L., Meijer, S., Humphreys, R., & Sadler, L. (1994). Machine translation: an introductory guide.
Bar-Hillel, Y. (1960). A demonstration of the non-feasibility of fully automatic high quality translation. Appendix III of "The present status of automatic translation of languages' Advances in Computers(Vol.1), pp. 91-163.
Bates, M. (1978). The Theory and Practice of Augmented Transition Network Grammars. In L. Bolc (Ed.), Natural Language Communication with Computers.
Berlin, C., Hood, L., Morlet, T., Wilensky, D., Li, L., Mattingly, K. R., . . . Frisch, S. (2010). Multi-site diagnosis and management of 260 patients with Auditory Neuropathy/Dys-synchrony (Auditory Neuropathy Spectrum Disorder*). International Journal of Audiology, Vol.49(Issue 1), pp. 30-43.
Bhat, D. (1977). Multiple Case Roles. Lingua, Volume 42(Issue 4), pp. 365-377.
Brown, P., Lai, J., & Mercer, R. (1991). Aligning sentences in parallel corpora. Proceeding of the 29th Annual Meeting of the ACL (pp. 169-176). Berkeley.
Butler, F. (2011). Machine versus human: will Google Translate replace professional translators? Retrieved December 25, 2014, from http://mason.gmu.edu/~fbutler2/IT%20103-005%20Research%20Paper%20Butler.pdf
Chang Chien, S.-C., & TANANGKINGSING , M. (2013). A Study of the Functions, Features, and Applications of Google Translate:A Case Study of Chinese Translations of English Manuals. Master Thesis, Fe Jen Catholic University.
Chen, S.-J. (2010). Linguistic relativity revisited. 2010年跨文化研究國際研討會論文集(Nov.2011). 輔仁大學.
Chen, S.-J. (2013). PADS Restoration and Its Importance in Reading Comprehension and Meaning Representation. 27th Pacific Asia Conference on Language, Information, and Computation, (pp. 563-570).
Chen, S.-J. (2015). Automatic Computer Reading and Human Reading. Studies in English Language and Literature(Vol.35), pp. 1-19.
Chen, S-J, & Loritz, D. (2005, May). Context grammar and POS tagging. Paper presented at the Second Midwest Computational Linguistics Colloquium, The Ohio State University.
Chomsky, N. (1969). Deep structure, surface structure, and semantic interpretation.
Dorr, B. J., Jordan, P. W., & Benoit, J. W. (1999). A Survey of Current Paradigms in Machine Translation. In M. V. Zelkowitz (Ed.), Advances in Computers (pp. 1-68).
Fillmore, C. J. (1968). The case for case. Retrieved March 10, 2015, from http://files.eric.ed.gov/fulltext/ED019631.pdf
Fulford, H. (2002). Freelance translators and machine translation:An investigation of perceptions, uptake, experience and training needs. Retrieved Dec 10, 2014, from http://www.mt-archive.info/EAMT-2002-Fulford.pdf
Google Translate. (2007). Retrieved 12 10, 2014, from Wikipedia: https://en.wikipedia.org/wiki/Google_Translate
Google Translate. (2016). Retrieved from Google Translate: https://translate.google.com/about/intl/en_ALL/languages.html
Green, S., Heer, J., & Manning, C. D. (2013). The Efficacy of Human Post-Editing for Language Translation. CHI'13 Processings of the SIGCHI Conference on Human Factors in Computing Systems, (pp. 439-448). Retrieved from http://vis.stanford.edu/files/2013-PostEditing-CHI.pdf
Guerberof, A. (2009). Productivity and quality in MT post-editing. Retrieved 1 6, 2016, from http://www.l2f.inesc-id.pt/~fmmb/wiki/uploads/Work/dict.ref6.pdf
Guerberof, A. (2013). What do professional translators think about post-editing? The Journal of Specialised Translation(19). Retrieved December 10, 2014, from http://www.jostrans.org/issue19/art_guerberof.pdf
Haspelmath, M. (2007). Coordination. In S. Timothy (Ed.), Language Typology and Syntactic Description (2nd Edition ed., Vol. 2, pp. 1-51). Cambridge University Press.
Hovy, E., Ide, N., Frederking, R., Mariani, J., & Zampolli, A. (2001). Multilingual Information Management: Current Levels and Future Abilities. Insituti Editoriali e Poligrafici Internazionali.
Huang, T.-H., & Hsu, L.-Y. (2015). The Effectiveness of Machine English-Chinese Translation with Post-editing in Comparison with Human Translation. Master Thesis, National Yunlin University of Science & Technology.
Hutchins, J. (1995). Machine Translation: A Brief History. In E. K. R.E.Asher (Ed.), Concise history of the language sciences: from the Sumerians to the congnitivists (pp. 431-445). Oxford: Programon Press.
Hutchins, J. (1995). REFELCTIONS ON THE HISTORY AND PRESENT STATE OF MACHINE TRANSLATION. Retrieved January 15, 2016, from http://www.mt-archive.info/MTS-1995-Hutchins.pdf
Hutchins, J. (1995). Trend in machine translation research. Retrieved December 10, 2014, from http://www.hutchinsweb.me.uk/SprogErhverv-1995.pdf
Hutchins, J. (2006). Machine Translation: A Concise History. Retrieved December 10, 2014, from http://www.hutchinsweb.me.uk/CUHK-2006.pdf
Hutchins, J., & Somers, H. (1992). General introduction and brief history. In An Introduction to Machine Translation (pp. 1-9).
Hutchins, W.J., & Harold, L. S,. (1992). An Introduction to Machine Translation. Academic Press.
Koponen, M., & Salmi, L. (2015, January). On the correctness of machine translation: A machine translation post-editing task. The Journal of Specialised Translation, pp. 118-136.
Kravits, K. (2008). 6 The Stress Response and Adaptation Theory. In N. Hass-Cohen, & R. Carr (Eds.), Art Therapy and Clinical Neuroscience (pp. 111-130). Retrieved from English Teaching/Learning Resources Center in Northern Taiwan (教育部北區英語教學資源中心): www.etlc.ntust.edu.tw/materials06/04_Clouding%20Computing.doc
Krugman, P. (2011, Oct 22). Meanwhile, Greece. Retrieved from New York Times, The Opnion Pages: http://krugman.blogs.nytimes.com/2011/10/22/meanwhile-greece/
Läubli, S., Fishel, M., Massey, G., Ehrensberger-Dow, M., & Volk, M. (2013). Assessing Post-Editing Efficiency in a Realistic Translation Environment. Proceedings of MT Summit XIV Workshop on Post-editing Technology and Practice, (pp. 83-91).
Lee, J., & Liao, P.-S. (2009). A Comparative Study of Fully Automatic Machine Translation with Post-editing and Human Translation. Master Thesis, National Taiwan Normal University. Retrieved December 10, 2014, from http://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi/ccd=wgSKFw/record?r1=1&h1=4
Lessen, P. (2005, May 22). Retrieved January 14, 2016, from Google Blogoscoped: http://blogoscoped.com/archive/2005-05-22-n83.html
Llitjós, F., Carbonell, J. G., & Lavie, A. (2005). A Framework for Interactive and Automatic Refinement of Transfer-based Machine Translation. Proceedings of the Tenth Workshop of the European Association for Machine Translation, (pp. 1-10).
Lopez, A. (2007). A Survey of Statistical Machine Translation. Retrieved March 10, 2015, from http://www.dtic.mil/dtic/tr/fulltext/u2/a466330.pdf
Loritz, D. (1911). Cerebral and Cerebellar Models of Language Learning. Applied Linguistics, Volume 12, Issue 3, pp. 299-318.
Loritz, D. (1992). Generalized Translation Network Parsing for Language Study:The GPARS system for English, Russian, Japanese and Chinese. CALICO Journal, Volume 10 Number 1, pp. 5-22.
Lu, S.-K., & Chen, S.-J. (2012). A Study of the Relational Marking and Clause Boundary Detection as the Keys to Solving Reordering Problems in Machine Translation Systems. Master Thesis, National Taiwan University of Science and Technology.
Lyons, J. (1995). Linguistic Semantics An Introduction. Cambridge: Cambridge University Press.
Mahoney, M., & Saltzman, W. (1999). Millimeter-scale positioning of a nerve-growth-factor source and biological activity in the brain. Proceddings of the National Academy of Sciences of the United States of America, v.96(8), pp. 4536-4539.
Marcus, M., Santorini, B., & Marcinkiewicz, M. (1993). Building a Large Annotated Corpus of English: Tje Penn Treebank.
Marshall, G. (1998). Sapir-Whorf hypothesis. A Dictionary of Sociology. Retrieved Feb 1, 2016, from Encyclopedia.com: http://www.encyclopedia.com/topic/Theory_of_linguistic_relativity.aspx#1-1O88:SapirWhorfhypothesis-full
Moorkens, J., & O'Brien, S. (2015). Post-Editing Evaulations: Trade-offs between Novice and Professional Participants. Conference of the European Association for Machine Translation, (p. 7).
Nessulhauf, N. (2005). Investigating collocations in a learner corpus. In N. Nessulhauf, Collocations in a Learner Corpus (pp. 11-64).
Partridge, B. (1982). The Structure and Function of Fish Schools. Scientific American, Vol.246, no.6, 114-123.
Plitt, M., & Masselot, F. (2010). A productivity Test of Statistical Machine Translation Post-Editing in a Typical Localisation Context. The Prague Bulletin of Mathematical Linguistics, pp. 7-16.
Ryan, J. (1988). The Role of the Translator in Making an MT system Work: Perspective of a Developer. In M. Vasconcellos (Ed.), Technology as Translation Strategy (pp. 127-132).
Sammer, M., Peiter, K., Soderland, S., Kirchhoff, K., & Etzioni, O. (2006). Ambiguity Reduction for Machine Translation: Human-Computer Collaboration. Retrieved 1 15, 2016, from http://ai.cs.washington.edu/www/media/papers/tmpL8vB8p.pdf
Shih, C.-L. (2004). To MT and TM: A Guidebook Teaching. The Bookmann.
Shih, C.-L. (2013, July). Adptations in Controlled Cultural Writing for Effective Machine Translation: A Register-specific Probe. Theory and Practice in Language Studies. Vol.3, No.7, pp. 1093-1102. Finland: ACADEMY PUBLISHER.
Shih, C.-L. (2013, March). Computer-Assisted Business Translation Teaching: A Study on MT and TM. Retrieved December 28, 2014, from http://ej.naer.edu.tw/CTR/v06.1/ctr060114.pdf
Shih, C.-L. (2015, Oct). Toward one-stop Information Mining: Tailoring Web Texts to Effective Machine Translation. Internaional Journal of Language and Linguistics, Vol.2, No.4, pp. 15-24.
Shuttleworth, M., & Cowie, M. (1997). Dictionary of Translation Studies. Routledge.
Singla, K., Orrego-Carmona, D., Gonzales, A. R., Carl, M., & Bangalore, S. (2014). Predicting Post-Editor Profiles from the Translation Process. AMTA Workshop, (pp. 51-60). Vancouver, Canada. Retrieved 1 6, 2016, from http://openarchive.cbs.dk/bitstream/handle/10398/9071/Michael%20Carl_%20AMTA2014Proceedings_2.pdf?sequence=1
Somers, H. (1997). A Practical Approach to Using Machine Translation Software:'Post-editing the Source Text'. The Translator, Vol.3(Issue 2), pp. 193-212.
Somers, H. (2003). Machine Translation:Latest Development. In R. Mitkov (Ed.), The Oxford Handbook of Computational Linguistic (pp. 512-528).
Somers, H. L. (2001). Machine translation, applications. In M. Baker (Ed.), Routledge Encyclopedia of Translation Studies (pp. 134-139).
Talbot, D., Kazawa, H., Ichikawa, H., Katz-Brown, J., Seno, M., & Och, F. J. (2011). ALightweightEvaluationFrameworkforMachineTranslationReordering. Proceedings of the 6th workshop on Statistical Machine Translation, (pp. 12-21).
TOEIC Simulation Test. (2003). Retrieved from The English Generation of Air Language Institution: http://www.air-aviation.com.tw/english-center/Test/TOEIC-1.htm
Vasconcellos, M. (1986). Post-editing on-screen: machine translation from Spanish into English. Proceedings of Translating and the computer 8, (pp. 133-146).
Vasconcellos, M., & Léon, M. (1985). SPANAM and ENGSPAN: machine translation at the Pan American Health Organization. Computer Linguistics(11), pp. 122-136.
Vilar, D., Xu, J., D'Haro, L. F., & Ney, H. (2006). Error Analysis of Statistical Machine Translation Output. Retrieved December 10, 2014, from http://www.dfki.de/~davi01/papers/vilar:lrec2006:errorAnalysis.pdf
Wagner, E. (1985). Post-editing Systran-A challenge for Commission Translator. Terminologie et Traduction(no.3).
Wagner, E. (1987). Post-editing--Practical Considerations. ITI Conference I: The Business of Translatiing and Interpreting, (pp. 71-78). London.
Wang, Z.-Y., & Chen, S.-J. (2013). Function Word Translation: A Case Study on English Prepositions. Master Thesis, National Taiwan University of Science and Technology.
Weaver, W. (1955). Translation. In L. &. Booth (Ed.).
Yunker, J. (2008). The end of translation as we know it. MultiLingual, Vol.19(Issue 8), p. 30.
Z, B., Llinás, M., Pulliam, B., Wong, E., Zhu, J.-C., & Derisi, J. (2003). The Transcriptome of the Intraerythrocytic Developmental Cycle of Plasmodium falciparum. Retrieved from Plos, BIOLOGY: http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.0000005
姚念祖、蘇克毅. (2005). 淺談機器翻譯之瓶頸及目前的研發趨勢. 中華民國計算語言學學會通訊, 第十六卷(第二期), pp. 8-18.
張景新、 陳淑娟. (2001). 機器翻譯的最新發展趨勢. Retrieved 12 10, 2014, from http://nlp.csie.ncnu.edu.tw/~shin/bdc/doc/INTRO.201
陳欣蓉. (2009e). 如何提升翻譯記憶體的資料重複使用之方法—以Trados軟體為例. 淡江人文社會學刊, pp. 119-140. Retrieved 3 15, 2015, from http://www2.nkfust.edu.tw/~chensl/paper/20060918ZP.doc
解志強. (2002). Teaching machine translation and translation memory systems. 翻譯學研究集刊(7), pp. 297-322.

QR CODE