研究生: |
劉育哲 Yu-Che Liu |
---|---|
論文名稱: |
應用文件探勘技術於接觸式影像感測器專利分析之研究 Application of Text Mining Techniques for Contact Image Sensor Patent Analysis |
指導教授: |
郭中豐
Chung-Feng Kuo |
口試委員: |
蔡鴻文
Hung-Wen Tsai 葉雲卿 none 呂永和 Yung-Ho Leu |
學位類別: |
碩士 Master |
系所名稱: |
應用科技學院 - 專利研究所 Graduate Institute of Patent |
論文出版年: | 2014 |
畢業學年度: | 102 |
語文別: | 中文 |
論文頁數: | 115 |
中文關鍵詞: | 接觸式影像感測器 、本體論 、文件探勘 、集群分析 、分類分析 |
外文關鍵詞: | Contact Image Sensor、Ontology、Text Mining、Clu |
相關次數: | 點閱:785 下載:0 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
我國為影像感測器的生產大國,在整體的製造與研發技術上亦居於全球領先地位。且國內外業者對智慧財產權的重視,積極對接觸式影像感測器技術進行專利佈局,每年皆有大量的專利申請案。針對這些大量的專利資料,實有必要以有系統的方式進行專利的管理及專利分析,以確保在未來的技術競爭中,獲得持續性的優勢。
目前的專利分析,研發人員或專利工程師若想透過專利來掌握技術動態,首先要經過專利的檢索。經專利檢索後所獲得的資料,往往需要花費大量的時間進行專利閱讀,才能瞭解專利資料中的技術內容。若能以文件探勘的方法,來掌握專利資料中的技術關鍵字,便能夠有效率的對資料進行解析,並且透過定義的本體論關鍵詞來快速的對專利資料進行技術分析,如此一來便能夠加快大量資料的處理時間。
本研究檢索接觸式影像感測器之美國專利,並且應用專利地圖來探討接觸式影像感測器領域的技術發展趨勢。將所檢索之專利資料以文件探勘技術提取出該技術領域之關鍵字,以建立接觸式影像感測器之本體論架構。進而利用此本體論對接觸式影像感測器專利進行解析,分別使用集群分析之連續信息瓶頸法及分類分析之貝氏分類法兩種分析角度進行探討。在應用上集群分析屬非監督式的分析方法,將樣本專利經演算法運算後共分成5個集群,以可視化的圖形介面來表示,並分別探討各集群之技術分類。而分類分析為監督式的分析方法,透過資料的處理後,利用熵(Entropy)之運算方法對TF-IDF加權處理後的特徵值進行篩選,將經過熵之運算後最佳化的特徵值結果,在精確率(Precision)上達90.4%,其召回率(Recall)達91.7%,而在調和平均數(F-Measure)上達91.1%。本研究利用文件探勘方法分析接觸式影像感測器專利,運用此分析方法在大量資料的處理時,能夠節省研發人員或專利工程師對專利文件的處理時間,進而提高專利分析的準確度及效率。
Taiwan is a major image sensor producing region, and its overall manufacturing and R&D technologies are in a leading position of the world. Companies have paid attention to the intellectual property, and patented their contact image sensor technologies actively. There are numerous patent applications annually. For the mass patent data, it is necessary to manage and analyze patents systematically, so as to guarantee persistent advantages in the future technology competition.
In the present patent analysis, the R&D personnel or patent engineers need to retrieve patents first in order to master technology conditions from patents. It is time-consuming to read through the patents, in order to learn the technical content in the patent data obtained from patent retrieval. If the technical keywords in patent data can be known by text mining, the data can be analyzed efficiently, and the patent data are analyzed technically and rapidly by the defined ontology keywords. The processing of mass data can be accelerated.
This study retrieved the U.S. patents for contact image sensor, and used patent map to discuss the technical development trend of contact image sensor field. The keywords of the technical field were extracted from the retrieved patent data by using text mining technology. The ontological framework of contact image sensor was built, and the contact image sensor patents were analyzed using this ontology. The sequential information bottleneck method of cluster analysis and the Bayes Classifier of classification analysis were used for discussion. In terms of application, the cluster analysis is a non-supervised analysis method, and the sample patents are divided into 5 clusters after algorithmic calculation, represented as visual graphic interface, and the technical classes of various clusters are discussed respectively. The classification analysis is supervised analysis method. After data processing, the eigenvalues of the weighted TF-IDF are screened by entropy calculation method, the precision of the optimized eigenvalue result after entropy calculation is 90.4%. The recall is 91.7%, and the F-Measure is 91.1%. The text mining method is used in this study to analyze the contact image sensor patents, using this analysis method to process mass data can shorten the patent document processing time for the R&D personnel or patent engineers, so as to increase the accuracy and efficiency of patent analysis.
[1] Patents and Patent Searching, http://www.lib.ntu.edu.tw/Publication/univj/uj2-4/
uj2-4_7.htm, Accessed June, 2014.
[2] Z. Cao and H. Zhao, “Research of Knowledge Acquisition and Modeling Method Based on Patent Map,” Knowledge Acquisition and Modeling Workshop, IEEE International Symposium, pp. 1090–1094, 2008.
[3] M. S. Yuan, “The Study of Development of Patent Indexes,” Journal of Library and Information Science, Vol. 35, No. 2, p.88–106, 2009.
[4] Y. H. Chen, C. Y. Chen and S. C. Lee, “Technology Forecasting and Patent Strategy of Hydrogen Energy and Fuel Cell Technologies,” International Journal of Hydrogen Energy, Vol. 36, No. 12, pp. 6957–6969, 2011.
[5] R. Feldman and I. Dagan, “Knowledge Discovery in Textual Databases (KDT),” Proceedings of the First International Conference on Knowledge Discovery and Data Mining, pp. 112-117, 1995.
[6] M. Fattori, G. Pedrazzi and R. Turra, “Text Mining to Patent Mapping: A Practical Business Case,” World Patent Information, Vol. 25, No. 4, pp. 335-342, 2011.
[7] B. Yoon and Y. Park, “A Text-Mining-Based Patent Network: Analytical Tool for High-Technology Trend,” The Journal of High Technology Management Research, Vol. 15, No. 1, pp. 37-50, 2004.
[8] K. K. Lai and S. J. Wu, "Using the Patent Co-Citation Approach to Establish a New Patent Classification System," Inforniation Processing & Management, Vol. 41, pp. 313-330, 2005.
[9] L. Sun and S. Yu, ”Research on Clustered Patent Mapping Visualization and Interaction,” IEEE The 9th International Conference on Computer-Aided Industrial Design & Conceptual Design, pp. 1130-1133, 2008.
[10] Z. Liu and D. Zhu, “Web Mining Based Patent Analysis and Citation Visualization,” Web Mining and Web-based Application, Second Pacific-Asia Conference, pp. 19–23, 2009.
[11] C. H. Lee, H. C. Yang, C. H. Wu and Y. J. Li, “A Multilingual Patent Text-Mining Approach for Computing Relatedness Evaluation of Patent Documents,” Intelligent Information Hiding and Multimedia Signal Processing, pp. 612-615, 2009.
[12] A. J. C. Trappey, C. V. Trappey, F. C. Hsu and D. W. Hsiao, “A Fuzzy Ontological Knowledge Document Clustering Methodology,” IEEE Transactions on Systems, Man, and Cybernetics, Part B, Cybernetics, Vol. 39, No. 3, pp. 806-814, 2009.
[13] H. Nonaka, A. Kobayahi, H. Sakaji, Y. Suzuki, H. Sakai and S. Masuyama, “Extraction of The Effect and The Technology Terms from a Patent Document,” Computers and Industrial Engineering, 40th International Conference, pp.1-6, 2010.
[14] A. Kobayashi, H. Nonaka, S. Masuyama and H. Sakai, “An Automatic Thesaurus Construction Method for Technological Terms in Patent Maps,” Computers and Industrial Engineering, 40th International Conference, pp. 1-5, 2010.
[15] V. Trappey, H. Y. Wu, F. Taghaboni-Dutta and A. J. C. Trappey, “Using Patent Data for Technology Forecasting: China RFID Patent Analysis,” Advanced Engineering Informatics, Vol. 25, No. 1, pp. 53–64, 2011.
[16] E. Guangpu, C. Xu and P. Zhiyong, “A Rules and Statistical Learning Based Method for Chinese Patent Information Extraction,” Web Information Systems and Applications Conference, pp. 114–118, 2011.
[17] J. Choi, H. Kim and Y. S. Hwang, “A Study on The Patent Analysis for Effective Technology Forecasting,” Information Science and Digital Content Technology, 8th International Conference, pp. 88-91, 2012.
[18] J. Tang, B. Wang, Y. Yang, P. Hu, Y. Zhao, X. Yan, B. Gao, M. Huang, P. Xu and W. Li, “PatentMiner: Topic-Driven Patent Analysis and Mining,” Proceedings of The 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1366-1374, 2012.
[19] P. Hu, M. Huang and X. Zhu, “Finding Nuggets in Patent Portfolios: Core Patentmining and Its Applications,” Tsinghua Science and Technology, Vol. 18, No. 4, 2013.
[20] M. Drazic, D. Kukolj, M. Vitas, M. Pokric, S. Manojlovic and Z. Tekic, “Technology Matching of The Patent Documents Using Clustering Algorithms,” Computational Intelligence and Informatics, IEEE 14th International Symposium, pp. 405-409, 2013.
[21] A. J. C. Trappey, H. Y. Peng and C. V. Trappey, “Ontology-Based Dental Implant Connection Patent Analysis,” Computer Supported Cooperative Work in Design, IEEE 17th International Conference, pp. 257-262, 2013.
[22] 菱光科技股份有限公司年報,台北,頁42-44,2012。
[23] 曹慧華,接觸式影像感應器CPFR之應用研究,東吳大學企業管理研究所,2009。
[24] 林來誠,接觸式影像感測器產業與技術動態,光電科技工業協進會,台北,1998。
[25] 許浩中,發光二極體接觸式影像感測模組之微光元件設計與製作,台灣科技大學電子工程研究所,2006。
[26] 黃竹申,超薄形導光板光學設計與製成之研究,高雄應用科技大學模具工程研究所,2008。
[27] 飛秒光電科技公司,http://www.feteco.com/wxsyb_about.aspx,2014年6月造訪。
[28] Photonics Industry & Technology Development Association, http://www.pida.
org.tw/report/2001/03-PDF/ok_2001__CH4.PDF, Accessed June, 2014.
[29] Digitalbolex, http://www.digitalbolex.com/global-shutter/, Accessed June, 2014.
[30] 資策會資訊市場情報中心,http://www2.lib.cycu.edu.tw/itdb/eBookShow.asp?s
no=1934,2014年6月造訪。
[31] 亞泰影像科技股份有限公司年報,台北,頁49,2012。
[32] 沉其廷,接觸式影像感測器CIS介紹,零組件雜誌,頁55,1999。
[33] 陳達仁,專利檢索與專利分析,經濟部智慧財產局出版,台北,頁13-26,2013。
[34] 魯明德,解析專利資訊,全華科技圖書股份有限公司出版,台北,頁384-392,2010。
[35] 胡瀞文,運用專利書目資料提昇專利文件比對效果之研究,中原大學資訊管理研究所,2005。
[36] 陳澤義,科技管理:理論與應用,華泰文化出版,台北,頁30-35,2008。
[37] 高志強,組合自動化文件分類技術之研究,中原大學資訊管理研究所,2004。
[38] D. Sullivan, “Document Warehousing and Text Mining,” Wiley Computer Publishing, New York, 2001.
[39] IBM, “Intelligent Miner for Text: Getting Started,” IBM Corp, 1998.
[40] 曾元顯,關鍵詞自動擷取技術之探討,中國圖書館學會會訊106期,頁26-29,1997。
[41] Y. Matsuo and M. Ishizuka, ”Keyword Extraction from a Single Document Using Word Co-Occurrence Statistical Information,” International Journal on Artificial Intelligence Tools, Vol. 13, No. 1, pp. 157-169, 2004.
[42] H. P. Luhn, “A Statistical Approach to Mechanized Encoding and Searching of Literary Information,” IBM Journal of Research and Development, Vol. 1, No. 4, pp. 309-317, 1957.
[43] K. S. Jones, “A Statistical Interpretation of Term Specificity and Its Application in Retrieval,” Journal of Documentation, Vol. 28, No. 1, pp. 11-20, 1972.
[44] G. Salton and C. Buckley, “Term-Weighting Approaches in Automatic Text Retrieval,” Journal of Information Processing and Management, Vol. 24, No. 5, pp. 513-523, 1988.
[45] C. E. Shannon, “A Mathematical Theory of Communication,” Bell System Technical Journal, Vol. 27, pp. 379-423, 1948.
[46] G. Salton, “Automatic Text Processing; The Transformation Analysis, and Retrieval of Information by Computer,” Addision-Wesley, New York., 1989.
[47] R. Studer, V. R. Benjamins and D. Fensel, “Knowledge Engineering: Principles and Methods,” Data and Knowledge Engineering, Vol. 25, pp. 161-197, 1998.
[48] S. William and T. Austin, “Ontologies,” IEEE Intelligent Systems, Vol. 14, pp. 18-19, 1999.
[49] D. Soergel, “Text Analysis for Ontology and Terminology Engineering,” Applied Ontology, Vol. 1, No. 1, pp. 35-46, 2005.
[50] 陳郁儒,運用關聯規則與雲端運算為基礎之新聞本體論視覺化呈現,雲林科技大學資訊管理研究所,2012。
[51] 國家教育研究院, http://terms.naer.edu.tw/detail/1311901/?index=4,2014年6月造訪。
[52] Social Learning Space, http://sls.weco.net/node/10936, Accessed June, 2014.
[53] N. Tishby, N. Slonim and N. Friedman, ”Unsupervised Document Classification Using Sequential Information Maximization,” Proceedings of 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 129-136, 2002.
[54] Data Clustering and Pattern Recognition, http://mirlab.org/jang/books/dcpr/prBa
000,ysianClassifier.asp?title=5-7%20%A8%A9%A6%A1%A4%C0%C3%FE%BE%B9, Accessed June, 2014.
[55] Montylingua, http://web.media.mit.edu/~hugo/montylingua/, Accessed June, 2014.
[56] M. Marcus, B. Santorini and M. Marcinkiewicz, “Building a Large Annotated Corpus of English: The Penn TreeBank,” Journal of Computational Linguistics, Vol. 19, No. 1, pp. 313-330, 1993.
[57] Protege, http://protege.stanford.edu/, Accessed June, 2014.
[58] Weka 3: Data Mining Software in Java, http://www.cs.waikato.ac.nz/ml/weka/, Accessed June, 2014.
[59] B. Larsen and C. Aone, ”Fast and Effective Text Mining Using Linear-Time Document Clustering,” Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1999.