研究生: |
張富祺 Fu-Chi - Chang |
---|---|
論文名稱: |
基於巨量資料分析之流感趨勢預測 Trend Forecasting of Influenza Using Big Data Analysis |
指導教授: |
陳俊良
Jiann-Liang Chen |
口試委員: |
郭斯彥
Sy-Yen Kuo 黎碧煌 Bih-Hwang Lee 張耀中 Yao-Chung Chang 馬奕葳 Yi-Wei Ma |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電機工程系 Department of Electrical Engineering |
論文出版年: | 2017 |
畢業學年度: | 105 |
語文別: | 中文 |
論文頁數: | 69 |
中文關鍵詞: | 巨量資料 、流感 、關連性分析 、非線性趨勢預測 |
外文關鍵詞: | Big data, Influenza, Analysis of the relationships, Nonlinear tracking model |
相關次數: | 點閱:366 下載:12 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
準確地追蹤傳染病例如流感的發展趨勢,可以幫助公衛單位做出及時且有意義的決策,對於穩定民心與挽救民眾生命有著極大幫助。傳統的疫病活動監測系統是以事後回報的確認病歷來統計,與實際流行高峰會有著至少1週以上的報告延遲,而在活絡的網路社群的訊息中,人們卻可能早已揭露出集體憂慮。因此,藉由快速的資訊檢測,來監測疫病活動信號,是有可能事先偵測並追蹤疫病擴散的趨勢,擁有社群巨量資訊的Yahoo與Google這些網路搜尋引擎提供者都曾投入相關研究。本論文以流感的流行趨勢為研究目標,擷集了2010至2016年間台灣地區的衛福部疾管署統計資料、Google Trends線上資料及King Net國家網路醫藥資料,以線性與非線性方法分別分析三者間的相互關連性,並建立出相互關係模式,藉由歷史資料來作檢測,可以發現在三者線性相關性較高的各年份,非線性分析也得到相同結果。本論文所提出之非線性趨勢預測架構,在有較大流行疫情的年份中,都可以捕捉到該區間趨勢的變化,證明了社群線上巨量資訊可作為間接監測流感活動信號之媒介,由此建立的流行趨勢預測模式,可提供及早因應流感流行之建議。
Accurate tracking the outbreak of an infectious disease, like influenza, helps Public Health to make timely and significant decisions that could calm the fear of people and save lives. A traditional disease caring system based on confirmed cases reports an outbreak typically with at least one-week lag. Therefore, some surveillance systems by monitoring indirect signals about influenza have been proposed to provide a faster unearthing. The volume of those signals is huge and could be pick out from social networks or searching databases. Yahoo and Google, the top two internet search providers who own those Big Data had fired researches about disease tracking ever. In this study, we first draw out the huge influenza signals from CDC (Central Disease Control, Taiwan) database, Google Trends database and King Net database. Then, the linear and nonlinear analyses between three databases are investigated. We found a high correlation existed between series drawn from three databases in years (2011-2016) under survey regardless of linear or nonlinear analysis. Furthermore, we proposed a nonlinear tracking model to capture changes in this epidemic trend, and we can detect the outbreak of influenza more early in years with heavy infectious. These results prove that the signals exposed on networks can provide rich material to trend events of human society.
[1] 財團法人台灣網路資訊中心(2015)。2015年台灣寬頻網路使用調查報告,財團法人台灣網路資訊中心出版,頁65-71。
[2] J. Ginsberg, M.H. Mohebbi, R.S. Patel, L. Brammer, M.S. Smolinski, and L. Brilliant, “Detecting influenza epidemics using search engine query data, ” Nature, vol. 457(19), pp.1012-1014, 2009.
[3] King Net, http://www.kingnet.com.tw/
[4] S. Yang, M. Santillana, and S. C. Kou, “Accurate estimation of influenza epidemics using Google search data via ARGO, ” Proc Natl Acad Sci U S A , vol. 112(47), pp.14473-14478, 2015.
[5] 衛生福利部疾病管制署網頁, http://www.cdc.gov.tw/
[6] Google Trends, https://www.google.com/trends/
[7] J. Espino, W. Hogan, and M. Wagner, “Telephone triage: A timely data source for surveillance of influenza-like diseases, ” AMIA: Annual Symposium Proceedings, pp. 215–219, 2003.
[8] S. Magruder, “Evaluation of over-the-counter pharmaceutical sales as a possible early warning indicator of human disease, ” Johns Hopkins University APL Technical Digest 24, pp. 349-353, 2003.
[9] A. Hulth, G. Rydevik, and A. Linde, “Web Queries as a Source for Syndromic Surveillance, ” PLoS ONE 4(2): e4378. doi:10.1371/journal.pone.0004378, 2009.
[10] G. Eysenbach, ”Infodemiology: tracking flu-related searches on the web for syndromic surveillance, ” AMIA: Annual Symposium Proceedings, pp.244-248, 2006
[11] P.M. Polgreen, Y. Chen, D.M. Pennock, and N.D. Forrest, “Using internet searches for influenza surveillance,” Clinical Infectious Diseases 47, pp.1443-1448, 2008.
[12] 林佳微(2009)。利用網路搜尋引擎資料瞭解台灣地區居民環境相關議題關切度之研究, 國立高雄師範大學環境教育研究所碩士論文。
[13] Google Flu Trends, https://www.google.org/flutrends/about/how/htm1
[14] L. Yang, J. Zhang and Q. Zhang, “PESC:A parallel system for clustering ECG streams based on MapReduce, ” Proceedings of the 2013 IEEE Global Communications Conference (GLOBECOM), pp.2604-2609, 2013.
[15] Y. Zhao, J. Wu and C. Liu, “Dache: A data aware caching for big-data applications using the MapReduce framework,” Tsinghua Science and Technology, Vol. 19, Issue: 1, pp.39-50, 2014.
[16] M. Wang, L. Zhang, Z. Zhang, C. Xu, G. Chen, H. Shang, and X. Li, “The application characteristics of traditional Chinese medical science treatment on headache based on data-mining Apriori algorithm,” Bioinformatics and Biomedicine (BIBM), Proceedings of the 2014 IEEE International Conference, pp.153-157, 2014.
[17] 王文中(2004)。統計學與Excel資料分析之實習應用(第五版),博碩文化出版,頁389-393。
[18] C. K. Peng, S. Havlin, H. E. Stanley, and A. L. Goldberger, “Quantification of scaling exponents and crossover phenomena in non-stationary heartbeat time series,” Chaos, Vol. 5, No. 1, pp.82-87, 1995.
[19] 林澂(2004)。外科重症病人的長期和短期存活之評估-心率變異度之線性與非線性分析, 國立陽明大學生理學研究所碩士論文。
[20] 謝祥煥(2007)。心率變異度之去趨勢波動分析,國立陽明大學醫工所碩士論文。
[21] 廖思善(2015)。用統計分布與碎形分析來判別腦波的清醒階段及失眠狀態,國立中興大學生物物理學研究所碩士論文。
[22] D. Hoyer, R. Bauer, B. Walter, and U. Zwiener, “Estimationof nonlinear couplings on the basis of complexity and predictability: A new method applied to cardiorespiratory coordination,” IEEE Trans. Biomed. Eng. 45:1–8, 1998.
[23] F. Censi, G. Calcagnini, S. Strano, P. Bartolini, and V. Barbaro, “Nonlinear Coupling Among Heart Rate, Blood Pressure, and Respiration in Patients Susceptible to Neuromediated Syncope,” Annals of Biomedical Engineering, Vol. 31, pp. 1097–1105, 2003.
[24] National Instrument, http://www.ni.com/labview/
[25] 柯金美(2009)。運用網路搜尋工具監測環境傳染病流行趨勢之可行性研究,國立高雄師範大學環境教育研究所碩士論文。