簡易檢索 / 詳目顯示

研究生: 余春賢
Chun-Hsien Yu
論文名稱: 目標導向並兼具線上學習監督與習後反饋之即時物聯網資訊分析
Goal-oriented Iot Data Stream Analytics with On-line Learning Monitoring and Post-Learning Reflection
指導教授: 陸敬互
Ching-Hu Lu
口試委員: 鍾聖倫
Sheng-Luen Chung
蘇順豐
Shun-Feng Su
蔣宗哲
Tsung-Che Chiang
古倫維
Lun-Wei Ku
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2017
畢業學年度: 105
語文別: 中文
論文頁數: 63
中文關鍵詞: 自我調適學習機器學習串流資料分析概念飄移物聯網
外文關鍵詞: self-regulation learning, machine laerning, data stream, concept drift, IoT
相關次數: 點閱:356下載:1
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 物聯網時代的來臨,數以萬計連網裝置所產生的龐大串流資料需要電腦的即時分析,機器學習也因應此趨勢更受關注。然而,既有機器學習模型往往忽略資料語意內容本身對模型的影響,因為資料語意內容不完整或敏感都會導致機器學習的結果出現偏差。此外,串流資料模型無法得知其真實標籤,所以先前研究也忽視學習後之成效即時檢視,導致無法對後續或未來的學習能加以正面調整,且資料分佈本身也容易潛藏概念飄移(concept drift)。所以本研究針對具有概念飄移的即時串流資料分析提出一個基於人類教育學(Pedagogy)的自我調適(Self-regulation)學習框架以改善既有機器學習模型的缺陷。此框架具備習前預思(Forethought)、習間表現監督(Performance)以及習後反饋(Reflection)三個循環要素。其中,習前預思是指學習者在學習前清楚明白學習的目標為何,習間表現監督乃在學習過程當中,能自我即時監督並調整學習。最後習後反饋是指透過學習後能評量學習結果並更正學習方式。為了使機器具備上述三個要素,在習前預思部分,研究透過提出「習前目標導向的篩選技術」,透過維基百科(Wiki)搜尋欲訓練之資料內容是否與學習目標相牴觸。在習間表現監督部分,本研究透過提出「習中後設認知(Metacognition)即時監控之學習框架」,該框架允許使用者將所選定的機器學習模型強化具有後設認知的能力,透過學什麼(What to learn) 來選擇資料特徵、以如何學(How to learn) 來調整模型、以及透過何時學(When to learn)根據信心不足資料之累積以啟動模型更新,以因應概念飄移的模型調適。最後,習後反饋部分,本研究透過「習後專家模型更正與反饋技術」,針對低信心分類資料做進一步確認,並找出動態微粒(Dynamic particle) 輔助模型給予學習指導標籤,以突破串流資料無法得知真實標籤的限制。實驗結果顯示,本研究所提出的方法在有概念飄移的即時資料分析上,可以在物聯網無人為介入的環境下,讓準確率接近有人為介入的基準,甚至部分資料準確率也獲得提升,另外也初步確保輸入資料能透過語意層級的解讀,讓學習能依循所設定的目標進行。


    For the upcoming IoT era, plethora of data needs to be processed on a real-time basis. However, most prior data streaming related researches ignored the semantic meaning of the input data, thus leading to unexpected or even unacceptable models. Such an issue may become worse under the circumstances of concept drift. Since data streaming does not have the ground-truth of the data, it becomes really challenging to reflect upon what has been learned for later continuously improving learning performance. To address above issues, this study incorporates the self-regulation learning from pedagogy. There are three elements in self regulation learning, including Forethought, Performance, and Self-reflection. Forethought refers to setting a goal in advance before learning. Performance refers to self monitoring and controlling how well people learn during learning. Self-reflection refers to how to correct mistakes after learning. To help a machine equip with the above abilities, this study proposes a goal-oriented filtering mechanism for the Forethought stage to exclude those input data not related to the learning goal. Next, the Performance stage is implemented as an on-line metacognition based learning framework, which can empower a selected model with the cognitive abilities under concept drift by the replaceable “What to learn,” “How to learn,” and “When to learn” components. Finally, we propose a post-learning expert-model-assisted correction for the self reflection stage, where we choose the dynamic particle algorithm to serve as an expert model to correct those low confident data under concept drift. Such a design is to reduce human intervention for IoT-enabled data streaming analytics. With the above enhancements, the experiment results of our enhanced model can be as good as those prior studies still needing human intervention. Some results even show slight improvement, meanwhile guaranteeing that the data astrayed from the goal will be exluded.

    目錄 中文摘要 i Abstract ii 致謝 iii 圖目錄 vii 表格目錄 ix 第一章 簡介 1 1.1 資料串流在物聯網應用上所帶來的挑戰 1 1.2 自我調適學習框架的用途與特性 3 1.3 相關研究 5 1.4 本文貢獻與文章架構 8 第二章 系統設計理念與架構簡介 10 第三章 習前目標導向的篩選技術 13 3.1 決定敏感性議題 13 3.2 向知識庫發出請求 14 3.3 本體論關係網絡的建立 17 3.4 與學習目標牴觸之資料處理 21 第四章 習中後設認知即時監控之學習框架 23 4.1 學習模型選擇與介紹 23 4.2 資料特徵選擇(學什麼) 25 4.3 隱藏層節點增加/減少(如何學) 27 4.4 分類信心監督(何時學) 28 第五章 習後專家模型更正與反饋技術 29 5.1 專家模型之選擇與輔助 29 5.2 專家模型意見採納與模型更新 32 第六章 實驗結果驗證 35 6.1 習前目標導向篩選結果 35 6.2 習中後設認知即時監控成效 39 6.3 習後交叉比對反饋之成效 40 第七章 結論與未來規劃 42 參考文獻 44 附錄 46

    參考文獻
    [1] M. Pratama, G. Zhang, M. J. Er, and S. Anavatti, "An Incremental Type-2 Meta-Cognitive Extreme Learning Machine," IEEE Transactions on Cybernetics, vol. PP, pp. 1-15, 2016.
    [2] B. Mirza and Z. Lin, "Meta-cognitive online sequential extreme learning machine for imbalanced and concept-drifting data classification," Neural Netw, vol. 80, pp. 79-94, Aug 2016.
    [3] G. Sateesh Babu and S. Suresh, "Meta-cognitive Neural Network for classification problems in a sequential learning framework," Neurocomputing, vol. 81, pp. 86-96, 2012.
    [4] M. Hahsler and M. Bolaos, "Clustering Data Streams Based on Shared Density between Micro-Clusters," IEEE Transactions on Knowledge and Data Engineering, vol. 28, pp. 1449-1461, 2016.
    [5] R. M. Vallim, J. A. Andrade Filho, A. C. Carvalho, and J. Gama, "A Density-Based Clustering Approach for Behavior Change Detection in Data Streams," in Neural Networks (SBRN), 2012 Brazilian Symposium on, 2012, pp. 37-42.
    [6] B. J. Zimmerman, "Chapter 2 - Attaining Self-Regulation: A Social Cognitive Perspective A2 - Boekaerts, Monique," in Handbook of Self-Regulation, P. R. Pintrich and M. Zeidner, Eds., ed San Diego: Academic Press, 2000, pp. 13-39.
    [7] H. Huang, S. Yoo, and S. P. Kasiviswanathan, "Unsupervised feature selection on data streams," in Proceedings of the 24th ACM International on Conference on Information and Knowledge Management, 2015, pp. 1031-1040.
    [8] C. Yin, L. Feng, L. Ma, J. Wang, Z. Yin, and J. U. Kim, "A Feature Selection Algorithm of Dynamic Data-Stream Based on Hoeffding Inequality," in 2015 4th International Conference on Advanced Information Technology and Sensor Application (AITS), 2015, pp. 92-95.
    [9] L. Huang, C. Sun, Y. Chi, and H. Xu, "Auto-Construction of Domain Ontology based on Wikipedia and Scientific Papers," Journal of Residuals Science & Technology, vol. 13, 2016.
    [10] Y. Wang, T. Peng, and W. Zuo, "Hyponymy Graph Model for Word Semantic Similarity Measurement," Chinese Journal of Electronics, vol. 24, pp. 96-101, 2015.
    [11] M. Fern, ndez-Delgado, E. Cernadas, n. Barro, and D. Amorim, "Do we need hundreds of classifiers to solve real world classification problems?," J. Mach. Learn. Res., vol. 15, pp. 3133-3181, 2014.
    [12] J. S. R. Jang, "ANFIS: adaptive-network-based fuzzy inference system," IEEE Transactions on Systems, Man, and Cybernetics, vol. 23, pp. 665-685, 1993.
    [13] 邱俊智, "自適應性類神經模糊推論系統於客製化生產環境之預測應用," 2010.
    [14] I. Žliobaitė, "Learning under concept drift: an overview," arXiv preprint arXiv:1010.4784, 2010.
    [15] M. B.-G. Jose, J. Del Campo-Ávila, R. Fidalgo, A. Bifet, R. Gavaldà, and R. Morales-bueno, "Early Drift Detection Method."
    [16] F. Breve and L. Zhao, "Semi-supervised Learning with Concept Drift Using Particle Dynamics Applied to Network Intrusion Detection Data," in 2013 BRICS Congress on Computational Intelligence and 11th Brazilian Congress on Computational Intelligence, 2013, pp. 335-340.
    [17] C. Meyers and T. B. Jones, Promoting Active Learning. Strategies for the College Classroom: ERIC, 1993.
    [18] Wikipedia, "Standard deviation," 7/3 2017.
    [19] Wikipedia, "Gaussian function," 2017.
    [20] J. A. Hartigan and M. A. Wong, "Algorithm AS 136: A K-Means Clustering Algorithm," Journal of the Royal Statistical Society. Series C (Applied Statistics), vol. 28, pp. 100-108, 1979.
    [21] A. Blum and T. Mitchell, "Combining labeled and unlabeled data with co-training," in Proceedings of the eleventh annual conference on Computational learning theory, 1998, pp. 92-100.
    [22] W. N. Street and Y. Kim, "A streaming ensemble algorithm (SEA) for large-scale classification," in Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, 2001, pp. 377-382.
    [23] J. Gama, Knowledge discovery from data streams: CRC Press, 2010.
    [24] R. Polikar, L. Upda, S. S. Upda, and V. Honavar, "Learn++: An incremental learning algorithm for supervised neural networks," IEEE transactions on systems, man, and cybernetics, part C (applications and reviews), vol. 31, pp. 497-508, 2001.
    [25] T. K. Dey, J. Giesen, S. Goswami, J. Hudson, R. Wenger, and W. Zhao, "Undersampling and oversampling in sample based shape modeling," in Proceedings of the conference on Visualization'01, 2001, pp. 83-90.
    [26] I. University of California, "Adult Data Set," 2013.

    QR CODE