簡易檢索 / 詳目顯示

研究生: 仲崇瑞
Tsung-Jui Chung
論文名稱: 主動式取樣用於行車時間預測
Active Learning in Trip Time Prediction
指導教授: 鮑興國
Hsing-kuo Pao
口試委員: 孫敏德
戴碧如
項天瑞
鮑興國
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2017
畢業學年度: 105
語文別: 英文
論文頁數: 36
中文關鍵詞: 主動式學習行車時間預測
外文關鍵詞: Active learning, Trip time prediction
相關次數: 點閱:201下載:12
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報

在這科技日新月異的社會,人們的時間變得相當珍貴,因此當他們在開車或搭車前能得知準確的行車時間便是一項重要的議題,然而準確的行車時間預測建立在一定數量且有利用價值的歷史資料,但是毫無策略一味的蒐集資料相當沒有效率,並且會帶來許多不好的後果,像是伺服器端會有嚴重的負擔、本地端的設備可能會因蒐集太多不必要的資料而造成無謂的電量消耗。
我們將會針對行車時間預測的歷史資料蒐集進行策略性的取樣,並且希望儘管使用較少的資料,一樣能達成未使用任何策略性採樣的結果。我們著重於空間及時間這兩個在現實中最容易影響行車時間的關鍵因素,另外再加入資料密集程度的概念,透過減少蒐集密集度較高區域的資料,希望能忽略掉一些不必要或是多餘的資料。最後除了各個方法的結果比較,我們也會透過視覺化的方式來觀察搜集到的資料以便更了解整體資料的分布情況。


In the modern society, people often need to know how long they will spend on driving. To predict the travel time accurately is an important issue. It based on the enough and useful historical data. But, if we collect the historical data without any strategy, it will bring some bad consequences. The burden of the server will be too heavy, and for local side, it will result in some unnecessary power consumption when collecting the data.
In this thesis, we focus on the problem of sampling the data in trip time prediction. We want to achieve the same result by using less data. Space and time are the two main factors we consider, and we also propose the concept of data density. By reducing the data in high density areas, we can ignore some redundant data. At last we will show the results of our method and compare with each other. In addition, some visualization of the data sampled will be presented so that we can clearly understand the data distribution.

Abstract in Chinese . . . . . . . . . . . . . . . . . . . . . . . . . . iii Abstract in English . . . . . . . . . . . . . . . . . . . . . . . . . . iv Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . v Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 Related Work . . . . . . . . . . . . . . . . . . . . . . . . 3 1.4 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.1 Active Learning . . . . . . . . . . . . . . . . . . . . . . . 6 2.1.1 Scenarios . . . . . . . . . . . . . . . . . . . . . . 6 2.1.2 Stream-based Sampling . . . . . . . . . . . . . . 8 2.2 Modified Stream-based Sampling . . . . . . . . . . . . . 9 2.3 Sampling Approach . . . . . . . . . . . . . . . . . . . . . 10 2.4 Adjust Sampling Rate . . . . . . . . . . . . . . . . . . . . 11 2.5 Data Density Analysis . . . . . . . . . . . . . . . . . . . 15 2.6 Spatio-temporal Sampling . . . . . . . . . . . . . . . . . 20 3 Experiments and Results . . . . . . . . . . . . . . . . . . . . . 22 3.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.3 Support Vector Regression . . . . . . . . . . . . . . . . . 24 3.4 Experiment Setting . . . . . . . . . . . . . . . . . . . . . 24 3.5 Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.6 Analysis on sampled data . . . . . . . . . . . . . . . . . . 30 4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Letter of Authority . . . . . . . . . . . . . . . . . . . . . . . . . . 37

[1] C.-H. Wu, J.-M. Ho, and D.-T. Lee, “Travel-time prediction with support vector regression,” IEEE
transactions on intelligent transportation systems, vol. 5, no. 4, pp. 276–281, 2004.
[2] X. Zhang and J. A. Rice, “Short-term travel time prediction,” Transportation Research Part C: Emerg-
ing Technologies, vol. 11, no. 3, pp. 187–210, 2003.
[3] S. I.-J. Chien and C. M. Kuchipudi, “Dynamic travel time prediction with real-time and historic data,”
Journal of transportation engineering, vol. 129, no. 6, pp. 608–616, 2003.
[4] J.VanLint, S. Hoogendoorn, and H. J.van Zuylen, “Accurate freewaytraveltime predictionwith statespace neural networks under missing data,” Transportation Research Part C: Emerging Technologies,
vol. 13, no. 5, pp. 347–369, 2005.
[5] “Stream-based active learning for data selection in a real world application.” http://citeseerx. ist.psu.edu/viewdoc/download;jsessionid=D60C84D94E72B8E34FF014DC1889C19D? doi=10.1.1.2.4071&rep=rep1&type=pdf.
[6] I. Žliobaitė, A. Bifet, B. Pfahringer, and G. Holmes, “Active learning with evolving streaming data,” in Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 597– 612, Springer, 2011.
[7] N. Saunier, S. Midenet, and A. Grumbach, “Stream-based learning through data selection in a road safety application,” in STAIRS 2004, Proceedings of the Second Starting AI Researchers’Symposium,
vol. 109, pp. 107–117, 2004.
[8] J. Smailović, M. Grčar, N. Lavrač, and M. Žnidaršič, “Stream-based active learning for sentiment
analysis in the financial domain,” Information sciences, vol. 285, pp. 181–203, 2014.
[9] X. Zhu, P. Zhang, X. Lin, and Y. Shi, “Active learning from stream data using optimal weight classifier ensemble,” IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), vol. 40, no. 6,
pp. 1607–1621, 2010.
[10] M.-R. Bouguelia, Y. Belaïd, and A. Belaïd, “A stream-based semi-supervised active learning approach for document classification,” in Document Analysis and Recognition (ICDAR), 2013 12th Interna-
tional Conference on, pp. 611–615, IEEE, 2013.
[11] W. Liu, Y. Zheng, S. Chawla, J. Yuan, and X. Xing, “Discovering spatio-temporal causal interactions in traffic data streams,” in Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 1010–1018, ACM, 2011.
[12] D. Ganesan, S. Ratnasamy, H. Wang, and D. Estrin, “Coping with irregular spatio-temporal sampling in sensor networks,” ACM SIGCOMM Computer Communication Review, vol. 34, no. 1, pp. 125–
130, 2004.
[13] B. Settles, “Active learning literature survey,” University of Wisconsin, Madison, vol. 52, no. 55-66, p. 11, 2010.
[14] “Kaggle competition, ecml/pkdd 15: Taxi trip time prediction (ii),” 2015.
[15] C. J. Willmott and K. Matsuura, “Advantages of the mean absolute error (mae) over the root mean square error (rmse) in assessing average model performance,” Climate research, vol. 30, no. 1, pp. 79– 82, 2005.
[16] S. R. Gunn et al., “Support vector machines for classification and regression,” ISIS technical report, vol. 14, pp. 85–86, 1998.

QR CODE