簡易檢索 / 詳目顯示

研究生: 林明德
Ming-der Lin
論文名稱: 彈性區域範圍的均值移動法在半監督式學習上的應用
Varied Bandwidth Mean Shift for Semi-supervised Learning
指導教授: 鮑興國
Hsing-kuo Pao
口試委員: 李育杰
Yuh-jye Lee
陳素雲
Su-yun Huang
張源俊
Yuan-chin Chang
學位類別: 碩士
Master
系所名稱: 電資學院 - 資訊工程系
Department of Computer Science and Information Engineering
論文出版年: 2009
畢業學年度: 98
語文別: 英文
論文頁數: 41
中文關鍵詞: 半監督式學習均值移動法核密度估計多分類器組合式分類器
外文關鍵詞: fine-to-coarse, multiple classifier system
相關次數: 點閱:217下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 均值移動法(mean shift)是一種用在非監督式學習上且無參數的演算法。主要是藉由密度梯度的評估再利用遞迴的方式找到樣本中局部稠密度最高的位置。在這論文中,我們提出一個基於均值移動法的新演算法並應用在半監督式學習的領域上。這方法我們稱之為『彈性區域範圍的均值移動法』。一般的均值移動法所使用的樣本資料為未標示的資料(unlabeled data),而在機器的學習過程中,我們多添加了已標示的資料(labeled data)以藉此提升學習的品質。而均值移動法的演算過程中最主要的參數為區域範圍(bandwidth)。為了找到一個最適當的區域範圍,我們試著去找到點集中的每個點的鄰近半徑。在整個演算法中,我們利用區域範圍的變化去探討稠密度的變化,藉此選擇最適合含括每個點的區域範圍的大小。再利用整合分類器(ensemble classifier)去標示學習資料中未標示的資料。實驗中,我們在人造樣本與實際樣本和其他的演算法作比較並討論其優勢。另外,我們討論一些比較特別的資料型態,如高維度的資料。這些資料或許隱含著維度詛咒,在此我們也會利用其他方式處理這類型的資料。


    Mean shift is a non-parametric method for unsupervised learning. Based
    on density gradient estimation, it is operated iteratively for finding the
    mode. In this work, we propose a mean shift-based algorithm, called Var-
    ied Bandwidth Mean Shift for semi-supervised learning. Different from
    the common mean shift algorithms, the input combines both of labeled
    and unlabeled data to improve the learning performance, compared to us-
    ing the data only in a supervised or unsupervised fashion. To decide the
    mode of data points, our proposed VBMS tries to find the appropriate
    bandwidth, called neighborhood radius for each of the points. Starting from the point itself, we run a fine-to-coarse procedure to select the first
    "low density" place to be the radius of the point's territory. Then we
    use an ensemble to label the points with partial known label information.
    In the experiments, we evaluate our method by applying it to some syn-
    thetic and real data sets. The performance shows our method superior
    to other related approaches. We also discuss some particular situations
    where we have clustered data, high-dimensional data with possible curse
    of dimensionality, etc to illustrate our result.

    1 Introduction 1 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 Proposed Method . . . . . . . . . . . . . . . . . . . . . . . 2 2 Semi-supervised Leaning 4 3 Mean Shift 7 3.1 Kernel Density Estimation . . . . . . . . . . . . . . . . . . 8 3.2 Mean Shift . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2.1 Mean Shift Algorithm . . . . . . . . . . . . . . . . 10 3.2.2 Selecting the Bandwidth . . . . . . . . . . . . . . . 11 3.3 Mean Shift Applications . . . . . . . . . . . . . . . . . . . 12 3.3.1 Pattern recognition . . . . . . . . . . . . . . . . . . 13 3.3.2 Image processing . . . . . . . . . . . . . . . . . . . 14 3.3.3 Tracking . . . . . . . . . . . . . . . . . . . . . . . . 15 4 Varied Bandwidth Mean Shift 18 4.1 Learning Problem Setup . . . . . . . . . . . . . . . . . . . 18 4.2 Varied Bandwidth Mean Shift . . . . . . . . . . . . . . . . 19 4.2.1 Measure the Density Estimator . . . . . . . . . . . 20 4.2.2 Fine-to-Coarse Framework . . . . . . . . . . . . . . 21 4.3 An Ensemble of Classifiers: Multiple Bandwidth Mean Shift 22 4.3.1 Varied Bandwidth Mean Shift Algorithm . . . . . . 23 5 Experiment Results 25 5.1 Synthetic Data Sets . . . . . . . . . . . . . . . . . . . . . . 25 5.2 Real Data Sets . . . . . . . . . . . . . . . . . . . . . . . . 30 5.2.1 Low-Density Separation Problems . . . . . . . . . . 31 5.2.2 Data Set in High Dimensions . . . . . . . . . . . . 35 6 Conclusion & Future Work 38 6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . 39

    [1] K. P. Bennett and A. Demiriz. Semi-supervised support vector ma-
    chines. In Advances in Neural Information Processing Systems, pages
    368-374. MIT Press, 1998.
    [2] A. Blum and S. Chawla. Learning from labeled and unlabeled data
    using graph mincuts. In Proc. 18th International Conf. on Machine
    Learning, pages 19-26. Morgan Kaufmann, San Francisco, CA, 2001.
    [3] O. Chapelle, B. SchÄolkopf, and A. Zien, editors. Semi-Supervised
    Learning (Adaptive Computation and Machine Learning). The MIT
    Press, Sep. 2006.
    [4] Y. Cheng. Mean shift, mode seeking, and clustering. IEEE Trans.
    Pattern Analysis and Machine Intelligence, 17(8):790-799, Aug. 1995.
    [5] D. Comaniciu and P. Meer. Mean shift: A robust approach toward
    feature space analysis. IEEE Trans. Pattern Analysis and Machine
    Intelligence, 24(5):603-619, May 2002.
    [6] V. Ramesh D. Comaniciu and P. Meer. The variable bandwidth mean
    shift and data-driven scale selection. Proc Eighth Int'l Conf. Com-
    puter Vision, 1:331-337, July 2001.
    [7] V. Ramesh D. Comaniciu and P. Meer. Kernel-based object tracking.
    IEEE Trans. Pattern Analysis and Machine Intelligence, 25(5):564-
    575, May 2003.
    [8] K. Fukunaga and L. D. Hosteler. The estimation of the gradient of
    a density function, with applications in pattern recognition. IEEE
    Trans. Information Theory, 21:32-40, 1975.
    [9] T. Kailath. The divergence and bhattacharyya distance measures in
    signal selection. IEEE Trans. Commun. Tech., 15:52-62, 1967.
    [10] V. Kolmogorov and R. Zabih. What energy functions can be mini-
    mized via graph cuts. IEEE Trans. on Pattern Analysis and Machine
    Intelligence, 26:65-81, 2004.
    [11] Y.-J. Lee and O. L. Mangasarian. Rsvm: Reduced support vector
    machines. In Data Mining Institute, Computer Sciences Department,
    University of Wisconsin, pages 00-07, 2001.
    [12] F. Laviolette M. R. Amini and N. Usunier. A transductive bound for
    the voted classi‾er with an application to semi-supervised learning.
    In NIPS, pages 65-72, 2008.
    [13] S. J. Sheater and M. C. Jones. A reliable data-based bandwidth
    selection method for kernel density estimation. Journal of the Royal
    Statitical Society. Series B (Methodological), 53(3):683-690, 1991.
    [14] B. W. Silverman. Density Estimation for Statistics and Data Analy-
    sis. Chapman & Hall/CRC, Apr. 1986.
    [15] J. B. Tenenbaum, V. de Silva, and J. C. Langford. A global geo-
    metric framework for nonlinear dimensionality reduction. Science,
    290(5500):2319-2323, Dec. 2000.
    [16] M. P.Wand and M. C. Jones. Kernel Smoothing (Monographs on Sta-
    tistics and Applied Probability). Chapman & Hall/CRC, Dec. 1994.
    [17] X. Zhu. Semi-supervised learning literature survey. Technical report,
    Dec. 2007.
    [18] X. Zhu and Z. Ghahramani. Learning from labeled and unlabeled
    data with label propagation. Technical Report CMU-CALD-02-107,
    Carnegie Mellon University, 2002.

    無法下載圖示 全文公開日期 2012/11/17 (校內網路)
    全文公開日期 本全文未授權公開 (校外網路)
    全文公開日期 本全文未授權公開 (國家圖書館:臺灣博碩士論文系統)
    QR CODE