簡易檢索 / 詳目顯示

研究生: 呂俊毅
Jyun-Yi Lyu
論文名稱: 應用梯度估測網路於資料分群、影像分割與色彩轉換之初步探索
Exploring Gradient Estimation Network of Score Matching for Data Clustering, Image Segmentation and Color Transformation
指導教授: 林伯慎
Bor-Shen Lin
口試委員: 林伯慎
Bor-Shen Lin
楊傳凱
Chuan-Kai Yang
賴源正
Yuan-Cheng Lai
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理系
Department of Information Management
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 60
中文關鍵詞: Score Matching梯度估測網路資料分群影像分割色彩轉換影像抽象化
外文關鍵詞: Score Matching, Gradient Estimation Network, Data Clustering, Image Segmentation, Color Transformation, Image Stylization
相關次數: 點閱:180下載:6
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 梯度是資料分佈機率密度增加的方向,而前人已提出一種估測資料分佈梯度的參數化模型,稱為Score Matching。基於Score Matching模型,本研究提出基於梯度估測網路的分群方法;此方法能在不知道資料分佈形狀,也不需要預先決定群數的情況下,根據資料本身所估測的梯度進行資料聚合。我們先使用二維資料進行初步分群測試,發現此方法的特性與K-means相近,但可以得到比K-means更好的資料聚合效果。進一步,在鳶尾花分類與葡萄酒分類的資料集上,我們分別以四維與十三維的分類特徵來比較四種分群方法:梯度估測網路分群方法、K-means、聚合式階層分群、以及DBSCAN。實驗結果顯示,在四種方法中,梯度估測網路分群法得到最好的分群效能,其蘭德指數可分別達到0.95及0.69。梯度估測網路分群法雖然沒有直接估測資料分佈,也沒有定義特徵空間的相似度量,卻能有效地掌握資料分佈的整體趨勢,對於同質性資料展現了很好的歸納能力。
    進一步,我們研究將此分群法應用於兩種影像處理任務:影像分割與色彩轉換。在影像分割任務方面,我們先比較了使用單純RGB三維特徵、和RGB加上像素座標之五維特徵的分割效果。結果顯示,使用RGB加上像素座標可獲得較佳的影像分割結果,其色彩誤差較低。接著,我們以分割後的色彩區塊圖,描繪出子區域的邊界,並評估物件邊界偵測的準確率。在不同種類圖片的測試中,梯度估測網路分群法可以得到比K-means更準確的邊界偵測效果。在色彩轉換任務方面,我們先以一張風格圖來訓練梯度估測網路,而在另一張內容圖上進行色彩特徵轉換。實驗結果顯示,此色彩轉換方法在風景照上能有自然的風格化效果,並能控制其風格化程度。最後,我們用隨機生成的雜訊圖片與自然圖片進行色彩轉換;結果發現,生成的圖像會有抽象的油畫風格,並反映出原圖的主要色彩特徵與概略位置,有應用於影像抽象化與影像區塊標記的潛力。


    Gradient is the direction on which the probability density of data raises. In earlier research a parametric model, named as score matching, was proposed to estimate the gradient of data distribution. Based on score matching model, this study proposes a clustering algorithm called gradient estimation network (GEN) that may aggregate data without knowing the shape of the data distribution or determining the number of clusters in advance. We first used two-dimensional data to perform preliminary clustering tests and found that the characteristic of this method is similar to that of K-means clustering, but has better capability of data aggregation. In addition, on the datasets of Iris classification and wine classification, we use four-dimensional and thirteen-dimensional features respectively to compare four clustering methods: GEN-based clustering, K-means clustering, agglomerative hierarchical clustering, and DBSCAN. Experimental results show that, GEN-based clustering has the best clustering performance among the four methods, and its Rand index can reach 0.95 and 0.69, respectively. Although GEN-based clustering method does not estimate the distribution directly, nor does it define the similarity or distance measure for clustering, it can however grasp the overall trend of the distribution effectively and show excellent inductive ability for homogeneous data.
    Further, we apply this clustering method to two image processing tasks: image segmentation and color transformation. For the image segmentation task, we compared the segmentation outputs of using three-dimensional RGB features and five-dimensional features of RGB with pixel coordinates, and the results show using RGB with pixel coordinates can achieve better segmentation results with lower errors. Additionally, the segmented regions were used to draw the regional boundaries to evaluate the accuracy of object boundary detection. When tested on pictures of different types, GEN-based clustering obtains more accurate results of boundary detection than K-means clustering. For the color transformation task, we first train GEN on a style image, and convert the pixel colors on another content image according to the estimated gradients. Experimental results show this color transformation method can have a natural and stylized effect on landscape photos, and the degree of stylization is controllable. Finally, color transformation was performed on a randomly generated noise image through the GEN learned from a natural image. It turns out that the transformed images have an abstract style like oil painting and may reflect roughly the main spatial distribution of colors for the natural image, which implies the proposed clustering approach is potentially applicable to image stylization and image tagging.

    第1章 緒論 1 1.1 研究背景與動機 1 1.2 研究貢獻 1 1.3 論文組織與架構 2 第2章 文獻回顧 3 2.1 Score Matching方法 3 2.2 基於梯度重採樣方法 5 2.3 資料分群方法 6 2.3.1 K-means 7 2.3.2 聚合式階層分群法 7 2.3.3 DBSCAN 8 2.3.4 評估指標:蘭德指數 8 2.4 影像處理相關技術 9 2.4.1 影像分割 9 2.4.2 風格轉換 9 2.4.3 結構相似性指標 10 2.4.4 色彩均方誤差 10 2.4.5 F1分數 10 2.5 本章摘要 11 第3章 梯度估測網路分群方法 12 3.1 基於Score Matching梯度估測網路分群 12 3.1.1 骨幹網路 12 3.1.2 網路訓練階段 13 3.1.3 分群階段 14 3.2 實驗設定 16 3.3 評估與分析 17 3.3.1 二維資料分群 17 3.3.2 Iris資料集準確性評估 20 3.3.3 葡萄酒資料集準確性評估 22 3.4 本章摘要 24 第4章 梯度預測網路應用影像處理 26 4.1 影像分割 26 4.1.1 基於RGB 三維特徵影像分割 26 4.1.2 基於RGB及像素座標特徵的影像分割 32 4.1.3 影像分割評估與分析 35 4.2 色彩轉換 38 4.2.1 基於梯度色彩轉換 38 4.2.2 影像抽象化 43 4.3 本章摘要 45 第5章 結論與未來展望 46 參考文獻 47

    [1] A. Hyvärinen. Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 6(Apr):695–709, 2005.
    [2] V. Pascal. A connection between score matching and denoising autoencoders. Neural Computation, 23(7):1661–1674, 2011.
    [3] Y. Song, S. Ermon. Generative modeling by estimating gradients of the data distribution. In Advances in Neural Information Processing Systems, pages 11895–11907, 2019.
    [4] J. MacQueen. Some Methods for classification and Analysis of Multivariate Observations. Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability. Vol. 1. University of California Press. pp. 281–297, 1967.
    [5] L. Kaufman, P.J. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis (1 ed.). New York: John Wiley, 1990.
    [6] M. Ester, Hans-Peter Kriegel, Jörg Sander, X. Xu. Simoudis, Evangelos; Han, Jiawei; Fayyad, Usama M. (eds.). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96). AAAI Press. pp. 226–231, 1996.
    [7] S. Haykin. Neural Networks: A Comprehensive Foundation 2. Prentice Hall. ISBN 0-13-273350-1. 1998.
    [8] F. Rosenblatt. Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Spartan Books, Washington DC, 1961
    [9] A. Paszke, et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32. Curran Associates, Inc., pp. 8024–8035, 2019.
    [10] F. Pedregosa, et al. Scikit-learn: Machine learning in Python. Journal of machine learning research, 12(Oct), pp.2825–2830 , 2011.
    [11] J.D. Hunter. Matplotlib: A 2D graphics environment. Computing in science engineering, 9(3), pp.90–95, 2007.
    [12] R.A. Fisher. The use of multiple measurements in taxonomic problems. Annual Eugenics, 7, Part II, 179-188, 1936.
    [13] Y. Song, S. Garg, J. Shi, S. Ermon. Sliced score matching: A scalable approach to density and score estimation. In Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence, UAI 2019, Tel Aviv, Israel, July 22-25, 2019, page 204, 2019.
    [14] L. Barghout, J. Sheynin. Real-world scene perception and perceptual organization: Lessons from Computer Vision. Journal of Vision, 2013.
    [15] Y. Jing, Y. Yang, Z. Feng, J. Ye, Y. Yu, and M. Song. Neural style transfer: A review. IEEE Trans. Vis. Comput Graph, 2019.
    [16] 富子豪、林伯慎。使用雙層濾波與GMM分群之影像抽象化方法研究(碩士論文,國立台灣科技大學,台北,台灣)。國立台灣科技大學圖書館。2015。
    [17] D. Dua, C. Graff. UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science, 2019.
    [18] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics. ICCV, 2001.
    [19] 劉易昇、林伯慎。基於PageRank之文件分群與文件視覺化方法研究(碩士論文,國立台灣科技大學,台北,台灣)。國立台灣科技大學圖書館。2014。

    QR CODE