Author: |
呂俊毅 Jyun-Yi Lyu |
---|---|
Thesis Title: |
應用梯度估測網路於資料分群、影像分割與色彩轉換之初步探索 Exploring Gradient Estimation Network of Score Matching for Data Clustering, Image Segmentation and Color Transformation |
Advisor: |
林伯慎
Bor-Shen Lin |
Committee: |
林伯慎
Bor-Shen Lin 楊傳凱 Chuan-Kai Yang 賴源正 Yuan-Cheng Lai |
Degree: |
碩士 Master |
Department: |
管理學院 - 資訊管理系 Department of Information Management |
Thesis Publication Year: | 2023 |
Graduation Academic Year: | 111 |
Language: | 中文 |
Pages: | 60 |
Keywords (in Chinese): | Score Matching 、梯度估測網路 、資料分群 、影像分割 、色彩轉換 、影像抽象化 |
Keywords (in other languages): | Score Matching, Gradient Estimation Network, Data Clustering, Image Segmentation, Color Transformation, Image Stylization |
Reference times: | Clicks: 470 Downloads: 20 |
Share: |
School Collection Retrieve National Library Collection Retrieve Error Report |
梯度是資料分佈機率密度增加的方向,而前人已提出一種估測資料分佈梯度的參數化模型,稱為Score Matching。基於Score Matching模型,本研究提出基於梯度估測網路的分群方法;此方法能在不知道資料分佈形狀,也不需要預先決定群數的情況下,根據資料本身所估測的梯度進行資料聚合。我們先使用二維資料進行初步分群測試,發現此方法的特性與K-means相近,但可以得到比K-means更好的資料聚合效果。進一步,在鳶尾花分類與葡萄酒分類的資料集上,我們分別以四維與十三維的分類特徵來比較四種分群方法:梯度估測網路分群方法、K-means、聚合式階層分群、以及DBSCAN。實驗結果顯示,在四種方法中,梯度估測網路分群法得到最好的分群效能,其蘭德指數可分別達到0.95及0.69。梯度估測網路分群法雖然沒有直接估測資料分佈,也沒有定義特徵空間的相似度量,卻能有效地掌握資料分佈的整體趨勢,對於同質性資料展現了很好的歸納能力。
進一步,我們研究將此分群法應用於兩種影像處理任務:影像分割與色彩轉換。在影像分割任務方面,我們先比較了使用單純RGB三維特徵、和RGB加上像素座標之五維特徵的分割效果。結果顯示,使用RGB加上像素座標可獲得較佳的影像分割結果,其色彩誤差較低。接著,我們以分割後的色彩區塊圖,描繪出子區域的邊界,並評估物件邊界偵測的準確率。在不同種類圖片的測試中,梯度估測網路分群法可以得到比K-means更準確的邊界偵測效果。在色彩轉換任務方面,我們先以一張風格圖來訓練梯度估測網路,而在另一張內容圖上進行色彩特徵轉換。實驗結果顯示,此色彩轉換方法在風景照上能有自然的風格化效果,並能控制其風格化程度。最後,我們用隨機生成的雜訊圖片與自然圖片進行色彩轉換;結果發現,生成的圖像會有抽象的油畫風格,並反映出原圖的主要色彩特徵與概略位置,有應用於影像抽象化與影像區塊標記的潛力。
Gradient is the direction on which the probability density of data raises. In earlier research a parametric model, named as score matching, was proposed to estimate the gradient of data distribution. Based on score matching model, this study proposes a clustering algorithm called gradient estimation network (GEN) that may aggregate data without knowing the shape of the data distribution or determining the number of clusters in advance. We first used two-dimensional data to perform preliminary clustering tests and found that the characteristic of this method is similar to that of K-means clustering, but has better capability of data aggregation. In addition, on the datasets of Iris classification and wine classification, we use four-dimensional and thirteen-dimensional features respectively to compare four clustering methods: GEN-based clustering, K-means clustering, agglomerative hierarchical clustering, and DBSCAN. Experimental results show that, GEN-based clustering has the best clustering performance among the four methods, and its Rand index can reach 0.95 and 0.69, respectively. Although GEN-based clustering method does not estimate the distribution directly, nor does it define the similarity or distance measure for clustering, it can however grasp the overall trend of the distribution effectively and show excellent inductive ability for homogeneous data.
Further, we apply this clustering method to two image processing tasks: image segmentation and color transformation. For the image segmentation task, we compared the segmentation outputs of using three-dimensional RGB features and five-dimensional features of RGB with pixel coordinates, and the results show using RGB with pixel coordinates can achieve better segmentation results with lower errors. Additionally, the segmented regions were used to draw the regional boundaries to evaluate the accuracy of object boundary detection. When tested on pictures of different types, GEN-based clustering obtains more accurate results of boundary detection than K-means clustering. For the color transformation task, we first train GEN on a style image, and convert the pixel colors on another content image according to the estimated gradients. Experimental results show this color transformation method can have a natural and stylized effect on landscape photos, and the degree of stylization is controllable. Finally, color transformation was performed on a randomly generated noise image through the GEN learned from a natural image. It turns out that the transformed images have an abstract style like oil painting and may reflect roughly the main spatial distribution of colors for the natural image, which implies the proposed clustering approach is potentially applicable to image stylization and image tagging.
[1] A. Hyvärinen. Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research, 6(Apr):695–709, 2005.
[2] V. Pascal. A connection between score matching and denoising autoencoders. Neural Computation, 23(7):1661–1674, 2011.
[3] Y. Song, S. Ermon. Generative modeling by estimating gradients of the data distribution. In Advances in Neural Information Processing Systems, pages 11895–11907, 2019.
[4] J. MacQueen. Some Methods for classification and Analysis of Multivariate Observations. Proceedings of 5th Berkeley Symposium on Mathematical Statistics and Probability. Vol. 1. University of California Press. pp. 281–297, 1967.
[5] L. Kaufman, P.J. Rousseeuw. Finding Groups in Data: An Introduction to Cluster Analysis (1 ed.). New York: John Wiley, 1990.
[6] M. Ester, Hans-Peter Kriegel, Jörg Sander, X. Xu. Simoudis, Evangelos; Han, Jiawei; Fayyad, Usama M. (eds.). A density-based algorithm for discovering clusters in large spatial databases with noise. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining (KDD-96). AAAI Press. pp. 226–231, 1996.
[7] S. Haykin. Neural Networks: A Comprehensive Foundation 2. Prentice Hall. ISBN 0-13-273350-1. 1998.
[8] F. Rosenblatt. Principles of Neurodynamics: Perceptrons and the Theory of Brain Mechanisms. Spartan Books, Washington DC, 1961
[9] A. Paszke, et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32. Curran Associates, Inc., pp. 8024–8035, 2019.
[10] F. Pedregosa, et al. Scikit-learn: Machine learning in Python. Journal of machine learning research, 12(Oct), pp.2825–2830 , 2011.
[11] J.D. Hunter. Matplotlib: A 2D graphics environment. Computing in science engineering, 9(3), pp.90–95, 2007.
[12] R.A. Fisher. The use of multiple measurements in taxonomic problems. Annual Eugenics, 7, Part II, 179-188, 1936.
[13] Y. Song, S. Garg, J. Shi, S. Ermon. Sliced score matching: A scalable approach to density and score estimation. In Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence, UAI 2019, Tel Aviv, Israel, July 22-25, 2019, page 204, 2019.
[14] L. Barghout, J. Sheynin. Real-world scene perception and perceptual organization: Lessons from Computer Vision. Journal of Vision, 2013.
[15] Y. Jing, Y. Yang, Z. Feng, J. Ye, Y. Yu, and M. Song. Neural style transfer: A review. IEEE Trans. Vis. Comput Graph, 2019.
[16] 富子豪、林伯慎。使用雙層濾波與GMM分群之影像抽象化方法研究(碩士論文,國立台灣科技大學,台北,台灣)。國立台灣科技大學圖書館。2015。
[17] D. Dua, C. Graff. UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science, 2019.
[18] D. Martin, C. Fowlkes, D. Tal, and J. Malik. A Database of Human Segmented Natural Images and its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics. ICCV, 2001.
[19] 劉易昇、林伯慎。基於PageRank之文件分群與文件視覺化方法研究(碩士論文,國立台灣科技大學,台北,台灣)。國立台灣科技大學圖書館。2014。