簡易檢索 / 詳目顯示

研究生: 王翔毅
Shyang-Yih Wang
論文名稱: 全卷積自編碼神經網路與混合式損失函數之逆半色調技術
Inverse Halftoning using Autoencoder based on Fully Convolutional Neural Network and Hybrid Loss Function
指導教授: 郭景明
Jing-Ming Guo
口試委員: 花凱龍
Kai-Lung Hua
丁建君
Jian-Jiun Ding
徐繼聖
Gee-Sern Hsu
學位類別: 碩士
Master
系所名稱: 電資學院 - 電機工程系
Department of Electrical Engineering
論文出版年: 2018
畢業學年度: 106
語文別: 中文
論文頁數: 120
中文關鍵詞: 數位半色調逆半色調深度學習卷積神經網路自編碼器感知損失
外文關鍵詞: Digital halftoning, autoencoder, inverse halftoning, deep learning, convolutional neural network, perceptual loss
相關次數: 點閱:343下載:2
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 數位半色調是一種傳統的轉換技術,用於從連續灰階或彩色圖像中獲得二值模式,也就是只有黑(0)和白(1)兩種色調。其已經廣泛的被應用在許多有限數量顏料設備之中,像是印表機、電子紙等。在許多情況下,需要逆半色調來重建和處理印刷圖像。所謂數位逆半色調技術是將半色調影像還原成原始的連續色調影像。然而,由於逆半色調技術是考慮如何將2值模式訊號生成至256色階,其存在著無數種的組合可能,這是很難找到明確的數學公式來定義或表示如何生成高質量圖像,是一個具有挑戰性的問題。
    近年來伴隨科技的發展,深度學習技術方面的成熟,我們改變以往在逆半色調上處理的方式,並導入深度學習之技術。因此,本文提出在真高失真壓縮圖像的回復中使用強大的深度學習技術來獲得逆半色調的重建圖像。本文開發了一個有效的深度學習架構,結合自編碼器的概念,可以提取半色調圖像的重要特徵並將這些重要特徵重建。此外,我們使用了混合式損失函數,以增強重建圖像的紋理和細節。 最後我們在實驗結果證明,基於深度學習的解決方案明顯優於以往的方法,並且發現是一種可行且更具發展性的方法。


    Digital halftoning is a traditional transformation technique to obtain binary pattern from gray or color scale image, which is only black (0) and white (1). It has been widely used in many devices of limited number of colors, such as printers and electronic paper. At many instances, inverse halftoning is required to reconstruct and process printed images. The inverse halftoning is to restore a halftoning image to its original continuous tone image. However, since the inverse halftoning considers how to transform a two bits signal to 256 levels, there are countless combinations of possibilities, making it difficult to rely a simple mathematical formula to reconstruct a high quality image. Thus, it is still a challenging problem.
    In recent years, with the progress of hardware and the maturity of deep learning technology, it is possible to change the way of tackling inverse halftoning. Consequently, in this study we propose to employ deep learning which is normally used in high-distortion compressed image recovery to reconstruct a halftoning image. Specifically, this study has developed an effective deep learning framework, which integrates the autoencoder to extract important features from halftoning images. In addition, a hybrid loss function is employed to enhance the texture details of a reconstructed image. Experimental results demonstrate that the proposed deep learning based solution is significantly superior to the existing methods.

    中文摘要 I Abstract II 致謝 III 目錄 IV 圖表索引 VI 第一章 緒論 1 背景介紹 1 1.1.1 半色調 1 1.1.3 研究動機與目的 3 論文架構 4 第二章 半色調技術及逆半色調技術之文獻探討 6 2.1 誤差擴散法(Error-diffused, ED) 9 2.2 有序抖動法(Ordered dither, OD) 19 2.3 點擴散法(Dot-diffused, DD) 23 2.4 直接二元搜尋法(Direct binary search, DBS) 35 2.5 查表式逆半色調法(Look-up table Inverse Halftoning, LIH) 40 2.6 分類式特徵逆半色調技術(Feature-Classified Inverse Halftoning, FCIH) 42 2.7 Inverse Halftoning with Grouping Singular Value Decomposition (G-SVD) 46 第三章 基於卷積神經網路技術之文獻探討 48 3.1 類神經網路的運作 49 3.1.1 向前傳播(Forward Propagation) 49 3.1.2 反向傳播(Backward Propagation) 52 3.2 影響神經網路效能的因素 56 3.3 卷積神經網路 60 3.3.1 卷積 62 3.3.2 非線性激勵函數 64 3.3.3 匯集 66 3.3.4 訓練方法 67 3.3.5 視覺化過程 70 3.4 自編碼器(Autoencoder) 72 3.4.1 深度自編碼器(Deep Autoencoder) 73 3.4.2 Denoising Autoencoder 75 3.5 Fully Convolution Network (FCN) [28] 77 第四章 全卷積自編碼神經網路與混合式損失函數之逆半色調技術 80 4.1 Encode-Decoder Network 80 4.1.1 Encoder編碼器 81 4.1.2 Decoder解碼器 83 4.1.3 Skip-connection 85 4.2 Loss function 85 4.2.1 L1 loss 86 4.2.2 SSIM loss 86 4.2.3 Perceptual loss 88 4.2.4 Hybrid loss 89 4.3實驗結果 90 4.3.1 資料庫 90 4.3.2 實現細節 93 4.3.3 比較結果 100 第五章 結論與未來展望 115 參考文獻 116

    [1] R. W. Floyd and L. Steinberg, “An adaptive algorithm for spatial gray scale,” in Proc. SID 75 Digest. Soc. Inf. Display, 1975, pp. 36–37.
    [2] J. F. Jarvis, C. N. Judice, and W. H. Ninke, “A survey of techniques for the display of continuous-tone pictures on bilevel displays,” Comput. Graph. Image Proc., vol. 5, no. 1, pp. 13–40, 1976.
    [3] P. Stucki, “MECCA-A multiple-error correcting computation algorithm for bilevel image hardcopy reproduction,” Res. Rep. RZ1060, IBM Res. Lab., Zurich, Switzerland, 1981.
    [4] J. N. Shiau and Z. Fan, “A set of easily implementable coefficients in error diffusion with reduced worm artifacts,” SPIE, 2658: 222-225, 1996.
    [5] P. Li and J. P. Allebach, “Block interlaced pinwheel error diffusion,” JEI, 14(2), Apr-Jun. 2005.
    [6] V. Ostromoukhov, “A simple and efficient error diffusion algorithm,” Computer Graphics (Proceedings of SIGGRAPH 2001), pp. 567-572, 2001.
    [7] R. Ulichney, Digital Halftoning. Cambridge, MA: MIT Press, 1987.
    [8] D. E. Knuth, “Digital halftones by dot diffusion,” ACM Trans. Graph., vol. 6, no. 4, pp. 245–273, 1987.
    [9] M. Mese and P. P. Vaidyanathan, “Optimized halftoning using dot diffusion and methods for inverse halftoning,” IEEE Trans. Image Process., vol. 9, no. 4, pp. 691–709, Apr. 2000.
    [10] J. M. Guo and Y. F. Liu, “Improved dot diffusion by diffused matrix and class matrix co-optimization,” IEEE Trans. Image Processing, 18(8), pp. 1804-1816, Aug. 2009.
    [11] M. Analoui and J. P. Allebach, “Model based halftoning using direct binary search,” in Proc. SPIE, Human Vision, Visual Proc., Digital Display III, San Jose, CA, Feb. 1992, vol. 1666, pp. 96–108.
    [12] J. P. Allebach, “FM screen design using DBS algorithm,” in Proc. IEEE ICIP, vol. 1, Lausanne, Switzerland, pp. 549-552, 1996.
    [13] D. J. Lieberman and J. P. Allebach, “Efficient model based halftoning using direct binary search,” in Proc. IEEE ICIP, pp. 775-778, 1997.
    [14] D. J. Lieberman and J. P. Allebach, “A dual interpretation for direct binary search and its implications for tone reproduction and texture quality,” IEEE Trans. Image Processing, 9(11), pp. 1950-1963, 2000.
    [15] M. Mese and P. P. Vaidyanathan, “Look-Up Table (LUT) Method for Inverse Halftoning,” IEEE Trans. on Image Processing, vol. 10, no. 10, pp. 1566-1578, Oct. 2001.
    [16] Jing-Ming Guo and Jen-Ho Chen "Inverse halftoning with variance classified filtering," IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1293-1296, April. 2009.
    [17] Yang J, Guo J, Chao H (2015) Inverse Halftoning with grouping singular value decomposition. In: IEEE international conference on image processing (ICIP), 2015. IEEE, pp 1463–1467.
    [18] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
    [19] Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
    [20] Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10) (pp. 807-814).
    [21] Zeiler, M. D., Krishnan, D., Taylor, G. W., & Fergus, R. (2010, June). Deconvolutional networks. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on (pp. 2528-2535). IEEE.
    [22] Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., ... & Ghemawat, S. (2016). Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467.
    [23] Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., ... & Darrell, T. (2014, November). Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM international conference on Multimedia (pp. 675-678). ACM.
    [24] Bourlard, H.; Kamp, Y. "Auto-association by multilayer perceptrons and singular value decomposition". Biological Cybernetics. 59 (4–5): 291–294, 1988.
    [25] Hinton, G.E. and Salakhutdinov, R.R. Reducing the dimensionality of data with neural networks. Science, 313 (5786):504–507, 2006.
    [26] P. Smolensky, Parallel Distributed Processing: Volume 1: Foundations, D. E. Rumelhart, J. L. McClelland, Eds. (MIT Press, Cambridge, 1986), pp. 194–281.
    [27] P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th International Conference on Machine Learning, pages 1096–1103. ACM, 2008.
    [28] Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3431-3440).
    [29] Ayesha.Siddiqua, Guoliang.Fan, “Supervised DeepAuto-encoder for Depth Image-based 3D Model Retrieval,” in 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).
    [30] J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for realtime style transfer and super-resolution,” in European Conference on Computer Vision. Springer, 2016, pp. 694–711.
    [31] K. Fukushima. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological cybernetics, 1980.
    [32] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backpropagation applied to handwritten zip code recognition. Neural Computation, 1989.
    [33] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
    [34] A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. ICLR, 2016.
    [35] J. Bruna, P. Sprechmann, and Y. LeCun. Super resolution with deep convolutional sufficient statistics. In ICLR, 2016.
    [36] Zhou, W., A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. "Image Qualifty Assessment: From Error Visibility to Structural Similarity." IEEE Transactions on Image Processing. Vol. 13, Issue 4, April 2004, pp. 600–612.
    [37] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Doll´ar, and C. L. Zitnick. Microsoft coco: Common objects in context. In ECCV, pages 740–755. Springer, 2014.
    [38] D. Kingma and J. Ba. Adam: A method for stochastic optimization. In ICLR, 2015.
    [39] H. Zhao, O. Gallo, I. Frosio, and J. Kautz, “Loss Functions for Image Restoration with Neural Networks,” IEEE Transactions on Computational Imaging, vol. 3, no. 1, Mar. 2017.
    [40] R. Neelamani, R. D. Nowak, and R. G. Baraniuk, “Winhd: Waveletbased inverse halftoning via deconvolution,” IEEE Transactions on Image Processing, 2002.
    [41] Yang J, Guo J, Chao H (2015) Inverse Halftoning with grouping singular value decomposition. In: IEEE international conference on image processing (ICIP), 2015. IEEE, pp 1463–1467.
    [42] Chicco, Davide; Sadowski, Peter; Baldi, Pierre. "Deep autoencoder neural networks for gene ontology annotation predictions". Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics - BCB '14. p. 533, 2014.
    [43] NTUST MSP Lab Image Database. (2014) [Online].Available: http://msp.ee.ntust.edu.tw/public%20file/ImageSet.rar
    [44] S. Hein and A. Zakhor, “Halftone to continuous-tone conversion of errordiffusion coded images,” IEEE Trans. Image Process., vol. 4, no. 2, pp. 208–216, Feb. 1995.
    [45] P. W. Wong, “Inverse halftoning and kernel estimation for error diffusion,”IEEE Trans. Image Process., vol. 4, no. 4, pp. 486–498, Apr. 1995.
    [46] Z. Xiong, M. T. Orchard, and K. Ramchandran, “Inverse halftoning using wavelets,” in Proc. IEEE Int. Conf. Image Process., vol. 1. Lausanne, Switzerland, Sep. 1996, pp. 569–572.
    [47] T. D. Kite, N. Damera-Venkata, B. L. Evans, and A. C. Bovik, “A fast, high-quality inverse halftoning algorithm for error diffused halftones,” IEEE Trans. Image Process., vol. 9, no. 9, pp. 1583–1592, Sep. 2000.
    [48] N. Damera-Venkata, T. D. Kite, M. Venkataraman, and B. L. Evans, “Fast blind inverse halftoning,” in Proc. IEEE Int. Conf. Image Process., vol. 2. Chicago, IL, USA, Oct. 1998, pp. 64–68.
    [49] Thomas D Kite, Brian L Evans, and Alan C Bovik, “Modeling and quality assessment of halftoning by error diffusion,” IEEE Transactions on Image Processing, vol. 9, no. 5, pp. 909–922, 2000.
    [50] T. D. Kite, N. Damera-Venkata, B. L. Evans, and A. C. Bovik, “A fast, high-quality inverse halftoning algorithm for error diffused halftones,”IEEE Transactions on Image Processing, vol. 9, no. 9, pp. 1583–1592, 2000.
    [51] P. C. Chang, C. S. Yu, and T. H. Lee, “Hybrid LMS-MMSE inverse halftoning technique,” IEEE Trans. Image Process., vol. 10, no. 1, pp. 95–103, Jan. 2001.
    [52] M. Mese and P. P. Vaidyanathan, “Look-up table (LUT) method for inverse halftoning,” IEEE Trans. Image Process., vol. 10, no. 10, pp. 1566–1578, Oct. 2001.
    [53] K. L. Chung and S. T. Wu, “Inverse halftoning algorithm using edgebased lookup table approach,” IEEE Trans. Image Process., vol. 14, no. 10, pp. 1583–1589, Oct. 2005.
    [54] U. F. Siddiqi and S. M. Sait, “Algorithm for parallel inverse halftoning using partitioning of Look-Up Table (LUT),” in Proc. IEEE Int. Symp. Circuits Syst., May 2008, pp. 3554–3557.
    [55] N. Atamena, Z. Tifedjadjine, Z. Dibi, and A. Bouridane, “A fast inverse halftoning algorithm using LUT approach for grey-level images,” in Proc. ICTTA, Apr. 2008, pp. 1–4.
    [56] Y. F. Liu, J. M. Guo, and J. D. Lee, “Inverse halftoning based on the Bayesian theorem,” IEEE Trans. Image Process., vol. 20, no. 4, pp. 1077–1084, Apr. 2011.
    [57] Xin Li, “Inverse halftoning with nonlocal regularization,”in Image Processing (ICIP), 2011 18th IEEE International Conference on. IEEE, 2011, pp. 1717–1720.
    [58] Jing-Ming Guo, Yun-Fu Liu, Jen-Ho Chen and Jiann-Der Lee, “Inverse Halftoning with Context Driven Prediction”, IEEE Trans. on Image Processing Vol. 23, No. 4, pp. 1923-1924, April 2014.
    [59] Masahiro Hirao, Toshiaki Aida, “Sparse Representation Approach to Inverse Halftoning by Means of K-SVD Dictionary”, 2015 15th International Conference on Control, Automation and Systems(ICCAS), 2015.

    QR CODE