研究生: |
王翔毅 Shyang-Yih Wang |
---|---|
論文名稱: |
全卷積自編碼神經網路與混合式損失函數之逆半色調技術 Inverse Halftoning using Autoencoder based on Fully Convolutional Neural Network and Hybrid Loss Function |
指導教授: |
郭景明
Jing-Ming Guo |
口試委員: |
花凱龍
Kai-Lung Hua 丁建君 Jian-Jiun Ding 徐繼聖 Gee-Sern Hsu |
學位類別: |
碩士 Master |
系所名稱: |
電資學院 - 電機工程系 Department of Electrical Engineering |
論文出版年: | 2018 |
畢業學年度: | 106 |
語文別: | 中文 |
論文頁數: | 120 |
中文關鍵詞: | 數位半色調 、逆半色調 、深度學習 、卷積神經網路 、自編碼器 、感知損失 |
外文關鍵詞: | Digital halftoning, autoencoder, inverse halftoning, deep learning, convolutional neural network, perceptual loss |
相關次數: | 點閱:343 下載:2 |
分享至: |
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
數位半色調是一種傳統的轉換技術,用於從連續灰階或彩色圖像中獲得二值模式,也就是只有黑(0)和白(1)兩種色調。其已經廣泛的被應用在許多有限數量顏料設備之中,像是印表機、電子紙等。在許多情況下,需要逆半色調來重建和處理印刷圖像。所謂數位逆半色調技術是將半色調影像還原成原始的連續色調影像。然而,由於逆半色調技術是考慮如何將2值模式訊號生成至256色階,其存在著無數種的組合可能,這是很難找到明確的數學公式來定義或表示如何生成高質量圖像,是一個具有挑戰性的問題。
近年來伴隨科技的發展,深度學習技術方面的成熟,我們改變以往在逆半色調上處理的方式,並導入深度學習之技術。因此,本文提出在真高失真壓縮圖像的回復中使用強大的深度學習技術來獲得逆半色調的重建圖像。本文開發了一個有效的深度學習架構,結合自編碼器的概念,可以提取半色調圖像的重要特徵並將這些重要特徵重建。此外,我們使用了混合式損失函數,以增強重建圖像的紋理和細節。 最後我們在實驗結果證明,基於深度學習的解決方案明顯優於以往的方法,並且發現是一種可行且更具發展性的方法。
Digital halftoning is a traditional transformation technique to obtain binary pattern from gray or color scale image, which is only black (0) and white (1). It has been widely used in many devices of limited number of colors, such as printers and electronic paper. At many instances, inverse halftoning is required to reconstruct and process printed images. The inverse halftoning is to restore a halftoning image to its original continuous tone image. However, since the inverse halftoning considers how to transform a two bits signal to 256 levels, there are countless combinations of possibilities, making it difficult to rely a simple mathematical formula to reconstruct a high quality image. Thus, it is still a challenging problem.
In recent years, with the progress of hardware and the maturity of deep learning technology, it is possible to change the way of tackling inverse halftoning. Consequently, in this study we propose to employ deep learning which is normally used in high-distortion compressed image recovery to reconstruct a halftoning image. Specifically, this study has developed an effective deep learning framework, which integrates the autoencoder to extract important features from halftoning images. In addition, a hybrid loss function is employed to enhance the texture details of a reconstructed image. Experimental results demonstrate that the proposed deep learning based solution is significantly superior to the existing methods.
[1] R. W. Floyd and L. Steinberg, “An adaptive algorithm for spatial gray scale,” in Proc. SID 75 Digest. Soc. Inf. Display, 1975, pp. 36–37.
[2] J. F. Jarvis, C. N. Judice, and W. H. Ninke, “A survey of techniques for the display of continuous-tone pictures on bilevel displays,” Comput. Graph. Image Proc., vol. 5, no. 1, pp. 13–40, 1976.
[3] P. Stucki, “MECCA-A multiple-error correcting computation algorithm for bilevel image hardcopy reproduction,” Res. Rep. RZ1060, IBM Res. Lab., Zurich, Switzerland, 1981.
[4] J. N. Shiau and Z. Fan, “A set of easily implementable coefficients in error diffusion with reduced worm artifacts,” SPIE, 2658: 222-225, 1996.
[5] P. Li and J. P. Allebach, “Block interlaced pinwheel error diffusion,” JEI, 14(2), Apr-Jun. 2005.
[6] V. Ostromoukhov, “A simple and efficient error diffusion algorithm,” Computer Graphics (Proceedings of SIGGRAPH 2001), pp. 567-572, 2001.
[7] R. Ulichney, Digital Halftoning. Cambridge, MA: MIT Press, 1987.
[8] D. E. Knuth, “Digital halftones by dot diffusion,” ACM Trans. Graph., vol. 6, no. 4, pp. 245–273, 1987.
[9] M. Mese and P. P. Vaidyanathan, “Optimized halftoning using dot diffusion and methods for inverse halftoning,” IEEE Trans. Image Process., vol. 9, no. 4, pp. 691–709, Apr. 2000.
[10] J. M. Guo and Y. F. Liu, “Improved dot diffusion by diffused matrix and class matrix co-optimization,” IEEE Trans. Image Processing, 18(8), pp. 1804-1816, Aug. 2009.
[11] M. Analoui and J. P. Allebach, “Model based halftoning using direct binary search,” in Proc. SPIE, Human Vision, Visual Proc., Digital Display III, San Jose, CA, Feb. 1992, vol. 1666, pp. 96–108.
[12] J. P. Allebach, “FM screen design using DBS algorithm,” in Proc. IEEE ICIP, vol. 1, Lausanne, Switzerland, pp. 549-552, 1996.
[13] D. J. Lieberman and J. P. Allebach, “Efficient model based halftoning using direct binary search,” in Proc. IEEE ICIP, pp. 775-778, 1997.
[14] D. J. Lieberman and J. P. Allebach, “A dual interpretation for direct binary search and its implications for tone reproduction and texture quality,” IEEE Trans. Image Processing, 9(11), pp. 1950-1963, 2000.
[15] M. Mese and P. P. Vaidyanathan, “Look-Up Table (LUT) Method for Inverse Halftoning,” IEEE Trans. on Image Processing, vol. 10, no. 10, pp. 1566-1578, Oct. 2001.
[16] Jing-Ming Guo and Jen-Ho Chen "Inverse halftoning with variance classified filtering," IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 1293-1296, April. 2009.
[17] Yang J, Guo J, Chao H (2015) Inverse Halftoning with grouping singular value decomposition. In: IEEE international conference on image processing (ICIP), 2015. IEEE, pp 1463–1467.
[18] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097-1105).
[19] Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
[20] Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th international conference on machine learning (ICML-10) (pp. 807-814).
[21] Zeiler, M. D., Krishnan, D., Taylor, G. W., & Fergus, R. (2010, June). Deconvolutional networks. In Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on (pp. 2528-2535). IEEE.
[22] Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., ... & Ghemawat, S. (2016). Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467.
[23] Jia, Y., Shelhamer, E., Donahue, J., Karayev, S., Long, J., Girshick, R., ... & Darrell, T. (2014, November). Caffe: Convolutional architecture for fast feature embedding. In Proceedings of the 22nd ACM international conference on Multimedia (pp. 675-678). ACM.
[24] Bourlard, H.; Kamp, Y. "Auto-association by multilayer perceptrons and singular value decomposition". Biological Cybernetics. 59 (4–5): 291–294, 1988.
[25] Hinton, G.E. and Salakhutdinov, R.R. Reducing the dimensionality of data with neural networks. Science, 313 (5786):504–507, 2006.
[26] P. Smolensky, Parallel Distributed Processing: Volume 1: Foundations, D. E. Rumelhart, J. L. McClelland, Eds. (MIT Press, Cambridge, 1986), pp. 194–281.
[27] P. Vincent, H. Larochelle, Y. Bengio, and P.-A. Manzagol. Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th International Conference on Machine Learning, pages 1096–1103. ACM, 2008.
[28] Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3431-3440).
[29] Ayesha.Siddiqua, Guoliang.Fan, “Supervised DeepAuto-encoder for Depth Image-based 3D Model Retrieval,” in 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).
[30] J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for realtime style transfer and super-resolution,” in European Conference on Computer Vision. Springer, 2016, pp. 694–711.
[31] K. Fukushima. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological cybernetics, 1980.
[32] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel. Backpropagation applied to handwritten zip code recognition. Neural Computation, 1989.
[33] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 770–778.
[34] A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. ICLR, 2016.
[35] J. Bruna, P. Sprechmann, and Y. LeCun. Super resolution with deep convolutional sufficient statistics. In ICLR, 2016.
[36] Zhou, W., A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli. "Image Qualifty Assessment: From Error Visibility to Structural Similarity." IEEE Transactions on Image Processing. Vol. 13, Issue 4, April 2004, pp. 600–612.
[37] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Doll´ar, and C. L. Zitnick. Microsoft coco: Common objects in context. In ECCV, pages 740–755. Springer, 2014.
[38] D. Kingma and J. Ba. Adam: A method for stochastic optimization. In ICLR, 2015.
[39] H. Zhao, O. Gallo, I. Frosio, and J. Kautz, “Loss Functions for Image Restoration with Neural Networks,” IEEE Transactions on Computational Imaging, vol. 3, no. 1, Mar. 2017.
[40] R. Neelamani, R. D. Nowak, and R. G. Baraniuk, “Winhd: Waveletbased inverse halftoning via deconvolution,” IEEE Transactions on Image Processing, 2002.
[41] Yang J, Guo J, Chao H (2015) Inverse Halftoning with grouping singular value decomposition. In: IEEE international conference on image processing (ICIP), 2015. IEEE, pp 1463–1467.
[42] Chicco, Davide; Sadowski, Peter; Baldi, Pierre. "Deep autoencoder neural networks for gene ontology annotation predictions". Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics - BCB '14. p. 533, 2014.
[43] NTUST MSP Lab Image Database. (2014) [Online].Available: http://msp.ee.ntust.edu.tw/public%20file/ImageSet.rar
[44] S. Hein and A. Zakhor, “Halftone to continuous-tone conversion of errordiffusion coded images,” IEEE Trans. Image Process., vol. 4, no. 2, pp. 208–216, Feb. 1995.
[45] P. W. Wong, “Inverse halftoning and kernel estimation for error diffusion,”IEEE Trans. Image Process., vol. 4, no. 4, pp. 486–498, Apr. 1995.
[46] Z. Xiong, M. T. Orchard, and K. Ramchandran, “Inverse halftoning using wavelets,” in Proc. IEEE Int. Conf. Image Process., vol. 1. Lausanne, Switzerland, Sep. 1996, pp. 569–572.
[47] T. D. Kite, N. Damera-Venkata, B. L. Evans, and A. C. Bovik, “A fast, high-quality inverse halftoning algorithm for error diffused halftones,” IEEE Trans. Image Process., vol. 9, no. 9, pp. 1583–1592, Sep. 2000.
[48] N. Damera-Venkata, T. D. Kite, M. Venkataraman, and B. L. Evans, “Fast blind inverse halftoning,” in Proc. IEEE Int. Conf. Image Process., vol. 2. Chicago, IL, USA, Oct. 1998, pp. 64–68.
[49] Thomas D Kite, Brian L Evans, and Alan C Bovik, “Modeling and quality assessment of halftoning by error diffusion,” IEEE Transactions on Image Processing, vol. 9, no. 5, pp. 909–922, 2000.
[50] T. D. Kite, N. Damera-Venkata, B. L. Evans, and A. C. Bovik, “A fast, high-quality inverse halftoning algorithm for error diffused halftones,”IEEE Transactions on Image Processing, vol. 9, no. 9, pp. 1583–1592, 2000.
[51] P. C. Chang, C. S. Yu, and T. H. Lee, “Hybrid LMS-MMSE inverse halftoning technique,” IEEE Trans. Image Process., vol. 10, no. 1, pp. 95–103, Jan. 2001.
[52] M. Mese and P. P. Vaidyanathan, “Look-up table (LUT) method for inverse halftoning,” IEEE Trans. Image Process., vol. 10, no. 10, pp. 1566–1578, Oct. 2001.
[53] K. L. Chung and S. T. Wu, “Inverse halftoning algorithm using edgebased lookup table approach,” IEEE Trans. Image Process., vol. 14, no. 10, pp. 1583–1589, Oct. 2005.
[54] U. F. Siddiqi and S. M. Sait, “Algorithm for parallel inverse halftoning using partitioning of Look-Up Table (LUT),” in Proc. IEEE Int. Symp. Circuits Syst., May 2008, pp. 3554–3557.
[55] N. Atamena, Z. Tifedjadjine, Z. Dibi, and A. Bouridane, “A fast inverse halftoning algorithm using LUT approach for grey-level images,” in Proc. ICTTA, Apr. 2008, pp. 1–4.
[56] Y. F. Liu, J. M. Guo, and J. D. Lee, “Inverse halftoning based on the Bayesian theorem,” IEEE Trans. Image Process., vol. 20, no. 4, pp. 1077–1084, Apr. 2011.
[57] Xin Li, “Inverse halftoning with nonlocal regularization,”in Image Processing (ICIP), 2011 18th IEEE International Conference on. IEEE, 2011, pp. 1717–1720.
[58] Jing-Ming Guo, Yun-Fu Liu, Jen-Ho Chen and Jiann-Der Lee, “Inverse Halftoning with Context Driven Prediction”, IEEE Trans. on Image Processing Vol. 23, No. 4, pp. 1923-1924, April 2014.
[59] Masahiro Hirao, Toshiaki Aida, “Sparse Representation Approach to Inverse Halftoning by Means of K-SVD Dictionary”, 2015 15th International Conference on Control, Automation and Systems(ICCAS), 2015.