基於條件擴散模型之炫光去除｜國立臺灣科技大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	鄭子文 Tzu-Wen Cheng
論文名稱：	基於條件擴散模型之炫光去除 FlareDiffusion: Conditional Diffusion Model for Flare Removal
指導教授：	林昌鴻 Chang-Hong Lin
口試委員:	林昌鴻 Chang-Hong Lin 陳維美 Wei-Mei Chen 吳晉賢 Chin-Hsien Wu 呂政修 Jenq-Shiou Leu
學位類別：	碩士 Master
系所名稱：	電資學院 - 電子工程系 Department of Electronic and Computer Engineering
論文出版年：	2024
畢業學年度：	112
語文別：	英文
論文頁數：	68
中文關鍵詞：	炫光去除、影像恢復、影像處理、擴散模型、卷積神經網路 (CNN) 、深度學習、半監督式學習
外文關鍵詞：	Flare Removal, Image Restoration, Image Processing, Diffusion Network, Convolution Neural Network (CNN), Deep Learning, Semi-supervised Learning
相關次數：	點閱：131 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

上一筆

在攝影中，鏡頭上的髒污或鏡片本身的缺陷會導致光線散射和反射，從而產生我們不想要的炫光和光暈，從而降低圖像品質。此外，當相機直接對著光源時，也可能會出現類似的缺陷，尤其是在夜晚時更容易發生。因此，炫光去除任務的目標是消除這些缺陷，自然地恢復圖像內被炫光損壞的部分並保留所有細節。
在本論文中，我們提出了一種新穎的條件擴散模型-FlareDiffusion，用於炫光去除任務。我們的方法利用了擴散模型的優勢，通過在訓練過程中引入多樣化的炫光形式來增強模型的泛化能力。通過將輸入圖像作為條件進行處理並整合專門設計的損失函數，FlareDiffusion能夠有效去除炫光的同時更好的保留光源，確保高質量的圖像恢復。
量化比較結果顯示，在Flare7K測試集上，我們的方法在性能上優於最先進的現有方法，顯示其在炫光去除任務中的有效性。此外，我們的方法在視覺比較上產生了更加自然、清晰的圖片，展示了我們的模型對各種類型的炫光的穩健性和泛化性。

In photography, dirt on the lens or imperfections in the lens itself can lead to light scattering or reflecting within the lens, which cause unwanted artifacts, such as lens flare, glare, or halos, which degrade the quality of images. Additionally, directly pointing a camera at strong light sources can lead to similar defects, especially at night. Therefore, the goal of flare removal is to eliminate these artifacts, restoring the corrupted parts naturally while preserving all the details.
In this thesis, we present the FlareDiffusion, a novel conditional diffusion model designed for flare removal task. Our approach leverages the advantages and strengths of diffusion models, incorporating diverse flare patterns during training to improve the generalization capabilities of the model. By using input images to condition the model and integrating a specially designed loss function, the FlareDiffusion effectively removes flares while preserving light sources, ensuring high-quality image restoration.
The quantitative comparisons on the Flare7K test dataset demonstrate that our method achieves better results than state-of-the-art methods, which demonstrate its effectiveness in the flare removal task. Moreover, our method produces more natural and clearer images in visualize comparisons, presenting our model's robustness and generalization to various types of flares.

摘要	I
ABSTRACT	II
致謝	III
LIST OF CONTENTS	IV
LIST OF FIGURES	VII
LIST OF TABLES	IX
CHAPTER 1	INTRODUCTIONS	1
1	Motivation	1
2	Contributions	5
3	Thesis Organization	6
CHAPTER 2	RELATED WORKS	7
1	Flare Removal	7
2	Diffusion-based Networks for Similar Tasks	9
CHAPTER 3	PROPOSED METHODS	10
1	Data Preprocessing	12
1.1	Light Source Detection	13
1.2	Light and Flare Synthesis	15
1.2.1	Inverse Gamma Correction	15
1.2.2	Intensity and Noise Adjustments	17
1.2.3	Geometric Transformations	18
1.2.4	Brightness and Blur Adjustment	19
2	Diffusion Network	22
2.1	Denoising Diffusion Probabilistic Models [12]	22
2.2	Conditional Denoising Diffusion Models	25
2.3	Model Architecture	27
2.3.1	Denoising U-Net	28
2.3.2	Time Embedding Block	30
2.3.3	Residual Block	32
2.3.4	Attention Block	35
2.3.5	Downsampling Block	37
2.3.6	Upsampling Block	38
3	Loss Functions	39
3.1	Reconstruction Loss	39
3.2	Flare Loss	41
4	Sampling Method	42
CHAPTER 4	EXPERIMENTAL RESULTS	43
1	Training Details	44
2	Flare7K Dataset [2]	45
3	Evaluation Metrics	47
3.1	Structural Similarity Index (SSIM) [41]	47
3.2	Peak Signal-to-Noise Ratio (PSNR) [42]	48
3.3	Learned Perceptual Image Patch Similarity (LPIPS) [43]	49
4	Comparisons with State-of-the-art Methods	50
4.1	Quantitative Comparisons	51
4.2	Qualitative Comparisons	53
5	Ablation Studies	57
5.1	Comparison of Flare Synthesis Methods	57
5.2	Impact of Loss Functions	59
CHAPTER 5	CONCLUSIONS AND FUTURE WORKS	61
1	Conclusions	61
2	Future Works	62
REFERENCES	65
                                

[1] M. Hullin, E. Eisemann, H. Seidel, and S. Lee, "Physically-based real-time lens flare rendering," ACM Transactions on Graphics, vol. 30, no. 4, pp. 1-10, 2011.
[2] Y. Dai, C. Li, S. Zhou, R. Feng, and C. C. Loy, "Flare7k: A phenomenological nighttime flare removal dataset," Advances in Neural Information Processing Systems, vol. 35, pp. 3926-3937, 2022.
[3] Y. Wu, Q. He, T. Xue, R. Garg, J. Chen, A. Veeraraghavan, and J. T. Barron, "How to train neural networks for flare removal," in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2239-2247, 2021.
[4] X. Li, B. Zhang, J. Liao, and P. V. Sander, "Let's see clearly: Contaminant artifact removal for moving cameras," in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2011-2020, 2021.
[5] R. Raskar, A. Agrawal, C. A. Wilson, and A. Veeraraghavan, "Glare aware photography: 4D ray sampling for reducing glare effects of camera lenses," ACM Transactions on Graphics, vol. 27, no. 3, pp. 1-10, 2008.
[6] C. Asha, S. K. Bhat, D. Nayak, and C. Bhat, "Auto removal of bright spot from images captured against flashing light source," in Proceedings of the 2019 IEEE International Conference on Distributed Computing, VLSI, Electrical Circuits and Robotics, pp. 1-6, 2019.
[7] P. Vitoria and C. Ballester, "Automatic flare spot artifact detection and removal in photographs," Advances in Mathematical Imaging Vision, vol. 61, no. 4, pp. 515-533, 2019.
[8] Q. Sun, E. Tseng, Q. Fu, W. Heidrich, and F. Heide, "Learning rank-1 diffractive optics for single-shot high dynamic range imaging," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1386-1396, 2020.
[9] X. Qiao, G. P. Hancke, and R. W. Lau, "Light source guided single-image flare removal from unpaired data," in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4177-4185, 2021.
[10] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, "Generative adversarial networks," Advances in Communications of the ACM, vol. 63, no. 11, pp. 139-144, 2020.
[11] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, "Attention is all you need," Advances in Neural Information Processing Systems, vol. 30, pp. 5998-6008, 2017.
[12] J. Ho, A. Jain, and P. Abbeel, "Denoising diffusion probabilistic models," Advances in Neural Information Processing Systems, vol. 33, pp. 6840-6851, 2020.
[13] J. Zhu, T. Park, P. Isola, and A. A. Efros, "Unpaired image-to-image translation using cycle-consistent adversarial networks," in Proceedings of the IEEE international Conference on Computer Vision, pp. 2223-2232, 2017.
[14] Y. Dai, C. Li, S. Zhou, R. Feng, Y. Luo, and C. C. Loy, "Flare7k++: Mixing synthetic and real datasets for nighttime flare removal and beyond," ArXiv Preprint ArXiv:.04236, 2023.
[15] Y. Zhou, D. Liang, S. Chen, S.-J. Huang, S. Yang, and C. Li, "Improving lens flare removal with general-purpose pipeline and multiple light sources recovery," in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12969-12979, 2023.
[16] D. Zhang, J. Ouyang, G. Liu, X. Wang, X. Kong, and Z. Jin, "Ff-former: Swin fourier transformer for nighttime flare removal," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2824-2832, 2023.
[17] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, and B. Guo, "Swin transformer: Hierarchical vision transformer using shifted windows," in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012-10022, 2021.
[18] L. Chi, B. Jiang, and Y. Mu, "Fast fourier convolution," Advances in Neural Information Processing Systems, vol. 33, pp. 4479-4488, 2020.
[19] C. Saharia, W. Chan, H. Chang, C. Lee, J. Ho, T. Salimans, D. Fleet, and M. Norouzi, "Palette: Image-to-image diffusion models," in Proceedings of the ACM SIGGRAPH Conference, pp. 1-10, 2022.
[20] P. Dhariwal and A. Nichol, "Diffusion models beat gans on image synthesis," Advances in Neural Information Processing Systems, vol. 34, pp. 8780-8794, 2021.
[21] G. Rosh, B. P. Prasad, L. R. Boregowda, and K. Mitra, "Deep unsupervised reflection removal using diffusion models," in Proceedings of the IEEE International Conference on Image Processing, pp. 2045-2049, 2023.
[22] T. Wang, W. Lu, K. Zhang, W. Luo, T.-K. Kim, T. Lu, H. Li, and M.-H. Yang, "PromptRR: Diffusion models as prompt generators for single image reflection removal," ArXiv Preprint ArXiv:.02374, 2024.
[23] O. Özdenizci and R. Legenstein, "Restoring vision in adverse weather conditions with patch-based denoising diffusion models," IEEE Transactions on Pattern Analysis Machine Intelligence, vol. 45, no. 8, pp. 10346-10357, 2023.
[24] Y. Li, L. Liu, and B. Ma, "Image rain removal algorithm based on conditional diffusion model," in Proceedings of the IEEE International Conference on Image Processing, Computer Vision and Machine Learning, pp. 68-71, 2023.
[25] M. Ren, M. Delbracio, H. Talebi, G. Gerig, and P. Milanfar, "Multiscale structure guided diffusion for image deblurring," in Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10721-10733, 2023.
[26] K. Chen and Y. Liu, "Efficient image deblurring networks based on diffusion models," ArXiv Preprint ArXiv:.05907, 2024.
[27] X. Zhang, R. Ng, and Q. Chen, "Single image reflection separation with perceptual losses," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4786-4794, 2018.
[28] P. Babakhani and P. Zarei, "Automatic gamma correction based on average of brightness," Advances in Computer Science, vol. 4, no. 6, pp. 156-159, 2015.
[29] D. P. Kingma and M. Welling, "Auto-encoding variational bayes," International Conference on Learning Representations, 2014.
[30] O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation," in Proceedings of the Medical Image Computing and Computer-Assisted Intervention, pp. 234-241, 2015.
[31] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016.
[32] T. Karras, S. Laine, M. Aittala, J. Hellsten, J. Lehtinen, and T. Aila, "Analyzing and improving the image quality of stylegan," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8110-8119, 2020.
[33] Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, "Score-based generative modeling through stochastic differential equations," in Proceedings of the International Conference on Learning Representations, 2021.
[34] Y. Wu and K. He, "Group normalization," in Proceedings of the European Conference on Computer Vision, pp. 3-19, 2018.
[35] S. Ioffe and C. Szegedy, "Batch normalization: Accelerating deep network training by reducing internal covariate shift," in Proceedings of the International Conference on Machine Learning, pp. 448-456, 2015.
[36] A. Q. Nichol and P. Dhariwal, "Improved denoising diffusion probabilistic models," in Proceedings of the International Conference on Machine Learning, pp. 8162-8171, 2021.
[37] J. Johnson, A. Alahi, and F.-F. Li, "Perceptual losses for real-time style transfer and super-resolution," in Proceedings of the European Conference on Computer Vision, pp. 694-711, 2016.
[38] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," ArXiv Preprint ArXiv:.1556, 2014.
[39] I. Loshchilov and F. Hutter, "SGDR: Stochastic gradient descent with warm restarts," in Proceedings of the International Conference on Learning Representations, 2017.
[40] I. Loshchilov and F. Hutter, "Decoupled weight decay regularization," in Proceedings of the International Conference on Learning Representations, 2019.
[41] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, "Image quality assessment: from error visibility to structural similarity," IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, 2004.
[42] D. H. Johnson, "Signal-to-noise ratio," Scholarpedia, vol. 1, no. 12, p. 2088, 2006.
[43] R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, "The unreasonable effectiveness of deep features as a perceptual metric," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586-595, 2018.
[44] A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in Neural Information Processing Systems, vol. 25, 2012.
[45] L. Chen, X. Lu, J. Zhang, X. Chu, and C. Chen, "Hinet: Half instance normalization network for image restoration," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 182-192, 2021.
[46] S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, and M.-H. Yang, "Restormer: Efficient transformer for high-resolution image restoration," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5728-5739, 2022.
[47] Z. Wang, X. Cun, J. Bao, W. Zhou, J. Liu, and H. Li, "Uformer: A general u-shaped transformer for image restoration," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17683-17693, 2022.
[48] J. Zhang, Y. Cao, Z.-J. Zha, and D. Tao, "Nighttime dehazing with a synthetic benchmark," in Proceedings of the ACM International Conference on Multimedia, pp. 2355-2363, 2020.
[49] A. Sharma and R. T. Tan, "Nighttime visibility enhancement by increasing the dynamic range and suppression of light effects," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11977-11986, 2021.
[50] L. Qu, S. Zhou, J. Pan, J. Shi, D. Chen, and J. Yang, "Harmonizing light and darkness: A symphony of prior-guided data synthesis and adaptive focus for nighttime flare removal," ArXiv Preprint ArXiv:.00313, 2024.

全文公開日期 2026/07/29 (校內網路)
全文公開日期 2026/07/29 (校外網路)
全文公開日期 2026/07/29 (國家圖書館：臺灣博碩士論文系統)

簡易檢索 / 詳目顯示

相關論文