Dual-Domain Cooperative Recovery of Atmospheric Turbulence Degradation Images

Qiu, Jianxiao; Jiang, Runbo; Meng, Wenwen; Shi, Dongfeng; Hu, Bingzhang; Wang, Yingjian

doi:10.3390/rs16162972

Open AccessArticle

Dual-Domain Cooperative Recovery of Atmospheric Turbulence Degradation Images

by

Jianxiao Qiu

^1,2,3

,

Runbo Jiang

^1,2,3,

Wenwen Meng

⁴,

Dongfeng Shi

^1,2,3,*,

Bingzhang Hu

^2,3 and

Yingjian Wang

^1,2,3

¹

Science Island Branch of Graduate School, University of Science and Technology of China, Hefei 230026, China

²

Key Laboratory of Atmospheric Optics, Anhui Institute of Optics and Fine Mechanics, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China

³

Advanced Laser Technology Laboratory of Anhui Province, Hefei 230037, China

⁴

School of Artificial Intelligence and Big Data, Hefei University, Hefei 230601, China

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(16), 2972; https://doi.org/10.3390/rs16162972

Submission received: 26 June 2024 / Revised: 9 August 2024 / Accepted: 12 August 2024 / Published: 14 August 2024

Download

Browse Figures

Versions Notes

Abstract

:

Atmospheric turbulence is a key factor contributing to data distortion in mid-to-long-range target observation tasks. Neural networks have become a powerful tool for dealing with such problems due to their strong ability to fit nonlinearities in the spatial domain. However, the degradation in data is not confined solely to the spatial domain but is also present in the frequency domain. In recent years, the academic community has come to recognize the significance of frequency domain information within neural networks. There remains a gap in research on how to combine dual-domain information to reconstruct high-quality images in the field of blind turbulence image restoration. Drawing upon the close association between spatial and frequency domain degradation information, we introduce a novel neural network architecture, termed Dual-Domain Removal Turbulence Network (DDRTNet), designed to improve the quality of reconstructed images. DDRTNet incorporates multiscale spatial and frequency domain attention mechanisms, combined with a dual-domain collaborative learning strategy, effectively integrating global and local information to achieve efficient restoration of atmospheric turbulence-degraded images. Experimental findings demonstrate significant advantages in performance for DDRTNet compared to existing methods, validating its effectiveness in the task of blind turbulence image restoration.

Keywords:

atmospheric turbulence; image restoration; deep learning; image processing

Graphical Abstract

1. Introduction

Imaging systems within the atmospheric medium are susceptible to geometric distortions and optical blurring, primarily due to atmospheric turbulence [1]. Further analysis indicates that the severity of turbulence disturbances correlates with the optical path integral, with turbulence-induced image degradation becoming more pronounced at greater imaging distances [2]. Specifically, atmospheric turbulence causes random fluctuations in the air refractive index, introducing stochastic disturbances in amplitude and phase during light propagation. This complexity and unpredictability in the imaging process pose a challenging inverse problem for computer vision and adaptive optics to mitigate atmospheric turbulence effects [3,4]. The significance of this issue lies in the fact that if degraded images are utilized for advanced visual tasks such as target detection, classification, remote sensing, surveillance, autonomous driving, and high-resolution Earth observation, the deterioration in image quality will directly compromise the performance and accuracy of these tasks, potentially leading to erroneous outcomes [5]. Such effects are intolerable in demanding application scenarios. Therefore, restoring turbulence-degraded images is a critically important low-level visual task to meet the quality requirements of advanced visual tasks.

In recent years, with the rapid advancement of deep learning technology, end-to-end deep learning-based image restoration techniques have increasingly become mainstream. Consequently, the application of deep learning methods to mitigate atmospheric turbulence effects in degraded images has garnered significant attention [6]. Unlike traditional adaptive optics methods, deep learning offers substantial potential in addressing nonlinear problems by leveraging data-driven learning of prior knowledge. Through extensive training on large datasets, deep neural networks can establish robust nonlinear mappings between inputs and outputs. With sufficient data and strong generalization capabilities, these networks can tackle challenges that traditional methods cannot, such as image denoising [7,8], deblurring [9,10], and super-resolution [11,12]. Unlike non-blind turbulence image restoration tasks, which often require cumbersome extraction of prior knowledge [13,14], neural networks can autonomously learn the necessary prior information through data-driven approaches. As a result, deep learning methods based on various neural network architectures have emerged as the leading approach for blind turbulence image restoration tasks in recent years. For instance, Kangfu Mei et al. [15] proposed LTT-GAN, based on generative adversarial networks, to mitigate turbulence interference in long-distance facial recognition. Zihao Cai et al. [16] addressed turbulence interference in solar telescopes by enhancing CycleGAN. Santiago Lopez-Tapia et al. [17] introduced variational deep networks for the first time to correct atmospheric turbulence interference. P. Hill et al. [18] introduced the concept of “accelerated Deep Image Prior” in the context of blind turbulence image restoration. Xingguang Zhang et al. [19] proposed the Turbulence Mitigation Transformer, while Xijun Wang et al. [20] designed the “Real-World Atmospheric Turbulence Mitigation Model” by combining supervised and unsupervised methods. Abu Bucker Siddik et al. [21] utilized a deep neural network architecture based on AlexNet to predict corrected Zernike coefficients. Xingguang Zhang et al. [22] introduced “DATUM”, which efficiently performs long-distance spatiotemporal aggregation using a recursive approach. Lizhen Duan et al. [23] tackled interference in the pre-denoising step of turbulence-degraded images using a novel deblurring kernel. Viktor Sineglazov et al. [24] developed a “fully convolutional encoder-decoder network” to address turbulence blur in drone-captured images. Yiming Guo et al. [25] designed “a novel blind restoration network” to improve the generalization capabilities of existing models. Huimin Ma et al. [26] employed “GoogLeNet” to develop a method for correcting turbulence aberrations between two image frames. Ripon Kumar Saha et al. [27] pioneered a segmented restoration channel technique to restore dynamic scenes in turbulent environments. Nantheera Anantrasirichai et al. [28] proposed a new framework that supports dynamic scene restoration by utilizing short time spans. Weiyun Jiang et al. [29] introduced a universal implicit neural representation (NeRT) for unsupervised turbulence mitigation. Shengqi Xu et al. [30] won the “CVPR 2023 UG²⁺ Track 2.2” by combining image matching and deblurring methods. Shuyuan Zhang et al. [31] proposed a dual-patch pixel (DPP) prior to effective blind deblurring of turbulence images. Inspired by the connection between turbulence phase diagram characteristics, Xiangxi Li et al. [32] introduced a novel deep neural network named DeturNet. Ajay Jaiswal et al. [33] integrated a physics-based simulator directly into the training process to help the network separate turbulence randomness from degraded and underlying images. Zhiyuan Mao et al. [34] proposed a transformer model inspired by physics to improve atmospheric turbulence imaging and introduced a unified approach for mitigating atmospheric turbulence in both static and dynamic sequences [35]. In summary, although existing neural network models have achieved impressive results in restoration tasks, many primarily rely on redundant spatial domain information for representation learning, often overlooking the potential value of frequency domain information in blind turbulence image restoration tasks.

In image restoration theory, high-frequency information in the frequency domain represents textures and details, while low-frequency information is primarily associated with smooth areas of an image [36]. According to signal processing theory, filtering different frequency domain components can more effectively restore degraded images. Additionally, advancements in fast Fourier transform (FFT) technology have significantly enhanced the speed of processing frequency domain signals. As a result, the role of frequency domain information in neural networks has garnered increasing attention. For example, to address spatial domain degradation in deep learning-based exposure correction algorithms, Jie Huang et al. [37] proposed the Fourier Transform-Based Deep Exposure Correction Network (FECNet). In end-to-end image-deblurring networks, Xintian Mao et al. [38] introduced the Residual Fast Fourier Transform-Convolution Block (Res FFT-Conv Block). Chongyi Li et al. [39] enhanced low-light images by correcting phase and amplitude through neural networks, while Shi Guo et al. [40] proposed the Spatial-Frequency Attention Network (SFANet) to improve convolutional networks’ ability to capture long-range dependencies. Xuanhua He et al. [41] achieved dual-domain complementary learning by using spatial information for local feature learning and frequency domain information for global feature learning. Lei Lu et al. [42] introduced the Denoising Frequency Attention Network (DFANet) based on frequency differences, and Kangzhen Yang et al. [43] separated and enhanced high-frequency and low-frequency information in image enhancement tasks. Xin Yuan et al. [44] proposed SFUNet, which employs frequency-aware convolution and attention modules to jointly model complementary information from both spatial and frequency domains in wavelet data. Tian Zhou et al. [45] demonstrated that applying the Fourier transform in transformers can better capture the global characteristics of time series, while Badri Narayana Patro et al. [46] combined frequency domain information with multi-head attention layers to create a more efficient transformer architecture. These studies collectively underscore the critical importance of frequency domain information in neural network models. However, a significant research gap remains in how to effectively combine dual-domain information to reconstruct high-quality images in the context of blind turbulence image restoration.

In the intersection of deep learning and computational optics, this paper introduces a novel neural network architecture, the Dual-Domain Removal Turbulence Network (DDRTNet). This model is inspired by the intricate relationship between spatial and frequency domain degradation in atmospheric turbulence imaging. DDRTNet is built upon cutting-edge neural network modules, with the primary goal of significantly enhancing the accuracy of image restoration and the quality of reconstructed images. At the heart of DDRTNet are its innovative multiscale spatial and frequency domain attention mechanisms, coupled with a dual-domain collaborative learning strategy. These features not only improve the model’s ability to capture and represent multiscale features but also effectively integrate degradation information from both the spatial and frequency domains, leading to precise restoration of turbulence-degraded images. To thoroughly assess DDRTNet’s performance, extensive simulations and real-world experiments were conducted. The results demonstrate that DDRTNet outperforms existing methods in image restoration tasks, highlighting its efficiency and practical value in blind turbulence image restoration. This research not only introduces new approaches for restoring turbulence-degraded images but also serves as a valuable reference for the application of deep learning in the broader field of image processing.

2. Materials and Methods

2.1. Atmospheric Turbulence Imaging Model

In this section, the properties and simplified models of atmospheric turbulence in imaging are discussed in detail. Figure 1 illustrates the process of how real image information is distorted by atmospheric turbulence and captured by a camera as a degraded image. In optical systems, the degradation function caused by turbulence can be considered as a spatially invariant function, thus the image degradation process can be modeled as a degradation system consisting of a degradation function and additive noise [47].

g (x, y) = h (x, y) \otimes o (x, y) + n (x, y)

(1)

\{\begin{matrix} r = \sqrt{x^{2} + y^{2}} \\ θ = arctan \frac{y}{x} \end{matrix}

(2)

where

g (x, y)

is a degraded image,

h (x, y)

is a degenerate function,

o (x, y)

is a real image,

n (x, y)

is additive noise, and ⊗ is the convolution operation.

The commonly accepted degradation function for atmospheric turbulence is derived from the Kolmogorov spectrum [32], as in Equation (3):

h (r, θ) = {|F \{p (r, θ) exp [i φ (r, θ)]\}|}^{2}

(3)

where

p (r, θ)

is an optical pupil function of the camera,

φ (r, θ)

is the wavefront phase distribution, and

F

represents the Fourier transform and the pupil function [5] as in Equation (4):

p (r, θ) = \{\begin{matrix} 0 & o t h e r \\ \frac{2}{π} [arccos (\frac{Ω}{Ω_{0}}) - \frac{Ω}{Ω_{0}} \sqrt{1 - {(\frac{Ω}{Ω_{0}})}^{2}}] & Ω \leq Ω_{0} \end{matrix}

(4)

where

Ω = 2 r / λ

is the spatial angular frequency,

Ω_{0} = D_{0} / λ

is the cut-off spatial angular frequency of the optical system,

D_{0}

is the diameter of the camera’s lens, and r is the polar coordinates radius.

The wavefront phase distribution function after degradation by atmospheric turbulence can be decomposed into orthogonal Zernike polynomials [48] as in Equation (5):

φ (r, θ) = \sum_{i = 1}^{\infty} a_{i} z_{i} (r, θ)

(5)

a_{i} = \int d \vec{r} p (r, θ) φ (r, θ) z_{i} (r, θ)

(6)

where

z_{i} (r, θ)

is the i-th term characterizing the Zernike polynomials and

a_{i}

is the Zernike coefficient on the i-th term since

φ (r, θ)

is consistent with a Gaussian distribution at each point in space and

a_{i}

is also Gaussian. The Zernike coefficients can be calculated by Equation (6).

From the Kolmogorov turbulence theory, the relationship between the Zernike polynomial coefficients and the covariance matrix elements [48] is determined as follows:

〈a_{i} a_{j}〉 = \{\begin{matrix} \begin{matrix} c_{i j} {(\frac{D_{0}}{r_{0}})}^{5 / 3} & i - j = e v e n \end{matrix} \\ \begin{matrix} 0 & i - j = o d d \end{matrix} \end{matrix}

(7)

where

c_{i j}

is the covariance matrix element that can be calculated from the refractive fluctuation exponent characterized by the Kolmogorov power spectrum,

D_{0}

is the diameter of the camera’s lens, and

r_{0}

is the atmospheric coherence length. Because each term of the coefficients of the Zernike polynomials is related to

D_{0} / r_{0}

, it is often used to characterize the turbulence intensity.

2.2. The Structure of Network

2.2.1. Res FFT-GeLU Block

To integrate spatial and frequency domain information, we propose the Res FFT-GeLU Block, as illustrated in Figure 2. This structure consists of two parallel branches, a spatial domain branch and a frequency domain branch. The spatial domain branch focuses on extracting spatial domain features of the image to capture spatial degradation information. In the spatial domain, two 3 × 3 convolutions are equivalent to one 5 × 5 convolution. This is to expand the receptive field of the network while reducing the number of parameters so that the network can learn richer features. The 1 × 1 convolution kernel used in the frequency domain branch can be equivalent to a multiplication operation. Before inputting the Res FFT GeLU module, different texture information in the image has been extracted separately. Because different texture information is composed of different frequency domain information, we assume that the 1 × 1 convolution parameter can be approximated as the reciprocal of the degradation function value of different frequency information, thus alleviating turbulence interference in the frequency domain information of different channel dimensions. The residual connection in each branch serves to accelerate training, alleviate the vanishing gradient problem during training, and reduce the loss of feature information. The frequency domain branch acts like an adaptive filter, capable of filtering the frequency domain signals of different channels to adaptively process high- and low-frequency features. Through carefully designed filters, we can selectively enhance or suppress specific frequency components, effectively filtering and handling frequency domain degradation information. Finally, the outputs of the spatial domain branch and the frequency domain branch are merged using an additive fusion strategy, resulting in a feature representation that integrates spatial and frequency domain information. This dual-domain fusion approach leverages the complementary aspects of both domains, thereby enhancing the performance of the image restoration model.

Based on the atmospheric turbulence imaging model of Section 2.1, we will analyze the adaptive filtering capability of the frequency domain branch in detail. In order to process the frequency domain information, Equation (1) was Fourier transformed to the following:

G (u, v) = H (u, v) O (u, v) + N (u, v)

(8)

where

G (u, v)

is a degraded image,

H (u, v)

is a degenerate function,

O (u, v)

is a real image, and

N (u, v)

is additive noise. Compared to Equation (1), the neural network fits Equation (8) more easily.

The mathematical model of the frequency domain branch is shown in Equation (9) and the input feature of the branch are shown in Equation (10):

X_{2}^{C \times H \times W} (u, v) = {W_{2}}^{C \times 1 \times 1} \otimes G e L U ({W_{1}}^{C \times 1 \times 1} \otimes X_{1}^{C \times H \times W} (u, v)) + X_{1}^{C \times H \times W} (u, v)

(9)

{X_{1}}^{C \times H \times W} (u, v) = H^{C \times H \times W} (u, v) O^{C \times H \times W} (u, v) + N (u, v)

(10)

where

O^{C \times H \times W} (u, v)

is the frequency information of the real image texture features,

{X_{1}}^{C \times H \times W} (u, v)

is the Spatial domain characteristics

{X_{i n}}^{C \times H \times W} (x, y)

after fast Fourier transform, and

{X_{2}}^{C \times H \times W} (u, v)

is the output of the frequency domain branch.

Since the 1 × 1 convolution operation is equivalent to the multiplication operation, combining Equations (9) and (10) yields Equation (11), as follows:

{W_{1}}^{C \times 1 \times 1} \otimes X_{1}^{C \times H \times W} = W_{1}^{C \times 1 \times 1} H^{C \times H \times W} (u, v) O^{C \times H \times W} (u, v) + W_{1}^{C \times 1 \times 1} N (u, v)

(11)

When u, v are in the tiny range, the

H^{C \times H \times W} (u, v)

can be seen as a constant. The weights

{W_{1}}^{C \times 1 \times 1}

learned adaptively through neural networks can approximate the specific frequencies

1 / H^{C \times H \times W} (u, v)

, which makes

W_{1}^{C \times 1 \times 1} H^{C \times H \times W} (u, v) \approx 1

,

N^{'} (u, v) = W_{1}^{C \times 1 \times 1} N (u, v)

, and in this case, Equation (9) can be simplified to Equation (12):

X_{2}^{C \times H \times W} (u, v) = W_{2}^{C \times 1 \times 1} GeLU (O^{C \times H \times W} (u, v) + N^{'} (u, v)) + X_{1}^{C \times H \times W} (u, v)

(12)

GeLU (x) = 0.5 x (1 + tanh (\sqrt{2 / π} (x + 0.044715 x^{3})))

(13)

From Equation (12), we can see that the feature

O^{C \times H \times W} (u, v) + N^{'} (u, v)

mainly contains the random noise interference of the degraded system, so we can carry out the filtering process to achieve noise reduction. Consequently, we can employ filtering techniques to achieve noise reduction. As depicted in Figure 3a,b, the GeLU [49] function expression, presented in Equation (13), and its derivative exhibit a striking resemblance to the Butterworth filter [36], and while the ReLU function possesses characteristics analogous to an ideal filter for superior noise suppression [50], in the context of neural networks, GeLU is deemed more suitable as the filter function for the Res FFT-GeLU Block. This is due to several reasons: First, ideal filters are physically unattainable, whereas Butterworth filters offer a more physically interpretable alternative [36]. Second, as we postulate that some noise is inherently coupled with the image frequencies, preserving relevant frequency information is conducive to image restoration. Last, the ReLU function is prone to the “dying ReLU” issue during network training [51], which GeLU effectively circumvents. After filtering and then integrating the frequency information by convolution

{W_{2}}^{C \times 1 \times 1}

, added with the residual connection to improve the training speed and prevent the loss of key information [52], we finally obtain the frequency domain information

{X_{2}}^{C \times H \times W} (u, v)

which is close to the real image.

2.2.2. Overall Architecture

To mitigate turbulence effects in images, we propose a deep learning model called the Dual-Domain Removal Turbulence Network (DDRTNet), illustrated in Figure 4. Inspired by state-of-the-art dual-domain neural network models, our model adopts the U-Net [53] encoder–decoder framework, which includes three scales in both the encoding and decoding stages. Research by Hamidreza Fazlali et al. [54] has shown that downsampling operations can alleviate image blur, and clearer small-scale outputs can help restore clearer large-scale images. Based on these findings, we construct corresponding inputs and outputs at three scales within the network. We hope to further enhance the network’s ability to restore different scale degradation areas by calculating the loss of downsampling output. The branches of different scales are all shaped like a variant of U-Net, which consists mainly of an encoder and a decoder for encoding feature information. After encoding features of different scales and dimensions, corresponding decoding processes are performed to obtain restored images of different scales, making the model capable of multi-scale restoration. This design can make fuller use of different scale feature information and enhance the model’s ability to represent different scale features. To enable the model to efficiently learn features from downsampled inputs, we use the SCM block, as shown in Figure 5a, which can rapidly increase the number of channels of the downsampled inputs to match the required channels for the corresponding scale of the model. As shown in Figure 5b, the ResGroup comprises a ResBlock, FFT-GeLU Block, and the Dual-Domain Strip Attention Mechanism (DSAM) module [55]. Initially, the ResBlock, consisting of two layers of 3 × 3 convolution and GeLU non-linear activation functions, is used to efficiently extract features with doubled or halved channels. Once features are extracted, they are quickly integrated with spatial and frequency domain information through the residual block composed of the FFT-GeLU Block and DSAM. The DSAM module, which efficiently refines spectra and integrates contextual information, is placed after the Res FFT-GeLU Block, with its structure depicted in Figure 5c. To reduce computational complexity while enhancing the network’s multiscale feature learning capability, we also embed the Omni-Kernel Module (OKM) [56] between the encoding and decoding stages, with the OKM structure shown in Figure 5d. The 3 × 3 convolution before the three scales output is to fuse the features extracted by ResGroup into a 3-dimensional restored image output, while the other 3 × 3 convolution is to increase or decrease the number of channels. These operations are crucial for adjusting the network structure, controlling model complexity, and improving model performance. Concatenation can help us connect multiple different features or multiple different network module outputs together by splicing the feature information extracted from the downsampled image and the original image in the channel dimension, thereby forming a richer feature representation. Then, through 3 × 3 convolution, we can fuse the features between different channels during the process of increasing or decreasing the channel, achieving interaction between feature information across channels. Finally, to minimize the loss of image information during the downsampling and upsampling processes, we design residual connections at each scale of the encoding and decoding stages, as well as in the output results.

Through the design of DDRTNet, the potential of frequency domain information in reconstructing high-quality images is fully exploited. Our model not only effectively reduces the number of ResBlock modules typically used in traditional spatial domain models, but also enhances the network’s nonlinear mapping capability. Aimed at reconstructing high-quality images from those degraded by atmospheric turbulence, our model design makes full use of the characteristics of the atmospheric turbulence imaging model.

2.2.3. The Implementation of DDRTNet

In this section, we will detail the construction of the training dataset and test samples. First, 1200 images were randomly selected from the ImageNet dataset [57] to serve as the original dataset. Degraded images were generated using the turbulence simulation algorithm proposed by Zhiyuan Mao et al. [58]. Relevant parameters of the simulation were

L = 3000 m, D_{0} = 0.1 m, r_{0} = 0.1 m

. The ImageNet dataset [57] has been widely used in research on computer vision algorithms since it was introduced in 2010 for the ImageNet Large Scale Visual Recognition Challenge. The turbulence simulation algorithm proposed by Zhiyuan Mao et al. [58] has also gained widespread recognition and was adopted as the official simulation algorithm for the CVPR UG²⁺ Challenge Track 2.2-Coded Target Restoration through Atmospheric Turbulence. A subset of background images and their corresponding simulated degraded versions are shown in Figure 6. Visually, the results of the simulation closely resemble the actual effects of atmospheric turbulence, accurately replicating the distortions typically seen in real-world scenarios.

In our dataset, we used 1000 images for the training dataset, applying data augmentation techniques to expand the training set’s size. An additional 100 images were used as the validation dataset, and the remaining 100 images were designated as the test dataset. The network’s three input image sizes were the original size of 256 × 256, half-size downsampled to 128 × 128, and quarter-size downsampled to 64 × 64. To balance the representation learning of both the spatial and frequency domains, we employed the L1 loss function for both the spatial and frequency domain signals in the model’s restoration results. The loss functions for the spatial domain, frequency domain, and the overall loss are given by Equations (14), (15), and (16), respectively.

L_{c} = \frac{1}{E} {∥I^{*} - Y∥}_{1}

(14)

L_{f} = \frac{1}{E} {∥F (I^{*}) - F (Y)∥}_{1}

(15)

L_{0} = (L_{c} + L_{c, 1 / 2} + L_{c, 1 / 4}) + λ (L_{f} + L_{f, 1 / 2} + L_{f, 1 / 4})

(16)

where

I^{*}

and Y represent the predicted image and background reality, E indicates the number of output elements.

L_{c, 1 / 2}

,

L_{c, 1 / 4}

,

L_{f, 1 / 2}

,

L_{f, 1 / 4}

denote the spatial and frequency domain loss corresponding to downsampling 1/2 and 1/4 times the predicted image and

λ

represents the weighting factor, which is set to 0.01 here [50].

3. Results and Discussion

3.1. The Test of Image Restoration in Simulation

In this section, we will validate the performance and robustness of DDRTNet using a simulated dataset. The initial learning rate was set to

8 \times 10^{- 4}

, and the training process took approximately 3 h. We employed a cosine annealing learning rate schedule, with the minimum learning rate set to

1 \times 10^{- 6}

. The batch size was set to 20, using the Adam optimizer with a weight decay rate of

1 \times 10^{- 8}

. Our training platform consisted of a Supermicro X11SCA-F server (Supermicro, Shenzhen, China), equipped with an Intel Xeon Silver 4210R CPU (Intel, Shenzhen, China) and an NVIDIA GeForce GTX 3090 GPU (Nvidia, Shenzhen, China). The model was trained and tested using PyTorch 1.8.1 and Python 3.7.3. The loss curves of DDRTNet are shown in Figure 7.

In Figure 8, we present the restoration results of DDRTNet on the test set. From the figure, it is evident that our model effectively suppresses the turbulence effects in the images, significantly enhancing their clarity and detail representation. To validate the superiority of DDRTNet, we compared it with several state-of-the-art deep learning-based image restoration methods, including frequency domain-based Frequency Selection Network (FSNet) [59], spatial domain-based MIRNet [60], and Nonlinear Activation Free Network (NAFNet) [61]. To ensure a fair comparison, all methods were trained using the same training protocol. The restoration results of these methods are also displayed in Figure 8. The comparison shows that DDRTNet not only preserves the texture of the images better but also more accurately corrects texture distortions and local blurring. From subjective evaluation, DDRTNet’s restoration results are the closest to the true background, demonstrating the best performance. Although other methods exhibit some restoration capabilities, they still leave certain areas blurry and some textures distorted. In contrast, DDRTNet shows higher accuracy and robustness in handling turbulence effects.

To objectively assess the quality of reconstructed images, we employed three widely used metrics: Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and Normalized Root Mean Squared Error (NRMSE). These metrics measure the differences and similarities between the restored images and the original images from different perspectives. By calculating these metrics, we can further validate the superiority of DDRTNet.

P S N R = 10 \times {log}_{10} (\frac{max {(I^{*} (x, y))}^{2}}{\frac{1}{H W} \sum_{x = 0}^{W - 1} \sum_{y = 0}^{H - 1} {[I^{*} (x, y) - Y (x, y)]}^{2}})

(17)

S S I M (I^{*}, Y) = \frac{(2 μ_{I^{*}} μ_{Y} + {(0.01 R)}^{2}) (2 σ_{I^{*} Y} + {(0.03 R)}^{2})}{(μ_{I^{*}}^{2} + μ_{Y}^{2} + 0.01 R) (σ_{I^{*}}^{2} + σ_{Y}^{2} + 0.03 R)}

(18)

N R M S E = \frac{\sum_{x = 0}^{W - 1} \sum_{y = 0}^{H - 1} |I^{*} (x, y) - Y (x, y)|}{\sum_{x = 0}^{W - 1} \sum_{y = 0}^{H - 1} {(Y (x, y))}^{2}}

(19)

where H, W represent the height and width of the image;

μ_{I^{*}}

,

σ_{I^{*}}

represent the mean and variance of the recovered image

I^{*}

, and

μ_{Y}

,

σ_{Y}

represent the mean and variance of Y and R represents the range of pixel values. PSNR is an important objective metric for evaluating the fidelity of reconstructed images. Higher PSNR values indicate greater fidelity of the reconstructed images. SSIM, on the other hand, objectively measures the structural similarity between the reconstructed image and the original image. When the reconstructed image closely resembles the original, SSIM approaches 1. NRMSE quantifies the difference between the reconstructed image and the original image; a lower NRMSE indicates higher similarity, approaching 0 as the reconstructed image becomes more similar to the original. Additionally, the variances of PSNR (PSNR_td) and SSIM (SSIM_td) are employed to assess the stability of the model.

The results of the objective evaluation metrics are presented in Table 1. As shown, DDRTNet outperforms all other methods across all metrics. Specifically, DDRTNet achieves a 1.02 dB improvement in PSNR over the second-best method, indicating that our method reconstructs images with lower error and higher quality. In terms of SSIM, DDRTNet shows an increase of 0.0338, suggesting that our method better preserves the structural information of the images, making the restoration closer to the original. Additionally, DDRTNet reduces NRMSE by 0.0089, further demonstrating the superiority of our method in image restoration. Moreover, the standard deviations of PSNR (PSNR_td) and SSIM (SSIM_td) were calculated to assess the stability of different methods on the test data. As seen in Table 1, DDRTNet has lower SSIM_td values compared to other methods, indicating higher stability when handling various test data. In summary, DDRTNet exhibits exceptional performance in restoration tasks. Whether from the perspective of subjective perception or objective evaluation metrics, DDRTNet significantly outperforms other methods.

To comprehensively evaluate the performance of DDRTNet, we further compared DDRTNet with three other state-of-the-art image restoration methods in terms of algorithm complexity and runtime. As shown in Table 2, we recorded the total floating-point operations (FLOPs) and the number of parameters (Params) for each method, along with their average runtime on the test set. The total floating-point operations (FLOPs) indicate the time complexity of the algorithm, while the number of parameters (Params) represents the spatial complexity of the algorithm. These two metrics are crucial for evaluating the performance of the model, especially in real-time image restoration tasks. Smaller parameter counts and shorter runtimes imply better portability and real-time capability of the model.

From Table 2, although our method does not achieve the optimum in the comparison of computational complexity and space complexity, they are both sub-optimal results, achieving a balance between computational complexity and space complexity. Specifically, the parameter count of DDRTNet is relatively low, which helps in minimizing the storage requirements and enhancing its portability across different devices. Additionally, DDRTNet’s lower FLOPs indicate that it requires fewer computational resources to perform the restoration tasks, allowing for faster image data processing and meeting real-time requirements. In terms of average runtime, DDRTNet also excels. Compared to the other three methods, DDRTNet has a shorter average runtime, further demonstrating its superiority in real-time image restoration tasks. This advantage makes DDRTNet particularly suitable for applications requiring quick response times, such as autonomous driving and video surveillance.

In summary, DDRTNet not only excels in image restoration quality but also shows significant advantages in algorithm complexity and runtime. This makes DDRTNet an ideal choice for real-time image restoration tasks, offering broad application prospects and potential in various fields.

3.2. Ablation

To further investigate the impact of different components within DDRTNet on its performance, we conducted a series of detailed ablation experiments. In these experiments, we focused primarily on the activation functions within the Res FFT-GeLU Block and the specific modules within DDRTNet. Firstly, we replaced the GeLU function in the Res FFT-GeLU Block with a ReLU function to observe the effect of different activation functions on network performance. Additionally, we added or removed specific modules within DDRTNet (indicated by ✓/✗) to verify their contributions to the network’s ability to mitigate turbulence effects.

In the experiments, we ensured that each variant adopted the same hyperparameters and training strategies to ensure fairness and comparability of the results. We selected the model parameters that performed best on the validation set for testing and recorded the performance metrics of each variant. The experimental results are shown in Table 3, from which we can observe that both replacing the activation function and removing specific modules led to a performance decline in DDRTNet, resulting in a weakened ability to remove turbulence. This result clearly demonstrates the rationality of our selection of various modules in DDRTNet and the importance of these modules working together to enhance network performance.

Specifically, the GeLU function in the Res FFT-GeLU Block exhibited better performance than the ReLU function in the experiments, effectively validating the viewpoint we proposed when designing the Res FFT-GeLU Block. Additionally, each module within DDRTNet played a crucial role in the process of removing turbulence effects, and their combination enabled the network to better understand and handle complex degradation patterns in the images. Through this experiment, we not only gained a deeper understanding of the roles of each module in DDRTNet but also provided valuable insights for further optimizing the network structure in the future.

3.3. The Robustness of DDRTNet on Different Noises

To verify the robustness of DDRTNet against noise interference, we added four sets of Gaussian noise to the test set, with mean values of 0 and variances of 0.09, 0.16, 0.25, and 0.36, respectively. We directly input these noisy images into the trained model and obtained the corresponding restoration results as shown in Figure 9. The relevant objective evaluation metrics are presented in Table 4.

In Figure 9, we can clearly observe the impact of additive noise on the quality of reconstructed images. Specifically, when the noise is weak, its influence on the reconstruction quality is relatively minor, and the image can still maintain a high level of clarity and detailed information. However, as the noise intensity gradually increases, the quality of the reconstructed image begins to decline, manifested by an increase in noise points in the image and a blurring of detailed information. It is worth noting that even with the increasing noise intensity, DDRTNet’s reconstruction results can still preserve a significant amount of detailed information. From a subjective visual perception standpoint, despite the image being subjected to a certain degree of noise interference, the target information can still be accurately identified and perceived. This indicates that DDRTNet exhibits strong robustness when handling noise interference. Additionally, the quantitative results in Table 4 further validate this observation. With the increase in noise variance, the PSNR and SSIM metrics of the reconstructed images do decrease, but the decrease is relatively small. This indicates that DDRTNet can maintain relatively stable performance when faced with noise interference, without causing a significant decline in image quality due to noise.

In conclusion, DDRTNet demonstrates strong robustness and excellent reconstruction capabilities when dealing with degraded images affected by additive noise interference. This characteristic makes DDRTNet highly promising for practical applications, especially in complex and dynamic natural scenes, where it can effectively address image degradation issues and provide high-quality image reconstruction results.

3.4. The Performance of DeturNet on Different Turbulences

To evaluate the ability of DDRTNet to remove turbulence of different intensities, we generated datasets using the same simulation method with varying turbulence strengths (

D_{0} / r_{0} = 0.2

and

D_{0} / r_{0} = 1.3

). Utilizing these new datasets, we applied identical hyperparameters and training strategies to retrain the model. The experimental results are presented in Figure 10 and Table 5.

When the turbulence intensity is

D_{0} / r_{0} = 0.2

, the degradation of the images mainly manifests as atmospheric blur, causing partial loss of details in objects. However, DDRTNet demonstrates the ability to effectively reconstruct these lost image details. For example, in the test results shown in Figure 10, DDRTNet accurately restored the details of the billboard text, the small house behind the hot air balloon, and the background behind the plane. This indicates that DDRTNet exhibits good reconstruction capability when dealing with low-intensity turbulence degradation. As the turbulence intensity increases to

D_{0} / r_{0} = 1.3

, the degradation of image quality becomes more severe. Key information such as text on billboards, images on hot air balloons, and aircraft outlines have suffered significant degradation. Nonetheless, DDRTNet is still able to reconstruct most of the texture information, although some details are lost. This indicates that DDRTNet still maintains a certain degree of robustness and reconstruction ability when faced with images with severe degradation.

From Table 5, it is evident that DDRTNet effectively enhances the quality of reconstructed images for different turbulence intensities. By comparing the PSNR and SSIM image quality evaluation metrics under different turbulence intensities, we can further validate DDRTNet’s reconstruction ability under varying turbulence strengths. The experimental results demonstrate that DDRTNet significantly improves the PSNR and SSIM values of reconstructed images under both turbulence intensities, thereby proving its excellent reconstruction performance under different turbulence strengths.

3.5. Outdoor Experiment Results and Discussions

In this chapter, the test dataset under real turbulence conditions is sourced from relevant videos on YouTube [62,63]. To verify the performance of DDRTNet under real turbulence conditions, we present the comparative experimental results and no-reference evaluation metrics (entropy) from the test set in Figure 11, Figure 11a shows the degraded image, while (b), (c), and (d) depict the restoration results of the mentioned comparative algorithms, and (e) showcases the restoration result of our proposed DDRTNet. In experiments under real turbulence conditions, due to the lack of real reference images, we utilize some no-reference evaluation metrics to assess the quality of the restored images. Table 6 presents the average values of corresponding evaluation metrics for different restoration algorithms under real scenarios. Entropy effectively evaluates the texture details of images, calculated using Equation (20). Average Gradient (AG) measures the clarity of the restored images, with larger AG values indicating better image clarity and restoration quality, calculated using Equation (21). NIQE is a no-reference image quality evaluation metric based on statistical features of natural scenes [64], primarily used to assess image naturalness. Smaller NIQE values indicate images closer to background reality, calculated using Equation (22). Laplacian Gradient (LG) is a commonly used sharpness evaluation metric in image processing, where larger LG values indicate clearer images.

E n t r o p y = - \sum_{i = 1}^{L} p_{i} log (p_{i})

(20)

where

p_{i}

represents the ratio of the number of pixels with grey value i to the total number of pixels. In general, the entropy is an evaluation index of image texture details, and the larger its value is, the more complex the image texture is.

A G = \frac{1}{(H - 1) (W - 1)} \sum_{x = 1}^{W - 1} \sum_{y = 1}^{H - 1} \sqrt{\frac{{(I^{*} (x + 1, y) - I^{*} (x, y))}^{2} + {(I^{*} (x, y + 1) - I^{*} (x, y))}^{2}}{2}}

(21)

where H, W represent the height and width of the image and

(x, y)

represent the pixel coordinates. In general, the average gradient is an evaluation index of image clarity, the larger its value, the clearer the image, and the more detailed the information.

N I Q E = \sqrt{{(V_{1} - V_{2})}^{T} {(\frac{Σ_{1} + Σ_{2}}{2})}^{- 1} (V_{1} - V_{2})}

(22)

where

V_{1}

,

V_{2}

and

Σ_{1}

,

Σ_{2}

are the mean vector and covariance matrix of the multivariate Gaussian model for natural images and the multivariate Gaussian model for distorted images, respectively. In general, NIQE is an evaluation index of the naturalness of an image, and the larger its value, the more realistic the image is.

Based on the results in Table 6, our proposed DDRTNet achieved the best scores in various evaluation metrics for blind turbulence image restoration tasks. Despite being trained only on simulated data, DDRTNet demonstrated outstanding generalization ability when facing unknown natural scenes. This achievement significantly proves the effectiveness of DDRTNet in handling turbulence-degraded images. It is noteworthy that compared to traditional networks based solely on spatial domain mapping, DDRTNet exhibited higher reconstruction quality in restoring turbulence-degraded images. This is mainly attributed to the integration of frequency domain information in DDRTNet’s design, allowing it to capture image details and textures more comprehensively

From a visual perception perspective in Figure 11, while there are still some residual artifacts in the images reconstructed by DDRTNet that cannot be completely eliminated, compared to other methods, its reconstructed images are richer and clearer in texture information and image details. This is further corroborated by the quantified assessment of entropy, where DDRTNet’s reconstructed images exhibit higher entropy values, confirming its advantages in image texture restoration and detail preservation. Through both objective evaluation metrics and subjective visual perception judgments, we can confidently assert that DDRTNet not only achieves optimal performance in blind turbulence image restoration tasks but also demonstrates superior generalization capabilities.

4. Conclusions

In this paper, we introduce DDRTNet, an innovative deep learning network designed to leverage both spatial and frequency domain information for high-quality reconstruction of turbulence-degraded images. DDRTNet represents a pioneering approach to utilizing frequency domain information for blind turbulence image restoration tasks and is closely aligned with the atmospheric turbulence imaging model. Through extensive simulations and real-world tests, we demonstrate DDRTNet’s effectiveness and reliability. In simulations with additive noise, DDRTNet exhibits remarkable robustness to unknown noise interference, attributed to its filter-like structural design. In real-world scenarios, DDRTNet, trained exclusively on simulated data, successfully reconstructs degraded images under actual turbulence conditions, showcasing its impressive generalization capabilities. The network employs a dual-domain information fusion strategy, achieving superior restoration performance without increasing network depth by incorporating a modest number of frequency domain parameters at each feature scale. This design feature underscores DDRTNet’s potential for real-time applications. Furthermore, DDRTNet excels in image texture restoration and detail preservation due to its effective use of frequency domain information, surpassing existing methods in the quality of restored images. Overall, our experimental results affirm that DDRTNet offers a novel and effective solution for image restoration in the presence of atmospheric turbulence.

Despite these advancements, there is still room for improvement, particularly concerning residual artifacts in real-world restoration results. Future work will focus on optimizing the network architecture to mitigate these artifacts and refining network parameters by incorporating a larger dataset of real turbulence-degraded images.

Author Contributions

Conceptualization, D.S., R.J., B.H. and Y.W.; validation, formal analysis, data curation, instrument, writing—original draft preparation, visualization, J.Q.; writing—review and editing, J.Q., W.M. and D.S.; supervision, B.H., D.S. and Y.W.; project administration and funding acquisition, D.S. All authors have read and agreed to the published version of the manuscript.

Funding

The Youth Innovation Promotion Association of the Chinese Academy of Sciences, Chinese Academy of Sciences (No. 2020438); the Anhui International Joint Research Center for Ancient Architecture Intellisencing and Multi-Dimensional Modeling, Anhui Provincial Department of Science and Technology (No. GJZZX2022KF02); the HFIPS Director’s Fund, Hefei Institutes of Physical Science (grant No. YZJJ202404-CX and YZJJ202303-TS); and the Anhui Provincial Key Research and Development Project, Anhui Provincial Department of Science and Technology (No. 202304a05020053).

Data Availability Statement

The data were prepared and analyzed in this study. Data will be made available upon request.

Acknowledgments

We thank all anonymous reviewers for their comments and suggestions. In addition, the authors would like to thank Dongfeng Shi for their patience, help, and guidance.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Lau, C.P.; Lai, Y.H.; Lui, L.M. Restoration of atmospheric turbulence-distorted images via RPCA and quasiconformal maps. Inverse Probl. 2019, 35, 074002. [Google Scholar] [CrossRef]
Fante, R.L. Electromagnetic beam propagation in turbulent media. Proc. IEEE 1975, 63, 1669–1692. [Google Scholar] [CrossRef]
Hufnagel, R.; Stanley, N. Modulation transfer function associated with image transmission through turbulent media. JOSA 1964, 54, 52–61. [Google Scholar] [CrossRef]
Halder, K.K.; Tahtali, M.; Anavatti, S.G. Geometric correction of atmospheric turbulence-degraded video containing moving objects. Opt. Express 2015, 23, 5091–5101. [Google Scholar] [CrossRef] [PubMed]
Zou, H.; Li Qy, Z.Q. Research on influence of atmospheric turbulence parameters on image degradation. J. Chang. Univ. Sci. Technol. Nat. Sci. Ed. 2018, 41, 95–99. [Google Scholar]
Cheng, J.; Li, J.; Dai, C.; Ren, Y.; Xu, G.; Li, S.; Chen, X.; Zhu, W. Research on atmospheric turbulence-degraded image restoration based on generative adversarial networks. In Proceedings of the First International Conference on Spatial Atmospheric Marine Environmental Optics (SAME 2023), Shanghai, China, 7–9 April 2023; Volume 12706, pp. 37–44. [Google Scholar]
Huang, T.; Li, S.; Jia, X.; Lu, H.; Liu, J. Neighbor2neighbor: Self-supervised denoising from single noisy images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 14781–14790. [Google Scholar]
Cheng, S.; Wang, Y.; Huang, H.; Liu, D.; Fan, H.; Liu, S. Nbnet: Noise basis learning for image denoising with subspace projection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 4896–4906. [Google Scholar]
Kupyn, O.; Budzan, V.; Mykhailych, M.; Mishkin, D.; Matas, J. Deblurgan: Blind motion deblurring using conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8183–8192. [Google Scholar]
Tao, X.; Gao, H.; Shen, X.; Wang, J.; Jia, J. Scale-recurrent network for deep image deblurring. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 8174–8182. [Google Scholar]
Guo, Y.; Chen, J.; Wang, J.; Chen, Q.; Cao, J.; Deng, Z.; Xu, Y.; Tan, M. Closed-loop matters: Dual regression networks for single image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 5407–5416. [Google Scholar]
Liu, J.; Zhang, W.; Tang, Y.; Tang, J.; Wu, G. Residual feature aggregation network for image super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 2359–2368. [Google Scholar]
Rigaut, F.; Ellerbroek, B.L.; Northcott, M.J. Comparison of curvature-based and Shack–Hartmann-based adaptive optics for the Gemini telescope. Appl. Opt. 1997, 36, 2856–2868. [Google Scholar] [CrossRef] [PubMed]
Krishnan, D.; Fergus, R. Fast image deconvolution using hyper-Laplacian priors. Adv. Neural Inf. Process. Syst. 2009, 22, 1033–1041. [Google Scholar]
Mei, K.; Patel, V.M. Ltt-gan: Looking through turbulence by inverting gans. IEEE J. Sel. Top. Signal Process. 2023, 17, 587–598. [Google Scholar] [CrossRef]
Cai, Z.; Zhong, Z.; Zhang, B. High-resolution restoration of solar images degraded by atmospheric turbulence effect using improved CycleGAN. New Astron. 2023, 101, 102018. [Google Scholar] [CrossRef]
López-Tapia, S.; Wang, X.; Katsaggelos, A.K. Variational Deep Atmospheric Turbulence Correction for Video. In Proceedings of the 2023 IEEE International Conference on Image Processing (ICIP), Kuala Lumpur, Malaysia, 8–11 October 2023; pp. 3568–3572. [Google Scholar]
Hill, P.; Anantrasirichai, N.; Achim, A.; Bull, D. Atmospheric Turbulence Removal with Video Sequence Deep Visual Priors. arXiv 2024, arXiv:2402.19041. [Google Scholar]
Zhang, X.; Mao, Z.; Chimitt, N.; Chan, S.H. Imaging through the atmosphere using turbulence mitigation transformer. IEEE Trans. Comput. Imaging 2024, 10, 115–128. [Google Scholar] [CrossRef]
Wang, X.; López-Tapia, S.; Katsaggelos, A.K. Real-World Atmospheric Turbulence Correction via Domain Adaptation. arXiv 2024, arXiv:2402.07371. [Google Scholar]
Siddik, A.B.; Sandoval, S.; Voelz, D.; Boucheron, L.E.; Varela, L. Estimation of modified Zernike coefficients from turbulence-degraded multispectral imagery using deep learning. Appl. Opt. 2024, 63, E28–E34. [Google Scholar] [CrossRef]
Zhang, X.; Chimitt, N.; Chi, Y.; Mao, Z.; Chan, S.H. Spatio-Temporal Turbulence Mitigation: A Translational Perspective. arXiv 2024, arXiv:2401.04244. [Google Scholar]
Duan, L.; Zhong, L.; Zhang, J. Turbulent image deblurring using a deblurred blur kernel. J. Opt. 2024, 26, 065702. [Google Scholar] [CrossRef]
Sineglazov, V.; Lesohorskyi, K.; Chumachenko, O. Faster Image Deblurring for Unmanned Aerial Vehicles. In Proceedings of the 2024 2nd International Conference on Unmanned Vehicle Systems-Oman (UVS), Muscat, Oman, 12–14 February 2024; pp. 1–6. [Google Scholar]
Guo, Y.; Wu, X.; Qing, C.; Liu, L.; Yang, Q.; Hu, X.; Qian, X.; Shao, S. Blind Restoration of a Single Real Turbulence-Degraded Image Based on Self-Supervised Learning. Remote Sens. 2023, 15, 4076. [Google Scholar] [CrossRef]
Ma, H.; Zhang, W.; Ning, X.; Liu, H.; Zhang, P.; Zhang, J. Turbulence Aberration Restoration Based on Light Intensity Image Using GoogLeNet. Photonics 2023, 10, 265. [Google Scholar] [CrossRef]
Saha, R.K.; Qin, D.; Li, N.; Ye, J.; Jayasuriya, S. Turb-Seg-Res: A Segment-then-Restore Pipeline for Dynamic Videos with Atmospheric Turbulence. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 17–21 June 2024; pp. 25286–25296. [Google Scholar]
Anantrasirichai, N. Atmospheric turbulence removal with complex-valued convolutional neural network. Pattern Recognit. Lett. 2023, 171, 69–75. [Google Scholar] [CrossRef]
Jiang, W.; Boominathan, V.; Veeraraghavan, A. Nert: Implicit neural representations for general unsupervised turbulence mitigation. arXiv 2023, arXiv:2308.00622. [Google Scholar]
Xu, S.; Cao, S.; Liu, H.; Xiao, X.; Chang, Y.; Yan, L. 1st Solution Places for CVPR 2023 UG²⁺ Challenge Track 2.2-Coded Target Restoration through Atmospheric Turbulence. arXiv 2023, arXiv:2306.09379. [Google Scholar]
Zhang, S.; Rao, P.; Chen, X. Blind turbulent image deblurring through dual patch-wise pixels prior. Opt. Eng. 2023, 62, 033104. [Google Scholar] [CrossRef]
Li, X.; Liu, X.; Wei, W.; Zhong, X.; Ma, H.; Chu, J. A DeturNet-Based Method for Recovering Images Degraded by Atmospheric Turbulence. Remote Sens. 2023, 15, 5071. [Google Scholar] [CrossRef]
Jaiswal, A.; Zhang, X.; Chan, S.H.; Wang, Z. Physics-driven turbulence image restoration with stochastic refinement. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 12170–12181. [Google Scholar]
Mao, Z.; Jaiswal, A.; Wang, Z.; Chan, S.H. Single frame atmospheric turbulence mitigation: A benchmark study and a new physics-inspired transformer model. In Proceedings of the European Conference on Computer Vision, Aviv, Israel, 23–27 October 2022; pp. 430–446. [Google Scholar]
Mao, Z.; Chimitt, N.; Chan, S.H. Image reconstruction of static and dynamic scenes through anisoplanatic turbulence. IEEE Trans. Comput. Imaging 2020, 6, 1415–1428. [Google Scholar] [CrossRef]
Gonzales, R.C.; Woods, R.E. Digital Image Processing; Pearson: New York, NY, USA, 2010. [Google Scholar]
Huang, J.; Liu, Y.; Zhao, F.; Yan, K.; Zhang, J.; Huang, Y.; Zhou, M.; Xiong, Z. Deep fourier-based exposure correction network with spatial-frequency interaction. In Proceedings of the European Conference on Computer Vision, Aviv, Israel, 23–27 October 2022; pp. 163–180. [Google Scholar]
Mao, X.; Liu, Y.; Shen, W.; Li, Q.; Wang, Y. Deep residual fourier transformation for single image deblurring. arXiv 2021, arXiv:2111.11745. [Google Scholar]
Li, C.; Guo, C.L.; Zhou, M.; Liang, Z.; Zhou, S.; Feng, R.; Loy, C.C. Embedding fourier for ultra-high-definition low-light image enhancement. arXiv 2023, arXiv:2302.11831. [Google Scholar]
Guo, S.; Yong, H.; Zhang, X.; Ma, J.; Zhang, L. Spatial-frequency attention for image denoising. arXiv 2023, arXiv:2302.13598. [Google Scholar]
He, X.; Yan, K.; Li, R.; Xie, C.; Zhang, J.; Zhou, M. Pyramid Dual Domain Injection Network for Pan-sharpening. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, France, 2–6 October 2023; pp. 12908–12917. [Google Scholar]
Lu, L.; Liu, T.; Jiang, F.; Han, B.; Zhao, P.; Wang, G. DFANet: Denoising Frequency Attention Network for Building Footprint Extraction in Very-High-Resolution Remote Sensing Images. Electronics 2023, 12, 4592. [Google Scholar] [CrossRef]
Yang, K.; Hu, T.; Dai, K.; Chen, G.; Cao, Y.; Dong, W.; Wu, P.; Zhang, Y.; Yan, Q. CRNet: A Detail-Preserving Network for Unified Image Restoration and Enhancement Task. arXiv 2024, arXiv:2404.14132. [Google Scholar]
Yuan, X.; Li, L.; Wang, J.; Yang, Z.; Lin, K.; Liu, Z.; Wang, L. Spatial-Frequency U-Net for Denoising Diffusion Probabilistic Models. arXiv 2023, arXiv:2307.14648. [Google Scholar]
Zhou, T.; Ma, Z.; Wen, Q.; Wang, X.; Sun, L.; Jin, R. Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In Proceedings of the International Conference on Machine Learning, PMLR, Baltimore, MD, USA, 17–23 July 2022; pp. 27268–27286. [Google Scholar]
Patro, B.N.; Namboodiri, V.P.; Agneeswaran, V.S. SpectFormer: Frequency and Attention is what you need in a Vision Transformer. arXiv 2023, arXiv:2304.06446. [Google Scholar]
Li, D.; Simske, S. Atmospheric turbulence degraded-image restoration by kurtosis minimization. IEEE Geosci. Remote Sens. Lett. 2009, 6, 244–247. [Google Scholar]
Roggemann, M.C.; Welsh, B.M. Imaging through Turbulence; CRC Press: Boca Raton, FL, USA, 2018. [Google Scholar]
Hendrycks, D.; Gimpel, K. Gaussian error linear units (gelus). arXiv 2016, arXiv:1606.08415. [Google Scholar]
Mao, X.; Liu, Y.; Liu, F.; Li, Q.; Shen, W.; Wang, Y. Intriguing findings of frequency selection for image deblurring. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; Volume 37, pp. 1905–1913. [Google Scholar]
Lu, L.; Shin, Y.; Su, Y.; Karniadakis, G.E. Dying relu and initialization: Theory and numerical examples. arXiv 2019, arXiv:1903.06733. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015; Proceedings, Part III 18. Springer: Berlin, Germany, 2015; pp. 234–241. [Google Scholar]
Fazlali, H.; Shirani, S.; BradforSd, M.; Kirubarajan, T. Atmospheric turbulence removal in long-range imaging using a data-driven-based approach. Int. J. Comput. Vis. 2022, 130, 1031–1049. [Google Scholar] [CrossRef]
Cui, Y.; Knoll, A. Dual-domain strip attention for image restoration. Neural Netw. 2024, 171, 429–439. [Google Scholar] [CrossRef] [PubMed]
Cui, Y.; Ren, W.; Knoll, A. Omni-Kernel Network for Image Restoration. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; Volume 38, pp. 1426–1434. [Google Scholar]
Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 248–255. [Google Scholar]
Mao, Z.; Chimitt, N.; Chan, S.H. Accelerating atmospheric turbulence simulation via learned phase-to-space transform. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 14759–14768. [Google Scholar]
Cui, Y.; Ren, W.; Cao, X.; Knoll, A. Image Restoration via Frequency Selection. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 46, 1093–1108. [Google Scholar] [CrossRef] [PubMed]
Zamir, S.W.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M.H.; Shao, L. Learning enriched features for fast image restoration and enhancement. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 1934–1948. [Google Scholar] [CrossRef]
Chen, L.; Chu, X.; Zhang, X.; Sun, J. Simple baselines for image restoration. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; pp. 17–33. [Google Scholar]
HKT Heat Haze. Available online: https://www.youtube.com/watch?v=oF3x1BsQir8/ (accessed on 18 June 2024).
PENTAX PAIR II Fog&Heat Haze Reduction DEMO. Available online: https://www.youtube.com/watch?v=D-xNKZyKjFc/ (accessed on 18 June 2024).
Mittal, A.; Soundararajan, R.; Bovik, A.C. Making a “completely blind” image quality analyzer. IEEE Signal Process. Lett. 2012, 20, 209–212. [Google Scholar] [CrossRef]

Figure 1. Schematic of atmospheric turbulence imaging model. (a) The original image; (b) turbulence disturbance; (c) DSLR camera; (d) the degraded image captured by camera.

Figure 2. The structure of Res FFT-GeLU Block. FFT and IFFT denote fast Fourier transform and its inverse operation, respectively.

Figure 3. (a) GeLU Function and (b) GeLU Derivative.

Figure 4. Overall architecture of the proposed DDRTNet. Composed of three scales of encoding and decoding structures.

Figure 5. (a) The structure of SCM; (b) the structure of ResGroup; (c) the structure of DSAM; (d) the structure of OKM.

Figure 6. The first row is part of the real background image from the ImageNet dataset, and the second row is the degraded image output from the background image processed by the simulation algorithm.

Figure 7. Loss curves of DDRTNet during the training process.

Figure 8. Restoration effect comparison of images in test sets where (a) original map; (b) degraded images; (c) NAFNet; (d) MIRNet; (e) FSNet; (f) DDRTNet. The numbers in the images are the PSNR and SSIM.The best results are red.

Figure 9. Robustness test results under different noise intensities. The first row of each set shows the degraded image with added noise, and the second row shows the recovered results for comparison. The last four columns represent images with different noise levels. The numbers in the images are PSNR and SSIM.

Figure 10. Recovery results under different turbulence intensities. First three columns of the figure show the ground truth, degraded image, and the recovery results at turbulence intensity

D_{0} / r_{0} = 0.2

. The last three columns show the ground truth, degraded image, and recovery results at turbulence intensity

D_{0} / r_{0} = 1.3

. The numbers in the images are PSNR and SSIM.

Figure 10. Recovery results under different turbulence intensities. First three columns of the figure show the ground truth, degraded image, and the recovery results at turbulence intensity

D_{0} / r_{0} = 0.2

. The last three columns show the ground truth, degraded image, and recovery results at turbulence intensity

D_{0} / r_{0} = 1.3

. The numbers in the images are PSNR and SSIM.

Figure 11. Recovery results in real turbulence scenarios. (a) Degraded images; (b) NAFNet; (c) MIRNet; (d) FSNet; (e) DDRTNet. The numbers in the images are the entropy. The best results are red.

Table 1. Objective assessment of the above restoration methods.

	Degraded	MIRNet	NAFNet	FSNet	DDRTNet
PSNR ↑	24.54	25.26	24.98	25.23	26.28
SSIM ↑	0.7149	0.7654	0.7610	0.7723	0.8061
NRMSE ↓	0.1297	0.1240	0.1281	0.1207	0.1118
PSNR_td ↓	12.0471	13.2617	13.1166	13.5189	14.7575
SSIM_td ↓	0.01479	0.01141	0.01136	0.01031	0.00853

The best results are bolded. ↑ denotes that the bigger scores represent better quality of the images, ↓ indicates the opposite result.

Table 2. Comparison of the average time, FLOPs, and Params consumed by methods to recover images.

Index/Unit	MIRNet	NAFNet	FSNet	DDRTNet
FLOPs/G	19.88	140.75	59.71	34.63
Params/MB	29.2	5.8	7.0	7.0
Time/ms	123.801	120.806	92.851	62.902

The best results are bolded.

Table 3. The results of ablation experiments.

Method	ReLU	RFGB	DSAM	OKM	PSNR	SSIM
	✓	✓	✓	✓	25.92	0.7981
	✗	✗	✓	✓	25.97	0.7964
DDRTNet	✗	✓	✗	✓	25.87	0.7942
	✗	✓	✓	✗	25.34	0.7648
	✗	✓	✓	✓	26.28	0.8061

The best results are bolded.

Table 4. Comparison of the noise test sets.

		Var = 0	Var = 0.09	Var = 0.16	Var = 0.25	Var = 0.36
Noise test	PSNR	24.5861	24.5830	24.5794	24.5758	24.5721
sets	SSIM	0.7131	0.7125	0.7117	0.7109	0.7101
Recovery	PSNR	26.2832	26.1369	25.9011	25.7171	25.5145
results	SSIM	0.8061	0.8018	0.7942	0.7888	0.7817

Table 5. Comparison of the different turbulence intensity sets.

	$D_{0} / r_{0} = 0.2$		$D_{0} / r_{0} = 1.0$		$D_{0} / r_{0} = 1.3$
	Degraded	Our	Degraded	Our	Degraded	Our
PSNR	27.48	31.70	24.54	26.28	23.74	24.48
SSIM	0.8332	0.9366	0.7149	0.8061	0.6739	0.7405

Table 6. Objective assessment of the results of different methods in real scenarios.

	Degraded	NAFNet	MIRNet	FSNet	DDRTNet
Entropy ↑	6.0399	6.1034	6.1057	6.1139	6.1284
AG ↑	1.2478	1.7810	1.7149	1.7984	2.1132
NIQE ↓	9.1128	7.5898	7.7245	7.6107	7.0817
LG ↑	12.0088	45.0503	35.2508	30.7535	50.7452

The best results are bolded. ↑ denotes that the bigger scores represent better quality of the images, ↓ indicates the opposite result.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Qiu, J.; Jiang, R.; Meng, W.; Shi, D.; Hu, B.; Wang, Y. Dual-Domain Cooperative Recovery of Atmospheric Turbulence Degradation Images. Remote Sens. 2024, 16, 2972. https://doi.org/10.3390/rs16162972

AMA Style

Qiu J, Jiang R, Meng W, Shi D, Hu B, Wang Y. Dual-Domain Cooperative Recovery of Atmospheric Turbulence Degradation Images. Remote Sensing. 2024; 16(16):2972. https://doi.org/10.3390/rs16162972

Chicago/Turabian Style

Qiu, Jianxiao, Runbo Jiang, Wenwen Meng, Dongfeng Shi, Bingzhang Hu, and Yingjian Wang. 2024. "Dual-Domain Cooperative Recovery of Atmospheric Turbulence Degradation Images" Remote Sensing 16, no. 16: 2972. https://doi.org/10.3390/rs16162972

APA Style

Qiu, J., Jiang, R., Meng, W., Shi, D., Hu, B., & Wang, Y. (2024). Dual-Domain Cooperative Recovery of Atmospheric Turbulence Degradation Images. Remote Sensing, 16(16), 2972. https://doi.org/10.3390/rs16162972

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Dual-Domain Cooperative Recovery of Atmospheric Turbulence Degradation Images

Abstract

1. Introduction

2. Materials and Methods

2.1. Atmospheric Turbulence Imaging Model

2.2. The Structure of Network

2.2.1. Res FFT-GeLU Block

2.2.2. Overall Architecture

2.2.3. The Implementation of DDRTNet

3. Results and Discussion

3.1. The Test of Image Restoration in Simulation

3.2. Ablation

3.3. The Robustness of DDRTNet on Different Noises

3.4. The Performance of DeturNet on Different Turbulences

3.5. Outdoor Experiment Results and Discussions

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI