Project background
Hyperspectral images are brilliant in many practical applications because of the rich spectral information provided by their hyperspectral resolution (with hundreds of spectral bands), such as image classification, anomaly detection, change detection and quantitative agriculture. However, due to the signal-to-noise ratio and satellite revisit cycle, the spatial resolution of hyperspectral image is very low. For example, the spatial resolution of Modis satellite image is 500m, which seriously loses the spatial details of the image, which greatly limits its scope and accuracy in application.
Hyperspectral panchromatic fusion refers to the fusion of hyperspectral images with hyperspectral and low spatial resolution and single band panchromatic images with high spatial resolution to obtain images with hyperspectral and high spatial resolution, which is an effective way to improve hyperspectral spatial resolution.
However, the existing fusion methods mainly focus on the small-scale fusion task (i.e. the spatial resolution ratio of hyperspectral and panchromatic images), which is lack of practical application. For example, the hyperspectral image obtained by resource-1 02D satellite has 166 bands, with a spatial resolution of 30m, while the spatial resolution of panchromatic image is 2.5m, and the ratio of the two resolutions reaches 12. At present, there are no good research results for this kind of large-scale resolution.
Therefore, based on the propeller framework focusing on the large-scale fusion task for the first time (the scale is 16), and aiming at the ill condition of the fusion problem (that is, predicting the reflectivity of multi band hyperspectral image from single band panchromatic image), this paper proposes a fusion network based on Hyperspectral projection abundance space. In addition, according to the linear and nonlinear relationship between panchromatic intensity and abundance characteristics, a detail injection sub network is designed to inject panchromatic details into the low-dimensional abundance space instead of the original high-dimensional space, which effectively alleviates the ill condition of the fusion problem and improves the quality of the fused image.
Technical scheme
Based on the linear and nonlinear relationship between panchromatic features and abundance features, a PDIN sub network is constructed to inject panchromatic details.
1. Linear relationship
The derivation of the linear relationship between panchromatic and abundance features is shown above. According to the fourth formula, panchromatic intensity is the linear combination of abundance, and panchromatic intensity itself is the linear combination of hyperspectral. Therefore, there is equivalence between injecting panchromatic features into abundance space and hyperspectral space.
2. Nonlinear relationship
Figure 1: simulated scatter distribution between abundance characteristics and panchromatic intensity
The nonlinear relationship is obtained by simulation experiment. Firstly, select the spectral curve from the USGS spectral library to simulate the pure end element, and then randomly generate the abundance pixel data. The standard deviation calculated by the abundance is the horizontal axis of Figure 1. The randomly obtained abundance multiplied by the endmember is the constructed hyperspectral data, and then multiplied by the spectral response function of worldview2 to generate panchromatic data. Figure 1 is the scatter diagram of the calculated abundance standard deviation and panchromatic intensity. Because the shape of this distribution is very similar to that of fish, it is called 'fish distribution' in this paper.
Run the following code to generate the distribution diagram in Figure 1.
1import numpy as np 2import matplotlib.pyplot as plt 3srf_pan = np.load('/home/aistudio/data/data124050/srf_pan.npy') # Spectral response function 4 5filename = '/home/aistudio/work/file/envi_plot3.txt' # Pure spectra of various selected features 6curve_data =np.loadtxt(filename, skiprows=13)[:, 1:] 7 8pixel_num = 5000 9abun = np.random.rand(pixel_num, curve_data.shape[1]) 10abun_sum = np.sum(abun, axis=1) 11abun = abun / np.expand_dims(abun_sum, axis=1) 12 13intensity = np.dot(srf_pan, np.dot(curve_data[:, :], np.transpose(abun))[:srf_pan.shape[0], :]) / np.sum(srf_pan) 14x_var = np.std(abun, axis=1) # Standard deviation of endcell abundance per pixel 15 16plt.scatter(x_var, intensity, color='b', s=10) 17plt.xlabel('Abundance STD') 18plt.ylabel('PAN Intensity') 19plt.show()
3. The distribution shown in Figure 1 of detail injection sub network (PDIN) is correct, but in the process of network forward propagation, the data distribution generated by some modules such as the sampling module is not correct. Our idea is to inject panchromatic details in a disguised form by adjusting the data distribution. The adjustment process is envisaged as follows:
Figure 2: process of hypothetical data distribution transformation
As shown in Figure 2, we assume that the green dot is an incorrect point (there is an error in its pixel abundance), and the blue dot is the correct point in line with the 'fish distribution'. We want to move the green dot to the blue dot area to match the 'fish distribution'. Since the longitudinal axis is the corresponding panchromatic intensity and the existing data, it can play the role of supervision or restriction. That is, in the process of green dot moving, the 'panchromatic intensity' decoded by the corresponding abundance pixel should not deviate from the real panchromatic intensity.
The movement of green dots can be completed by changing the standard deviation of abundance pixels, which can be multiplied by a weight. Weight means that it can be seen from Figure 2 that the distance between different green points and red lines (green blue boundary) is different, that is, the distance to be moved is also different. Therefore, the weight multiplied by abundance pixels should be adaptive to each green point. Due to the limited space, please check the source paper for the specific derivation process.
Figure 3: detail injection sub network (PDIN)
The following is the PDIN class function built based on the propeller:
1# A fusion network based on linear and nonlinear relations 2class pan_abunstd(nn.Layer): 3 def __init__(self, endmember_num=30): 4 super(pan_abunstd, self).__init__() 5 self.multiply_weight = simple_net_res(1, endmember_num, kernelsize=3) # Multiplicative weight, change variance 6 self.plus_weight = simple_net_res(1, endmember_num, kernelsize=3) # Additive weight, change the mean 7 8 def forward(self, w1, w2, b1, b2, pan, abun, dim_band=1): 9 update_weight = F.sigmoid(F.relu(pan - w1 * paddle.std(abun, axis=1, keepdim=True) - b1)) # Position weights that should be updated 10 update_weight2 = F.sigmoid(F.relu(w2 * paddle.std(abun, axis=1, keepdim=True) + b2 - pan)) 11 update_weight = update_weight + update_weight2 12 result0 = self.multiply_weight(pan) * update_weight * abun + self.plus_weight(pan) * update_weight 13 return result0 + abun
Among them, W1, W2, B1 and B2 in the forward function pass through the paddle in the main class function create_ Parameter() function definition.
4. Overall network framework
Figure 4: fusion network framework Pgnet
The fusion network proposed in this paper is divided into four parts, as shown in Figure 4.
Part 1 is the coding part, that is, the input hyperspectral image is projected into the low dimensional abundance space, and the network structure of Conv+BN+LReLU is used for projection.
Part 2 is the up sampling part. Because this paper focuses on the large-scale fusion task, two-step up sampling is adopted to gradually improve the image resolution. The green square is the up sampling block, including bicubic interpolation and single convolution operation. Purple square double net is a double convolution operation, which is used to match the resolution of abundance and extract features.
Figure 5: SAB block
Part 3 is an attention enhancement part, which adopts a connected pixel level attention mechanism to further enhance the features, as shown in Figure 5. PDIN is added to SAB block to further inject panchromatic space details.
Part 4 is the decoding part, that is, the abundance feature is restored to hyperspectral image. It is worth noting that the convolution parameters in this part can be regarded as the endmembers in the linear mixed model.
experimental result
This paper verifies the fusion performance of the proposed Pgnet on four data sets: JiaXing, Chikusei, XiongAn and Real dataset. In addition, in order to verify the generalization performance of the network on small-scale fusion tasks, we conducted experiments with ratios 4 and 8. The ablation experiment is also designed to verify the necessity of each network module and the correctness of the distribution change law, that is, in PDIN, injecting spatial details by changing the data distribution is not just an assumption. The experimental results confirm the hypothesis proposed by the simulation experiment.
Data generation code, the first step: generate low resolution hyperspectral and panchromatic data; Step 2: cut into training patch:
1def generate_data(path, path_srf): # First step 2 ratio_hs = 16 3 # spectral response function of worldview2 4 srf_pan = np.expand_dims(np.expand_dims(np.load(path_srf), axis=-1), axis=-1) 5 6 noise_mean = 0.0 7 noise_var = 0.0001 8 dowmsample16 = Downsampler(ratio_hs) # Gaussian kernel downsampling 9 original_msi = np.float32(np.load(path)) 10 band, row, col = original_msi.shape 11original_msi = original_msi[:, :(row - row%(ratio_hs*4)), :(col - col%(ratio_hs*4))] 12 13 # ratio 16 14 temp_blur = dowmsample16(paddle.to_tensor(np.expand_dims(original_msi, axis=0), dtype='float32')) 15 print(temp_blur.shape) 16 temp_blur = np.squeeze(temp_blur.numpy()) 17 _, rows, cols = temp_blur.shape 18 blur_data = [] 19 for i3 in range(temp_blur.shape[0]): # Add noise 20 blur_data.append(np.expand_dims(temp_blur[i3, :, :] + 21 np.random.normal(noise_mean, noise_var ** 0.5, [rows, cols]), axis=0)) 22 blur_data = np.concatenate(blur_data, axis=0) 23 print('blur_data.shape:' + str(blur_data.shape)) 24 25 # simulated pan image 26 temp_pan = np.expand_dims(np.sum(original_msi * srf_pan, axis=0) / np.sum(srf_pan), axis=0) 27 print('temp_pan.shape:' + str(temp_pan.shape)) 28 29 return original_msi, temp_blur, temp_pan 30 31def crop_data(hrhs, lrhs, pan, ratio=16, train_ratio=0.8): # Step two 32 training_size = 64 # training patch size 33 testing_size = 256 # testing patch size 34 idx = int(lrhs.shape[2] * train_ratio) # Coordinate index value of cutting 35 36 '''Generate training and test data''' 37 train_hs_image, train_hrpan_image, train_label = \ 38 Crop_traindata(lrhs[:, :, :idx], 39 pan[:, :, :idx*ratio], 40 hrhs[:, :, :idx*ratio], 41 ratio=ratio, size=training_size, 42 test=False) 43 test_hs_image, test_hrpan_image, test_label = \ 44 Crop_traindata(lrhs[:, :, idx:], 45 pan[:, :, idx*ratio:], 46 hrhs[:, :, idx*ratio:], 47 ratio=ratio, size=testing_size, 48 test=True) 49 return train_hs_image, train_hrpan_image, train_label, test_hs_image, test_hrpan_image, test_label
Training code:
1# Define data and models 2dataset0 = Mydata(train_hs_image, train_hrpan_image, train_label) # Training data 3train_loader = paddle.io.DataLoader(dataset0, num_workers=0, batch_size=opt.batch_size, 4 shuffle=True, drop_last=True) 5 6dataset1 = Mydata(test_hs_image, test_hrpan_image, test_label) # test data 7test_loader = paddle.io.DataLoader(dataset1, num_workers=0, batch_size=opt.test_batch_size, 8 shuffle=False, drop_last=False) 9 10model = Pg_net(band=opt.in_nc, endmember_num=opt.endmember, ratio=ratio) 11 12L2_loss = nn.loss.MSELoss() # loss function 13samloss = SAMLoss() 14 15scheduler = optim.lr.StepDecay(opt.lr, opt.step, gamma=opt.momentum, verbose=False) # Learning rate adjustment 16optimizer = optim.Adam(learning_rate=scheduler, parameters=model.parameters()) 17for epoch in range(opt.num_epochs): 18 time0 = time.time() 19 loss_total = 0.0 20 21 scheduler.step(epoch) 22 model.train() 23 for i, (images_hs, images_pan, labels) in enumerate(train_loader()): 24 result = model(images_hs, images_pan) 25 26 loss_l2 = L2_loss(result, labels) 27 loss_sam = samloss(result, labels) 28 loss = loss_l2 + 0.01*loss_sam 29 30 loss.backward() 31 optimizer.step() 32 optimizer.clear_gradients() 33 loss_total += loss.numpy() 34 35 if ((epoch+1) % 10) == 0: 36 print('epoch %d of %d, using time: %.2f , loss of train: %.4f' % 37 (epoch + 1, opt.num_epochs, time.time() - time0, loss_total)) 38 39# Test output: 40model.eval() 41image_all = [] 42with paddle.no_grad(): 43 for (images_hs, images_pan, _) in test_loader: 44 outputs_temp = model(images_hs, images_pan) 45 image_all.append(outputs_temp) 46 a = paddle.concat(image_all, axis=0) 47return a
JiaXing experimental results:
Figure 6: the fusion results of each method are on the JiaXing dataset. The first column is the real image, and the subsequent columns are GSA, SFIM, wavelet and MTF respectively_ GLP_ The results of HPM, cNMF, hyperpnn2, hspeset2 and hyperkit methods are listed in the last column (see the paper for detailed methods). Among them, the odd row is the fusion hyperspectral image, and the even row is the difference from the true value.
Table 1: quantitative experimental results on JiaXing dataset.
From the above experimental results, it can be seen that compared with other methods, this method shows better spectral and spatial fidelity. The quantitative results including multiple indicators also show that our method achieves the best performance. Due to the spatial relationship, the results of the other three data sets are not shown, but our method has achieved the best results.
Verification of data distribution transformation hypothesis
In order to verify the effect of the proposed PDIN on data distribution transformation, we analyzed the intermediate results of the test image, and the results are as follows:
Figure 7: change process of intermediate results. Each row represents different data sets. Taking the JiaXing data in the first row as an example, the first three intermediate result scatter plots (the horizontal axis is the abundance standard deviation and the vertical axis is the panchromatic intensity) represent the end of the second upper sampling block, the second PDIN and Part 3 respectively. The fourth to sixth images are the hyperspectral images obtained by corresponding decoding, and the last one is the real image.
As shown in Figure 7, from the first column to the second column, that is, after a detail injection of PDIN, the data distribution becomes a little like the distribution of simulated data (fish distribution), and the corresponding hyperspectral image quality is also improved (column 5). After the attention mechanism with PDIN, that is, the third column, the data distribution becomes almost equal to the simulated data, and the corresponding hyperspectral quality is very close to the real image (column 6). The transformation of the intermediate results shown in the figure verifies that the proposed PDIN can inject details and improve image quality by correcting the data distribution, and also verifies the correctness of the 'fish distribution'.
conclusion
This paper focuses on a large proportion of hyperspectral and panchromatic fusion tasks, and constructs a step-by-step up sampling fusion network based on projection abundance subspace using the propeller framework. Using the linear and nonlinear relationship between abundance and panchromatic features, a panchromatic detail injection sub network PDIN is constructed, which effectively injects panchromatic details into abundance features and alleviates the ill condition of fusion problem. The effectiveness of the proposed network is verified qualitatively and quantitatively. The visualization of intermediate results also verifies that PDIN can inject panchromatic details and improve image quality by correcting data distribution. The lightweight network design (only 49100 parameters) and the experimental results make the practical application of hyperspectral and panchromatic fusion a step.