Histogram Equalization on Sentinel-2 L1C Satellite Images
For those wrestling with satellite imagery, clouds can be a real pain. They overexpose scenes, making it tough to get a clear picture of both the atmospheric and terrestrial elements. No worries, though! Today we’re tackling this issue with a technique called histogram equalization on Sentinel-2 L1C images. 🌦️
Why Histogram Equalization?
Imagine trying to look at a bright cloud and a shaded forest in the same image. Standard visualization will wash out one or the other. Histogram equalization adjusts the range of brightness values in your image, giving both clouds and land their moment in the sun, or well, your screen.
The Code Explained
Let’s jump into our Python code. We’re using the libraries numpy, rasterio, pathlib and tqdm.
import numpy as np
import rasterio as rio
from pathlib import Path
from tqdm.auto import tqdm
Paths and Bands
We first define the paths to our input and output directories. We also specify the RGB bands we’re interested in (4, 3, 2).
s2_l1c_dir = Path('/home/nick/Downloads/test2/in/S2A_MSIL1C...')
output_dir = Path('/home/nick/Downloads/test2/out')
required_bands = [4,3,2]
Histogram Functions
We have two core functions: get_histogram
to calculate the histogram and histogram_equalization
to apply equalization.
def get_histogram(array, bins, data_pct=1):
# Calculate the subset size based on the data_pct parameter
subset_size = int(len(array) * data_pct / 100)
# Randomly choose a subset from the original array
array = np.random.choice(array, subset_size)
histogram = np.zeros(bins)
# Count occurrences of each value in the array and populate the histogram
for value in array:
histogram[value] += 1
return histogram
def histogram_equalization(array):
# Flatten the array to a 1D array
flat = array.flatten()
# Get the histogram of the flattened array
hist = get_histogram(flat, 65536)
# Calculate the cumulative sum of the histogram
cs = np.cumsum(hist)
# Normalize the cumulative sum
cs = ((cs - cs.min()) * 65535 / (cs.max() - cs.min())).astype('uint16')
# Reshape the normalized values back to the original array shape
return np.reshape(cs[flat], array.shape)
Looping Through Bands
Then we loop through each band, applying histogram equalization.
rgb_stack = []
for band in tqdm(required_bands):
# find band path
band_path = list(s2_l1c_dir.rglob(f'*GRANULE/*/IMG_DATA/*_B0{band}.jp2'))[0]
# open array
src = rio.open(band_path)
array = src.read(1)
# equalize array
array = histogram_equalization(array)
# convert to uint8
array = (array / 256).astype('uint8')
rgb_stack.append(array)
Why uint8 and JPEG Compression?
We convert the equalized array to uint8
to save space. It still maintains good enough quality for visualization. We then use JPEG compression in our GeoTIFF to save even more space.
# stack bands into 3D array
rgb_stack = np.array(rgb_stack)
# copy and update band metadata
meta = src.meta.copy()
meta.update({
'count': rgb_stack.shape[0],
'dtype': rgb_stack.dtype,
'compress': 'JPEG',
'driver': 'GTiff'
})
Finale: Writing the Image
Finally, we write the image into a GeoTIFF file.
# export equalized compressed RGB image
with rio.open(output_dir / f'{s2_l1c_dir.stem}_RGB.tif', 'w', **meta) as dst:
dst.write(rgb_stack)
Before and After
Before Equalization
After Equalization
Notice how you can see both clouds and land features clearly in the equalized image. So the next time clouds crash your image party, you know how to make them behave!
That’s all folks, happy equalizing! 🌈🛰️
Acknowledgements
Some of the ideas and techniques discussed in this blog post were inspired by Tory Walker’s GitHub repository on histogram equalization, which you can find here.