Mapping the Coverage Extent of Sentinel-2

The Sentinel-2 satellite constellation plays a pivotal role in remote sensing and data science, especially in monitoring terrestrial environments. However, it faces challenges with inconsistent data coverage over oceanic regions, particularly at the edges of its coverage. This inconsistency poses difficulties for marine and coastal research, particularly in time series analysis. To mitigate this issue, we’ve developed a Python tool that leverages the Microsoft Planetary Computer API. This tool calculates and visualises the global capture extent of Sentinel-2 by processing and amalgamating extent polygons into a global raster. The resulting raster provides a pixel-wise representation of observation frequency, aiding researchers and analysts in navigating the challenges of data coverage irregularities.

Read More

Histogram Equalization on Sentinel-2 L1C Satellite Images

For those wrestling with satellite imagery, clouds can be a real pain. They overexpose scenes, making it tough to get a clear picture of both the atmospheric and terrestrial elements. No worries, though! Today we’re tackling this issue with a technique called histogram equalization on Sentinel-2 L1C images. 🌦️

Read More

Pathlib.Path - The Path to Enlightenment (and Better File Management)

Greetings, fellow code warriors! Today, I come to you with an embarrassing confession: I’ve been doing it all wrong. And by “it,” I mean using strings for file paths like someone who’s never tasted the sweet, sweet nectar of object-oriented goodness. 🙈 But fear not, for I have seen the light, and that light is called pathlib.Path. So buckle up, my friends, as I guide you through this wonderful world of file management with Python’s pathlib.Path, while simultaneously questioning my life choices and reminiscing about the old ways. 😅

Read More

Improving segmentation model accuracy with Test Time Augmentation

Test Time Augmentation (TTA) is a technique used to improve the accuracy of a machine learning model by generating additional predictions on modified data during inference time and combining them to produce a final (hopefully improved) prediction. TTA is useful when the model is underperforming and cannot be directly improved. While TTA is available in Fastai it does not yet work for segmentation. However, it is possible to manually implement TTA for any segmentation model regardless of modelling library you are using. This can be done by applying an augmentation function to input images before generating predictions.

Read More

How to efficiently create millions of overlapping raster tiles

When dealing with large amounts of raster spatial data, you will often find that operations become very slow to perform or just won’t run at all. Often this is the result of single-core processes or simply running out of RAM. Fortunately, in most situations, there is a solution to this issue. Simply chop your raster data into smaller parts and run multiple simultaneous operations. In this post, I will be covering the first half of this workflow; chopping up your data AKA ‘tiling’.

Read More

Train a Deep Learning image classifier in 5 minutes with Python

Image classification is the process of assigning a label to an image. This guide will outline how to train a Deep Learning image classifier with a very small amount of code and with limited training data. The approach covered in this post is very powerful and, as such, I find myself using it frequently.

Read More

Applying an Azure AutoML model to raster GIS data

This is a walk through of a Jupyter Notebook I created to run a vegetation classification model over the Nullarbor Land Conservation District. This notebook assumes you are trying to execute your own geographic classification task and that you already have a trained model from Azure AutoML. It is also assumed that your input raster data is already prepared and all your data has the same extent, pixel size and projection. For my particular application I was using 100 raster layers at 80m resolution, which covers 150,000 km2 and equates to about 20,000,000 pixels in total.

Read More

Point sampling multiple raster files

When performing spatial modeling it is often necessary to extract raster values at xy point locations. If you’re working with small to moderate amounts of data, this operation can be done within QGIS or alike, however, if you are working with larger datasets or many small datasets it becomes useful to use a script to do the work for you. I ran into this issue recently when I needed to extract 150,000 point values from several hundred raster files. Needless to say, QGIS did not appreciate me trying to load all of this data into it, so I resorted to building a jupyter notebook instead.

Read More

Raster extent to polygon

Recently, I had the need to visualise the extent of a large number of DEM files. I initially tried loading them into QGIS but this was very clunky and slow. I only wanted to see the extent of the files so I attempted to use the inbuilt QGIS function ‘Extract layer extent’ and run it as a batch process. Unfortunately the batch processing window did not appreciate me trying to load a couple hundred large DEM files and it promptly crashed. So I put a Jupyter Notebook together to do the work for me. This notebook will crawl all files within a directory and all subdirectories and extract the bounding geometry for each file. These bounds are then grouped by projection and saved out as a geopackage ready to be viewed in QGIS or alike.

Read More

Batch compressing raster files

When I’m finished with a project, I often have a large amount of raster files on disk which I need to archive for later use. Depending on the structure of the data, it can be highly advantageous to compress the data before storing it away. I have put together some Python code in a Jupyter Notebook to compress all raster files in a folder to streamline the task.

Read More

Detection of corrupt raster files with python

Have you ever needed to find a corrupt GeoTIFF file amongst a large amount of valid files? I recently had this issue and put together a useful little python script to do the work for me. The script is written within a jupyter notebook so you can run it interactively.

Read More