Hoverview is a series of blog posts throughout the ICAERUS project discussing everything drone technologies. This second instalment presents the uavgeo Python library and tutorial for orthomosaic analysis.
Traditional geoinformation software packages such as QGIS and ArcGIS offer a large depth of tooling and visual starting-points for orthomosaic analysis. Especially the high-resolution imagery of UAV-acquired orthomosaics, are interesting to interact with using these software packages. However, as more machine and deep learning approaches are coming to the forefront of analysis methods, integration and processing capabilities that can be lacking. Luckily for us, Python has much to offer nowadays with Deep Learning frameworks such as PyTorch and geoinformation libraries such as geopandas and rioxarray.
This inspired me to combine some of the features for both into a single library called uavgeo.
Uavgeo version at time of writing is v0.2.1
This library has a set of helper functions built around the aforementioned geopandas and rioxarray that can serve as a step between the raw orthomosaic from the photogrammetry pipeline and Deep Learning. The main contribution is a set of methods to crop the large orthomosaic into smaller sections: a process called ‘chipping’. These chips are sized to fit in the Deep Learning model of choice. Additionally, as we deal with spatial data, we want these chips to be able to return to their original location. This tutorial goes over the whole process, as well as some extra features implemented in uavgeo.
Installation
Make sure to have your Python environment installed and ready. I find the easiest way to use anaconda or miniforge prompt. As it comes with good environment management, and both conda and pip to install packages. Get your Python environment ready, follow the tutorial over at https://docs.anaconda.com/free/anaconda/install/.
Once you have this ready, the best practice is to setup an environment specific to every Python project you have from the anaconda prompt terminal. Here, I will call mine drone.
conda create -n drone python=3.10
Then you need to activate the environment to continue to run the commands from this environment:
conda activate drone
Then I usually install JupyterLab, as I like their Python editing environment, it really is suited for the data scientist. This can be done through conda or pip, whatever you prefer!
pip install jupyterlab
Finally, uavgeo is made available through the PyPi project. The simplest way is therefore to install using the pip command in your Python environment, then you are ready! This not only installs uavgeo, but also the whole dependencies-tree: scipy, sklearn, fiona, rasterio, geocube, geopandas, xarray, etc..
pip install uavgeo
You can choose to use the library as is, I tend to also install the PyTorch frameworks to really get things started from a single environment. This can be quite a hassle however.
Another way I work a lot is through Docker Containers. These docker containers contain everything I need to get immediately started: jupyter environment, pytorch, nvidia drivers, and also some geo packages. You can install docker following the tutorial here:https://docs.docker.com/get-docker/. Then you can run the container I made a while ago, which contains all these things, and should already launch the JupyterLab editor to work from.
docker run --rm -it --runtime=nvidia -p 8888:8888 --gpus 1
--shm-size=5gb --network=host -v /path_to_local/dir:/home/jovyan
jurrain/drone-ml:gpu-torch11.8-uavgeoformers
Now we can get started from the notebook environment!
Loading an orthomosaic
I’d recommend thinking of a simple folder structure for the project. I tend to use a home folder, with a notebook and data subfolder. One with the notebooks, one with the data. Something like this:
uavgeo_tutorial/
data/
- ortho.tif
- dsm.tif
notebooks/
- chipping.ipynb
As uavgeo present a large set of helper functions for orthomosaic processing, it is easiest to import all the required packages, as it is a bit of a mix-and-match.
import rioxarray as rxr
import geopandas as gpd
import uavgeo as ug
For this example, I will work with an openly available orthomosaic from the ICAERUS project, available from Zenodo over at https://zenodo.org/records/8123870. Extracted, and the .tif orthomosaic placed in the data/ folder. Leaving us with the simple task of just loading the dataset as an xarray array.
ortho = rxr.open_rasterio("../data/ortho.tif")

xarray works similarly to R’s raster package, although a bit more pythonic, and much nicer than using rasterio directly, although on the backend it does use rasterio for all the geographical transformations. It is my preferred way to work with the large orthomosaics in python. There is a lot of nice utility built-in xarray, so also take a look at their documentation!
Chipping the orthomosaic
The main idea here is to create square images, from the orthomosaic, so that it fits the dimensions of a Deep Learning model. Let’s say our model takes in 640 pixels by 640 pixels by 3 bands (Red, Green, Blue). If we rescale the whole orthomosaic, we would lose a lot of detail. So we chop up the whole orthomosaic into individual smaller pieces, and we store the geographical location as well in a new file. Later, we can place these squares, or predictions, or whatever, back into their geographical location.
We have the size of the orthomosaic, so based on that we know how many chips we need to cut, as well as the dimensions of the input-pixels for the Deep Learning model.
# define the size of the chips
input_dims = {"x":640, "y":640}
# the size of the total orthomosaic defines the shape
s_x = ortho.shape[2]
s_y=ortho.shape[1]
# and a crs to apply to the GDF
crs = ortho.rio.crs
chip_gdf = ug.compute.create_chip_bounds_gdf(input_dims = input_dims, shape_x = s_x, shape_y = s_y, crs = crs)
There is also the input_overlap setting, which allows the user to set an overlap between the chips in both x and y. This is also set as a dictionary, like input_dims is.
Now we have a GeoPandas GeoDataFrame, however, it is missing the coordinates of the boxes in geographical space. So we are adding those in the next step, using the imgrf_to_crsref_boxes function
chip_gdf = ug.compute.imgref_to_crsref_boxes(raster=ortho, gdf = chip_gdf)
# then apply this column as the new geometry columns
chip_gdf = chip_gdf.set_geometry(chip_gdf["c_geom"])

Now the boxes and the orthomosaic are aligned!
This chips_gdf can now be used to subset for training, testing evaluation images, or do whatever you like. For our final step, we want to extract the square images and save the created files.
First we store the geodataframe chips. To a geojson file called chips.
# now is also a good time to store the dataframe for later reference
chip_gdf.to_file("../data/chips.geojson")
Then with Python list comprehension, we iterate over the geodataframe. For every box, we cut out a square of the original orthomosaic, and place them in a list. This list now contains all the image chips.
# Takes in ortho, and iterates over the chip_gdf
chip_list = [ortho.rio.clip_box(minx = row.geometry.bounds[0], miny
=row.geometry.bounds[1] ,maxx
=row.geometry.bounds[2] , maxy= row.geometry.bounds[3]) for i, row in chip_gdf.iterrows()]
So for example, we can visualize the 456th chip as follows:
#lets look at the 45th chip
chip_list[456].plot.imshow()
#and we can also take a look at what is actually there:
print(chip_list[456])

So this is just a georeferenced tiny square of the original orthomosaic. With 4 image bands?! Red, Green, Blue and Alpha. This is not quite what my Deep Learning model uses. It likes three bands and not geographically located pixels! The following box adjusts this nicely.
# Usually ML models do not like CRS coordinates, and the R,G,B as bands
-> basically the xr.DataArray format is nice for geo-stuff, but not ML.
So lets reformat them to more classical numpy images
#my example ortho has 4 bands: r,g,b and alpha: lets only select the first three
chip_list = [img.sel(band = [1,2,3]) for img in chip_list]
#now we can transpose the x,y,band to the expected numpy colour images format
np_imgs = [np.transpose(img.values, (1,2,0)) for img in chip_list]
You can immediately write it into a PyTorch dataloader, or do really whatever you like with it from this point. Perhaps saving these images to disk, and make it more accessible for other research activities is also a good step.
# this could be a training set or whatever, so it can all be saved to disk or whatever really
import os
from PIL import Image
def export_image_list(path, img_list):
for i in range(len(img_list)):
filename = f"{i:0>{6}}" +".jpg"
if not os.path_exists(path):
os.mkdir(path)
filepath = os.path.join(path, filename)
im = Image.fromarray(img_list[i])
im.save(filepath)
export_image_list(path = "../data/chips", np_imgs)
Reconstructing the orthomosaic
Let’s say you trained and evaluated your model, and applied it to the RGB chips we created earlier. Now you have many RGB chips, but are missing the orthomosaic! We should reconstruct this, based on the coordinates in the earlier saved chips_gdf.
Let’s say you trained and evaluated your model, and applied it to the RGB chips we created earlier. Now you have many RGB chips, but are missing the orthomosaic! We should reconstruct this, based on the coordinates in the earlier saved chips_gdf.
# To start we need an empty array to fill, which is referenced etc.
# wait: we have the original ortho to do that!
empty_raster = xr.full_like(ortho.sel(band=[1,2,3]), fill_value=np.nan)
# we also need the input chip-gdf from earlier: which is why we saved it.
chip_gdf = gpd.read_file("../data/chips.geojson")
# we also need numpy images in a list, Ill just use the np_imgs from earlier
np_imgs = np_imgs
# takes a little while, clip should be set to true when your chipslist
and chip_geoms is based on a smaller subset of the input raster
reconstructed = ug.compute.chips_to_single(chiplist = np_imgs,
empty_raster = empty_raster, chip_geoms = chip_gdf, clip = False)
Important here is to load your image chips back as numpy images, and placed in order in a list of chips, which follows the formatting from earlier! Now we have reconstructed the loaded chips back into an orthomosaic. Whilst for this example, we just used to RGB chips we clipped earlier, it is easy to imagine that the chips could be a classification map, mask, style-transfer, etc! It is nice to have it back into the original orthomosaic format for more GIS analysis down the pipeline.
Extra 1: Index calculation
uavgeo also contains a variety of index-calculations! Based on the list from FieldImageR. With some additional indices added. They can be accessed through the uavgeo.compute module. The list of available indices can be found in the readme, or at the bottom of this page! By default the functions rescale the output floats back to uint8 (0-255). This behaviour can be turned off with the rescale = False parameter.
So for the RGBVI index, we could use the following code:
import rioxarray as rxr
import geopandas as gpd
import uavgeo as ug
#load our orthomosaic from earlier:
ortho = rxr.open_rasterio('../data/ortho.tif')
#then calculate + plot RGBVI:
rgbvi = ug.compute.calc_rgbvi(ortho, red_id = 2, green_id =1, blue_id=0, rescale = False)
rgbvi.plot(cmap = 'RdYlGn', vmin = -1, vmax = 1)

Extra 2: DTM calculation
Whilst the surface model out of photogrammetry software might be nice, you might want to work with just the height model (DTM) or even the tree canopy. uavgeo has some functions to get you started to calculate a terrain model (DTM) from DSM.
Starting with the imports, and loading our DSM.
import rioxarray as rxr
import geopandas as gpd
import uavgeo as ug
#load our orthomosaic from earlier:
dsm = rxr.open_rasterio('../data/dsm.tif')
As input to the functions, we want to identify the size of the pixels in geographical space, as well as our size to sample the DTM from. I usually set this second between 1 and 2 meters.
Then the function creates squares of size sampling_meters. Within this square, the minimum value is found. Which finally is interpolated over the input dsm, so we get the same size and shaped image, but only the low-points, which is a DTM!
pixel_size = dsm.rio.transform[0]
print(pixel_size)
sampling_meters = 2
# sampling meters will define the grid-size from which to take the minimum value, in the case of an orchard,
# we can expect every square meter to contain at least 1 pixel with the lowest ground-point available.
#function will take a bit longer than the progress bar shows
dtm = ug.compute.calc_dtm_from_dsm(dsm, pixel_size, sampling_meters)
#and calculate the Canopy height model:
chm = dsm-dtm


Upcoming features
Next, we can identify the difference in precision, recall and time for the different quantization methods. Similar as described above, there is little change in performance between 32 and 8 bits, while 4 bits does drastically reduce the model performance. Do note, that the qualitative result show that the results are not exactly the same anymore, and some boxes have changed for better or for worse.
Finally, the time for each smaller bit size reduces slightly. This change is now only subtle most likely due to the large amount of overhead in calculating the performance, i.e. performing the forward pass is only a fraction of the total time.
Included indices
Index | calc_indexname | Description | Formula | Related Traits | References |
BI | calc_bi | Brightness Index | sqrt((RA^2+GA^2+B^2)/3) | Vegetation coverage, water content | Richardson and Wiegand (1977) |
SCI | calc_sci | Soil Color Index | (R-G)/(R+G) | Soil color | Mathieu et al. (1998) |
GLI | calc_gli | Green Leaf Index | (2 * G-R-B)/(2 * G+R+B) | Chlorophyll | Louhaichi et al. (2001) |
HI | calc_hi | Hue Index | (2*R-G-B)/(G-B) | Soil color | Escadafal et al. (1994) |
NGRDI | calc_ngrdi | Normalized Green Red Difference Index | (G-R)/(G+R) | Chlorophyll, biomass, water content | Tucker (1979) |
SI | calc_si | Saturation Index | (R-B)/(R+B) | Soil color | Escadafal et al. (1994) |
VARI | calc_vari | Visible Atmospherically Resistant Index | (G-R)/(G+R-B) | Canopy, biomass, chlorophyll | Gitelson et al. (2002) |
HUE | calc_hue | Overall Hue Index# | atan(2*(B-G-R)/30.5*(G-R)) | Soil color | Escadafal et al. (1994) |
BGI | calc_bgi | Blue Green Pigment Index | B/G | Chlorophyll | Zarco-Tejada et al. (2005) |
PSRI | calc_psri | Plant Senescence Reflectance Index | (R-G)/(RE) | Chlorophyll, LAI | Merzlyak et al. (1999) |
NDVI | calc_ndvi | Normalized Difference Vegetation Index | (NIR-R)/(NIR+R) | Chlorophyll, nitrogen, maturity | Rouse et al. (1974) |
GNDVI | calc_gndvi | Green Normalized Difference Vegetation Index | (NIR-G)/(NIR+G) | Chlorophyll, LAI, biomass, yield | Gitelson et al. (1996) |
RVI | calc_rvi | Ratio Vegetation Index | NIR/R | Chlorophyll, LAI, nitrogen, protein content, water content | Pearson and Miller (1972) |
NDRE | calc_ndre | Normalized Difference Red Edge Index | (NIR-RE)/(NIR+RE) | Biomass, water content, nitrogen | Gitelson and Merzlyak (1994) |
TVI | calc_tvi | Triangular Vegetation Index | 0.5 * (120 * (NIR — G)-200 * (R — G)) | Chlorophyll | Broge and Leblanc (2000) |
CVI | calc_cvi | Chlorophyll Vegetation Index | (NIR * R)/(GA^2) | Chlorophyll | Vincini et al. (2008) |
EVI | calc_evi | Enhanced Vegetation Index | 2.5 *(NIR — R)/(NIR + 6 * R — 7.5 * B) | Nitrogen, chlorophyll | Huete et al. (2002) |
CIG | calc_cig | Chlorophyll Index — Green | (NIR/G) — 1 | Chlorophyll | Gitelson et al. (2003) |
CIRE | calc_cire | Chlorophyll Index — Red Edge | (NIR/RE) — 1 | Chlorophyll | Gitelson et al. (2003) |
DVI | calc_dvi | Difference Vegetation Index | NIR-RE | Nitrogen, chlorophyll | Jordan (1969) |
RGBVI | calc_rgbvi | RGB Vegetation Index | ((G^2)-(R * B)/(G^2)+(R * B)) | Nitrogen, chlorophyll | Bendig et al. (2015) |
SAVI | calc_savi | Soil Adjusted Vegetation Index | (NIR-R)/(NIR+R+l)*(1+l) | Vegetation coverage, LAI | Huete (1988) |
------- | ---------------- | ------------- | --------- | ---------------- | ------------ |
NDWI | calc_ndwi | Normalized Difference Water Index | (G-NIR)/(G+NIR) | Water coverage, water content | McFeeters (1996) |
MNDWI | calc_mndwi | Modified Normalized Difference Water Index | (G-SWIR)/(GREEN+SWIR) | Water coverage, water content | McFeeters (1996) |
AWEIsh | calc_aweish | Automated water extraction index (sh) | B + 2.5 * G - 1.5 * (NIR-SWIR1) - 0.25 * SWIR2 | Water coverage, water content | Fayeisha (2014) |
AWEInsh | calc_aweinsh | Automated water extraction index (nsh) | 4 * (G - SWIR1) - (0.25 * NIR + 2.75* SWIR1) | Water coverage, water content | Fayeisha (2014) |