Shadow Detection in Machine Learning for Wildlife Management

How the semi-arid savanna ecosystem conservation can benefit from Data Science and Artificial Intelligence algorithms

This project has been developed in collaboration with

As ecosystems are increasingly threatened by rising land use pressure due to human population growth and climate change, effective wildlife management becomes vital in preserving and protecting ecosystems. Traditional methods, such as manual counting of different animal breeds or the recognition of landscape features in bounded areas, are often too complex or expensive to implement. In cooperation with the Kuzikus Wildlife Reserve in Namibia, the non-profit organization Wild Intelligence Lab aims to address these issues and to have a lasting impact on the conservation of ecosystems through the use of innovative technologies like tailored machine learning algorithms. In cooperation with TechLabs Aachen, a volunteer organization focused on equipping young individuals with tech domain expertise, Wild Intelligence Lab has established a multidisciplinary team of students and experts combining knowledge from the fields of mechanical engineering, automation, economics and computer science.

So far, the counting and detection of different species and landscape features from the ground is a long and time-consuming process that can hardly provide a good coverage of landscape. Previous methods, such as the manual recording from a helicopter, disturb the animals living there due to noise and shadow cast. To overcome these problems, high resolution photographic maps from unmanned aerial vehicles (UAV) or satellites are used to train machine learning algorithms.

By automatically extracting data from the images through machine learning algorithms, detailed information about ecosystems, such as the animal behavior during long droughts, can be obtained. To automate the extraction of these data points, machine learning models need to be developed. For the identification of trees and animals we have chosen an approach involving the classification of shadows from animals and vegetation. The following paragraphs describe the developments up until today and is intended for readers of all backgrounds. We hope to give you a detailed insight into the intersection of Wildlife Conservation and Technology and encourage you to contact us in case of any questions.

Slicing Orthomosaics

The raw data captured from the UAVs is submitted in the form of high-resolution orthomosaics (large photographic maps in the GeoTiff Format). To reduce the computational complexity of further processing the orthomosaics are sliced into smaller images as shown in Figure 1.

Additionally one of these sliced images with a resolution of 1000×1000 pixels is shown in Figure 2 and, includes some of the characteristic features of an arid sub-saharan ecosystem, like soil, aardvark holes, trees and animals. Each of these images consists of four color bands (RGB and an alpha channel), which, when combined, form the image we see. For the subsequent image analysis, the first three channels of the images must be extracted, since they are needed for calculation. In each color band, each pixel is assigned a value between 0 and 255, describing the color intensity of of the given color band at a given pixel position. A perfectly red pixel is for example defined by 255 for the red channel and 0 for the green and blue channel.

After these steps have been done, the sliced images are available as stacked arrays of the colour channel and are stored in a pandas data frame. To check if the slicing algorithm works correctly, the sliced and restacked images should be stored in a new directory. For further calculations another data frame is created containing the image names and the GPS coordinates which can be extracted from the orthomosaic metdata. Additionally, the mean pixel intensitiy and the standard deviation for each image are calculated. A histogram is also used to determine the color values with highest abundance in the individual channels and is also saved in the data frame. For these calculations, implementations from Numpy, Open CV and rasterio can be used.

Land Classification using K-Nearest Neigbors (KNN)

In the previous steps, the raw data has been prepared for implemening a classification algorithm for feature detection.

“Tell me who your neighbours are, and I’ll tell you who you are”

Although the above quote can also appeal to human relations in real life, this is a reference to the KNN algorithm as an example of simple supervised classification algorithms. In terms of data, the meaning is that the class of a data point is determined by the classes of the data points in the training data around it. And so, in terms of pixels, certain pixels can be classified as shadows in an image by looking at the nearest “pixel-neighbour” in the training data. By comparing the previous calculated values like the pixel intensity and their maxima and minima, it’s possible to get an idea about the location of features in an image.

One of our first objectives is to create a hand labelled dataset for the shadow classifier by manually classifying shadow masks. For the creation of the shadow masks the prediction of an individual pixel will be based on another dataset, containing pixel intensity combinations and their corresponding classes, and is used to construct the KNN classifier.

Identifying and Extracting Shadow Data

To help us to create the dataset for the shadow classifier we wanted to automate large parts of the dataset creation. The identification of shadows depends on the contour that is calculated with the KNN algorithm and provided in form of a binary mask (as shown in Figure 2). For every shadow contour, a sub-image has to be cropped out and labeled after the contour analysis is done.

The green elements describe the shadows from the trees that have been detected by the algorithm. To smooth the output, it has proven to be helpful to conduct morphological transformation, which is also included in the Open CV package. With the erosion and dilation tool, the area of the contour of the detected shadows can be increased or decreased depending on the operation. The opening function, which is an erosion operation followed by a dilation, is used to get rid of disturbing noises. Identifying the structural outlines of the shadows helps to approximate the contour. Furthermore, by using contour functions, specific information such as area or perimeter values can be extracted and saved to a data frame. In addition, the size of a tree can be calculated by placing a fitted line through the shadows. The length of this line in combination with the angle of the sun can be used to then predict the vertical size of a given tree.

What’s next: The Neural Network

At this point of the project, the team has successfully managed to develop an algorithm to process large amounts of image data and extract the shadow information from the images for post-processing. For the purpose of data augmentation and training, a firm labeled shadow dataset has to be created. After the training dataset is created, the next steps are to implement a working Convolutional Neural Network (CNN) for binary datasets. By the end of this step, the algorithms are ready to be tested on new unknown data sets.

Conclusion

By developing a machine learning algorithm, the shadow detection team of the Kuzikus Wildlife Reserve, Wild Intelligence Lab and TechLabs Aachen is well on its way to develop a tool to make a lasting contribution to automating ecosystem quantification for wildlife conservation. By extracting information from the contours of the shadows, a method for characterizing the ecosystem is created without disturbing the natural balance. In this way, the natural habitats of different breeds can be determined, or conclusions about the fertility of the land can be drawn from the accumulation of particular tree species in an area. By combining innovative technologies with machine learning and already gathered experience from previous projects of the locally acting Kuzikus team, important data for the protection of an ecosystem can be obtained.