Document Type

Theses, Ph.D

Rights

Available under a Creative Commons Attribution Non-Commercial Share Alike 4.0 International Licence

Disciplines

1.2 COMPUTER AND INFORMATION SCIENCE

Publication Details

Thesis submitted for the degree of Doctor of Philosophy, Technological University Dublin, December 2021.

Abstract

To ensure that pasture-based farming meets production and environmental targets for a growing population under increasing resource constraints, producers need to know pastureland traits. Current proximal pastureland trait prediction methods largely rely on vegetation indices to determine biomass and moisture content. The development of new techniques relies on the challenging task of collecting labelled pastureland data, leading to small datasets. Classical computer vision has already been applied to weed identification and recognition of fruit blemishes using morphological features, but machine learning algorithms can parameterise models without the provision of explicit features, and deep learning can extract even more abstract knowledge although typically this is assumed to be based around very large datasets.

This work hypothesises that through the advantages of state-of-the-art deep learning systems, pastureland crop traits can be accurately assessed in a just-in-time fashion, based on data retrieved from an inexpensive sensor platform, under the constraint of limited amounts of labelled data. However the challenges to achieve this overall goal are great, and for applications such as just-in-time yield and moisture estimation for farm-machinery, this work must bring together systems development, knowledge of good pastureland practice, and also techniques for handling low-volume datasets in a machine learning context.

Given these challenges, this thesis makes a number of contributions. The first of these is a comprehensive literature review, relating pastureland traits to ruminant nutrient requirements and exploring trait estimation methods, from contact to remote sensing methods, including details of vegetation indices and the sensors and techniques required to use them.

The second major contribution is a high-level specification of a platform for collecting and labelling pastureland data. This includes the collection of four-channel Blue, Green, Red and NIR (VISNIR) images, narrowband data, height and temperature differential, using inexpensive proximal sensors and provides a basis for holistic data analysis. Physical data platforms built around this specification were created to collect and label pastureland data, involving computer scientists, agricultural, mechanical and electronic engineers, and biologists from academia and industry, working with farmers.

Using the developed platform and a set of protocols for data collection, a further contribution of this work was the collection of a multi-sensor multimodal dataset for pastureland properties. This was made up of four-channel image data, height data, thermal data, Global Positioning System (GPS) and hyperspectral data, and is available and labelled with biomass (Kg/Ha) and percentage dry matter, ready for use in deep learning.

However, the most notable contribution of this work was a systematic investigation of various machine learning methods applied to the collected data in order to maximise model performance under the constraints indicated above. The initial set of models focused on collected hyperspectral datasets. However, due to their relative complexity in real-time deployment, the focus was instead on models that could best leverage image data.

The main body of these models centred on image processing methods and, in particular, the use of the so-called Inception Resnet and MobileNet models to predict fresh biomass and percentage dry matter, enhancing performance using data fusion, transfer learning and multi-task learning.

Images were subdivided to augment the dataset, using two different patch sizes, resulting in around 10,000 small patches of size 156 x 156 pixels and around 5,000 large patches of size 240 x 240 pixels. Five-fold cross validation was used in all analysis. Prediction accuracy was compared to older mechanisms, albeit using hyperspectral data collected, with no provision made for lighting, humidity or temperature.

Hyperspectral labelled data did not produce accurate results when used to calculate Normalized Difference Vegetation Index (NDVI), or to train a neural network (NN), a 1D Convolutional Neural Network (CNN) or Long Short Term Memory (LSTM) models. Potential reasons for this are discussed, including issues around the use of highly sensitive devices in uncontrolled environments.

The most accurate prediction came from a multi-modal hybrid model that concatenated output from an Inception ResNet based model, run on RGB data with ImageNet pre-trained RGB weights, output from a residual network trained on NIR data, and LiDAR height data, before fully connected layers, using the small patch dataset with a minimum validation MAPE of 28.23% for fresh biomass and 11.43% for dryness. However, a very similar prediction accuracy resulted from a model that omitted NIR data, thus requiring fewer sensors and training resources, making it more sustainable. Although NIR and temperature differential data were collected and used for analysis, neither improved prediction accuracy, with the Inception ResNet model’s minimum validation MAPE rising to 39.42% when NIR data was added. When both NIR data and temperature differential were added to a multi-task learning Inception ResNet model, it yielded a minimum validation MAPE of 33.32%.

As more labelled data are collected, the models can be further trained, enabling sensors on mowers to collect data and give timely trait information to farmers. This technology is also transferable to other crops. Overall, this work should provide a valuable contribution to the smart agriculture research space.


Share

COinS