External Use Case

EyeOnWater

Measuring Life in Water

Objective

MARIS develops and operates the successful EyeOnWater app, which invites users to make and share images of water surfaces with their smartphones. This helps researchers classify rivers, lakes, coastal waters, seas, and oceans by colour. It can be used for both fresh and saline natural waters. The observations via the app are an extension of a long-term (over 100 years) set of watercolour observations made by scientists in the past.

MARIS’s iMagine challenge has been to detect and automatically reject false images using iMagine AI services and approaches.

Research Actitivites

The use case comprised the following research activities:  

  • Analysis of object detection and image segmentation models within the domain of Computer Vision, including identifying available models.
  • Evaluation of suitable object detection and image segmentation models for use in the EyeOnWater and Imagine platform.
  • In-depth research into the functionality and implementation of YOLOv8, both locally and on the Imagine platform, with the following subgoals:
    • Identification of a suitable training dataset.
    • Investigation of hyperparameters to optimise the training model.
    • Analysis of false positives and negatives to gain insights into ambiguous predictions by YOLOv8.
    • Study of augmentation techniques and their applicability to improve the EyeOnWater training dataset.
    • Determination of the most suitable YOLOv8 model (classification, detection, segmentation) for approving or rejecting images.
    • Identification of the most appropriate labels/classes for classification in the context of image approval.
  • Research into the Imagine platform, focusing on setting up deployments and uploading models and datasets, while testing the user experience.

Execution

  • Manual classification of an extensive training dataset (10,000+ images) into the correct labels. The dataset is categorised into three distinct classes: “water_good,” “water_bad,” and “bad.” The “water_good” class includes images that meet the requirements of EyeOnWater. The “water_bad” class consists of images of water that do not meet these requirements. This class helps create a clearer distinction between good and bad water, for a better separation of the two categories. Finally, the “bad” class consists of images that users submitted that do not depict water. 
  • Development of a script to map “classified” folders to the structure used by YOLOv8.
  • Multiple training runs of the dataset with YOLOv8, based on the “classified” folders.
  • Validation and testing of the trained model.
  • Modification of the YOLOv8 code to make false positives and negatives visible.
  • Development of a script for quality control of images in the dataset, including checks for minimum size, corruption, and correct extension.
  • Development of a script for dataset augmentation with storage in the correct folder structure. Using this script, the set of original images (containing a total of 1700 images) are augmented, by rotating, displacing and resizing them. The initial distribution of images was uneven, with 11,406 images in the “water_good” class, 1,789 in “water_bad,” and only 466 in the “bad” class. To achieve a more balanced dataset, we augmented the images so that each class would contain approximately 3,000 images. This augmentation process improves the model’s ability to learn from each class more effectively, to achieve a more consistent representation across categories.
  • Development of a prediction script that approves or rejects images based on a confidence percentage.

Results

The image showcases a set of images that meet the quality criteria of EyeOnWater and are classified as “water_good” by the AI solution.

The image shows water-related samples with predicted labels like “water_good”, “water_bad” and “bad”, from the validation batch during training. To check if the training matches predictions, predicted labels are compared with ground truth to analyse patterns for correct and incorrect classifications, and ensure it generalises well to unseen data.

The use case achieved its goal of developing a script that retrieves unanalysed images from the database based on n_code, collecting the images from the server, and checking them for minimum size, correct extension, and corruption. The script then predicts the status of each image and updates the database with the results, including whether the image should be approved, rejected, or manually reviewed.

This AI solution has been integrated into the EyeOnWater production line, which MARIS runs on its own infrastructure. Once a day, all unprocessed images are predicted, and the results as seen in the image below are displayed in a grid where the admin can manually verify the predictions. Each tile shows the image, its name, the date it was taken, the coordinates of where it was captured, the predicted class, and the confidence score for that class. By clicking on an image, the admin can flag it to prevent its use in the Secchi disk analysis. The page also includes various filters, allowing the admin to search for specific classes or confidence scores.