To establish an operational image handling service at the iMagine platform that ingests, stores, processes images of marine water samples taken by the Zooscan instrument and uploads the resulting regions of interest to the EcoTaxa platform for later taxonomic Identification.
Aim
Development actions during iMagine
Upgrading the ZooProcess software for image analytics with the use of AI (instance segmentation)
Setting up an operational environment at iMagine platform, with ZooProcess –Ecotaxa workflow for processing ZooScan images and storing the resulting data
Connecting this iMagine-hosted environment to the EcoTaxa database, using its API, for ingesting data
Developing guidance and training material for uptake of the new ZooProcess – EcoTaxa workflow services
Reaching out to users for uptake and providing support and training
Objective and challenge
The objective of this project is to create a handling service on the iMagine platform for processing zooplankton images captured using the Zooscan. The service will ingest, store, and process images of marine water samples, and upload the resulting regions of interest to the EcoTaxa platform for taxonomic identification. The technology used in this project involves processing grayscale images of 356 megapixels using classical image segmentation and measurement methods, enhanced by neural network algorithms, specifically instance segmentation. EcoTaxa utilizes a combination of deep and classic machine learning techniques to predict likely identifications for the uploaded images, which can be validated through a dedicated user interface.
Currently, a technician responsible for digitizing plankton samples spends several hours manually handling and processing the images. They use custom software to correct processing errors and manually separate organisms that touch each other in the images to ensure accurate data. Importing and sorting the images taxonomically on EcoTaxa is also done manually, making the process tedious and lacking automation.To publish and analyze the dataset effectively, metadata such as observed volume, imaging instrument, and imaging settings need to be documented using controlled BODC vocabularies and included in a DarwinCore Archive (DwCA) file. However, researchers, such as plankton ecologists, often have limited time to look up and incorporate the necessary metadata, resulting in a need for better automation in data processing, management, and distribution.
Timeline and progress
Expected Results
Plankton is an integral and vital component of pelagic food webs and provides many
ecosystem services, such as oxygen production and carbon storage. Plankton indicators are used within several descriptors of the MSFD and WFD. Datasets showing spatial and longterm trends in concentration of phyto- and zooplankton are essential to understand the dynamics of food availability for commercially exploited species and the effects of climate change. Their description at global scale yields an indication of the health of marine ecosystems and their response to anthropic stressors.
The provision of the ZooScan – EcoTaxa pipeline in iMagine will accelerate and standardise the processing of plankton samples and therefore result in contributing to more numerous and more interoperable zooplankton data to better describe and understand these systems.