Interactive Segmentation with Deep Metric Learning

Abstract:

Segmenting regions of interest in images, such as identifying roots in minirhizotron root images, is a critical step in several applications. However, manually annotating large image datasets to train a reliable segmentation model is time-consuming. To address this challenge, we propose a deep interactive segmentation framework to reduce the annotation burden. Interactive segmentation allows users to provide dynamic input, such as bounding boxes or scribbles, to support automated segmentation. Existing methods fall into two categories: classical techniques that rely on low-level features such as color and texture, and advanced methods that use deep networks but require either user interaction for each image or offline training. To overcome these challenges, our proposed framework utilizes transfer learning. A pre-trained deep network extracts high-level features that are interactively fine-tuned by annotators. This fine-tuned network can then be applied to new, unlabeled images to extract similar objects. To enable real-time interaction, we adapted the deep network by adding lightweight embedding layers to improve efficiency. In addition, we introduced prototype learning to capture data variations in the embedding space, making fine-tuning effective even on unseen datasets. The proposed framework was tested on various data types, including synthetic data, RGB minirhizotron root images, and hyperspectral images. The results consistently showed the effectiveness of our prototype learning (PL) model in identifying unseen categories and sub-categories, outperforming comparative models. We then developed an interactive system called interXRoot to support a real-world application of plant root annotation. The design of the interXRoot system was based on implications derived from an interview study with eight expert annotators. A user study was conducted to explore the interplay between user and model behavior. The results showed that the PL model with stable prediction led to more user engagement. In contrast, the regular UNet model introduced prediction variance and confusion. User preferences for annotation tools were influenced by object characteristics, dataset complexity, and model stability. In addition, providing probability maps of model predictions as explanations improved the annotation process, especially for the PL model. This analysis contributes to the foundation for future work on interpretable interactive annotation systems in eXplainable Artificial Intelligence (XAI).

Links:

Citation:

X. Guo, “Interactive Segmentation With Deep Metric Learning.” University of Florida, 2023.

@book{guo2023interactive,
title={Interactive Segmentation with Deep Metric Learning},
author={Guo, Xiaolei},
year={2023},
publisher={University of Florida}
}