Domain Translation and Image Registration for Multi-Look Synthetic Aperture Sonar Scene Understanding


The domain of multi-look scene understanding problems includes scenarios where multiple passes over the same area have occurred and combining information from them is desired. For example, in remotely sensed SAS surveys, the same location on the seafloor is captured from multiple views where the UTM coordinates may not fully overlap. Additionally, error in the coordinates from the INS can be on the order of several meters. Therefore, alignment from INS data can result in significant coregistration error. Properties of SAS modalities can also provide barriers to adequate registration. Inherently, sound returns are aspect dependent. The same location surveyed at different orientations may not correlate in the beamformed imagery. This dissertation explores domain translation to improve correlation between multi-look SAS imagery and investigates its utility for multiaspect alignment. Approaches for domain translation and image registration are reviewed and new methods are proposed to adapt and improve performance for SAS alignment. Domain translation is the process of transforming a sample from one representation to another. In the context of this work, domain translation refers to estimating height directly from intensity imagery. Three types of models, with varying complexity, are applied to translate intensity imagery to height: a Gaussian Markov Random Field approach (GMRF), a conditional Generative Adversarial Network (cGAN), and UNet architectures. Methods are compared in
reference to coregistered simulated datasets. Additionally, predictions on simulated and real SAS imagery are shown. Finally, models are compared on two datasets of hand-aligned SAS imagery and evaluated across multiple aspects in comparison to using intensity. Our comprehensive experiments show that the proposed UNet architecture outperforms the GMRF and pix2pix models on height estimation for simulated and real SAS imagery. Image registration is the process of transforming imagery to a shared coordinate system. In this dissertation, pairs of SAS imagery are aligned. Various models are tested to align circular and sidescan SAS imagery. Constraints based on metadata information are applied to improve performance. Presented algorithms are proven to be effective in aligning circular SAS imagery with intensity imagery as inputs. However, estimated height samples provided a significant boost in performance for aligning multi-aspect sidescan SAS.



D. Stewart, "Domain Translation and Image Registration for Multi-Look Synthetic Aperture Sonar Scene Understanding," Ph.D Thesis, Gainesville, FL, 2022.
author = {Dylan Stewart},
title = {Domain Translation and Image Registration for Multi-Look Synthetic Aperture Sonar Scene Understanding},
school = {Univ. of Florida},
year = {2022},
address = {Gainesville, FL},