
About
Master's student in LIESMARS, Wuhan University
Projects
- 通过CNN多尺度特征提取与注意力机制检测影像对重叠区域,获取尺度先验指导下游宽基线影像匹配
- 创造性地将重叠估计问题转化为目标检测问题
Intelligent Interpretation of Remote Sensing Images
Global Natural Disaster Assessment System
- 基于前沿深度学习算法,从图像序列进行三维场景重建,并实现由粗到细的分层视觉定位
- 负责整体框架的搭建以及算法改进与整合
Writing
We introduce CoSAM, a novel matching framework designed to predict precise correspondences in uncontrolled environments. Unlike existing methods that rely on pixel-level correlation volumes spanning the entire image, we redefine the matching problem between image pairs as a co-visible region segmentation task within unified images. Our approach establishes binocular correspondences using a monocular vision foundational model. We derive hierarchical anchor points from learnable queries within a scale-aware transformer to guide pixel-wise co-visible region segmentation, enabling high-fidelity correspondences via the integration of diverse matching plugins. The prompt-based segmentation paradigm distinguishes our method from traditional semantic segmentation, which focuses on region-based class identification. This innovative paradigm effectively addresses the challenges posed by vast solution search spaces and complex one-to-many correspondence relationships, particularly in the presence of variations in viewpoints and scales. Experimental results in challenging scenarios demonstrate that CoSAM excels in both efficiency and effectiveness in feature matching and various downstream tasks.
Establishing reliable point correspondences between images with significant scale differences is a persistent challenge in photogrammetry and remote sensing. The scale differences often cause semantic ambiguities and positional shifts in the estimated correspondences. Limited by the direct point correspondence framework, existing methods struggle to address the scale issue at the local feature level. To overcome this, we address the scale difference issue by detecting co-visible regions between image pairs and propose \textbf{SCoDe} (\textbf{S}cale-aware \textbf{Co}-visible region \textbf{De}tector) that identifies and aligns co-visible regions for highly robust hierarchical point correspondence matching. Specifically, SCoDe employs a novel Scale Head Attention mechanism to map and correlate features across multiple scale subspaces, and leverages a learnable query to gather information from extracted scale-aware features of both images for co-visible region detection. In this way, correspondences can be hierarchically determined from region-level to point-level with semantic and location uncertainty eliminated. Extensive experiments conducted on three challenging datasets demonstrate the clear superiority of SCoDe over state-of-the-art methods, improving point matching precision by 9.04\%. SCoDe also exhibits a notable advantage when dealing with images that have large scale differences.