Search papers, labs, and topics across Lattice.
The paper introduces VISA, a two-stream segmentation network that decouples radiance and vegetation index cues for improved weed segmentation in UAV multispectral imagery. VISA uses residual spectral-spatial attention on calibrated reflectance bands and windowed self-attention with state-space layers on vegetation indices to better discriminate sparse weeds under canopy mixing. The authors also present BAWSeg, a new four-year UAV multispectral dataset of barley paddocks with dense crop, weed, and other labels, which they use to demonstrate VISA's superior performance (75.6% mIoU and 63.5% weed IoU) and robustness compared to a SegFormer baseline, especially under cross-plot and cross-year evaluation.
Decoupling spectral reflectance and vegetation indices in a two-stream network beats standard multispectral segmentation, especially when generalizing to new fields and years of barley crops.
Accurate weed mapping in cereal fields requires pixel-level segmentation from UAV imagery that remains reliable across fields, seasons, and illumination. Existing multispectral pipelines often depend on thresholded vegetation indices, which are brittle under radiometric drift and mixed crop--weed pixels, or on single-stream CNN and Transformer backbones that ingest stacked bands and indices, where radiance cues and normalized index cues interfere and reduce sensitivity to small weed clusters embedded in crop canopies. We propose VISA (Vegetation-Index and Spectral Attention), a two-stream segmentation network that decouples these cues and fuses them at native resolution. The radiance stream learns from calibrated five-band reflectance using residual spectral-spatial attention to preserve fine textures and row boundaries that are attenuated by ratio indices. The index stream operates on vegetation-index maps with windowed self-attention to model local structure efficiently, state-space layers to propagate field-scale context without quadratic attention cost, and Slot Attention to form stable region descriptors that improve discrimination of sparse weeds under canopy mixing. To support supervised training and deployment-oriented evaluation, we introduce BAWSeg, a four-year UAV multispectral dataset collected over commercial barley paddocks in Western Australia, providing radiometrically calibrated blue, green, red, red edge, and near-infrared orthomosaics, derived vegetation indices, and dense crop, weed, and other labels with leakage-free block splits. On BAWSeg, VISA achieves 75.6% mIoU and 63.5% weed IoU with 22.8M parameters, outperforming a multispectral SegFormer-B1 baseline by 1.2 mIoU and 1.9 weed IoU. Under cross-plot and cross-year protocols, VISA maintains 71.2% and 69.2% mIoU, respectively. The BAWSeg data, VISA code, and trained models will be released upon publication.