Search papers, labs, and topics across Lattice.
The paper investigates annotation error propagation in video segmentation using Segment Anything Model 2 (SAM2) across different prompt types (masks, boxes, points) for Barrett's esophagus dysplasia. It identifies that small errors accumulate during propagation, necessitating expert intervention. To mitigate this, they introduce Learning-to-Re-Prompt (L2RP), a cost-aware framework that learns an adaptive policy for expert intervention based on a human-cost parameter, balancing annotation effort and segmentation accuracy.
Stop blindly propagating annotations in video segmentation: L2RP learns when and where to ask experts for help, slashing annotation costs while boosting accuracy.
Accurate annotation of endoscopic videos is essential yet time-consuming, particularly for challenging datasets such as dysplasia in Barrett's esophagus, where the affected regions are irregular and lack clear boundaries. Semi-automatic tools like Segment Anything Model 2 (SAM2) can ease this process by propagating annotations across frames, but small errors often accumulate and reduce accuracy, requiring expert review and correction. To address this, we systematically study how annotation errors propagate across different prompt types, namely masks, boxes, and points, and propose Learning-to-Re-Prompt (L2RP), a cost-aware framework that learns when and where to seek expert input. By tuning a human-cost parameter, our method balances annotation effort and segmentation accuracy. Experiments on a private Barrett's dysplasia dataset and the public SUN-SEG benchmark demonstrate improved temporal consistency and superior performance over baseline strategies.