Search papers, labs, and topics across Lattice.
WildLIFT is introduced, a framework that lifts monocular drone video into 3D scene geometry and integrates it with 2D instance segmentation for species-agnostic wildlife monitoring. This enables the generation of oriented 3D bounding boxes with semantic face information, allowing for quantitative assessment of viewpoint coverage and inter-animal occlusion. Validated on a dataset of 6,700 3D detections across four mammal species, WildLIFT demonstrates high identity consistency and reduces manual annotation effort.
Unlock species-agnostic 3D tracking from standard drone footage with WildLIFT, turning 2D video into structured, viewpoint-aware representations for richer wildlife analysis.
Monocular RGB cameras mounted on drones are widely used for wildlife monitoring, yet most analytical pipelines remain confined to two-dimensional image space, leaving geometric information in video underexploited. We present WildLIFT, a computational framework that integrates three-dimensional scene geometry from monocular drone video with open-vocabulary 2D instance segmentation to enable species-agnostic 3D detection and tracking. Oriented 3D bounding box labels with semantic face information enable quantitative assessment of viewpoint coverage and inter-animal occlusion, producing structured metadata for downstream ecological analyses. We validate the framework on 2,581 manually curated frames comprising over 6,700 3D detections across four large mammal species. WildLIFT maintains high identity consistency in multi-animal scenes and substantially reduces manual 3D annotation effort through keyframe-based refinement. By transforming standard drone footage into structured 3D and viewpoint-aware representations, WildLIFT extends the analytical utility of aerial wildlife datasets for behavioural research and population monitoring.