Mar 4, 2026arXiv:2603.04091

CLIP-Guided Multi-Task Regression for Multi-View Plant Phenotyping

Simon Warmers, Muhammad Zawish, Fayaz Ali Dharejo, Steven Davy, Radu Timofte

AI Summary

This paper introduces a CLIP-guided multi-task regression framework for predicting plant age and leaf count from multi-view plant imagery. The method aggregates rotational views into angle-invariant representations and conditions visual features on text priors encoding viewpoint level, enabling stable predictions even with incomplete or unordered inputs. Experiments on the GroMo25 benchmark demonstrate a significant reduction in both age and leaf-count MAE compared to the baseline, showcasing the effectiveness of the approach.

Key Contribution

Achieve nearly 50% improvement in plant age and leaf count prediction by fusing CLIP embeddings with multi-view imagery, even when views are missing or unordered.

Abstract

Modeling plant growth dynamics plays a central role in modern agricultural research. However, learning robust predictors from multi-view plant imagery remains challenging due to strong viewpoint redundancy and viewpoint-dependent appearance changes. We propose a level-aware vision language framework that jointly predicts plant age and leaf count using a single multi-task model built on CLIP embeddings. Our method aggregates rotational views into angle-invariant representations and conditions visual features on lightweight text priors encoding viewpoint level for stable prediction under incomplete or unordered inputs. On the GroMo25 benchmark, our approach reduces mean age MAE from 7.74 to 3.91 and mean leaf-count MAE from 5.52 to 3.08 compared to the GroMo baseline, corresponding to improvements of 49.5% and 44.2%, respectively. The unified formulation simplifies the pipeline by replacing the conventional dual-model setup while improving robustness to missing views. The models and code is available at: https://github.com/SimonWarmers/CLIP-MVP

Computer Vision Multimodal Models Scientific Discovery & Drug Design

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

CLIP-Guided Multi-Task Regression for Multi-View Plant Phenotyping

Related Papers