Search papers, labs, and topics across Lattice.
This paper introduces a multi-dimensional evaluation framework to assess EEG foundation models under realistic low-resource conditions, moving beyond full fine-tuning on curated datasets. They evaluate several EEG foundation models (LaBraM, CSBrain, and CBraMod) and supervised models across six datasets, varying labeled data availability, sensor coverage, and adaptation strategies. Results show that foundation models excel in long-context tasks but struggle with short-window tasks and channel-constrained settings compared to supervised models.
EEG foundation models may not be the automatic win you think they are: they shine on long-context tasks but falter in short-window and channel-constrained scenarios, where smaller supervised models can compete.
Evaluating foundation models under appropriate adaptation settings is essential for understanding the quality and transferability of the learned representations. Recent EEG foundation models have demonstrated promising transfer capabilities across tasks and datasets, motivating their growing use in neurotechnology and clinical applications. However, these models are typically evaluated under full fine-tuning on well-curated downstream datasets, a setting that does not reflect biomedical domain constraints such as limited labeled data, reduced sensor coverage, or parameter-efficient adaptation. In this work, we propose a multi-dimensional evaluation framework for assessing EEG models under realistic low-resource conditions. Empirical analysis of both supervised EEG models and recent EEG foundation models, including LaBraM, CSBrain, and CBraMod, across 6 different datasets is performed under the proposed multi-dimensional evaluation framework. We find that EEG foundation models consistently provide performance gains on long-context tasks such as sleep stage prediction and mental health state classification. In contrast, for short-window Brain Computer Interface style tasks, supervised models achieve comparable despite having substantially fewer parameters. Additional analyses demonstrate that current foundation models provide limited robustness to short-window tasks and channel constrained settings. Together, these findings motivate the use of multi-dimensional evaluation protocols that characterize model behavior under realistic use constraints.