Search papers, labs, and topics across Lattice.
The paper introduces Semantic Tube Prediction (STP), a JEPA-style regularizer that constrains LLM hidden-state trajectories to a tubular neighborhood of geodesics on a semantic manifold, based on the proposed Geodesic Hypothesis. This approach aims to improve data efficiency by enhancing the signal-to-noise ratio and preventing trajectory collisions during inference. Experiments on the NL-RX-SYNTH dataset demonstrate that LLMs trained with STP achieve comparable accuracy to baseline models with 16x less training data, effectively challenging established scaling laws.
LLMs can achieve the same accuracy with 16x less data by constraining their hidden-state trajectories to follow geodesics on a semantic manifold.
Large Language Models (LLMs) obey consistent scaling laws -- empirical power-law fits that predict how loss decreases with compute, data, and parameters. While predictive, these laws are descriptive rather than prescriptive: they characterize typical training, not optimal training. Surprisingly few works have successfully challenged the data-efficiency bounds implied by these laws -- which is our primary focus. To that end, we introduce the Geodesic Hypothesis, positing that token sequences trace geodesics on a smooth semantic manifold and are therefore locally linear. Building on this principle, we propose a novel Semantic Tube Prediction (STP) task, a JEPA-style regularizer that confines hidden-state trajectories to a tubular neighborhood of the geodesic. STP generalizes JEPA to language without requiring explicit multi-view augmentations. We show this constraint improves signal-to-noise ratio, and consequently preserves diversity by preventing trajectory collisions during inference. Empirically, STP allows LLMs to match baseline accuracy with 16$\times$ less training data on the NL-RX-SYNTH dataset, directly violating the data term of Chinchilla-style scaling laws and demonstrating that principled geometric priors can surpass brute-force scaling. Code is available at https://github.com/galilai-group/llm-jepa#stp.