Search papers, labs, and topics across Lattice.
This paper presents a mathematical formulation of large language models (LLMs) as high-dimensional nonlinear autoregressive models with attention mechanisms. It provides an equation-level description encompassing pretraining, alignment methods (RLHF, DPO, RSFT, RLVR), and autoregressive generation. The formulation facilitates analysis of alignment-induced behaviors, inference-time phenomena, and potential extensions like continual learning.
LLMs can be precisely understood as high-dimensional nonlinear autoregressive models, offering a new lens for analyzing behaviors like sycophancy and hallucination.
Large language models (LLMs) based on transformer architectures are typically described through collections of architectural components and training procedures, obscuring their underlying computational structure. This review article provides a concise mathematical reference for researchers seeking an explicit, equation-level description of LLM training, alignment, and generation. We formulate LLMs as high-dimensional nonlinear autoregressive models with attention-based dependencies. The framework encompasses pretraining via next-token prediction, alignment methods such as reinforcement learning from human feedback (RLHF), direct preference optimization (DPO), rejection sampling fine-tuning (RSFT), and reinforcement learning from verifiable rewards (RLVR), as well as autoregressive generation during inference. Self-attention emerges naturally as a repeated bilinear--softmax--linear composition, yielding highly expressive sequence models. This formulation enables principled analysis of alignment-induced behaviors (including sycophancy), inference-time phenomena (such as hallucination, in-context learning, chain-of-thought prompting, and retrieval-augmented generation), and extensions like continual learning, while serving as a concise reference for interpretation and further theoretical development.