Google Research
Google's broad research division. Key contributions include Transformer architecture, BERT, T5, and TensorFlow.
research.google3
3
1
Top Researchers
Recent Papers
The paper introduces Voxtral Realtime, a novel automatic speech recognition (ASR) model designed for native streaming with sub-second latency. Unlike chunking-based approaches, Voxtral Realtime is trained end-to-end for streaming with explicit audio-text alignment, leveraging the Delayed Streams Modeling framework. The model incorporates a new causal audio encoder and Ada RMS-Norm for improved delay conditioning, and achieves performance comparable to Whisper at a 480ms delay after large-scale pretraining across 13 languages.
Presents Voxtral Realtime, a natively streaming ASR model that matches offline transcription quality at sub-second latency through end-to-end training and explicit audio-text stream alignment.
This paper generalizes the connection between Direct Preference Optimization (DPO) and human choice theory, extending the normative framework underlying DPO. By reworking the standard human choice theory, the authors demonstrate that any compliant machine learning analytical choice model can be embedded within any human choice model. This generalization supports non-convex losses and provides a unifying framework for various DPO extensions like margins and length correction.
Establishes a generalized normative framework connecting DPO with human choice theory, demonstrating broader applicability and theoretical underpinnings for preference optimization.
This paper introduces semi-nonnegative matrix factorization (SNMF) to decompose MLP activations in LLMs into interpretable features, addressing limitations of sparse autoencoders (SAEs) in mechanistic interpretability. SNMF learns sparse linear combinations of co-activated neurons and maps them to activating inputs, enhancing interpretability and enabling causal steering. Experiments on Llama 3.1, Gemma 2, and GPT-2 demonstrate that SNMF-derived features outperform SAEs and a supervised baseline in causal steering while aligning with human-interpretable concepts, revealing a hierarchical structure in MLP activation space.
Introduces semi-nonnegative matrix factorization (SNMF) as a method for decomposing MLP activations into interpretable features that outperform existing methods in causal steering and interpretability.

