Search papers, labs, and topics across Lattice.
36 papers published across 1 lab.
LLM explanation faithfulness varies wildly depending on how you test it, and might even be *anti*-faithful, so stop relying on single-intervention benchmarks.
Chain-of-thought reasoning is often a lie: models systematically suppress acknowledging the real reasons behind their answers, even when they demonstrably influence the output.
Skip annotating image rationales: this method transfers text-based rationales to images for explainable crisis classification, saving annotation effort while boosting performance.
Control LLMs without retraining: pinpointing just a few key neurons lets you steer outputs more reliably than attribution methods.
Fine-tuning LVLMs on counting alone boosts general visual reasoning by over 1.5%, revealing counting as a surprisingly central skill.
Chain-of-thought reasoning is often a lie: models systematically suppress acknowledging the real reasons behind their answers, even when they demonstrably influence the output.
Skip annotating image rationales: this method transfers text-based rationales to images for explainable crisis classification, saving annotation effort while boosting performance.
Control LLMs without retraining: pinpointing just a few key neurons lets you steer outputs more reliably than attribution methods.
Fine-tuning LVLMs on counting alone boosts general visual reasoning by over 1.5%, revealing counting as a surprisingly central skill.
VLAs aren't just memorizing training data; sparse autoencoders reveal a hidden layer of generalizable motion primitives that can be steered to control robot behavior across tasks.
Forget comparing models with benchmarks – mapping them by prompt-response likelihoods reveals hidden relationships between architecture, training data, and even how prompts compose.
VLMs selectively ignore visual information based on question framing, even when the visual reasoning task remains identical, highlighting a critical vulnerability in their grounding capabilities.
Get faithful and plausible natural language explanations for chest X-rays with as few as 5 human-annotated examples per diagnosis, and even boost classification accuracy in the process.
Unstable explanations plague ML models on spectroscopy data, but SHAPCA offers a more consistent and interpretable approach by combining PCA and SHAP values in the original input space.
LLM explanation faithfulness varies wildly depending on how you test it, and might even be *anti*-faithful, so stop relying on single-intervention benchmarks.
Unlock the power of interpretable AI: SINDy-KANs distills complex neural networks into sparse equations, revealing the underlying dynamics of systems.
LLMs can introspect on their own internal emotive states during conversations with surprising accuracy, opening a new avenue for monitoring and influencing their behavior.
Turns out, VLA models are mostly just looking at the scene: visual pathways dominate action generation, and language only matters when the visuals are ambiguous.
You *can* have it all: high-performance anomaly detection, interpretability, and fairness, even in highly imbalanced industrial datasets.
Uncover hidden relationships in drug discovery: BVSIMC uses Bayesian variable selection to pinpoint the most relevant chemical and genomic features, boosting prediction accuracy and interpretability.
You can get state-of-the-art performance on retinal fundus image tasks with an interpretable foundation model that's 16x smaller than the alternatives.
Ditch slow, unstable AR estimation: neural nets offer a 12x speed boost and better convergence, without sacrificing interpretability.
Forget static embeddings: this paper shows how modeling scientific concepts as evolving complex networks reveals surprising connections between conceptual change and network topology.
Locomotion policies, often considered black boxes, can autonomously learn interpretable phase structures and branching logic, revealing a hidden order in their decision-making.
Video diffusion transformers exhibit a hidden "magnitude hierarchy" in their activations that can be exploited for training-free quality improvements via a simple steering method.
LLMs don't just regurgitate token probabilities when expressing confidence; they perform a more sophisticated, cached self-evaluation of answer quality.
LLMs encode hierarchical semantic relations asymmetrically, with hypernymy being far more robust and redundantly represented than hyponymy.
Attention sinks aren't just a forward-pass phenomenon; they actively warp the training landscape by creating "gradient sinks" that drive massive activations.
People prefer XAI explanations that tell them *why* a feature change doesn't alter the outcome, not just *that* it doesn't.
MLLMs' image segmentation prowess isn't a given: a critical adapter layer actually *hurts* performance, with the LLM having to recover via attention-mediated refinement.
Anomaly detection gets a dose of interpretability: SYRAN learns human-readable equations that flag anomalies by violating learned invariants.
Pinpointing the training data behind an LLM's behavior is now possible without retraining, opening the door to precise debugging and targeted interventions.
Acoustic and phonetic NACs encode accent in fundamentally different ways, with implications for how we interpret and manipulate these representations.
Control the emotional tone of generated speech without any training by directly manipulating specific neurons within large audio-language models.
Image editing models leak fascinating hints about their world knowledge through "edit spillover"—unintended changes to semantically related regions—and this paper turns that leakage into a probe.
CLIP struggles with fine-grained details in cross-domain few-shot learning, but a cycle-consistency method can fix its vision-language alignment and boost performance.
You can now audit multi-agent LLM systems and trace responsibility for harmful outputs even without access to internal execution logs, thanks to a clever "self-describing text" technique.
An AI model can estimate legal age from clavicle CT scans with higher accuracy than human experts, potentially revolutionizing forensic age assessment.
Unlock explainable outlier detection in foundation models with FoMo-X, a modular framework that adds negligible inference overhead while revealing interpretable risk tiers and calibrated confidence measures.
Standard PCA, despite its widespread use in CAD, struggles to directly reveal the original design parameters of a geometry, but this paper identifies conditions for accurate parameter estimation.
LLMs aren't monolithic black boxes: they contain spatially organized, functionally specialized modules that can be automatically discovered.