Search papers, labs, and topics across Lattice.
15
0
19
LLM-based narrative evaluation reveals that the *way* people tell their stories is a stronger predictor of mental health than the specific words they use.
Dependency-controlled context and explicit evidence sufficiency criteria are key to preventing premature stopping and improving the consistency of enterprise research outputs.
Safety benchmarks for agent systems can be rapidly adapted to new execution environments by customizing a three-dimensional safety taxonomy, enabling continuous safety evaluation as agent capabilities evolve.
GenRec proves that generative recommendation can beat existing pipelines in a large-scale industrial setting, achieving nearly 10% gains in key metrics by focusing on preference alignment and efficient sequence encoding.
Seedance 2.0 leapfrogs existing models by unifying multi-modal inputs (text, image, audio, video) into a single architecture for generating high-quality, longer-duration audio-video content.
LLMs can generate clinical summaries that not only improve the accuracy of multimodal depression detection but also provide transparent rationales for those predictions.
LLM datasets aren't independent islands: tracing their lineage reveals hidden redundancy, benchmark contamination, and opportunities for more diverse training data.
By distilling successful and failed reasoning paths into a "Cognitive Tree," T-STAR pinpoints and corrects critical errors in multi-turn reasoning, leading to significant performance gains.
Forget tedious hyperparameter sweeps; AutoSOTA automates the *entire* research pipeline, discovering 105 new SOTA models across diverse AI tasks in just five hours per paper.
Current LLM safety evaluations miss the mark: ATBench reveals how risks in realistic, multi-step agent interactions emerge over time, challenging even the strongest models.
Row/column normalization *before* orthogonalization can significantly boost convergence and reduce validation perplexity in LLaMA2 pretraining, outperforming the base Muon optimizer.
LLMs can now write better quantitative trading algorithms than humans, thanks to a new framework that turns unstructured financial reports into executable code.
ARISE lets language models solve math problems better by learning and reusing successful solution strategies, outperforming existing RL methods, especially on harder, out-of-distribution problems.
LLMs can now scale depth more effectively: a new attention mechanism recovers diluted features in deeper layers, boosting performance with negligible overhead.
Self-wrapping cables aren't just a nuisance in robotic manipulation; they're a feature that can be exploited for redirected torque and more efficient object control.