Search papers, labs, and topics across Lattice.
22 papers published across 0 labs.
Forget comparing models with benchmarks – mapping them by prompt-response likelihoods reveals hidden relationships between architecture, training data, and even how prompts compose.
Open-source LLMs, when carefully prompted with representative examples, can rival or even surpass smaller commercial models like GPT-3.5-nano in resume screening tasks, offering a privacy-preserving alternative.
Multilingual embeddings just got a whole lot smaller and faster, with F2LLM-v2 models outperforming larger counterparts while supporting over 200 languages.
Democratizing social robotics research, M offers a low-cost, open-source platform that's easy to reproduce, modify, and deploy in real-world settings.
Navigating the maze of differentially private graph release methods just got easier: a new framework helps practitioners choose the right approach, avoid common pitfalls, and make sound evaluations.
Forget comparing models with benchmarks – mapping them by prompt-response likelihoods reveals hidden relationships between architecture, training data, and even how prompts compose.
Open-source LLMs, when carefully prompted with representative examples, can rival or even surpass smaller commercial models like GPT-3.5-nano in resume screening tasks, offering a privacy-preserving alternative.
Multilingual embeddings just got a whole lot smaller and faster, with F2LLM-v2 models outperforming larger counterparts while supporting over 200 languages.
Democratizing social robotics research, M offers a low-cost, open-source platform that's easy to reproduce, modify, and deploy in real-world settings.
Navigating the maze of differentially private graph release methods just got easier: a new framework helps practitioners choose the right approach, avoid common pitfalls, and make sound evaluations.
Stealing just the right neurons from another LLM lets you patch safety holes or remove biases in your own, with almost no performance hit.
Training speculative decoding models just got an order of magnitude faster, unlocking real-world deployment with a new open-source framework and a suite of production-ready draft models.
LLM watermarks can now survive fine-tuning, quantization, and distillation thanks to a new method that embeds them in a stable functional subspace.
Achieve controllable and scalable speech generation with MOSS-TTS, enabling zero-shot voice cloning and long-form synthesis.
Unlock the power of your favorite classifier for ordinal data: Classifier Pooling consistently beats standard methods, especially when data is scarce or categories are numerous.
YouTube's platform defenses are a house of cards: circumventing one control often triggers a cascade of failures, demanding constant architectural adaptation for large-scale content replication.
LLMs can get a massive multilingual boost, especially in low-resource languages, by offloading translation to specialized models and carefully aligning their representations.
LLMs encode hierarchical semantic relations asymmetrically, with hypernymy being far more robust and redundantly represented than hyponymy.
Ruyi2.5 achieves comparable performance to Qwen3-VL on general multimodal benchmarks while significantly outperforming it in privacy-constrained surveillance, demonstrating the effectiveness of its edge-cloud architecture.
Current CRL benchmarks often fail to provide a holistic view of model performance, hindering progress, but a new aggregate metric could change that.
ManiDreams lets robots handle real-world uncertainty in manipulation tasks without retraining, outperforming standard RL baselines under various perturbations.
Tackle previously intractable open quantum systems simulations with TENSO, a new open-source package that efficiently handles complex environments via tree tensor networks.
LLMs can be drastically compressed without retraining because the relative ordering of weights matters far more than their exact values, opening the door to efficient, training-free compression techniques.
LLMs can mimic human lexical patterns, but larger models act like stereotypical humans, sacrificing diversity for typicality in word associations, a trade-off tunable by temperature.
A 4B parameter model can nearly match the privilege escalation performance of a state-of-the-art closed LLM like Claude Opus, while being fully local and 100x cheaper to run.
Standardized, modular GenAI teaching units in GUIDE offer a practical path to integrating cutting-edge AI tools into digital design education.
Security patch detectors trained on standard vulnerability databases are practically useless in the real world, losing up to 90% F1-score when deployed on in-the-wild data.