Search papers, labs, and topics across Lattice.
Open-weight model releases, reproducibility, model licensing, and community-driven AI development.
#22 of 24
0
Tabular foundation model performance hinges on the evaluation metric, revealing that no single pretraining objective is universally optimal across different risk profiles.
Bilingual language models can achieve performance comparable to monolingual models in both languages, challenging the assumption that bilingual input poses significant learning obstacles.
Forget expensive finetuning: DUME dynamically combines existing expert LLMs into a powerful MoE *without* additional training, unlocking multi-domain performance at minimal cost.
Unlock knowledge equity for underserved languages: L-ReLF offers a reproducible recipe for creating high-quality lexical datasets where they're needed most.
Despite its simple grammar, Esperanto translation still poses challenges for LLMs, with NLLB models only preferred in about half of human evaluations.
Proprietary language models trounce open-source alternatives by 3-6x on a new, large-scale corpus of Sinhala and Pali Buddhist texts.
LLMs can mimic legislative reasoning, but their performance hinges on the proposal's idiosyncrasy, revealing a susceptibility to plausible-sounding confabulation that could mislead policymakers.
Real-time, uncertainty-aware signed distance functions are now possible without sacrificing accuracy, thanks to a novel kernel regression and GP regression hybrid.
Unlock new insights into rapid software development and collaboration with a massive dataset of over 100,000 hackathon projects.
Open-source projects are quietly integrating ML models in ways that may violate terms of service and regulations, raising concerns about unchecked ML automation.
VLMs can appear to gain up to 58% F1 on clinical tasks simply by *mentioning* MRI data in the prompt, even when the data is uninformative, revealing a "scaffold effect" that inflates performance metrics.
Random weight initialization is a major source of instability in deep learning, especially for rare classes, but this work shows how to eliminate it entirely with structured orthogonal initialization.
Forget pruning or quantization: MPO decomposition lets you compress a transformer by 13x while retaining 97% accuracy.
LLMs can now reliably transform messy app store reviews into well-formatted user stories, but still fall short of creating truly independent and unique requirements for agile development.
Quantum-proofing your 5G core doesn't have to break the bank: a sidecar proxy can add post-quantum cryptography with a predictable 50ms latency hit.
A task-specific, lightweight transformer can outperform state-of-the-art reasoning LLMs and commercial tools in C code vulnerability detection, at a fraction of the inference cost.
Forget fine-tuning: merging language-specific weights into instruction-tuned LLMs unlocks surprisingly effective instruction following in low-resource languages.
Blockchain-based federated learning can be made practical by using multi-task peer prediction to overcome the computational bottleneck of contribution measurement.
Synergy's architecture lets agents evolve through experience by proactively recalling rewarded trajectories, hinting at a new way to build agents that learn and adapt in open, collaborative environments.
Securing LLM supply chains requires cryptographically binding training and release claims to artifacts, enabling verifiable enforcement of security policies across teams and stages.
Bitcoin can be more than just digital gold: BitSov proposes a composable architecture for a censorship-resistant internet, anchored to Bitcoin's blockchain, that could reshape how we build decentralized applications.
Ditch the command line: these open-source Shiny apps make introductory statistics concepts like hypothesis testing and regression intuitively accessible to students without any programming experience.
Open-source RISC-V microcontrollers are now easier to build, thanks to a streamlined design and fully open RTL-to-GDS flow.
LLMs exhibit polarity illusions without rational inference, suggesting that "good enough" processing and partial grammaticalization may suffice to explain these phenomena in both machines and humans.
Adapting LLMs to low-resource languages might be as simple as teaching them to "speak" bytes, sidestepping the tokenization bottleneck.
Despite increased discussion around open science, replication studies in computing education research have only seen marginal growth, suggesting a disconnect between espoused values and actual research practices.
AI coding agents are less likely to break your code *except* when they're confidently "maintaining" it, where they're actually twice as risky as humans.
Forget comparing models with benchmarks – mapping them by prompt-response likelihoods reveals hidden relationships between architecture, training data, and even how prompts compose.
Open-source LLMs, when carefully prompted with representative examples, can rival or even surpass smaller commercial models like GPT-3.5-nano in resume screening tasks, offering a privacy-preserving alternative.
Multilingual embeddings just got a whole lot smaller and faster, with F2LLM-v2 models outperforming larger counterparts while supporting over 200 languages.
Democratizing social robotics research, M offers a low-cost, open-source platform that's easy to reproduce, modify, and deploy in real-world settings.
Navigating the maze of differentially private graph release methods just got easier: a new framework helps practitioners choose the right approach, avoid common pitfalls, and make sound evaluations.
Stealing just the right neurons from another LLM lets you patch safety holes or remove biases in your own, with almost no performance hit.
Training speculative decoding models just got an order of magnitude faster, unlocking real-world deployment with a new open-source framework and a suite of production-ready draft models.
LLM watermarks can now survive fine-tuning, quantization, and distillation thanks to a new method that embeds them in a stable functional subspace.
Achieve controllable and scalable speech generation with MOSS-TTS, enabling zero-shot voice cloning and long-form synthesis.
Unlock the power of your favorite classifier for ordinal data: Classifier Pooling consistently beats standard methods, especially when data is scarce or categories are numerous.
YouTube's platform defenses are a house of cards: circumventing one control often triggers a cascade of failures, demanding constant architectural adaptation for large-scale content replication.
LLMs can get a massive multilingual boost, especially in low-resource languages, by offloading translation to specialized models and carefully aligning their representations.
LLMs encode hierarchical semantic relations asymmetrically, with hypernymy being far more robust and redundantly represented than hyponymy.
Ruyi2.5 achieves comparable performance to Qwen3-VL on general multimodal benchmarks while significantly outperforming it in privacy-constrained surveillance, demonstrating the effectiveness of its edge-cloud architecture.
Current CRL benchmarks often fail to provide a holistic view of model performance, hindering progress, but a new aggregate metric could change that.
ManiDreams lets robots handle real-world uncertainty in manipulation tasks without retraining, outperforming standard RL baselines under various perturbations.
Tackle previously intractable open quantum systems simulations with TENSO, a new open-source package that efficiently handles complex environments via tree tensor networks.
LLMs can be drastically compressed without retraining because the relative ordering of weights matters far more than their exact values, opening the door to efficient, training-free compression techniques.
LLMs can mimic human lexical patterns, but larger models act like stereotypical humans, sacrificing diversity for typicality in word associations, a trade-off tunable by temperature.
A 4B parameter model can nearly match the privilege escalation performance of a state-of-the-art closed LLM like Claude Opus, while being fully local and 100x cheaper to run.
Standardized, modular GenAI teaching units in GUIDE offer a practical path to integrating cutting-edge AI tools into digital design education.
Security patch detectors trained on standard vulnerability databases are practically useless in the real world, losing up to 90% F1-score when deployed on in-the-wild data.
This Italian LLM punches way above its weight, matching the performance of models trained on 6-10x more data while using only 3B active parameters during inference.