Search papers, labs, and topics across Lattice.
35 papers published across 4 labs.
Regularizing model sensitivity along the expected covariate drift directions, rather than isotropically, significantly improves the robustness of frozen models deployed in non-stationary environments.
Training vision-language models to detect glaucoma fairly across demographics requires debiasing both text *and* images, which this paper achieves with a novel pretraining strategy.
Current LLM jailbreak evaluations are inadequate, often relying on narrow metrics, necessitating a multi-dimensional framework like Security Cube for comprehensive security assessment.
Fragmented privacy patches are insufficient for Embodied AI: a unified, lifecycle-level approach is needed to prevent systemic privacy leakage in real-world deployments.
Current reward models often *prefer* socially undesirable responses, revealing a critical gap in LLM alignment beyond instruction following.
Regularizing model sensitivity along the expected covariate drift directions, rather than isotropically, significantly improves the robustness of frozen models deployed in non-stationary environments.
Training vision-language models to detect glaucoma fairly across demographics requires debiasing both text *and* images, which this paper achieves with a novel pretraining strategy.
Current LLM jailbreak evaluations are inadequate, often relying on narrow metrics, necessitating a multi-dimensional framework like Security Cube for comprehensive security assessment.
Fragmented privacy patches are insufficient for Embodied AI: a unified, lifecycle-level approach is needed to prevent systemic privacy leakage in real-world deployments.
Current reward models often *prefer* socially undesirable responses, revealing a critical gap in LLM alignment beyond instruction following.
Human crowdsourcing struggles to reliably identify audiovisual deepfakes, especially when both audio and video are manipulated, suggesting current detection methods may overestimate human capabilities.
Stop waiting for AI agents to mess up: AgentTrust intercepts tool calls *before* execution, offering a chance to block, warn, or fix risky actions in real-time.
AI-powered learning systems often fail adult learners because they're built for kids: here are 19 guidelines to fix that.
Seemingly harmless fine-tuning data can stealthily nudge LLMs toward unsafe behavior by subtly shifting model parameters in "danger-aligned" directions.
AI coding assistants' Terms of Service overwhelmingly place responsibility for code correctness, safety, and legal compliance on the user, creating a potential accountability gap as these tools become more autonomous.
DAOs could unlock a new era of human-machine collaboration by democratizing the operation and governance of physical-digital systems.
Overconfident predictions plague mental health prediction models, but this new framework leverages evidential learning to provide more trustworthy uncertainty estimates and human-understandable reasoning signals.
LLMs differ most not in personality, but in how they represent themselves as having (or not having) rich internal experience.
Expert alignment is hard not just because of model limitations, but because human subjective evaluation is a moving target.
LLM benchmarks are missing a critical ingredient: social science data, which could significantly improve model generalization and robustness across a wide range of disciplines.
Your smart fridge might stop cooling because of a software update on a server you don't even know exists.
Current DeFi risk assessments miss critical systemic risks, as evidenced by this new framework's ability to explain the root causes of major incidents that existing methods overlook.
Roblox's chat moderation misses a disturbing amount of grooming, bullying, and other harmful content, despite its reliance on automated systems.
Forget retraining: NeWTral instantly restores safety to your LLM after adding a risky LoRA, slashing attack success rates from 70% to 13% without sacrificing expertise.
Standard data anonymization techniques crumble when outliers are present; ICSA offers a robust alternative that maintains utility while providing stronger privacy guarantees.
Say goodbye to TLS stripping attacks: HSTS-Enforced flips the web's security model, making HTTPS the default and eliminating the need for complex opt-in configurations.
Current alignment benchmarks are misleading: even if a model aces them, its real-world alignment could be totally different depending on the specific deployment context.
LLMs can exhibit gender bias in emergency triage even when well-calibrated, and interventions effective for one model may backfire on another.
Model collapse isn't just a technical problem; it's a threat to AI democratization that will widen the gap between high- and low-resource communities.
Separating LLMs into a deliberate validation layer, rather than making them an architectural default, can improve trustworthiness and efficiency in agentic AI systems.
LLMs in Korean judicial workflows are surprisingly prone to hallucination, bias, and inconsistency, especially when retrieving precedents and summarizing jurisprudence.
Despite their widespread use as mental health support, current AI chatbots lack the clinical validation and coordinated oversight needed to effectively prevent suicide and promote well-being.
AI safety is missing a big piece of the puzzle: the deskilling and addiction risks that could erode our cognitive abilities and mental well-being.
Online advertising can harm users not just through unequal distribution of opportunities, but also by systematically depriving certain groups of relevant concepts or saturating them with skewed framings.
AI data annotation companies are publicly framing human expertise as a commodity ripe for disruption, potentially devaluing traditional forms of knowledge and institutional authority.
Threat intelligence sharing can completely neutralize an attacker's advantage gained from increasing the number of attack surfaces.
Securely onboarding third-party apps in Open RAN just got easier: a new zero-trust rubric offers explicit Accept/Escalate/Block decisions.
The sheer breadth of IoT attack vectors, from node replication to skimming, highlights the urgent need for comprehensive security strategies that address device limitations and lack of standardization.
Releasing differentially private explanations of GNN predictions doesn't hide your graph structure as much as you think: adversaries can reconstruct it with surprising accuracy.
LLMs' persistent hallucinations aren't just about lacking knowledge, but about lacking the self-awareness to know what they *don't* know, suggesting uncertainty expression is key to building trustworthy AI.