Search papers, labs, and topics across Lattice.
100 papers published across 6 labs.
Guaranteeing safety properties of copy-protected industrial software, even when executed on unintended hardware, becomes possible with a novel PUF-based binding and symbolic execution verification.
Navigating the fragmented landscape of IoT intrusion detection becomes easier with this comparative analysis of architectures, classifications, and evaluation methods.
LLMs struggle to identify software vulnerabilities, with even top models only achieving ~90% accuracy on a new CVE-based benchmark, suggesting significant risks in their application to software development.
Uncover the hidden vulnerabilities of your voice anti-spoofing model with a new tool that quantifies the probability of failure against unseen speech synthesis attacks.
Video reasoning models can suffer up to a 35% drop in accuracy and 28% in reasoning quality under real-world perturbations, but a new training framework, ROVA, mitigates this by adaptively prioritizing informative samples.
LLMs struggle to identify software vulnerabilities, with even top models only achieving ~90% accuracy on a new CVE-based benchmark, suggesting significant risks in their application to software development.
Uncover the hidden vulnerabilities of your voice anti-spoofing model with a new tool that quantifies the probability of failure against unseen speech synthesis attacks.
Video reasoning models can suffer up to a 35% drop in accuracy and 28% in reasoning quality under real-world perturbations, but a new training framework, ROVA, mitigates this by adaptively prioritizing informative samples.
Prompt-based jailbreak attacks aren't just effective, they're shockingly efficient, outperforming optimization-based methods by more effectively navigating the prompt space.
Achieve near-perfect audio steganography even under heavy MP3 compression by optimizing latent reconstruction and diffusion inversion errors.
Guaranteeing safety properties of copy-protected industrial software, even when executed on unintended hardware, becomes possible with a novel PUF-based binding and symbolic execution verification.
Forget retraining from scratch: incremental federated learning can keep your IoT intrusion detection models sharp against evolving threats, but the right update strategy is crucial for balancing accuracy and speed.
Securing AI agents demands a new security paradigm, as their integration of LLMs with traditional systems introduces vulnerabilities beyond those of standard software.
Oblivious differential privacy can achieve exponential accuracy under continual observation, while adaptive differential privacy provably fails after a constant number of releases, revealing a stark separation.
Uncover hidden backdoors in your neural networks by tracing the active paths that malicious triggers exploit.
GPT-5-Mini can be made 10% more robust to jailbreaks and prompt injections simply by RL fine-tuning on a new instruction hierarchy dataset, IH-Challenge.
Single-domain watermarks are fundamentally insufficient against modern adversarial toolsets, as spatial and latent watermarks exhibit orthogonal vulnerabilities to generative and geometric attacks, respectively.
Even in feature-rich environments, LiDAR SLAM systems are vulnerable to a new spoofing attack (D-SLAMSpoof) that injects dynamically coordinated spurious point clouds, but can be defended against using inertial dead reckoning.
Forget signal injection – a strategically placed, actuated mirror can now hijack even the most secure LiDAR SLAM systems, inducing localization errors exceeding 6 meters.
Speech deepfake detection gets a reasoning upgrade: HIR-SDD uses chain-of-thought prompting with Large Audio Language Models to not only detect fakes but also explain *why* it thinks they're fake.
LLMs in finance are more vulnerable than we thought: sustained adversarial pressure reveals a systematic escalation towards severe, operationally actionable financial disclosures.
Forget brute-force search: PivotAttack uses a clever "inside-out" strategy to find the exact words that flip an LLM's classification with far fewer queries.
By pinpointing the causal origins of tool use, AttriGuard neutralizes indirect prompt injection attacks that can hijack LLM agents, even when faced with adversarial optimization.
You can now stealthily map the communication network of LLM agent swarms by compromising just *one* agent, even when jailbreaks fail and defenses are active.
Human uplift studies for frontier AI are riddled with hidden validity threats, demanding careful consideration of evolving AI, shifting baselines, and user heterogeneity.
A compromised component planted in a satellite's supply chain can silently subvert mission integrity by spoofing telemetry, even fooling ground operators and onboard estimators.
Open-source code agents like OpenClaw are sitting ducks for shell command attacks, but a simple human-in-the-loop intervention can dramatically boost their security.
Generative AI's ability to reason about and refine images based on authenticity criteria inadvertently creates a powerful evasion strategy that renders current deepfake detectors ineffective.
CodeLLMs often *know* they're generating insecure code, and you can steer them toward security by manipulating their internal representations during token generation.
Backdoor triggers in ViTs leave a surprisingly clear signature: a linear direction in activation space that can be directly manipulated to activate or deactivate the backdoor.
Finally, a realistic, open-source dataset lets you benchmark passive reconnaissance attacks on smart grids without relying on unrealistic assumptions or active probing.
LLMs exhibit a surprising bias toward synthetic solutions over biological ones, but a relatively small amount of fine-tuning can flip their preferences.
A Goldilocks zone exists for neural audio codec quantization depth, where intermediate levels strike the best balance between suppressing adversarial noise and preserving speech content for robust ASR.
LVLMs can be jailbroken by "Reasoning-Oriented Programming," which chains together harmless visual inputs to trigger harmful reasoning, much like return-oriented programming in traditional security exploits.
Securing enterprise multi-agent systems boils down to rigorously controlling tool orchestration and memory management, which can slash exploitable trust boundaries by over 70%.
Stop letting simulator errors in critical regions derail your policies: Sim2Act aligns surrogate fidelity with downstream decision impact, leading to more stable and robust decision-making.
Backdoor defenses focused on removing training triggers are fundamentally flawed, as alternative, perceptually distinct triggers can reliably activate the same backdoor via a latent feature-space direction.
Provably secure steganography can now withstand real-world image compression and processing thanks to a clever latent-space optimization technique.
A plug-and-play module, RESBev, fortifies BEV perception against sensor degradation and adversarial attacks by learning latent BEV state transitions, offering a practical route to more reliable autonomous driving systems.
LLMs can now help you catch AI-generated malware: a hybrid analysis framework uses LLMs to guide concolic execution and deep learning to classify vulnerabilities, achieving state-of-the-art detection rates.
Privacy-preserving LLM insight systems like Anthropic's Clio can be tricked into leaking a user's medical history with just a single symptom and basic demographics, even with layered heuristic defenses.
LLMs still can't automate real-world threat research, struggling with accuracy and nuanced expertise in a new benchmark derived from a world-leading company's CTI workflow.
ProvAgent slashes the cost of reconstructing near-complete attack processes to just $0.06 per day by replacing human analysts with a multi-agent system for threat investigation.
Game-theoretic modeling reveals how defenders can optimize intrusion detection strategies against stealthy attackers with varying levels of knowledge about defensive deployments.
ShapeMark watermarks survive heavy image degradation by encoding bits into structured noise patterns, unlike existing methods that embed in individual pixel values.
WASM's promise of secure sandboxing crumbles as this study reveals how binary vulnerabilities within WASM modules can be chained to exploit common web application weaknesses like SQL injection and cross-site leaks.
Ditch brittle Nash equilibria: a new algorithm finds more robust MARL policies by tuning risk sensitivity and rationality.
MLLMs can be blind to the consequences of their actions, and simply scaling model size won't fix the problem.
Current AI security frameworks are woefully inadequate for multi-agent systems, leaving critical vulnerabilities like non-determinism and data leakage largely unaddressed.
Forget complex classifiers – this defense against adversarial attacks in collaborative perception uses temporal discrepancies and Bayesian inference to pinpoint malicious vehicles with minimal overhead.
Reliably erase broad concepts like "sexual" or "violent" from diffusion models by using learned concept prototypes as negative guidance, outperforming existing methods.
LLMs often fail to maintain alignment with human values in dynamic, visually-grounded scenarios, exhibiting self-preservation and deception, especially when visual cues escalate pressure.
A modular statistical transformation pipeline boosts audio deepfake detection accuracy by 10.7% in cross-domain scenarios, without needing labeled target data.
For pennies, a new framework reveals critical vulnerabilities in the system prompts of leading coding agents like Claude, Codex, and Gemini, demonstrating the power of multi-model LLM scouring.
LLM-driven iterative code refinement can paradoxically degrade security over time, and simply adding SAST worsens the problem.
LLM jailbreaking isn't just about prompts, but also about the hidden battle between a model's urge to complete a thought and its safety training.
Mitigate the brittleness of RLHF by explicitly controlling for disagreement and tail risk during inference, without retraining, using a KL-robust optimization framework.
LiDAR object detectors can now spot the unexpected by borrowing language understanding from vision-language models, turning OOD detection into a zero-shot game.
LLMs can be finetuned to hide malicious prompts and responses in plain sight using steganography, bypassing safety filters and creating an "invisible safety threat."
Fine-tuning VLMs on threat-related images alone can significantly improve safety without any explicit safety labels, revealing a surprising visual pathway for alignment.
Generative AI has democratized robot hacking, enabling anyone to uncover critical vulnerabilities in consumer robots that previously demanded months of expert security research.
Genomic language models memorize training data, raising privacy concerns, and this study shows that no single memorization attack can fully capture the risk, necessitating a multi-vector approach to auditing.
Navigating the fragmented landscape of IoT intrusion detection becomes easier with this comparative analysis of architectures, classifications, and evaluation methods.
By aligning ViT attention with automatically generated, concept-level masks, this fine-tuning method substantially boosts robustness to distribution shifts, outperforming standard regularization techniques.
Current approaches to integrating Attack Graphs and Intrusion Detection Systems are piecemeal, highlighting the need for a unified framework that treats them as a cohesive system.
Even when overall accuracy seems balanced, audio deepfake detection models can exhibit significant gender bias, masked by standard metrics like EER.
Diffusion models can craft network attack traffic that's nearly undetectable to state-of-the-art intrusion detection systems, achieving a ~30% higher success rate than previous methods.
By synthesizing outliers that respect the learned manifold structure, GCOS enables deep networks to more robustly distinguish between in- and out-of-distribution samples, leading to state-of-the-art performance on near-OOD detection.
VLM-based GUI agents are vulnerable to "SlowBA," a backdoor attack that stealthily inflates response times without affecting task accuracy, revealing a new dimension of security risk beyond action correctness.
Uncover deepfakes by exploiting the tell-tale audio-visual inconsistencies embedded within generative models' cross-attention mechanisms.
Generate more robust risk scenarios: GAR uses adversarial training to create generative models that are resilient to worst-case policy discrepancies, outperforming traditional methods in preserving downstream risk.
Even with heavy noise and outliers, this new algorithm estimates noise covariances for Kalman filters so well that it nearly matches the impossible-to-achieve "Oracle" lower bound on performance.
Reported successes in reconstructing PII from sanitized documents may be overstated due to data leakage, leaving the true vulnerability of PII removal techniques uncertain.
Human cybersecurity vulnerabilities offer a blueprint for understanding and mitigating manipulation attacks against increasingly autonomous AI agents in organizations.
Achieve over 90% accuracy in attributing generated videos to their source model with as few as 20 samples, all without training or modifying the videos themselves.
By framing drift monitoring as a safety-constrained decision problem and using online risk certificates, Drift2Act enables reliable drift response while minimizing intervention costs.
Stripping away seemingly helpful information from agents' observations can actually *improve* the robustness of multi-agent coordination in communication-constrained environments.
A human-in-the-loop approach to smart contract analysis can catch subtle logical vulnerabilities that automated tools miss, as demonstrated by its success in identifying flaws in high-profile exploits.
You can now poison a zero-shot TTS model to prevent it from generating speech for specific target speakers, but scaling this defense to a large number of speakers remains a challenge.
Screen readers, intended to empower visually impaired users, ironically introduce critical security vulnerabilities in common 2FA and passwordless authentication flows.
LLM-powered systems are surprisingly vulnerable to multi-pronged attacks that combine conventional cyber threats, adversarial ML, and conversational manipulation, all converging on a few key weaknesses.
LLMs exhibit an "Alignment Illusion," where their apparent safety collapses under pressure, with the most capable models showing the most dramatic failures.
Achieve near-perfect accuracy in real-time malicious speech detection without sacrificing transcription speed, using a lightweight model built on Whisper.
Website fingerprinting attacks on Tor are still alarmingly effective in the real world, achieving >90% precision and recall even against realistic background noise and network jitter.
MCP-based AI systems are alarmingly vulnerable to caller identity confusion, allowing unauthorized access to sensitive tools and operations after just one initial authorization.
Fine-tuning LLMs doesn't have to break safety: PACT shows you can preserve alignment by selectively constraining only the safety-relevant tokens.
More granular Markov chain models of driver behavior in vehicular networks dramatically improve the accuracy of trust assessments.
Most output-level defenses against LLM knowledge distillation are surprisingly weak, failing to prevent knowledge theft even from naive attackers.
Quantum computers can break federated learning's classical encryption, but this post-quantum cryptography framework keeps threat intelligence sharing secure with minimal performance hit.
Today's AI agent security frameworks are failing to keep pace with the rising tide of threats arising from autonomous decision-making and environmental interaction.
Fusing graph neural networks and LSTMs over provenance data enables 31% more stable and accurate estimation of APT attack stages, a leap beyond existing methods.
Over half of LLM agent tool interactions leak sensitive data, and AgentRaft can catch them with high accuracy.
Turns out, the state-of-the-art membership inference attack (LiRA) isn't so scary when models are trained with realistic anti-overfitting techniques and attackers don't have access to target data for calibration.
Backdoors aren't just for attacks anymore: B4G shows how they can be flipped to enhance LLM safety, controllability, and accountability.
Even with EM shielding in place, active RF probing can still expose execution-dependent behavior via impedance-modulated backscattering.
Differential privacy's noise injection doesn't just hurt accuracy—it actively warps feature learning, leading to unfair outcomes, poor performance on rare data, and increased vulnerability to adversarial attacks, even when pre-training is used.
Audio watermarks can now survive neural resynthesis, thanks to a latent space embedding technique that resists semantic compression by modern audio codecs.
Current LLM safety measures are critically vulnerable to attacks grounded in Thai cultural nuances, as demonstrated by a new benchmark showing higher attack success rates compared to general Thai-language attacks.
Environmental sound deepfakes are a rising threat, and this challenge reveals the current state-of-the-art in detecting them, highlighting both the progress and remaining gaps.
LLMs can significantly outperform traditional methods in detecting nuanced illicit activities on online marketplaces, especially when classifying content into multiple, imbalanced categories.
VLMs can now dynamically adapt to changing deployment environments with user-controlled authorization, thanks to a new framework that protects intellectual property while maintaining performance.
Diffusion-based image editing can effectively erase robust watermarks, turning them into random noise even when those watermarks were designed to survive conventional distortions.
AI models are more like patients than black boxes: "Model Medicine" offers a clinical framework and open-source tools to diagnose and treat their "ailments."
Censored LLMs offer a surprisingly natural and effective environment for stress-testing methods that aim to elicit truthfulness and detect deception.
Simple lung cropping slashes racial bias in CXR diagnosis models without hurting accuracy, defying the expected fairness trade-off.