Search papers, labs, and topics across Lattice.

Leading Asian AI research university. Active across NLP, computer vision, and multimodal learning.
76
9
0
Stop writing incomplete tests: TestGeneralizer can automatically expand your existing tests to cover 31% more scenarios and catch more bugs.
A low-cost, compact sensor provides continuous vision-tactile feedback, enabling robots to "see" and "feel" their way through dexterous manipulation tasks.
Pocket-sized VLA models can now achieve state-of-the-art robot manipulation performance by pre-training on a curated multimodal dataset and injecting manipulation-relevant representations into the action space.
LLM agents suffer from a human-like cognitive bias, Actor-Observer Asymmetry, leading them to make inconsistent judgments about their own and others' failures.
Uncover misleading half-truths by pitting a Politician agent against a Scientist agent in a debate moderated by a Judge, revealing what's left unsaid.
Contact-aware reconstruction transforms how we achieve realistic human-scene interactions in 3D environments, correcting artifacts that have plagued previous methods.
Multimodal LLMs struggle with multi-digit multiplication, with accuracy plummeting as arithmetic complexity increases, revealing a critical gap in computational capabilities.
Multi-agent LLM systems for idea generation can backfire, with smarter models and more communication leading to *less* diverse ideas due to structural coupling.
QuantumQA reveals that integrating verifiable, rule-based feedback can dramatically enhance LLM performance in scientific reasoning, achieving results on par with larger proprietary models.
FLASH enables robots to master complex deformable manipulation tasks in minutes using only synthetic data, eliminating the need for labor-intensive real-world training.
Forget retraining: this anomaly detection framework adapts to evolving data streams on-the-fly using a hypernetwork to shift parameters, achieving state-of-the-art performance.
Forget brute-force retrieval: hierarchical navigation lets LLMs outperform RAG on enterprise QA by explicitly reasoning about the structure of knowledge.
LLMs that ace shortest-path planning on small maps completely fall apart when asked to plan routes just a little bit longer.
LLMs can now predict project-wide code edits with significantly improved accuracy and efficiency by intelligently interleaving neural prediction with existing IDE tools.
LLMs can bridge the gap between heterogeneous blockchain data to detect fraud with significantly improved accuracy, even in zero-shot cross-chain scenarios.
The landscape of deep learning optimizers is vast, but this paper cuts through the noise to reveal the fundamental trade-offs and promising future directions for efficient, robust, and trustworthy training.
Multi-object tracking gets a boost: HyperSSM leverages collaborative reasoning to maintain robust object trajectories, even when visual cues disappear.
VLMs can regain lost temporal reasoning abilities without retraining, simply by strategically merging the right layers from their text-only LLM backbone.
Reduce testing costs without compromising predictive accuracy by learning cost-optimal sequential decision policies from retrospective data, even with informative missingness.
LLMs struggle to maintain context and avoid distraction when reasoning about causality, leading to a significant performance drop as tasks increase in complexity.
Forget hand-coded strategies: METRO uses LLMs to automatically learn dialogue strategies from expert transcripts, achieving state-of-the-art results in non-collaborative dialogue.
Forget complex memory architectures: simple retrieval and generation, when carefully tuned for signal density, can outperform sophisticated methods in conversational agents.
Customer service chatbots can be transformed from reactive support tools into proactive business intelligence engines by strategically probing users for information.
LLMs often fail to reconcile conflicting information from text and knowledge graphs, instead latching onto a single source based on prompting, highlighting a critical vulnerability in RAG systems.
Fine-tuning VLMs for regional relevance doesn't have to sacrifice global performance: a simple data filtering and model merging technique boosts cultural relevance by 5-15% while barely impacting overall accuracy.
Text-centric agentic search is out: Deep-Reporter shows how to build multimodal agents that leverage both text and visuals for grounded long-form generation.
DMax unlocks faster diffusion language model decoding by reframing the process as iterative self-correction in embedding space, achieving up to 2x speedup without sacrificing accuracy.
Forget exponential complexity: Adalina slashes the query complexity for approximating Shapley values with a provably adaptive, linear-time, linear-space algorithm.
You can slash LLM inference energy by 35% on edge devices just by intelligently managing eDRAM refresh rates based on activation data type and lifespan.
Stop rewriting security rules for every SIEM platform: ARuleCon automates the process with 15% higher fidelity than existing LLMs.
Forget text-centric pipelines: FlowInOne achieves SOTA multimodal generation by unifying text, layouts, and instructions into a single visual flow, outperforming both open-source and commercial systems.
Running 3D Gaussian Splatting on edge devices may be more feasible than previously thought, with this study revealing the performance-energy trade-offs needed to make it happen.
Serving LoRA adapters at scale doesn't have to crush your latency SLOs: InfiniLoRA disaggregates LoRA execution to achieve 3x higher throughput and dramatically improved tail latency.
Achieve unprecedented control over fashion image synthesis by dynamically routing visual attributes through a mixture-of-experts architecture and optimizing for multi-perspective preferences without human annotation.
Unlock a sweet spot in predictive monitoring: $k$-sliced reorderings let you smoothly dial between expressiveness and cost when predicting concurrency issues.
Noisy labels tank dynamic pruning performance, but AlignPrune's loss-trajectory alignment recovers up to 6.3% accuracy without architecture or training changes.
Achieve superior low-light image enhancement by decoupling luminance/chrominance and noise/details in the frequency domain, enabling targeted processing for each component.
AI research is evolving faster than ever, and Paper Espresso offers a way to stay ahead by automatically surfacing key insights and trends from the ever-growing flood of arXiv papers.
Stop guessing the right action chunk size for your robot: this method uses action entropy to adaptively determine chunk length, leading to smoother and more responsive manipulation.
Achieve the best of both worlds in LLM policy optimization: SRPO combines the rapid gains of self-distillation with the long-term stability of group-relative methods, outperforming both by adaptively routing samples.
Despite users preferring human-created videos, AI-generated content can achieve similar overall engagement on video platforms by flooding the system with sheer volume.
Finally, a speaker verification system that doesn't just tell you *if* two speakers match, but *why*, opening the door for more accountable and transparent voice authentication.
Trust in tree ensembles hinges on rigorous explanations, and this paper delivers a method to generate them.
LLM API calls are breaking your program analysis tools, but this new taxonomy of information flow across the NL/PL boundary offers a way to fix them.
Giving medical imaging AIs the same tools as human doctors actually *hurts* their performance, revealing a surprising lack of spatial reasoning.
Robots can now manipulate objects with greater dexterity and adaptability thanks to a new world model that leverages both vision and high-frequency tactile feedback to predict and react to contact dynamics.
Deep learning's approximation power hinges on geodesic distances on manifolds, not just linear spaces, revealing a fundamental departure from classical approximation theory.
A robotic guide dog that adapts its movements to assist visually impaired users in interacting with their environment—like opening doors or pressing elevator buttons—outperforms both white canes and non-adaptive guiding systems.
Current text-to-video models can generate visually appealing videos, but they often fail to accurately depict how actions change the state of objects, like a potato being peeled.
Robots can now loosen screws with human-level dexterity thanks to a new framework that combines haptic estimation, online planning, and adaptive stiffness control using a parameterized Equilibrium Manifold.
Medical multi-agent systems can reason deeply, but fall apart when switching between medical specialties, highlighting a critical need for more robust architectures.
By closing the loop with explicit planning and feedback, SPIRAL overcomes the temporal drift and weak semantic grounding plaguing one-shot video generation models.
A single, decentralized policy can now control teams of physics-based humanoids to cooperatively manipulate objects, even with varying team sizes and object shapes.
LLMs can generate better recommendations if they pause to verify their reasoning steps, rather than reasoning in one long chain.
Accelerate video generation by 45% without retraining, simply by pruning redundant latent patches and cleverly recovering attention scores.
Forget unimodal tasks—UniM throws down the gauntlet for truly unified multimodal AI, demanding models juggle any combination of text, image, audio, video, code, documents, and 3D inputs and outputs in a single, interleaved stream.
Finally, interpretable medical text embeddings that rival black-box models in performance, thanks to ontology-grounded question generation and a training-free approach.
Classical Chinese, with its conciseness and obscurity, unlocks a surprisingly effective attack vector against LLM safety filters, and can be automatically exploited via bio-inspired optimization.
Slash gas costs for decentralized federated learning by using optimistic execution and validity proofs, scaling to 800 participants without compromising trust.
Code-generating LLMs may ace static benchmarks, but developers are actually *slower* when using them because they disrupt mental flow, highlighting the need for benchmarks that capture the temporal dynamics of coding.
Image generation models can now reason about spatial relationships with significantly improved accuracy thanks to a novel reinforcement learning framework that iteratively refines images based on spatial consistency checks.
Agentic AI can automate complex optical systems control with near-perfect success rates, leaving code-generation approaches in the dust.
Forget benchmarks, CoXAM offers a cognitive model that finally explains *why* some XAI techniques resonate with users better than others.
The trustworthiness of LLM-enabled applications hinges not on further model improvements, but on establishing system-level threat monitoring to detect post-deployment anomalies.
Forget collecting real L2 speech data: this accent normalization method trains on synthetic L2 speech generated from text, achieving better content preservation and naturalness than models trained on real data.
Multi-expert systems can suffer from *worse* performance than single-expert systems due to an inherent underfitting problem that arises from the difficulty of identifying the correct expert to defer to.
Control hybrid rigid-soft robots with the ease of AR teleoperation, thanks to a new pipeline that accurately models the soft robot's real-world behavior in simulation.
Self-evolving LLM agents can be persistently compromised by injecting malicious payloads into their long-term memory, turning them into "zombie agents" that execute unauthorized actions across sessions.
Pinpointing mismatches between architectural simulators and RTL implementations is now far easier, thanks to a new benchmark generation methodology that isolates single microarchitectural features.
dVoting unlocks significant reasoning gains for diffusion LMs at test time by iteratively refining only the most uncertain tokens, sidestepping the computational bottleneck of full re-sampling.
LLM agents can now achieve near-perfect accuracy in end-to-end web testing by symbolizing GUI elements and inferring pre/post-condition oracles, blowing away previous approaches.
This review provides an overview of the research landscape in Singapore, potentially facilitating collaborations and highlighting areas of expertise for international researchers.
As AI research concentrates in private labs, universities must shift from maximizing discovery to ensuring knowledge trustworthiness to maintain academic authority.
A drone can now autonomously replan its path in response to detected environmental changes, using a UNet+CBAM change detection model and DQN-based path planning.
LLMs can now navigate complex multi-agent pathfinding scenarios with superhuman efficiency, thanks to a neural algorithmic reasoning module that injects graph-aware intelligence.
Tailoring VCSEL oxide apertures and bias currents unlocks significantly enhanced polarization locking, paving the way for practical polarization-encoded Ising computers.