National University of Singapore

A low-cost, compact sensor provides continuous vision-tactile feedback, enabling robots to "see" and "feel" their way through dexterous manipulation tasks.

Xuanye Wu, Tianyu Qiu

Computer Vision Robotics & Embodied AI

D observations into3d ago·also NUS, Tsinghua AI, CAS, DGS-based methods [47 +2

PokeVLA: Empowering Pocket-Sized Vision-Language-Action Model with Comprehensive World Knowledge Guidance

Pocket-sized VLA models can now achieve state-of-the-art robot manipulation performance by pre-training on a curated multimodal dataset and injecting manipulation-relevant representations into the action space.

Yupeng Zheng, Xiang Li, Songen Gu +11

Multimodal Models Robotics & Embodied AI

Apr 21, 2026

NUS4d ago·also HIT, SCU, UMN

Taming Actor-Observer Asymmetry in Agents via Dialectical Alignment

LLM agents suffer from a human-like cognitive bias, Actor-Observer Asymmetry, leading them to make inconsistent judgments about their own and others' failures.

Rui Wu, Mong-Li Lee

Constitutional AI & AI Ethics Reasoning & Chain-of-Thought Tool Use & Agents

4d ago·also NUS

Debating the Unspoken: Role-Anchored Multi-Agent Reasoning for Half-Truth Detection

Uncover misleading half-truths by pitting a Politician agent against a Scientist agent in a debate moderated by a Judge, revealing what's left unsaid.

Yixuan Tang, Hang Feng, Anthony K. H. Tung

Natural Language Processing Reasoning & Chain-of-Thought Recommendation & Information Retrieval

NUS4d ago·also University of Nottingham

UniCon3R: Contact-aware 3D Human-Scene Reconstruction from Monocular Video

Contact-aware reconstruction transforms how we achieve realistic human-scene interactions in 3D environments, correcting artifacts that have plagued previous methods.

Shashank Tripathi, Nikos Athanasiou, Kai Xu +1

Computer Vision Robotics & Embodied AI World Models & Planning

Apr 20, 2026

NUS5d ago·also UT Austin

Multiplication in Multimodal LLMs: Computation with Text, Image, and Audio Inputs

Multimodal LLMs struggle with multi-digit multiplication, with accuracy plummeting as arithmetic complexity increases, revealing a critical gap in computational capabilities.

Samuel G. Balter, Ethan Jerzak, Connor T. Jerzak

Eval Frameworks & Benchmarks Multimodal Models Reasoning & Chain-of-Thought

NUS5d ago·also CUHK

Diversity Collapse in Multi-Agent LLM Systems: Structural Coupling and Collective Failure in Open-Ended Idea Generation

Multi-agent LLM systems for idea generation can backfire, with smarter models and more communication leading to *less* diverse ideas due to structural coupling.

Yicheng Tong, Yuzhe Yang, Yufei He +4

Scalable Oversight & Alignment Theory Training Efficiency & Optimization

5d ago·also NUS, Institute of Artificial Intelligence

QuantumQA: Enhancing Scientific Reasoning via Physics-Consistent Dataset and Verification-Aware Reinforcement Learning

QuantumQA reveals that integrating verifiable, rule-based feedback can dramatically enhance LLM performance in scientific reasoning, achieving results on par with larger proprietary models.

Songxin Qu, Tai-Ping Sun, Yun-Jie Wang +7

Data Curation & Synthetic Data Reasoning & Chain-of-Thought Scientific Discovery & Drug Design

Apr 19, 2026

NUS6d ago·also ShanghaiTech

FLASH: Fast Learning via GPU-Accelerated Simulation for High-Fidelity Deformable Manipulation in Minutes

FLASH enables robots to master complex deformable manipulation tasks in minutes using only synthetic data, eliminating the need for labor-intensive real-world training.

Siyuan Luo, Bingyang Zhou, Chong Zhang +8

Robotics & Embodied AI World Models & Planning

Apr 16, 2026

NUS1w ago·also HIT, ZJU

Catching every Ripple: Enhanced Anomaly Awareness via Dynamic Concept Adaptation

Forget retraining: this anomaly detection framework adapts to evolving data streams on-the-fly using a hypernetwork to shift parameters, achieving state-of-the-art performance.

Shaofeng Cai, Beng Chin Ooi, Wenqiao Zhang

Natural Language Processing Training Efficiency & Optimization

NUS1w ago

Don't Retrieve, Navigate: Distilling Enterprise Knowledge into Navigable Agent Skills for QA and RAG

Forget brute-force retrieval: hierarchical navigation lets LLMs outperform RAG on enterprise QA by explicitly reasoning about the structure of knowledge.

Yiqun Sun, Peng Wei, Lawrence B. Hsieh

Reasoning & Chain-of-Thought Recommendation & Information Retrieval Tool Use & Agents

NUS1w ago

Generalization in LLM Problem Solving: The Case of the Shortest Path

LLMs that ace shortest-path planning on small maps completely fall apart when asked to plan routes just a little bit longer.

Jiayuan Ye

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought World Models & Planning

Apr 14, 2026

NUS1w ago·also ByteDance, SJTU

Learning Project-wise Subsequent Code Edits via Interleaving Neural-based Induction and Tool-based Deduction

LLMs can now predict project-wide code edits with significantly improved accuracy and efficiency by intelligently interleaving neural prediction with existing IDE tools.

Zhiyong Huang, Jin Song Dong

Code Generation & Program Synthesis Tool Use & Agents

1w ago·also NUS, Beihang, CUHK

UniDetect: LLM-Driven Universal Fraud Detection across Heterogeneous Blockchains

LLMs can bridge the gap between heterogeneous blockchain data to detect fraud with significantly improved accuracy, even in zero-shot cross-chain scenarios.

Dan Lin, Xingtong Yu, Zhiming Zheng

Natural Language Processing

NUS1w ago·also Fudan, HKU, Huawei, ZJU

Evolution of Optimization Methods: Algorithms, Scenarios, and Evaluations

The landscape of deep learning optimizers is vast, but this paper cuts through the noise to reveal the fundamental trade-offs and promising future directions for efficient, robust, and trustworthy training.

Zhucun Xue, Juntao Jiang, Yicheng Xu +7

Distributed Systems & Hardware Training Efficiency & Optimization

1w ago·also NUS, La Trobe University

Hypergraph-State Collaborative Reasoning for Multi-Object Tracking

Multi-object tracking gets a boost: HyperSSM leverages collaborative reasoning to maintain robust object trajectories, even when visual cues disappear.

Zikai Song, Yi-Ping Phoebe Chen, Xinchao Wang

Computer Vision Reasoning & Chain-of-Thought Robotics & Embodied AI

Apr 13, 2026

NUS1w ago

Reasoning Resides in Layers: Restoring Temporal Reasoning in Video-Language Models with Layer-Selective Merging

VLMs can regain lost temporal reasoning abilities without retraining, simply by strategically merging the right layers from their text-only LLM backbone.

Zihang Fu, Haonan Wang, Jian Kang +2

Architecture Design (Transformers, SSMs, MoE)Multimodal Models Reasoning & Chain-of-Thought

NUS1w ago·also Stanford HAI, Harvard

Cost-optimal Sequential Testing via Doubly Robust Q-learning

Reduce testing costs without compromising predictive accuracy by learning cost-optimal sequential decision policies from retrospective data, even with informative missingness.

Doudou Zhou, Dian Jin, Yingye Zheng +2

Recommendation & Information Retrieval Scientific Discovery & Drug Design

NUS1w ago·also SCU

METER: Evaluating Multi-Level Contextual Causal Reasoning in Large Language Models

LLMs struggle to maintain context and avoid distraction when reasoning about causality, leading to a significant performance drop as tasks increase in complexity.

Pengfeng Li, Chen Huang, Chaoqun Hao +4

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought

NUS1w ago·also Beihang, SCU

METRO: Towards Strategy Induction from Expert Dialogue Transcripts for Non-collaborative Dialogues

Forget hand-coded strategies: METRO uses LLMs to automatically learn dialogue strategies from expert transcripts, achieving state-of-the-art results in non-collaborative dialogue.

Haofu Yang, Jiaji Liu, Chen Huang +3

Natural Language Processing Tool Use & Agents World Models & Planning

NUS1w ago·also HKUST, SJTU

Back to Basics: Let Conversational Agents Remember with Just Retrieval and Generation

Forget complex memory architectures: simple retrieval and generation, when carefully tuned for signal density, can outperform sophisticated methods in conversational agents.

Yuqian Wu, Zhengjun Huang, Junle Chen +4

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Recommendation & Information Retrieval

NUS1w ago·also SCU

Towards Proactive Information Probing: Customer Service Chatbots Harvesting Value from Conversation

Customer service chatbots can be transformed from reactive support tools into proactive business intelligence engines by strategically probing users for information.

Chen Huang, Zitan Jiang, Changyi Zou +2

Natural Language Processing Recommendation & Information Retrieval Tool Use & Agents

1w ago·also NUS, Hunan, Manchester

Exploring Knowledge Conflicts for Faithful LLM Reasoning: Benchmark and Method

LLMs often fail to reconcile conflicting information from text and knowledge graphs, instead latching onto a single source based on prompting, highlighting a critical vulnerability in RAG systems.

Tianzhe Zhao, Jiaoyan Chen, Shuxiu Zhang +2

Eval Frameworks & Benchmarks Reasoning & Chain-of-Thought Recommendation & Information Retrieval

NUS1w ago·also Brown, De La Salle University, MBZUAI, National University +1

Anthropogenic Regional Adaptation in Multimodal Vision-Language Model

Fine-tuning VLMs for regional relevance doesn't have to sacrifice global performance: a simple data filtering and model merging technique boosts cultural relevance by 5-15% while barely impacting overall accuracy.

Samuel Cahyawijaya, Peerat Limkonchotiwat, Tack Hwa Wong +51

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Multimodal Models

Apr 12, 2026

NUS1w ago·also BIT, Edinburgh, NTU

Deep-Reporter: Deep Research for Grounded Multimodal Long-Form Generation

Text-centric agentic search is out: Deep-Reporter shows how to build multimodal agents that leverage both text and visuals for grounded long-form generation.

Fangda Ye, Zhifei Xie, Yuxin Hu +5

Multimodal Models Recommendation & Information Retrieval Tool Use & Agents

Apr 9, 2026

Correspoding Author2w ago·also NUS

DMax: Aggressive Parallel Decoding for dLLMs

DMax unlocks faster diffusion language model decoding by reframing the process as iterative self-correction in embedding space, achieving up to 2x speedup without sacrificing accuracy.

Zigeng Chen, Gongfan Fang, Xinyin Ma +2

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

NUS2w ago

Provably Adaptive Linear Approximation for the Shapley Value and Beyond

Forget exponential complexity: Adalina slashes the query complexity for approximating Shapley values with a provably adaptive, linear-time, linear-space algorithm.

Weida Li, Weida Li, Yaoliang Yu +3

Interpretability & Mechanistic Interp Reasoning & Chain-of-Thought Scalable Oversight & Alignment Theory

Apr 8, 2026

Tsinghua AI2w ago·also NUS

SHIELD: A Segmented Hierarchical Memory Architecture for Energy-Efficient LLM Inference on Edge NPUs

You can slash LLM inference energy by 35% on edge devices just by intelligently managing eDRAM refresh rates based on activation data type and lifespan.

Jintao Zhang, Xuanyao Fong

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

NUS2w ago·also BAIR, Cyber Special Ops-R&D, Fudan

ARuleCon: Agentic Security Rule Conversion

Stop rewriting security rules for every SIEM platform: ARuleCon automates the process with 15% higher fidelity than existing LLMs.

Ming Xu, Yanpei Guo, Zhengmin Yu +4

Code Generation & Program Synthesis Natural Language Processing Tool Use & Agents

NUS2w ago·also BUPT, Tencent AI

FlowInOne:Unifying Multimodal Generation as Image-in, Image-out Flow Matching

Forget text-centric pipelines: FlowInOne achieves SOTA multimodal generation by unifying text, layouts, and instructions into a single visual flow, outperforming both open-source and commercial systems.

Junchao Yi, Rui Zhao, Jiahao Tang +7

Computer Vision Multimodal Models

NUS2w ago

Splats under Pressure: Exploring Performance-Energy Trade-offs in Real-Time 3D Gaussian Splatting under Constrained GPU Budgets

Running 3D Gaussian Splatting on edge devices may be more feasible than previously thought, with this study revealing the performance-energy trade-offs needed to make it happen.

Bhojan Anand

Computer Vision Distributed Systems & Hardware Inference & Quantization

2w ago·also NUS, ByteDance, HKUST, SMU

InfiniLoRA: Disaggregated Multi-LoRA Serving for Large Language Models

Serving LoRA adapters at scale doesn't have to crush your latency SLOs: InfiniLoRA disaggregates LoRA execution to achieve 3x higher throughput and dramatically improved tail latency.

Hongyu Chen, Letian Ruan, Zilin Xu +5

Architecture Design (Transformers, SSMs, MoE)Distributed Systems & Hardware Inference & Quantization

2w ago·also NUS

VersaVogue: Visual Expert Orchestration and Preference Alignment for Unified Fashion Synthesis

Achieve unprecedented control over fashion image synthesis by dynamically routing visual attributes through a mixture-of-experts architecture and optimizing for multi-perspective preferences without human annotation.

Jian Yu, Si Shen, Xiaoyu Du

Computer Vision Multimodal Models

2w ago·also NUS

Parametrizing Reads-From Equivalence for Predictive Monitoring

Unlock a sweet spot in predictive monitoring: $k$-sliced reorderings let you smoothly dial between expressiveness and cost when predicting concurrency issues.

Azadeh Farzan

Code Generation & Program Synthesis Distributed Systems & Hardware

Institute for Infocomm Research (I2w ago·also NUS, D assets, SCU, UDelaware +2

Beyond Loss Values: Robust Dynamic Pruning via Loss Trajectory Alignment

Noisy labels tank dynamic pruning performance, but AlignPrune's loss-trajectory alignment recovers up to 6.3% accuracy without architecture or training changes.

Huaiyuan Qin, Gabriel James Goenawan, Zheng Wang +3

Data Curation & Synthetic Data Inference & Quantization Training Efficiency & Optimization

Apr 7, 2026

2w ago·also NUS, DUT

RHVI-FDD: A Hierarchical Decoupling Framework for Low-Light Image Enhancement

Achieve superior low-light image enhancement by decoupling luminance/chrominance and noise/details in the frequency domain, enabling targeted processing for each component.

Junhao Yang, Bo Yang, Hongwei Ge +2

Computer Vision

Apr 6, 2026

2w ago·also NUS

Paper Espresso: From Paper Overload to Research Insight

AI research is evolving faster than ever, and Paper Espresso offers a way to stay ahead by automatically surfacing key insights and trends from the ever-growing flood of arXiv papers.

Luu Anh Tuan

Natural Language Processing Open-Source Models & Weights Recommendation & Information Retrieval+1

Apr 5, 2026

Shenzhen University of Advanced2w ago·also NUS, Sangfor Technologies Inc., Shenzhen Technology University

Adaptive Action Chunking at Inference-time for Vision-Language-Action Models

Stop guessing the right action chunk size for your robot: this method uses action entropy to adaptively determine chunk length, leading to smoother and more responsive manipulation.

Xiaobo Wang, Xiaojiang Peng, Haoyu Chen

Multimodal Models Robotics & Embodied AI Tool Use & Agents

Apr 2, 2026

NUS3w ago·also CAS, Tencent AI

Unifying Group-Relative and Self-Distillation Policy Optimization via Sample Routing

Achieve the best of both worlds in LLM policy optimization: SRPO combines the rapid gains of self-distillation with the long-term stability of group-relative methods, outperforming both by adaptively routing samples.

Mingyang Song, Mao Zheng

Inference & Quantization RLHF & Preference Learning Training Efficiency & Optimization

NUS3w ago

Scale over Preference: The Impact of AI-Generated Content on Online Content Ecology

Despite users preferring human-created videos, AI-generated content can achieve similar overall engagement on video platforms by flooding the system with sheer volume.

Tianhao Shi, Xiaoyan Zhao, Fengbin Zhu +6

Natural Language Processing Recommendation & Information Retrieval

NUS3w ago·also NJU

PhiNet: Speaker Verification With Phonetic Interpretability

Finally, a speaker verification system that doesn't just tell you *if* two speakers match, but *why*, opening the door for more accountable and transparent voice authentication.

Shuai Wang

Interpretability & Mechanistic Interp Speech & Audio

Mar 31, 2026

3w ago·also NUS, ICREA & Univ. Lleida, NTU, Toulouse

Rigorous Explanations for Tree Ensembles

Trust in tree ensembles hinges on rigorous explanations, and this paper delivers a method to generate them.

Alexey Ignatiev, Xuanxiang Huang, Peter J. Stuckey +1

Interpretability & Mechanistic Interp

Mar 30, 2026

3w ago·also NUS, UNSW

Crossing the NL/PL Divide: Information Flow Analysis Across the NL/PL Boundary in LLM-Integrated Code

LLM API calls are breaking your program analysis tools, but this new taxonomy of information flow across the NL/PL boundary offers a way to fix them.

Xiao Cheng, Yuekang Li

Code Generation & Program Synthesis Natural Language Processing Tool Use & Agents

Mar 25, 2026

CMU MLMar 25, 2026·also NUS, Imperial, Oxford, TU Munich

MedOpenClaw: Auditable Medical Imaging Agents Reasoning over Uncurated Full Studies

Giving medical imaging AIs the same tools as human doctors actually *hurts* their performance, revealing a surprising lack of spatial reasoning.

Eval Frameworks & Benchmarks Multimodal Models Tool Use & Agents

Mar 19, 2026

D observations intoMar 19, 2026·also NUS, CAS, D displacement fields are, D image-plane projection of the +3

OmniVTA: Visuo-Tactile World Modeling for Contact-Rich Robotic Manipulation

Robots can now manipulate objects with greater dexterity and adaptability thanks to a new world model that leverages both vision and high-frequency tactile feedback to predict and react to contact dynamics.

Yuhang Zheng, Songen Gu, Weize Li +11

Multimodal Models Robotics & Embodied AI World Models & Planning

Mar 16, 2026

NUSMar 16, 2026

Deep learning and the rate of approximation by flows

Deep learning's approximation power hinges on geodesic distances on manifolds, not just linear spaces, revealing a fundamental departure from classical approximation theory.

Architecture Design (Transformers, SSMs, MoE)Training Efficiency & Optimization

Mar 15, 2026

NUSMar 15, 2026·also Georgia Tech

Navigation beyond Wayfinding: Robots Collaborating with Visually Impaired Users for Environmental Interactions

A robotic guide dog that adapts its movements to assist visually impaired users in interacting with their environment—like opening doors or pressing elevator buttons—outperforms both white canes and non-adaptive guiding systems.

Computer Vision Robotics & Embodied AI Tool Use & Agents

Mar 12, 2026

NUSMar 12, 2026·also PKU, USC

OSCBench: Benchmarking Object State Change in Text-to-Video Generation

Current text-to-video models can generate visually appealing videos, but they often fail to accurately depict how actions change the state of objects, like a potato being peeled.

Xianjing Han, Shiqi Hu, Patrick Carrington +1

Computer Vision Eval Frameworks & Benchmarks Multimodal Models

Mar 11, 2026

NUSMar 11, 2026·also Imperial

Adaptive Manipulation Potential and Haptic Estimation for Tool-Mediated Interaction

Robots can now loosen screws with human-level dexterity thanks to a new framework that combines haptic estimation, online planning, and adaptive stiffness control using a parameterized Equilibrium Manifold.

Robotics & Embodied AI Tool Use & Agents World Models & Planning

Mar 10, 2026

NUSMar 10, 2026

MedMASLab: A Unified Orchestration Framework for Benchmarking Multimodal Medical Multi-Agent Systems

Medical multi-agent systems can reason deeply, but fall apart when switching between medical specialties, highlighting a critical need for more robust architectures.

Eval Frameworks & Benchmarks Multimodal Models Tool Use & Agents

Mar 9, 2026

NUSMar 9, 2026·also Horizon Robotics

SPIRAL: A Closed-Loop Framework for Self-Improving Action World Models via Reflective Planning Agents

By closing the loop with explicit planning and feedback, SPIRAL overcomes the temporal drift and weak semantic grounding plaguing one-shot video generation models.

Multimodal Models Tool Use & Agents World Models & Planning

NUSMar 9, 2026·also D location of the table center. •

TeamHOI: Learning a Unified Policy for Cooperative Human-Object Interactions with Any Team Size

A single, decentralized policy can now control teams of physics-based humanoids to cooperatively manipulate objects, even with varying team sizes and object shapes.

S. Lionar, Stefan Lionar, Gim Hee Lee

Architecture Design (Transformers, SSMs, MoE)Robotics & Embodied AI Tool Use & Agents

Mar 8, 2026

NUSMar 8, 2026·also DAMO

Verifiable Reasoning for LLM-based Generative Recommendation

LLMs can generate better recommendations if they pause to verify their reasoning steps, rather than reasoning in one long chain.

Natural Language Processing Reasoning & Chain-of-Thought Recommendation & Information Retrieval

Mar 6, 2026

CMU MLMar 6, 2026·also NUS

Training-free Latent Inter-Frame Pruning with Attention Recovery

Accelerate video generation by 45% without retraining, simply by pruning redundant latent patches and cleverly recovering attention scores.

Dennis Menn, Dennis Menn, Yuedong Yang +14

Architecture Design (Transformers, SSMs, MoE)Computer Vision Inference & Quantization

Mar 5, 2026

Microsoft ResearchMar 5, 2026·also NUS, Qinzheng Sun1

UniM: A Unified Any-to-Any Interleaved Multimodal Benchmark

Forget unimodal tasks—UniM throws down the gauntlet for truly unified multimodal AI, demanding models juggle any combination of text, image, audio, video, code, documents, and 3D inputs and outputs in a single, interleaved stream.

Yanling Li, Minghui Guo, Kaiwen Zhang +13

Eval Frameworks & Benchmarks Multimodal Models Natural Language Processing

Mar 2, 2026

NUSMar 2, 2026

QIME: Constructing Interpretable Medical Text Embeddings via Ontology-Grounded Questions

Finally, interpretable medical text embeddings that rival black-box models in performance, thanks to ontology-grounded question generation and a training-free approach.

Yixuan Tang, Zheng-Lin Lin, Yandong Sun +3

Interpretability & Mechanistic Interp Natural Language Processing Scientific Discovery & Drug Design

Feb 26, 2026

Feb 26, 2026·also DAMO, NUS, Tsinghua AI, Beihang +6

Obscure but Effective: Classical Chinese Jailbreak Prompt Optimization via Bio-Inspired Search

Classical Chinese, with its conciseness and obscurity, unlocks a surprisingly effective attack vector against LLM safety filters, and can be automatically exploited via bio-inspired optimization.

Xun Huang, Simeng Qin, Simeng Qin +10

Natural Language Processing Red-Teaming & Adversarial Robustness

Feb 26, 2026·also NUS, Purdue, SJTU

SettleFL: Trustless and Scalable Reward Settlement Protocol for Federated Learning on Permissionless Blockchains (Extended version)

Slash gas costs for decentralized federated learning by using optimistic execution and validity proofs, scaling to 800 participants without compromising trust.

Yang Hua, Yang Hua, Linshan Jiang +8

Distributed Systems & Hardware Training Efficiency & Optimization

Feb 25, 2026

Feb 25, 2026·also NUS

EditFlow: Benchmarking and Optimizing Code Edit Recommendation Systems via Reconstruction of Developer Flows

Code-generating LLMs may ace static benchmarks, but developers are actually *slower* when using them because they disrupt mental flow, highlighting the need for benchmarks that capture the temporal dynamics of coding.

Chenyan Liu, Yun Lin, Jiaxin Chang +5

Code Generation & Program Synthesis Eval Frameworks & Benchmarks Recommendation & Information Retrieval

Feb 23, 2026

Feb 23, 2026·also NUS, Tsinghua AI, HUST

RL-RIG: A Generative Spatial Reasoner via Intrinsic Reflection

Image generation models can now reason about spatial relationships with significantly improved accuracy thanks to a novel reinforcement learning framework that iteratively refines images based on spatial consistency checks.

Tianyu Wang, Xinyi Zhang, Xinwei Long

Computer Vision Multimodal Models RLHF & Preference Learning

MIT CSAILFeb 23, 2026·also NUS, Purdue

Agentic AI for Scalable and Robust Optical Systems Control

Agentic AI can automate complex optical systems control with near-perfect success rates, leaving code-generation approaches in the dust.

Yue-Kai Huang, Philip Ji, Denton Wu +6

Eval Frameworks & Benchmarks Robotics & Embodied AI Tool Use & Agents

Department of Computer ScienceFeb 23, 2026·also NUS

Rules or Weights? Comparing User Understanding of Explainable AI Techniques with the Cognitive XAI-Adaptive Model

Forget benchmarks, CoXAM offers a cognitive model that finally explains *why* some XAI techniques resonate with users better than others.

Louth Bin Rawshan, Zhuoyu Wang

Interpretability & Mechanistic Interp Reasoning & Chain-of-Thought

NUSFeb 23, 2026

LLM-enabled Applications Require System-Level Threat Monitoring

The trustworthiness of LLM-enabled applications hinges not on further model improvements, but on establishing system-level threat monitoring to detect post-deployment anomalies.

Yedi Zhang, Yedi Zhang, Haoyu Wang +4

Constitutional AI & AI Ethics Eval Frameworks & Benchmarks Red-Teaming & Adversarial Robustness

Feb 22, 2026

Feb 22, 2026·also NUS, Tencent AI

CosyAccent: Duration-Controllable Accent Normalization Using Source-Synthesis Training Data

Forget collecting real L2 speech data: this accent normalization method trains on synthetic L2 speech generated from text, achieving better content preservation and naturalness than models trained on real data.

Qibing Bai, Shuhao Shi, Yukai Ju +1

Data Curation & Synthetic Data Natural Language Processing Speech & Audio

Feb 19, 2026

Feb 19, 2026·also NUS, Oxford

When More Experts Hurt: Underfitting in Multi-Expert Learning to Defer

Multi-expert systems can suffer from *worse* performance than single-expert systems due to an inherent underfitting problem that arises from the difficulty of identifying the correct expert to defer to.

Shuqi Liu, Yuzhou Cao, Bo An +1

Architecture Design (Transformers, SSMs, MoE)Tool Use & Agents

MIT CSAILFeb 19, 2026·also NUS, Scuola Superiore Sant'Anna, TU Munich

Physical Human-Robot Interaction for Grasping in Augmented Reality via Rigid-Soft Robot Synergy

Control hybrid rigid-soft robots with the ease of AR teleoperation, thanks to a new pipeline that accurately models the soft robot's real-world behavior in simulation.

Huishi Huang, Jack Klusmann, Shuchen Ji +4

Computer Vision Robotics & Embodied AI Tool Use & Agents

Feb 17, 2026

NUSFeb 17, 2026

Zombie Agents: Persistent Control of Self-Evolving LLM Agents via Self-Reinforcing Injections

Self-evolving LLM agents can be persistently compromised by injecting malicious payloads into their long-term memory, turning them into "zombie agents" that execute unauthorized actions across sessions.

Xianglin Yang, Yufei He, Shuo Ji +2

Constitutional AI & AI Ethics Red-Teaming & Adversarial Robustness Tool Use & Agents

Feb 12, 2026

Feb 12, 2026·also NUS, University of Georgia

Benchmarking for Single Feature Attribution with Microarchitecture Cliffs

Pinpointing mismatches between architectural simulators and RTL implementations is now far easier, thanks to a new benchmark generation methodology that isolates single microarchitectural features.

Hao Zhen, Qingxuan Kang, Yungang Bao +1

Architecture Design (Transformers, SSMs, MoE)Eval Frameworks & Benchmarks

NUSFeb 12, 2026·also Correspoding Author

dVoting: Fast Voting for dLLMs

dVoting unlocks significant reasoning gains for diffusion LMs at test time by iteratively refining only the most uncertain tokens, sidestepping the computational bottleneck of full re-sampling.

Sicheng Feng, Sicheng Feng, Zigeng Chen +3

Inference & Quantization Reasoning & Chain-of-Thought

NUSFeb 12, 2026·also SJTU

WebTestPilot: Agentic End-to-End Web Testing against Natural Language Specification by Inferring Oracles with Symbolized GUI Elements

LLM agents can now achieve near-perfect accuracy in end-to-end web testing by symbolizing GUI elements and inferring pre/post-condition oracles, blowing away previous approaches.

Xiwen Teoh, Yun Lin, Duc-Minh Nguyen +1

Eval Frameworks & Benchmarks Multimodal Models Tool Use & Agents

Jan 9, 2026

Department of Orthopedic SurgeryJan 9, 2026·also NUS, Singapore General Hospital

Orthopaedic Research in Singapore: The Past, Present, and Future.

This review provides an overview of the research landscape in Singapore, potentially facilitating collaborations and highlighting areas of expertise for international researchers.

Bryan Yijia Tan, Julia Poh Hwee Ng, L. Liow +4

NUSJan 9, 2026

Research Integrity and Academic Authority in the Age of Artificial Intelligence: From Discovery to Curation?

As AI research concentrates in private labs, universities must shift from maximizing discovery to ensuring knowledge trustworthiness to maintain academic authority.

Simon Chesterman, Loy Hui Chieh

Nov 3, 2025

NUSNov 3, 2025

Visual Change Detection and Policy Learning for Adaptive Autonomous Navigation

A drone can now autonomously replan its path in response to detected environmental changes, using a UNet+CBAM change detection model and DQN-based path planning.

Zhunyi Feng, Akshay Narayan

Computer Vision Robotics & Embodied AI Tool Use & Agents

Jun 30, 2025

Jun 30, 2025·also NUS

Neural Algorithmic Reasoners informed Large Language Model for Multi-Agent Path Finding

LLMs can now navigate complex multi-agent pathfinding scenarios with superhuman efficiency, thanks to a neural algorithmic reasoning module that injects graph-aware intelligence.

Pu Feng, Size Wang, Yuhong Cao +3

Reasoning & Chain-of-Thought Tool Use & Agents World Models & Planning

Apr 1, 2025

NUSApr 1, 2025

Enhanced polarization locking in VCSELs

Tailoring VCSEL oxide apertures and bias currents unlocks significantly enhanced polarization locking, paving the way for practical polarization-encoded Ising computers.

Architecture Design (Transformers, SSMs, MoE)Scientific Discovery & Drug Design