Lattice AI Research

Research focus

Tool Use & Agents (3)Open-Source Models & Weights (2)Speech & Audio (2)Architecture Design (Transformers, SSMs, MoE) (1)Natural Language Processing (1)

Frequent co-authors

Jielin Qiu (3)Liangwei Yang (3)Ming Zhu (3)Juntao Tan (3)

Papers (4)

Mar 5, 2026

NVIDIAMar 5, 2026·also Salesforce AI

Building Enterprise Realtime Voice Agents from Scratch: A Technical Tutorial

Forget slow, end-to-end models: building real-time voice agents hinges on a cascaded streaming pipeline, as demonstrated by a new tutorial achieving sub-second latency.

Jielin Qiu, Zixiang Chen, Liangwei Yang +11

Open-Source Models & Weights Speech & Audio Tool Use & Agents

Mar 4, 2026

NVIDIAMar 4, 2026·also Salesforce AI

Position: Vector Prompt Interfaces Should Be Exposed to Enable Customization of Large Language Models

Forget text prompts: vector prompt interfaces are the key to unlocking scalable and stable LLM customization.

Liangwei Yang, Shiyu Wang, Rithesh Murthy +11

Architecture Design (Transformers, SSMs, MoE)Natural Language Processing Open-Source Models & Weights

Mar 2, 2026

Mosaic AIMar 2, 2026·also Salesforce AI

VoiceAgengRAG: Solving the RAG Latency Bottleneck in Real-Time Voice Agents Using Dual-Agent Architectures

Real-time voice agents can bypass slow vector DB lookups with a dual-agent architecture that pre-fetches relevant documents into a sub-millisecond semantic cache.

Jielin Qiu, Zixiang Chen, Liangwei Yang +13

Recommendation & Information Retrieval Speech & Audio Tool Use & Agents

Feb 28, 2026

Tsinghua AIFeb 28, 2026·also DAMO, CAS, PolyU, RUC +1

Qwen3-Coder-Next Technical Report

An 80B model that runs like a 3B? Qwen3-Coder-Next shows you can get competitive coding agent performance with a fraction of the active parameters, thanks to smart training.

Ruisheng Cao, Mouxiang Chen, Jiawei Chen +17

Code Generation & Program Synthesis Inference & Quantization Tool Use & Agents

Search

Wenting Zhao

Research focus

Frequent co-authors

Papers (4)