Search papers, labs, and topics across Lattice.
The Hong Kong University of Science and Technology
4
0
7
Achieving up to 4.395x speedup in RL training for LLMs by smartly reusing shared prefixes could revolutionize how we approach large-scale model training.
Forget static policies: Autopoiesis uses LLMs to continuously rewrite serving policy code, adapting to runtime dynamics in ways human-designed systems can't.
Squeezing up to 2x more throughput from multimodal data pipelines is now possible with Trident, a new adaptive scheduler that dynamically optimizes operator parallelism and placement while avoiding OOM errors.
Now you can audit black-box LLM APIs for cheating (model substitution, overbilling) with <1% overhead, using verifiable computation.