Search papers, labs, and topics across Lattice.
Corresponding authors
2
0
5
Jointly optimizing high-level and low-level policies can dramatically enhance LLM performance in tool-use tasks, overcoming planner-executor misalignment.
Source data that looks similar can still tank your cross-domain RL: aligning with target-domain Bellman targets is what actually matters for transfer.