Search papers, labs, and topics across Lattice.
1
0
3
Ditching the critic doesn't mean sacrificing fine-grained credit assignment: RTMC leverages overlapping states in rollout trees to estimate per-step Q-values, outperforming critic-free baselines on SWE-bench.