Search papers, labs, and topics across Lattice.
Independent Researcher
1
0
2
Self-conditioning on verified trajectories boosts reinforcement learning performance by over 8%, revealing the power of internal feedback in credit assignment.