Search papers, labs, and topics across Lattice.
2
0
6
Natural language database interfaces can be made far more robust by explicitly surfacing the system's interpretation of user intent and allowing for interactive disambiguation.
Stop hand-tuning reward learning losses: MAVRL learns a shared reward function from diverse feedback signals by treating each as a likelihood within a Bayesian inference framework.