Search papers, labs, and topics across Lattice.
1
0
3
9
Open-ended reinforcement learning with LLM-based rewards unlocks surprisingly strong performance in medical reasoning for multimodal models, even with limited training data.