Search papers, labs, and topics across Lattice.
1
0
3
Pokemon, with its blend of partial observability, game-theoretic reasoning, and long-horizon planning, emerges as a surprisingly effective benchmark, exposing critical gaps in LLM and RL capabilities that existing suites miss.