Search papers, labs, and topics across Lattice.
The Hong Kong University of Science and Technology (Guangzhou)
2
0
5
Forget reward engineering: this work shows LLMs can self-evolve and outperform larger models by learning to explore and summarize new environments autonomously.
LLMs can be fine-tuned more efficiently by adapting experts in the frequency domain, leading to better performance with fewer parameters.