CocoaBench Team +24Apr 13, 2026路also Hubei University, Hubei University), Institute of Artificial Intelligence and Future Networks, Institute of Foundation Models (IFM) +4
Today's best AI agents still fail more than half the time on real-world tasks combining vision, search, and coding, revealing critical gaps in reasoning and tool use.