Search papers, labs, and topics across Lattice.
This paper introduces GameSight, a two-stage model for automatic soccer commentary generation that addresses limitations of previous end-to-end methods by incorporating knowledge and visual reasoning. GameSight first aligns anonymous entities (players, teams) with visual and contextual analysis, then refines the commentary using external historical statistics and internal game state. Experiments on the SN-Caption-test-align dataset demonstrate that GameSight improves player alignment accuracy by 18.5% compared to Gemini 2.5-pro, and also outperforms in segment-level accuracy, commentary quality, game-level contextual relevance, and structural composition.
Forget robotic summaries: GameSight brings the human touch to AI soccer commentary by weaving in player stats and game context, making it feel like a real broadcast.
Soccer commentary plays a crucial role in enhancing the soccer game viewing experience for audiences. Previous studies in automatic soccer commentary generation typically adopt an end-to-end method to generate anonymous live text commentary. Such generated commentary is insufficient in the context of real-world live televised commentary, as it contains anonymous entities, context-dependent errors and lacks statistical insights of the game events. To bridge the gap, we propose GameSight, a two-stage model to address soccer commentary generation as a knowledge-enhanced visual reasoning task, enabling live-televised-like knowledgeable commentary with accurate reference to entities (players and teams). GameSight starts by performing visual reasoning to align anonymous entities with fine-grained visual and contextual analysis. Subsequently, the entity-aligned commentary is refined with knowledge by incorporating external historical statistics and iteratively updated internal game state information. Consequently, GameSight improves the player alignment accuracy by 18.5% on SN-Caption-test-align dataset compared to Gemini 2.5-pro. Combined with further knowledge enhancement, GameSight outperforms in segment-level accuracy and commentary quality, as well as game-level contextual relevance and structural composition. We believe that our work paves the way for a more informative and engaging human-centric experience with the AI sports application. Demo Page: https://gamesight2025.github.io/gamesight2025