UT AustinJun 11, 2026arXiv:2606.12790

GENIE: A Fine-Grained Measure for Novelty

Ramya Namuduri, Manya Wadhwa, A. Zheng, Anshun Asher Zheng, Greg Durrett, Junyi Jessy Li

AI Summary

This paper introduces GENIE, a fine-grained evaluation metric designed to assess the novelty of outputs generated by Large Language Models (LLMs) in a task-specific context. By addressing the limitations of holistic metrics, GENIE captures the high-dimensional nature of novelty and provides insights into the specific properties that contribute to creative outputs. The authors demonstrate that GENIE can effectively evaluate mitigation strategies aimed at enhancing creativity, revealing areas for improvement in model-generated content.

Key Contribution

GENIE reveals that traditional metrics fail to capture the nuanced dimensions of novelty, offering a sharper lens for evaluating LLM creativity.

Abstract

Large Language Models have consistently demonstrated a lack of creativity and diversity across tasks. Prior work has focused on addressing whether models are capable of generating creative outputs. Here, we aim to consider novelty and investigate what makes model-generated content novel or not novel in a task-specific manner. We propose a fine-grained evaluation metric GENIE to measure the novelty of responses along task-specific features with respect to a population of responses. We show that unlike GENIE, holistic metrics struggle to capture the high-dimensionality of novelty and do not provide insight on which properties they target. Finally, we use GENIE to measure the effectiveness of mitigation methods that address creativity to better understand where these methods can improve novelty.

Eval Frameworks & Benchmarks Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References48

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

GENIE: A Fine-Grained Measure for Novelty

Related Papers