MarburgRuhr University BochumUniversity of MunichFeb 22, 2026arXiv:2602.19133

A Dataset for Named Entity Recognition and Relation Extraction from Art-historical Image Descriptions

Stefanie Schneider, Miriam Göldl, Julian Stalter, Ricarda Vollmer

AI Summary

The paper introduces FRAME, a new manually annotated dataset of art-historical image descriptions designed for Named Entity Recognition (NER) and Relation Extraction (RE). FRAME contains fine-grained annotations across three layers (metadata, content, and co-reference) with 37 entity types aligned with Wikidata. The dataset, released as UIMA XMI CAS files, enables benchmarking and fine-tuning of NER and RE systems, including LLMs, in the art-historical domain.

Key Contribution

Unlock art history insights with FRAME, a meticulously annotated dataset that lets you train LLMs to understand the intricate relationships within artworks.

Abstract

This paper introduces FRAME (Fine-grained Recognition of Art-historical Metadata and Entities), a manually annotated dataset of art-historical image descriptions for Named Entity Recognition (NER) and Relation Extraction (RE). Descriptions were collected from museum catalogs, auction listings, open-access platforms, and scholarly databases, then filtered to ensure that each text focuses on a single artwork and contains explicit statements about its material, composition, or iconography. FRAME provides stand-off annotations in three layers: a metadata layer for object-level properties, a content layer for depicted subjects and motifs, and a co-reference layer linking repeated mentions. Across layers, entity spans are labeled with 37 types and connected by typed RE links between mentions. Entity types are aligned with Wikidata to support Named Entity Linking (NEL) and downstream knowledge-graph construction. The dataset is released as UIMA XMI Common Analysis Structure (CAS) files with accompanying images and bibliographic metadata, and can be used to benchmark and fine-tune NER and RE systems, including zero- and few-shot setups with Large Language Models (LLMs).

Data Curation & Synthetic Data Multimodal Models Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References0

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

A Dataset for Named Entity Recognition and Relation Extraction from Art-historical Image Descriptions

Related Papers