Avignon UniversitéLIAMar 9, 2026arXiv:2603.08282

Using Multimodal and Language-Agnostic Sentence Embeddings for Abstractive Summarization

Chaimae Chellaf, Salima Mdhaffar, Yannick Estève, Stéphane Huet

AI Summary

This paper introduces SBARThez, a modified BART-based French model for abstractive summarization that leverages multimodal and multilingual sentence embeddings from LaBSE, SONAR, and BGE-M3. To mitigate hallucinations, they incorporate a Named Entity Injection mechanism, appending tokenized named entities to the decoder input. Experiments demonstrate competitive performance, particularly for low-resource languages, with the model generating more concise and abstract summaries compared to token-level baselines.

Key Contribution

Hallucination in abstractive summarization? Injecting named entities into the decoder input, along with multimodal embeddings, can keep your French BART model grounded.

Abstract

Abstractive summarization aims to generate concise summaries by creating new sentences, allowing for flexible rephrasing. However, this approach can be vulnerable to inaccuracies, particularly `hallucinations'where the model introduces non-existent information. In this paper, we leverage the use of multimodal and multilingual sentence embeddings derived from pretrained models such as LaBSE, SONAR, and BGE-M3, and feed them into a modified BART-based French model. A Named Entity Injection mechanism that appends tokenized named entities to the decoder input is introduced, in order to improve the factual consistency of the generated summary. Our novel framework, SBARThez, is applicable to both text and speech inputs and supports cross-lingual summarization; it shows competitive performance relative to token-level baselines, especially for low-resource languages, while generating more concise and abstract summaries.

Eval Frameworks & Benchmarks Multimodal Models Natural Language Processing

Citation Metrics

Citations0

Influential citations0

References47

Year2026

VenueN/A

Related Papers

Finding related papers...

Search

Using Multimodal and Language-Agnostic Sentence Embeddings for Abstractive Summarization

Related Papers