Search papers, labs, and topics across Lattice.
The paper introduces a system for predicting semantic relatedness between English sentences using a fine-tuned XLM-RoBERTa-large model combined with a CNN architecture and k-fold cross-validation. This approach was evaluated on the SemRel2024 dataset, a benchmark for semantic textual relatedness. The system achieved state-of-the-art performance, with a Spearman's correlation of 0.854 and a Pearson's correlation of 0.863, demonstrating significant improvements over other transformer-based models and traditional machine learning approaches.
Fine-tuning XLM-RoBERTa-large with a CNN and k-fold cross-validation achieves state-of-the-art semantic relatedness prediction, surpassing other transformers and traditional methods on the SemRel2024 benchmark.
Accurately measuring the semantic relatedness between sentences is crucial for various natural language processing (NLP) tasks, including question answering, text summarization, and information retrieval. This study introduces a system designed to precisely evaluate the relatedness of English sentences. Utilizing the SemRel2024 dataset, a comprehensive benchmark for semantic textual relatedness (STR), we conducted baseline experiments across multiple monolingual settings. Our proposed method integrates k‐fold cross‐validation with a fine‐tuned XLMRoBERTa‐large model and a convolutional neural network (CNN) architecture, achieving the highest Spearman's correlation of 0.854 and Pearson's correlation of 0.863. We also explored several transformer‐based models (RoBERTa‐base, RoBERTa‐large, BERT‐large) and their architectural variations, as well as the effects of attention mechanisms with word embeddings such as Word2Vec, FastText, and global vectors (GloVe). Among traditional machine learning models, the TF‐IDF + random forest (RF) model exhibited the best performance. Our findings demonstrate the potential for significant advancements in NLP applications through enhanced semantic relatedness prediction, thereby improving machine translation, information retrieval, and the development of sophisticated language models capable of nuanced understanding.