Search papers, labs, and topics across Lattice.
This paper introduces G-Loss, a novel loss function for fine-tuning language models that incorporates global semantic structure by building a document-similarity graph and using semi-supervised label propagation. By leveraging structural relationships within the embedding manifold, G-Loss guides the model to learn more discriminative embeddings. Experiments on five classification datasets demonstrate that G-Loss converges faster and achieves higher classification accuracy compared to models fine-tuned with traditional loss functions.
Fine-tuning language models with a graph-guided loss that captures global semantic relationships can significantly boost classification accuracy and convergence speed.
Traditional loss functions, including cross-entropy, contrastive, triplet, and su pervised contrastive losses, used for fine-tuning pre-trained language models such as BERT, operate only within local neighborhoods and fail to account for the global semantic structure. We present G-Loss, a graph-guided loss function that incorporates semi-supervised label propagation to use structural relationships within the embedding manifold. G-Loss builds a document-similarity graph that captures global semantic relationships, thereby guiding the model to learn more discriminative and robust embeddings. We evaluate G-Loss on five benchmark datasets covering key downstream classification tasks: MR (sentiment analysis), R8 and R52 (topic categorization), Ohsumed (medical document classification), and 20NG (news categorization). In the majority of experimental setups, G-Loss converges faster and produces semantically coherent embedding spaces, resulting in higher classification accuracy than models fine-tuned with traditional loss functions.