Search papers, labs, and topics across Lattice.
The paper introduces Variable Substitution, a novel graph augmentation technique for graph contrastive learning (GCL) tailored for mathematical formula retrieval. Unlike generic GCL augmentations that can disrupt the semantics of mathematical expressions, Variable Substitution preserves algebraic relationships and formula structure. Experiments using a GCL-based retrieval model demonstrate that Variable Substitution significantly boosts retrieval performance compared to standard augmentation methods.
Swapping variables in mathematical formulas during graph contrastive learning surprisingly improves retrieval accuracy by preserving crucial algebraic relationships.
This paper introduces Variable Substitution as a domain-specific graph augmentation technique for graph contrastive learning (GCL) in the context of searching for mathematical formulas. Standard GCL augmentation techniques often distort the semantic meaning of mathematical formulas, particularly for small and highly structured graphs. Variable Substitution, on the other hand, preserves the core algebraic relationships and formula structure. To demonstrate the effectiveness of our technique, we apply it to a classic GCL-based retrieval model. Experiments show that this straightforward approach significantly improves retrieval performance compared to generic augmentation strategies. We release the code on GitHub.\footnote{https://github.com/lazywulf/formula_ret_aug}.