Search papers, labs, and topics across Lattice.
The paper introduces Multi-CoLoR, a framework for code localization in multi-language codebases that integrates organizational context with graph-based reasoning. It leverages a similar issue context (SIC) module to retrieve related historical issues for search space pruning, followed by a code graph traversal agent (an extension of LocAgent) for structural reasoning within C++ and QML code. Experiments on a real-world enterprise dataset demonstrate that Multi-CoLoR improves localization accuracy (Acc@5) and reduces tool calls compared to lexical and graph-based baselines.
Forget grepping through codebases: Multi-CoLoR leverages historical issue context and graph traversal to pinpoint relevant code across multiple languages with improved accuracy.
Large language models demonstrate strong capabilities in code generation but struggle to navigate complex, multi-language repositories to locate relevant code. Effective code localization requires understanding both organizational context (e.g., historical issue-fix patterns) and structural relationships within heterogeneous codebases. Existing methods either (i) focus narrowly on single-language benchmarks, (ii) retrieve code across languages via shallow textual similarity, or (iii) assume no prior context. We present Multi-CoLoR, a framework for Context-aware Localization and Reasoning across Multi-Language codebases, which integrates organizational knowledge retrieval with graph-based reasoning to traverse complex software ecosystems. Multi-CoLoR operates in two stages: (i) a similar issue context (SIC) module retrieves semantically and organizationally related historical issues to prune the search space, and (ii) a code graph traversal agent (an extended version of LocAgent, a state-of-the-art localization framework) performs structural reasoning within C++ and QML codebases. Evaluations on a real-world enterprise dataset show that incorporating SIC reduces the search space and improves localization accuracy, and graph-based reasoning generalizes effectively beyond Python-only repositories. Combined, Multi-CoLoR improves Acc@5 over both lexical and graph-based baselines while reducing tool calls on an AMD codebase.