Search papers, labs, and topics across Lattice.
This paper investigates zero-shot, inference-only methods for improving position encodings and optimizing attention mechanisms to enable context length extrapolation in LLMs for long code completion. The study focuses on evaluating existing techniques without training or fine-tuning, specifically targeting the challenge of fixed context lengths hindering generalization to long, domain-specific code sequences. The paper provides an analysis of methods that facilitate context length extrapolation in code.
Extending LLMs to handle longer code sequences may be possible without retraining, using clever modifications to positional embeddings and attention mechanisms.
The rapid advancement of large language models (LLMs) has led to a significant increase in automated tools in the software engineering, capable of performing various code-related tasks such as code generation, completion, and translation. Despite these advancements, its effectiveness is constrained by fixed context lengths, limiting its ability to generalize across long, domain-specific code sequences. To address this challenge, we investigate zero-shot, inference-only methods aimed at improving position encodings and optimizing attention mechanisms. Our goal is to provide a thorough analysis of current approaches that facilitate context length extrapolation in code, particularly in the context of long code completion tasks.