Search papers, labs, and topics across Lattice.
Università della Svizzera italiana
2
0
5
3
Forget expensive LLMs: modern small language models can judge code correctness well enough to rival the code generation performance of models 5-25x larger.
Despite generating 2.4x more suggestions, ChatGPT-4 misses 90% of the quality issues spotted by human code reviewers, highlighting the limitations of current DL-based code review automation.