[Review] Automated Program Repair in the Era of Large Pre-trained Language Models
The paper presents the first extensive evaluation of recent LLMs for fixing real-world projects. It evaluates the effectiveness of the Automated Program Repair(ARP) in the era of LLMs.
Several conclusions were drawn:
- As we increase the size of the model, we also increase in the number of correct and plausible patches generated.
- Successfully utilizing the code after the buggy lines is important for fixing bugs.
- While LLMs have the capability to perform fault localization and repair in one shot, for real world software systems, it is still more cost-effective to first use traditional fault localization techniques to pinpoint the precise bug locations and then leverage LLMs for more targeted patch generation.
- By directly applying LLMs for APR without any specific change/finetuning, we can already achieve the highest number of correct fixes compared to existing baselines.
- Entropy computation via LLMs can help distinguish correct patches from plausible patches.
- Sum entropy performs slightly better compared to mean entropy.
![[Review] Automated Program Repair in the Era of Large Pre-trained Language Models](/blog/images/23/cover.jpg)
![[Review] Assisting Static Analysis with Large Language Models: A ChatGPT Experiment](/blog/images/42/cover.png)
![[Review] Detecting Missed Security Operations Through Differential Checking of Object-based Similar Paths](/blog/images/41/cover.png)
![[Review] GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis](/blog/images/40/cover.png)
![[Review] MoonShine: Optimizing OS Fuzzer Seed Selection with Trace Distillation](/blog/images/39/cover.png)
![[Review] One Simple API Can Cause Hundreds of Bugs: An Analysis of Refcounting Bugs in All Modern Linux Kernels](/blog/images/38/cover.png)