[Review] Automated Program Repair in the Era of Large Pre-trained Language Models
The paper presents the first extensive evaluation of recent LLMs for fixing real-world projects. It evaluates the effectiveness of the Automated Program Repair(ARP) in the era of LLMs.
Several conclusions were drawn:
- As we increase the size of the model, we also increase in the number of correct and plausible patches generated.
- Successfully utilizing the code after the buggy lines is important for fixing bugs.
- While LLMs have the capability to perform fault localization and repair in one shot, for real world software systems, it is still more cost-effective to first use traditional fault localization techniques to pinpoint the precise bug locations and then leverage LLMs for more targeted patch generation.
- By directly applying LLMs for APR without any specific change/finetuning, we can already achieve the highest number of correct fixes compared to existing baselines.
- Entropy computation via LLMs can help distinguish correct patches from plausible patches.
- Sum entropy performs slightly better compared to mean entropy.
![[Review] Automated Program Repair in the Era of Large Pre-trained Language Models](/blog/images/23/cover.jpg)
![[Review] Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm](/blog/images/22/cover.png)
![[Review] Large Language Models are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models](/blog/images/21/cover.jpg)
![[Review] On the Naturalness of Software](/blog/images/20/cover.jpg)
![[Review] Titan : Efficient Multi-target Directed Greybox Fuzzing](/blog/images/19/cover.jpg)
![[Review] Whole Test Suite Generation](/blog/images/18/cover.png)
![[Review] Feedback-directed Random Test Generation](/blog/images/17/cover.png)
![[Review] autofz: Automated Fuzzer Composition at Runtime](/blog/images/16/cover.jpg)

![[Review] A Large-Scale Empirical Analysis of the Vulnerabilities Introduced by Third-Party Components in IoT Firmware](/blog/images/14/cover.png)
![[Review] Assisting Static Analysis with Large Language Models: A ChatGPT Experiment](/blog/images/42/cover.png)
![[Review] Detecting Missed Security Operations Through Differential Checking of Object-based Similar Paths](/blog/images/41/cover.png)
![[Review] GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis](/blog/images/40/cover.png)
![[Review] MoonShine: Optimizing OS Fuzzer Seed Selection with Trace Distillation](/blog/images/39/cover.png)
![[Review] One Simple API Can Cause Hundreds of Bugs: An Analysis of Refcounting Bugs in All Modern Linux Kernels](/blog/images/38/cover.png)