[Review] Automated Program Repair in the Era of Large Pre-trained Language Models

[Review] Automated Program Repair in the Era of Large Pre-trained Language Models

Link here

The paper presents the first extensive evaluation of recent LLMs for fixing real-world projects. It evaluates the effectiveness of the Automated Program Repair(ARP) in the era of LLMs.

Several conclusions were drawn:

  • As we increase the size of the model, we also increase in the number of correct and plausible patches generated.
  • Successfully utilizing the code after the buggy lines is important for fixing bugs.
  • While LLMs have the capability to perform fault localization and repair in one shot, for real world software systems, it is still more cost-effective to first use traditional fault localization techniques to pinpoint the precise bug locations and then leverage LLMs for more targeted patch generation.
  • By directly applying LLMs for APR without any specific change/finetuning, we can already achieve the highest number of correct fixes compared to existing baselines.
  • Entropy computation via LLMs can help distinguish correct patches from plausible patches.
  • Sum entropy performs slightly better compared to mean entropy.
Read more