[Review] Examining Zero-Shot Vulnerability Repair with Large Language Models
The paper tests the performance of LLM for program repair. The same topic as Automated Program Repair in the Era of Large Pre-trained Language Models. Differently, this paper focuses more on the details, whose program repair setting is much more complicated.
Some conclusions were drawn:
- LLMs can generate fixes to bugs.
- But for real-world settings, the performance is not enough.
Background:
- Security bugs are significant.
- LLMs are popular and has outstanding performance.
Implementation:
RQ1: Can off-the-shelf LLMs generate safe and functional code to fix security vulnerabilities?
RQ2: Does varying the amount of context in the comments of a prompt affect the LLM’s ability to suggest fixes?
RQ3: What are the challenges when using LLMs to fix vulnerabilities in the real world?
RQ4: How reliable are LLMs at generating repairs?