[Review] On the Naturalness of Software
A classical paper showing software also has its own naturalness like natural languages, demonstrating the basics of programming prediction and completion.
- Natural languages are repetitive and predictable, which can be processed by statistical approaches(NLP). Programming code is also very regular, and even more so than natural languages.
- Demonstrate, using standard cross-entropy and perplexity measures, that the above model is indeed capturing the high-level statistical regularity that exists in software at the n-gram level (probabilistic chains of tokens).
- Regularities are specific to both projects and to application domains.
![[Review] On the Naturalness of Software](/blog/images/20/cover.jpg)
![[Review] Assisting Static Analysis with Large Language Models: A ChatGPT Experiment](/blog/images/42/cover.png)
![[Review] Detecting Missed Security Operations Through Differential Checking of Object-based Similar Paths](/blog/images/41/cover.png)
![[Review] GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis](/blog/images/40/cover.png)
![[Review] MoonShine: Optimizing OS Fuzzer Seed Selection with Trace Distillation](/blog/images/39/cover.png)
![[Review] One Simple API Can Cause Hundreds of Bugs: An Analysis of Refcounting Bugs in All Modern Linux Kernels](/blog/images/38/cover.png)