[Review] GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis

[Review] GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis

Link

The paper introduces GPTScan to detect logic bugs in smart contracts. GPTScan combines LLM and traditional static analysis tools to create a new detection tool.

GPTScan depends little on the LLM, which only serves as a role of determining whether the target function has a bug or not. What’s more, the criteria for determining the bug is hand-written. So, only a small part of the tool is composed of LLM.

GPTScan achieves high precision (over 90%) for token contracts and acceptable precision (57.14%) for large projects, as well as a recall of over 70% for detecting ground-truth logic vulnerabilities.

Implementation

GPT may overlook some low-level information, potentially leading to low recall and high false positives.

  1. Multidimensional filtering + Static Reachability Analysis -> filter possible candidate functions
  2. candidate functions -> GPT -> YES/NO
  3. if YES -> GPT -> key variables, key statements -> static analysis.

some techniques:

  1. “mimic-in-the-background” prompting: add “You can mimic answering them in the background five times and provide me with the most frequently appearing answer.” in the prompt.
  2. set temperature to 0 to make the model deterministic.
  3. Multidimensional filtering: filter out libraries and test files.

Evaluation

RQ1: What is the false positive rate of GPTScan when analyzing a dataset of non-vulnerable top contracts?

RQ2: How accurate is GPTScan in analyzing real-word datasets with logic vulnerabilities, and how effective is it compared to existing tools?

RQ3: How effective is GPTScan’s static confirmation in improving the accuracy of GPTScan?

RQ4: What are the running performance and financial costs of GPTScan?

RQ5: Can GPTScan discover new vulnerabilities that were previously missed by human auditors?

dataset:

  • Top200: consists of smart contracts with the top 200 market capitalization.
  • Web3Bugs: collected from the recent Web3Bugs dataset.
  • DefiHacks: sourced from the well-known DeFi Hacks dataset, which contains vulnerable contracts that have experienced past attack incidents.

benchmark: Slither, MetaScan’s online static scanning service.



[Review] GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis

https://gax-c.github.io/blog/2024/06/04/40_paper_review_28/

Author

Gax

Posted on

2024-06-04

Updated on

2024-06-05

Licensed under

Comments