[Review] MINER: A Hybrid Data-Driven Approach for REST API Fuzzing

[Review] MINER: A Hybrid Data-Driven Approach for REST API Fuzzing

Link here

This paper proposed a new approach for API fuzzing, which:

  1. Focuses more on the long sequence query.
  2. Induces a customized attention model to support fuzzing process.
  3. Implements a new data-driven security rule checker to capture the new kind of errors caused by undefined parameters.

[1]: REST standard, usually including GET, POST, PUT, DELETE.

Motivation:

Cloud service testing is important, but early works(like RESTler) fail to generate long request sequence for testing, which is not enough to detect deep errors hidden in hard-to-reach states of cloud services. MINER applies length oriented mechanisms to generate long request sequence, and applies a attention model to help pass the semantic checking. Further more, it applies a data-driven security rule checker to capture the new kind of errors caused by undefined parameters.

Implementation:

  • 5 main components: Sequence Template Selection, Generation Module, Fuzzing Module, Collection Module, and Training Module.
  • Sequence Template Selection first generates the frameworks of the sequence.
  • Generation Module fills in the frameworks with parameters.
  • Fuzzing Module fuzzes the cloud service.
  • Collection Module collects the related data(valid request sequences, param-value pair).
  • Training Module is periodically invoked to train an attention model, which helps the Generation Module to work better.

response: 50x means bugs found, 20x means syntactic and semantic correctness, 40x means syntactic and semantic error.

Evaluation:

  • Construct two prototypes without and with the DataDriven Checker to measure the error discovery performance of the checker.
  • Comparison: Compares with state-of-the-art open-sourced fuzzer RESTler.
  • Benchmarks: GitLab, Bugzilla and WordPress via 11 REST APIs.
  • Deploy an open-sourced version of each cloud service on their own server.
  • Setup: Lasts for 48 hours on a docker container configured with 8 CPU cores, 20 GB RAM, Python 3.8.2, and the OS of Ubuntu 16.04 LTS. They run evaluations on 3 servers, each of which has two E5-2680 CPUs, 256GB RAM and a Nvidia GTX 1080 Ti graphics card.
  • Evaluate the 1) pass rate of syntax and semantic checking, 2) count the types of generated requests that get responses in 20× Range, 3) count the number of unique errors, which trigger the responses in 50× Range or violate the defined security rules.
  • Apply length-orientated sequence construction, attention model, and other techniques in RESTler to have a further analysis.

  • Performance on Reproducing Serious Bugs, Coverage Performance Analysis, Schedule of the Training Module, and Execution Distribution of Requests.

Future work:

  • Apply attention model to other areas.
  • To improve the reproducibility of the bugs found.
  • Keep improving the length-oriented quest generation approach.




[Review] MINER: A Hybrid Data-Driven Approach for REST API Fuzzing

https://gax-c.github.io/blog/2023/10/25/12_paper_review_3/

Author

Gax

Posted on

2023-10-25

Updated on

2023-11-01

Licensed under

Comments