<Pinned> How to Read a Paper

<Pinned> How to Read a Paper

It’s a good question that “how to read a paper ?”. After reading some papers I suddenly find myself actually don’t know how to effectively and efficiently read a paper. So I’ll make some collections here on this topic.

You can refer to the links above to get some general ideas (these links were copied from Lingming Zhang’s slides of cs527-s23).

According to William Griswold, try to answer the problem below:

Read more
[Review] MoonShine: Optimizing OS Fuzzer Seed Selection with Trace Distillation

[Review] MoonShine: Optimizing OS Fuzzer Seed Selection with Trace Distillation

Link

The paper proposes the concept of Trace Distillation, that is, to distill or extract the key system calls from the original system call sequence without lowering the coverage, and these distilled sequences will be used as the seed for mutation during fuzzing.

From the distillation process, the dependencies between the system calls will be inferred to help distillation. So actually, the root cause of the speed-up is the dependency inference.

Use static analysis to achieve the seed distillation: inferring both explicit and implicit dependencies between system calls.

MoonShine improved Syzkaller’s test coverage for the Linux kernel by 13% and discovered 17 new previously-undisclosed vulnerabilities in the Linux kernel.

Introduction

Kernel fuzzing, an old topic.

Challenges: dependencies between system calls, kernel states for specific bug triggering.

Existing hand-coded rules are not scalable or effective.

Read more
[Review] One Simple API Can Cause Hundreds of Bugs: An Analysis of Refcounting Bugs in All Modern Linux Kernels

[Review] One Simple API Can Cause Hundreds of Bugs: An Analysis of Refcounting Bugs in All Modern Linux Kernels

Link

The paper mainly focuses on the reference counting(refcounting) bugs in Linux Kernel.

  1. Analyzes the history of 1,033 refcounting bugs in 753 versions of Linux Kernels from 2005 to 2022, and concludes 9 critical rules to check refcounting bugs.
  2. Designs a new tool applying these 9 rules, and detects 351 new bugs, of which 240 are confirmed.

Introduction

Reference counting bugs: the reference count is used to record the reference number of an object(similar to smart pointers in C++).

Potential risks: Memory leakage, UAF.

Read more
[Review] HEALER: Relation Learning Guided Kernel Fuzzing

[Review] HEALER: Relation Learning Guided Kernel Fuzzing

Link

The paper proposes a new technique called relation learning to help infer the relations between system calls when fuzzing the kernel.

Relation learning is achieved by constructing a relation graph, which is a two-dimensional graph with each cell representing the dependencies between two system calls.

The relation graph is built through static and dynamic learning. Static learning will infer the dependencies by analyzing the parameters and the return value of each system call. Dynamic learning will determine the dependencies by analyzing the generated minimized system call sequences.

Read more
[Review] Towards Precise Reporting of Cryptographic Misuses

[Review] Towards Precise Reporting of Cryptographic Misuses

Link here

The paper demonstrates an investigation into Java cryptographic misuse. To be brief, the paper does some research on current misuse detection techniques, analyzing the false positive cases and true positive cases they manifest. The paper discovers the root cause of high false positive rate and invalid true positive cases.

Introduction:

Many cryptographic misuse detection techniques have been proposed but with a high false positive rate. Additionally, many of the misuse alarms might not be very actionable to developers, and previous works might have overestimated the number of misuses and vulnerabilities.

Read more
[Review] How Good Are the Specs? A Study of the Bug-Finding Effectiveness of Existing Java API Specifications

[Review] How Good Are the Specs? A Study of the Bug-Finding Effectiveness of Existing Java API Specifications

Link here

The paper is a evaluation, which assesses the current runtime verification technology, and mainly the effectiveness of the existing API specifications.

Three conclusions:

  1. Current RV technology has matured enough with tolerable runtime overhead.
  2. Existing API specification can find many bugs that developers are willing to fix.
  3. The false alarm rates are quite high due to the ineffective specifications.
Read more
[LeetCode] 1531. String Compression II

[LeetCode] 1531. String Compression II

Link here

I thought it should be Greedy at first, but was stuck in Example 2. Then, I considered about DP, but found it hard to figure out how to build the larger solution from the sub-problem.

So, another time I reached out for solution.

We use dp[i][j] to represent the minimum length of the first i letters after compression and after j deletions.

It obvious that if we delete the i-th letter, and then dp[i][j] = dp[i-1][j-1].

If we don’t, we can scan the former string sequence to try to “combine” the letters.

Read more
[LeetCode] 4. Median of Two Sorted Arrays

[LeetCode] 4. Median of Two Sorted Arrays

Link here

An excellent binary search problem. It’s easy to use brute-force or mergesort to solve it, but the time complexity is specified as $O(log(m+n))$ in the problem.

It’s obvious to think of binary search when noticing the time complexity with “log”, but constructing such a algorithm remains to be a big problem.

Here are three ways to solve the problem:

Read more
[Review] Python Crypto Misuses in the Wild

[Review] Python Crypto Misuses in the Wild

Link here

The paper conducts a study on Python crypto API misuses. A tool called LICMA is implemented aiming at detecting crypto API misuses in python.

Several conclusions:

  1. 52.26 % of the Python projects using crypto APIs contain at least a potential misuse.
  2. Only 14.81 % of the projects directly contain a misuse of a crypto API. The rest is introduced through third-party code.
  3. Most Python applications are more secure compared with C or Java, and the distribution between the concrete types of misuses differ a lot.
Read more
[Review] Evaluation of Static Vulnerability Detection Tools with Java Cryptographic API Benchmarks

[Review] Evaluation of Static Vulnerability Detection Tools with Java Cryptographic API Benchmarks

Link here

The paper assesses the performance of the current static vulnerability detection tools in the era of Java cryptographic API misuse.

Main contributions:

  • provide two benchmarks: CryptoAPI-Bench, ApacheCryptoAPI-Bench.
    • CryptoAPI-Bench consists of 181 test cases covering 16 types of Cryptographic and SSL/TLS API misuse vulnerabilities, with basic level and advanced level.
    • ApacheCryptoAPI-Bench documents the API misuse vulnerabilities from 10 real-world Apache projects. This benchmark is for checking the scalability(the ability to induce low computational overhead to analyze large code-bases) of the detection tool.
  • evaluate four static analysis tools based on the two proposed benchmarks: specialized tools(CryptoGuard, CrySL), general purpose tools(SpotBugs, Coverity).
Read more