Background: A user program can invoke different syscalls in arbitrary order. The syscalls will eventually execute different kernel codes. If syscalls are invoked from two different user threads, then their respective kernel code will also be executed in two different kernel threads. This characteristic raises interesting questions:
- Could a user program lead to a race condition in kernel code?
- Could the race create a data race in the kernel code?
For example, if syscall_A writes to memory 0xaa from thread TH_A and syscall_B reads from memory 0xaa using thread TH_B, then those two syscalls have a data race for memory 0xaa. Such data race can cause system unresponsiveness, sometimes memory vulnerabilities such as use-after-free, double-free, etc. Existing kernel data race detection systems follow either of two approaches:
- A fuzzer generates (mutation- or grammar-based) a program with arbitrary syscalls and executes the program as a single thread program. If two syscalls of that program access the same memory, then the executor tags those two syscalls as they have data race (although in reality, it has not been proved).
- A hypervisor generates random interleaving between threads while executing selected multi-threaded programs. If it gets lucky, the hypervisor may find the same memories being accessed from two different threads and tags their respected syscalls as they have data race (this is a concrete proof).
As the first approach is missing concrete proof while the second approach depends on the number of multi-threaded input programs are available, a combination of the two approaches would make a significant improvement.
RAZZER: RAZZER basically combines the fuzzer and thread interleaving approaches but also integrates a static analysis that tags potential data race pairs of instruction. A modified version of SVF (a static points-to analysis tool based on LLVM) is deployed to identify the overapproximate pairs of instruction. A hypervisor loads the kernel into a guest VM and sets hardware breakpoints at the pairs of instruction. The VM monitor maintains a mapping where bits are set when a breakpoint is triggered. Next, a grammar-based fuzzer generates user programs with arbitrary syscalls. Later, a single thread executor runs the user programs from the guest VM and triggers hardware breakpoints. Analyzing the bitcode mapping, a set of user programs is being selected for the next phase. However, multiple user programs may trigger the same candidate pairs but some of them may cover different parts of kernel code in the execution, so the system keeps all such user programs into consideration. Finally, the pairs of syscalls of the user program are executed from two different threads. A hypervisor is being modified for custom kernel thread interleaving that targets the pairs of instruction and tries them in both orders. A VM monitor verifies the concrete memory address being accessed and triggers the true data race if the memories are the same.
The original purpose of designing the RAZZER is to reduce the false-positive alerts of existing systems while also accelerate the process of finding bugs. They evaluated their system in a very powerful machine (Intel Xeon E5-4655 with 512GB memory) and fuzzed 4 versions of kernel for 7 weeks. They detected 30 bugs and confirmed 16 of them. Most (70%) of their bugs are reported within the first 100 hours which is very encouraging.
- The system design is simple and so useful. They follow a current trend where researchers are combining different program analysis methods to fine-tuning vulnerability detection.
- It leverages existing state-of-the-art tools to implement its design. The static analysis is based on SVF and the fuzzer user program generator is based on Syzkiller. The kernel and user program hosted in guest VM. The hypervisor is based on QEMU and utilizes KVM. So, the system is pretty much stable.
- Although, they did not discuss their case studies in the paper but do mention them in their GitHub repo. Those case studies give us a better understanding of the capability of the tool.
- Instead of randomly assigned thread to different syscalls and interleaving in both orders, they could predict the vulnerable order from static analysis. This way they could improve the acceleration better.
- When they could detect data race, their system falls short in determining exploitability. I believe that is why only 50% of their bugs were confirmed and patched.
- Effective Data-Race Detection for the Kernel. OSDI 2010.
- Kernel data race detection using debug register in Linux. IEEE COOL Chips.
- DRDDR: a lightweight method to detect data races in Linux kernel. Journal of Supercomputing.
- KRACE: Data Race Fuzzing for Kernel File Systems. IEEE S&P 2020.
- Concurrency bugs should fear the big bad data-race detector. Discussion 2020.
- Hybrid Dynamic Data Race Detection. PPoPP 2003.
- Dynamic Analyses for Data-Race Detection. Internation Conference on Runtime Verification.
- Fast and accurate static data-race detection for concurrent programs. CAV 2007.
- List of Linux kernel data races found in recent 5 years.