Enforcing Unique Code Target Property for Control-Flow Integrity

Link: https://dl.acm.org/citation.cfm?id=3243797
Source Code: https://github.com/uCFI-GATech

Summary: The project has tried to achieve an ambitious goal: based on their execution history, enforce a CFI policy that will allow only one valid target for an indirect jump/call. For decades, researchers have tried to design a strict enforcement, a strong CFI policy. But the performance overhead and complex real-world program fail them every time. This project is also no exception to them. They have failed to evaluate spec benchmark like gcc, xalancbmk, omnetpp etc. which are the most important spec benchmark to evaluate considering their complexity with implementing CFI defense (they also have the most number of indirect calls). The research hugely depends on Intel PT hardware support which usually fails to capture complete information if they have frequently encountered. Basically, Intel PT is designed for offline debug purpose, so there is no chance to fix this issue for an online analysis.

Design: The project can be divided into three parts: 1) constrain data identification 2) arbitrary data collection and 3) online analysis. Constrain data is a program variable which is not a function pointer but has a significant impact to fix where a function pointer will target. For example: if there is an array of function pointer, and someone uses variable i as the index for that array to call an indirect function; then i is going to be a constraint data. They use compile-time analysis to identify first instructions related to function pointer calculation, second, they check the operands of those instructions if they are variable. Once they know constraint data, they have to collect the data at runtime (2). They use compile-time instrumentation for this purpose. Now, as they are planning to use Intel PT to track execution history, so they have to figure out a way how to force the instrumention to entry the arbitrary value in Intel PT trace. Hence, they come out with a trick: they call their defined write function with constraining data value as the parameter and from there, they create multiple indirect function call with return only body where the target will be fixed by a chunk of the passed value. To reduce the number of packet trace, they also do the same for the sensitive basic block where they also assign a unique id for each of them. With all of these, they can now get rid of checking TNT packets (for conditional control transfer) while only checking TIP packets (indirect control transfer). For online analysis(3), they decide to it in separate processor parallel to the original program. They have defined a read method to reconstruct the arbitrary data from Intel PT packet trace and validate it with a mapping.

Advantage: If you can perfectly identify every constraint data for every sensitive indirect control transfer, then if you have a perfectly working hardware to trace the history, then if the separate validation does not harm your security guarantee, then it is a very good system.

Disadvantage: Unfortunately, not any of them is possible to work perfectly. The static analysis with the complex real-world application will never be successful with such a simple algorithm. There will be always special cases and you have to keep updating your analysis mechanism time by time. Next, a missing perfect hardware is eventually reported by themselves. And everyone knows, you need to know before you jump. Pause your program in sensitive system call-points can easily cause a DoS attack by creating a real long pending list of validation check through frequently created indirect control flow transfer. About performance overhead claim, they have mentioned 10% average performance overhead. But, for perlbench and h264ref they report 47% and 24% performance overhead when they have not evaluated similar benchmark gcc, omnetpp, and xalancbmk. If we even consider 25% overhead for each of them, the overall overhead will go up at least 15%. They have evaluated nginx and vsftpd but they have not mentioned what test suites they have used. Moreover, their system is creating more indirect call, maybe even more than that exists in the target program.

Discussion: To emphasize their security improvement, they have mentioned a case study from h264ref spec benchmark. They show there is a structure that has a member function pointer. This structure is later used to create an array of size 1200. With that, they simply assume, there are 1200 valid targets for that member function pointer (unique one for each of them) and they have reduced it to 1. But the truth is that member function has only taken three valid functions in the entire source code. The project is too ambitious and hence not practical.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.