Linux kernel and vulnerabilities
The number of devices running Linux each day is astounding. Android alone consists of 2 billion devices as of 2017. Some devices which run Linux receive regular updates, however many receive infrequent updates if any. Kernel bugs themselves tend to be long lived, with critical and high bugs remaining undetected for 5-6 years on average. There will always be bugs, and they will continue to exist whether we are aware of them or not. To this end, preventing bugs from being exploitable is of upmost importance.
A number of options have been added to the Linux kernel to validate that various types of memory corruptions have not occurred. One example is Stack Smashing Protection, mentioned in this post. This post focuses on other options which are more narrowly focused. These are discussed below. Note that the discussion is limited to Kernel v4.8; there may be changes in later versions which improve or change the options.
Linked List Sanity Checks
The CONFIG_DEBUG_LIST option, when used, will enable additional validation on linked list structures in the Kernel. An example of an exploit which leverages the corrupting of linked lists is (CVE-2017-10661)[https://access.redhat.com/security/cve/CVE-2017-10661], which uses a race condition and simultaneous operations to corrupt a list. This is fixed here, for the curious.
The option was added in this commit, and results in sanity checks being added to linked list operations:
Credential Sanity Checks
The CONFIG_DEBUG_CREDENTIALS option was added here and adds sanity checks to credentials. This adds the following to a credential structure:
as well as adding the following validations:
Notifiers Sanity Checks
Validation of notifier chains was added here to check that notifiers are from the kernel or a still-loaded module prior to being invoked. This is enabled with the CONFIG_DEBUG_NOTIFIERS option. The check is fairly light weight, and is as follows:
Scatter/Gather Table Sanity Checks
Scatter/Gather tables are a mechanism used for high performance I/O on DMA devices. Details on this can be found in this article. Sanity checks to scatter/gather tables was added to the Kernel here and can be enabled with the CONFIG_DEBUG_SG option. This adds an extra entry to scatter/gather tables, as shown below:
add also adds sanity checks when modifying or inspecting entries, such as:
Stack End Sanity Checks
The CONFIG_SCHED_STACK_END_CHECK option was added here to check in schedule() if a stack has been overrun. If it is, BUG() is invoked results in executing an undefined instruction, thus causing the current running process to die. Enabling this option adds the following sanity check to ensure the end of the stack is not corrupted:
An example of a bug which this catches can be found here, which is a vulnerability that allows one to recurse arbitrarily until the stack is overrun.
Analyzing stack smashing protection
To determine the performance implications of the aforementioned configurations, a custom Linux distribution was built using Yocto, one build with the configurations enabled in the kernel and one without. Linux v4.8 was used in the comparison. Two metrics relevant to an embedded system will be used for the analysis:
- Increased code size
- Performance cost
The build was run on QEMU and analyzed. See this post on how to create a custom QEMU image, in my case on macOS.
The Yocto build was a bare-bones build with one exception: FFmpeg was included which will be used to compare performance. Adding FFmpeg was accomplished by adding the following to the conf/local.conf file in the build directory:
To enable SSP the following kernel configuration was added:
Code size
The kernel images being compared are the bzImage files created by Yocto. Following are the sizes in KB of the kernel images from the two builds:
Build | Size (KB) |
---|---|
No Flags | 6,881 |
Sanitizers | 6,890 |
This shows that adding the sanitizers does add an additional 9 KB. This is rather small, and an increase of 0.1%.
Performance cost
The validation of the mentioned kernel instructions does results in additional instructions being executed, and is expected to incur some performance penalty. However, the question is can the performance cost be measured or is it exceedingly small?
To quantify the performance impact, an experiment was conducted which encoded a small video 20 times in succession using FFmpeg, once with and once without the kernel debug configurations mentioned earlier. See this post for details on the experiment and the video file which was used.
The following two box plots show the results of the experiment (raw data here).
The results show that there is some performance cost, as they increase encoding time on an average of 1.3 seconds. This is a performance hit of 1.1%.
Conclusion
The validation checks do provide some protection against vulnerabilities resulting from corruptions in various kernel data structures. The code size increase is rather small, which is intuitive given the limited code which is used to perform the validations. The runtime overhead of the checks, however, is not small. For the CPU bound work load of encoding videos with ffmpeg an increase of 1.1% was observed.
There are several options available to harden a Linux system. If one has a given performance loss budget that must be adhered to, consider the trade-off of the options presented with these kernel configurations with other opportunities. It depends on one’s threat landscape, but there may be larger impact hardening options for a similar or reduced performance hit.