Buffer overflow checking in glibc
The last few topics discussed “hardening” options in GCC which instrumented code, checking at runtime for certain undefined behavior regarding memory read or writes. This post takes a look at a different hardening option unrelated to instrumenting code, but instead at bounds checking in glibc.
There are several functions in glibc which operate on strings or buffers. These include:
memcpy, mempcpy, memmove, memset, strcpy, stpcpy, strncpy, strcat, strncat, sprintf, vsprintf, snprintf, vsnprintf, gets
Although some of them accept an argument defining the length of the passed buffer, the functions must believe that the size is correct. Other functions, such as sprintf, do not accept a maximum size argument. In both of these cases, it is possible for a buffer overflow to occur.
Glibc supports several wrapper functions for these which do perform bounds
checking at runtime on the actual size of the buffers being processed. For example,
__memcpy_chk()
supports an extra argument for the size of the destination buffer:
These wrappers function in the same way that the original function do, with the exception that if the destination buffer will not accommodate the data being written the function aborts at runtime, thus pointing out a buffer overflow before it occurs.
One is not expected to call these wrapper functions directly. Instead, if
code is compiled with the __FORTIFY_SOURCE
macro defined GCC will transparently
replace the function calls with their wrappers if the compiler cannot prove
that a buffer overflow is impossible. The size of the destination
buffer is determined using the built-in function __builtin_object_size()
.
This returns the bytes remaining in a structure; if the size is not
known at compile time (size_t) -1
is used (and thus no additional protection
is provided).
The __FORITY_SOURCE
macro can be configured to check in two different modes,
as the definition of “bytes remaining in a structure” has one of two possible
meanings. Consider the following structure:
The blob.first substructure has only 10 bytes. If writing
past this and into blob.second should be disallowed, use __FORTIFY_SOURCE=2
.
Alternatively, if data overruns from blob.first into blob.second, but does not
overrun blob, the program may still work correctly (and this may be part of
the program’s expected operation). If this is acceptable, use __FORTIFY_SOURCE=1
.
To summarize:
In addition to checking at runtime, if the compiler knows at compile time that a function call will cause a buffer overflow it will also emit a warning. For example:
Unless compiled with -Werror the program will still compile. However, when run it will fail:
Using __FORTIFY_SOURCE is not a panacea, as not all calls to these functions can be checked for overflows. Below shows examples of the four possible cases [source]:
Using __FORTIFY_SOURCE also adds compile time warnings and checks for some best-practices. Read here for further details.
Analyzing FORTIFY_SOURCE
The usage of __FORTIFY_SOURCE
is analyzed below. The most strict option,
__FORTIFY_SOURCE=2
, is used. Two metrics relevant to an embedded system will be
used for the analysis:
- Increased code size
- Performance cost
To facilitate the analysis, a custom Linux distribution was built using Yocto, one build with the define enabled and one without. The build was run on QEMU and analyzed. See this post on how to create a custom QEMU image, in my case on macOS
The Yocto build was a bare-bones build with one exception: FFmpeg was included which will be used to compare performance. Adding FFmpeg was accomplished by adding the following to the conf/local.conf file in the build directory:
To enable the sanitizer the following was added to the conf/local.conf file:
Code size
The Yocto builds were configured to produce a EXT4 file system image. Following are the number of KB used on the file systems:
Build | Size (KB) |
---|---|
No Flags | 34,844 |
FORTIFY_SOURCE | 34,868 |
This shows that compiling with __FORTIFY_SOURCE=2 results in adding an additional 24 KB, which is very modest. This may depend on the type of code being compiled, so your mileage may vary.
Performance cost
The wrappers for the fortified glibc functions do result in extra instructions being executed, checking bounds conditions before continuing on. However, the question is can the performance cost be measured or is it exceedingly small?
To quantify the performance impact, an experiment was conducted which encoded a small video 20 times in succession using FFmpeg, once with and once without using the fortified functions. See this post for details on the experiment and the video file which was used.
The following two box plots show the results of the experiment (raw data here).
The results show that there is no loss of performance when using __FORTIFY_SOURCE. Curious, the performance appears to have improved, which is not expected. It is not known if this is an artifact of the testing setup or if there were additional optimizations which became available when the option was used.
Conclusion
The __FORTIFY_SOURCE option does provide protection against some types of buffer overflows when using select glibc functions. The code size cost when enabling this is very low, and there is no indication of performance loss when encoding a sample video with FFmpeg. The option additionally will emit compile time warnings when code can be proved offline to result in buffer overflows. Given these results, enabling this option should be an simple and cheap way to decrease the risk of buffer overflow related defects in software using glibc.