Separating debug symbols from executables

23 November 2023 — by Johan Herland

This article aims to introduce and explore the practice of splitting debug symbols away from C/C++ build artifacts to save space and time when building large codebases. Note that we want to retain access to the debug symbols if and when they are needed at a later date, hence we don’t want to merely remove (aka strip) the debug symbols.¹

This exploration is largely inspired and based on what I have learned in various places around the web, most notably:

along with various experiments of my own, outlined below.

This article will focus on ELF files on Linux. For other formats and platforms, things are likely to be quite different. The compiler/toolchain below is based on GCC, but the experiments are repeatable with minor changes on LLVM-based toolchains. Your mileage may vary.

What are debug symbols?

In short, debug symbols are extra “stuff” in your intermediate object files — and ultimately in your executables — that help your debugger map the machine code being executed back into higher-level source code concepts like variables and functions. This allows the debugger to then present a view of the execution that corresponds more directly to the source code that you’re used to reading.

Without debug symbols, the debugger can become almost useless as it is typically very hard to understand which part of the (often optimized) machine code execution (i.e. registers, memory addresses, etc.) corresponds to which part of the source code (variables, functions, etc.).

To illustrate with a toy example, hello.cpp:

#include <iostream>

int main() {
    std::cout << "Hello, World!" << std::endl;
    return 0;
}

Debug symbols are the difference between:

$ g++ hello.cpp -o hello.default
$ gdb ./hello.default
GNU gdb (GDB) 13.1
[...]
Reading symbols from ./hello.default...
(No debugging symbols found in ./hello.default)
(gdb) br main
Breakpoint 1 at 0x4010a0
(gdb) run
Starting program: /home/jherland/code/debug_fission_experiment/hello.default
[...]
Breakpoint 1, 0x00000000004010a0 in main ()
(gdb) list
No symbol table is loaded.  Use the "file" command.

and:

$ g++ -g hello.cpp -o hello.with-g
$ gdb ./hello.with-g
[...]
Reading symbols from ./hello.with-g...
(gdb) br main
Breakpoint 1 at 0x4010a0: file hello.cpp, line 4.
(gdb) run
Starting program: /home/jherland/code/debug_fission_experiment/hello.with-g
[...]
Breakpoint 1, main () at hello.cpp:4
4           std::cout << "Hello, world!" << std::endl;
(gdb) list
1       #include <iostream>
2
3       int main() {
4           std::cout << "Hello, world!" << std::endl;
5           return 0;
6       }

Preliminaries

Before we dive into the deeper analysis, let’s make sure that our toy example can stay relevant for the remainder of this exploration.

Compiling and linking as separate steps

Above, we used a single command (g++ <options> hello.cpp -o hello.<suffix>) to compile and link our application. These two steps are worth separating in our further analysis. For one, it allows us to examine the intermediate results (the object files). But also, all build systems for larger C/C++ codebases typically compile and link in separate steps already, so this is closer to what we’ll encounter when we want to apply our learnings to a larger build system towards the end of this article.

Here’s how we separate the two steps:

$ g++ -g -c hello.cpp
$ g++ hello.o -o hello.with-g.2

We can confirm that nothing has changed with our executable:

$ ls -l hello.*
-rw-r--r-- 1 jherland users    97 Jan  1 00:00 hello.cpp
-rwxr-xr-x 1 jherland users 16304 Jan  1 00:00 hello.default
-rw-r--r-- 1 jherland users 29744 Jan  1 00:00 hello.o
-rwxr-xr-x 1 jherland users 38784 Jan  1 00:00 hello.with-g
-rwxr-xr-x 1 jherland users 38784 Jan  1 00:00 hello.with-g.2
$ diff --report-identical-files hello.with-g hello.with-g.2
Files hello.with-g and hello.with-g.2 are identical

In addition, we have this intermediate hello.o file. More about that, soon.

At this point we can start to play around with different compiler and linker options.

Using the gold linker

In fact, let’s start with trying a different linker altogether. The default linker used by GCC is the BFD linker, and that is what we’ve used so far. Let’s try the more recent gold linker instead. There are multiple reasons for this switch:

gold is faster and generates smaller executables than BFD.
gold supports some options that we’re going to need later on.
gold is a popular choice in many build systems, including Bazel.

Let’s re-run the above commands, but using gold:

$ rm hello.with-g.2
$ g++ -fuse-ld=gold hello.cpp -o hello.default
$ g++ -g -c hello.cpp
$ g++ -fuse-ld=gold hello.o -o hello.with-g

What about other linkers?

In this experiment we could have opted for an even more modern linker, like lld from the LLVM project or the more recent mold linker. The choice, however, is often dictated by the context of your project. For example, if you’re working in an embedded setting, the more recent linkers are often not available for the cross-building toolchains that are used.

What are debug symbols, really?

OK, with that out of the way, let’s dive into what our executables look like with and without debug symbols. First let’s have a look at the relative sizes of our files:

$ ls -l hello.*
-rw-r--r-- 1 jherland users    97 Jan  1 00:00 hello.cpp
-rwxr-xr-x 1 jherland users  8280 Jan  1 00:00 hello.default
-rw-r--r-- 1 jherland users 29744 Jan  1 00:00 hello.o
-rwxr-xr-x 1 jherland users 31560 Jan  1 00:00 hello.with-g

We can see that the debug symbols add an extra (31560 - 8280 =) 23280 bytes (or almost 300%) to the final executable. Comparing the output of readelf --sections --wide between the two executables (to list the ELF sections inside), we can see some extra ELF sections in the latter:²

+  [28] .debug_info       PROGBITS        0000000000000000 001023 002c66 00      0   0  1
+  [29] .debug_abbrev     PROGBITS        0000000000000000 003c89 0007b9 00      0   0  1
+  [30] .debug_loclists   PROGBITS        0000000000000000 004442 00010a 00      0   0  1
+  [31] .debug_aranges    PROGBITS        0000000000000000 00454c 000050 00      0   0  1
+  [32] .debug_rnglists   PROGBITS        0000000000000000 00459c 00007f 00      0   0  1
+  [33] .debug_line       PROGBITS        0000000000000000 00461b 000242 00      0   0  1
+  [34] .debug_str        PROGBITS        0000000000000000 00485d 001b62 01  MS  0   0  1
+  [35] .debug_line_str   PROGBITS        0000000000000000 0063bf 0004e1 01  MS  0   0  1

The other sections in the executable appear to be unchanged.

So, the debug symbols are part of the executable file, but they are not part of the actual machine code that is being executed (as that resides in other ELF sections).

You can drill further into the contents of these sections with commands like readelf --debug-dump, but the above will suffice for our analysis here.

What is the problem with debug symbols?

They take up space. And therefore also time. Both build time and run time.

In the above toy example, the effects are also toy-sized, but as we’ll see later, the giant proportions of debug symbols relative to executable code remain when we scale up to real-world projects. Furthermore this extra space is consumed both in the executable and the intermediate object files. In other words, turning on debug symbols can make your build artifacts take up orders of magnitude more disk space compared to a build without debug symbols.

Since these debug symbols are embedded within the intermediate object files, the tools that interact with these files have to process the debug symbols too: the compiler has to generate them in the first place, and the linker has to copy these sections into the final executable. Then, typically, the final packaging steps of the build process need to package these larger executables. All of these steps take extra time, because everything is so much bigger.

This extra space and time is all wasted as long as the debug symbols are not actually used.

In embedded projects where the final executable often has to run on a hardware-constrained device, the extra space taken up by debug symbols can also be the difference between something that will fit on the device and run successfully, and something that simply won’t.

Stripped executables

Many projects strip their final executables before deploying them. Stripping removes the debug symbols from the executable, but it also removes more than that. Returning to our toy example:

$ strip hello.with-g -o hello.stripped
$ ls -l hello.*
-rw-r--r-- 1 jherland users    97 Jan  1 00:00 hello.cpp
-rwxr-xr-x 1 jherland users  8280 Jan  1 00:00 hello.default
-rw-r--r-- 1 jherland users 29744 Jan  1 00:00 hello.o
-rwxr-xr-x 1 jherland users  6368 Jan  1 00:00 hello.stripped
-rwxr-xr-x 1 jherland users 31560 Jan  1 00:00 hello.with-g

With strip we are able to remove not only the debug symbols, but also an additional 1912 bytes (or ~23%) from the original executable (i.e. compared to hello.default). What is removed? Again, comparing the output of readelf --sections --wide on hello.default vs hello.stripped, we see that the following sections disappear:

-  [29] .symtab           SYMTAB          0000000000000000 001040 000408 18     30  21  8
-  [30] .strtab           STRTAB          0000000000000000 001448 0002df 00      0   0  1

Similar to debug symbols these sections are not necessary for the execution of the program per se, but they do provide the most basic of symbol lookup (e.g. what function name is located at which address).³ Without these sections, any kind of debugging now becomes very hard indeed:

$ gdb ./hello.stripped
[...]
Reading symbols from ./hello.stripped...
(No debugging symbols found in ./hello.stripped)
(gdb) br main
Function "main" not defined.
Make breakpoint pending on future shared library load? (y or [n]) n
(gdb) br *0x4009e0
Breakpoint 1 at 0x4009e0
(gdb) run
Starting program: /home/jherland/code/debug_fission_experiment/hello.stripped
[...]
Breakpoint 1, 0x00000000004009e0 in ?? ()
(gdb) list
No symbol table is loaded.  Use the "file" command.

Where our initial executable (compiled with default options) was at least able to understand where the main function was located, this stripped executable offers absolutely no help whatsoever, and we’re forced to interact via raw memory addresses. 😱

In other words, running a stripped executable under a debugger is not a very friendly experience. Not only are we missing correspondences between the machine code and the source code, but we’re even missing the most fundamental of symbol information.

Together this provides the top two reasons for why someone would like to strip their executables:

To really make the executable as small and fast as possible.
To intentionally make it harder to run the executable in a debugger, especially if developed in a proprietary setting and/or deployed into a potentially hostile environment where one wants to make reverse-engineering more difficult.

Stripped and unstripped, the best of both worlds?

A common practice in many C/C++ projects is to build everything with debug symbols, and then add a final build step that strips all the executables. Thus we can provide two versions of all executables: one stripped that is small and fast, and an unstripped version that contains all the debugging comforts.

When you need to debug the stripped executable, you can have gdb look at the corresponding unstripped executable to find all the debug symbols. Here, the unstripped executable is not actually executed itself, instead we execute the stripped executable and merely use the unstripped executable as a source of debug symbols:

$ gdb ./hello.stripped
[...]
Reading symbols from ./hello.stripped...
(No debugging symbols found in ./hello.stripped)
(gdb) symbol-file ./hello.with-g
Reading symbols from ./hello.with-g...
(gdb) br main
Breakpoint 1 at 0x4009e0: file hello.cpp, line 4.
(gdb) run
Starting program: /home/jherland/code/debug_fission_experiment/hello.stripped
[...]
Breakpoint 1, main () at hello.cpp:4
4           std::cout << "Hello, world!" << std::endl;
(gdb) list
1       #include <iostream>
2
3       int main() {
4           std::cout << "Hello, world!" << std::endl;
5           return 0;
6       }

Success! 😎

However, your build process might not be so happy with this: Depending on the shape of the final build product (e.g. a single archive or image containing multiple executables), you might end up generating a giant archive full of unstripped executables that then needs to be unpacked, stripped, and finally re-packaged as a corresponding archive of stripped executables.

Worse, when all your executables go into this archive, these expensive pack/unpack/strip/repack steps end up depending on all the executables in your project! The result is that no matter how small the change you make to one of your executables, your build system ends up having to redo the pack/unpack/strip/repack dance for almost every build.

You could mitigate this by moving the stripping to an earlier stage of the build: for example, right after you generate an (unstripped) executable, you could then strip this executable and carry both the stripped and unstripped executables forward into the final build steps. This is certainly much better, but you still haven’t addressed the extra build time incurred by the linker having to process these debug symbols in the first place.

Approaching debug fission

After the previous section, there are two burning questions that should remain:

If we never actually execute the unstripped executable, does it actually need to contain any code at all? Can we remove the executable code from it and retain only the debug information?
Can we push the splitting of the executable to an even earlier step in the build graph?

In essence, can we have the compiler and/or linker generate two separate output files? One with already-stripped executable code, and another with the debug information only?

Let’s tackle the first question first:

Making a file containing only debug information

We can use the --only-keep-debug option to strip (or objcopy) to convert an unstripped executable into a (non-executable ELF) file that contains only the debug-related sections from the original ELF executable:

$ strip --only-keep-debug hello.with-g -o hello.debug
$ ls -l hello.*
-rw-r--r-- 1 jherland users    97 Jan  1 00:00 hello.cpp
-rwxr-xr-x 1 jherland users 28184 Jan  1 00:00 hello.debug
-rwxr-xr-x 1 jherland users  8280 Jan  1 00:00 hello.default
-rw-r--r-- 1 jherland users 29744 Jan  1 00:00 hello.o
-rwxr-xr-x 1 jherland users  6368 Jan  1 00:00 hello.stripped
-rwxr-xr-x 1 jherland users 31560 Jan  1 00:00 hello.with-g

The new file is 28184 bytes, and although it appears to be an executable ELF file, it surely cannot be executed:

$ ./hello.debug
bash: ./hello.debug: cannot execute binary file: Exec format error

If we compare the output of readelf --sections --wide from hello.with-g with that of hello.debug, we see that most of the ELF sections have been removed: even though the section headers are still there, their type has been changed into NOBITS, and the actual section data is gone.

The only non-empty ELF sections that remain are:

  [ 2] .note.gnu.property     NOTE            00000000004002c8 0002c8 000030 00   A  0   0  8
  [ 3] .note.ABI-tag          NOTE            00000000004002f8 0002f8 000020 00   A  0   0  4
[...]
  [27] .comment               PROGBITS        0000000000000000 000318 000013 01  MS  0   0  1
  [28] .debug_info            PROGBITS        0000000000000000 00032b 002c66 00      0   0  1
  [29] .debug_abbrev          PROGBITS        0000000000000000 002f91 0007b9 00      0   0  1
  [30] .debug_loclists        PROGBITS        0000000000000000 00374a 00010a 00      0   0  1
  [31] .debug_aranges         PROGBITS        0000000000000000 003854 000050 00      0   0  1
  [32] .debug_rnglists        PROGBITS        0000000000000000 0038a4 00007f 00      0   0  1
  [33] .debug_line            PROGBITS        0000000000000000 003923 000242 00      0   0  1
  [34] .debug_str             PROGBITS        0000000000000000 003b65 001b62 01  MS  0   0  1
  [35] .debug_line_str        PROGBITS        0000000000000000 0056c7 0004e1 01  MS  0   0  1
  [36] .note.gnu.gold-version NOTE            0000000000000000 005ba8 00001c 00      0   0  4
  [37] .symtab                SYMTAB          0000000000000000 005bc8 000408 18     38  21  8
  [38] .strtab                STRTAB          0000000000000000 005fd0 0002ab 00      0   0  1
  [39] .shstrtab              STRTAB          0000000000000000 00627b 00019a 00      0   0  1

These correspond almost directly to the sections that we stripped earlier.

Furthermore, this new — much smaller — file can be used directly with gdb:

$ gdb ./hello.stripped
[...]
Reading symbols from ./hello.stripped...
(No debugging symbols found in ./hello.stripped)
(gdb) symbol-file ./hello.debug
Reading symbols from ./hello.debug...
(gdb) br main
Breakpoint 1 at 0x4009e0: file hello.cpp, line 4.
(gdb) run
Starting program: /home/jherland/code/debug_fission_experiment/hello.stripped
[...]
Breakpoint 1, main () at hello.cpp:4
4           std::cout << "Hello, world!" << std::endl;
(gdb) list
1       #include <iostream>
2
3       int main() {
4           std::cout << "Hello, world!" << std::endl;
5           return 0;
6       }

Connecting the stripped executable to the debug file

It is still annoying to have to tell gdb exactly where to find the debug symbols with symbol-file ./hello.debug. Fortunately, there are a couple of tools available to help us connect an executable to its debug file.

Using `--add-gnu-debuglink` on the stripped executable

objcopy has a --add-gnu-debuglink option that allows us to reconnect the stripped executable to its debug symbols in that other file:

$ objcopy --add-gnu-debuglink=hello.debug hello.stripped hello.stripped.debuglink
$ ls -l hello.stripped*
-rwxr-xr-x 1 jherland users 6368 Jan  1 00:00 hello.stripped
-rwxr-xr-x 1 jherland users 6464 Jan  1 00:00 hello.stripped.debuglink

We can now debug the stripped executable directly!

$ gdb ./hello.stripped.debuglink
[...]
Reading symbols from ./hello.stripped.debuglink...
Reading symbols from /home/jherland/code/debug_fission_experiment/hello.debug...
(gdb) br main
Breakpoint 1 at 0x4009e0: file hello.cpp, line 4.
(gdb) run
Starting program: /home/jherland/code/debug_fission_experiment/hello.stripped.debuglink
[...]
Breakpoint 1, main () at hello.cpp:4
4           std::cout << "Hello, world!" << std::endl;
(gdb) list
1       #include <iostream>
2
3       int main() {
4           std::cout << "Hello, world!" << std::endl;
5           return 0;
6       }

As expected, comparing readelf --sections --wide on ./hello.stripped.debuglink and ./hello.stripped confirms that a single section (.gnu_debuglink) has been added, and comparing the file sizes we see that this costs a modest 96 bytes.

Linking with `--build-id`

The linker option --build-id (reachable with -Wl,--build-id from the GCC command line) will embed a Build ID (specifically, a .note.gnu.build-id ELF section) into the linked executable. This build ID is a unique identifier for the built files: “the ID remains the same across multiple builds of the same build tree”.⁴ This section will be copied with --only-keep-debug, and it will survive stripping, so that — following the steps outlined in the previous sections — you will end up with a stripped executable and the debug symbols in a separate file, but both files will share the same Build ID.

When it comes time to debug, the debug file should be placed in a special place (that correspond to the debug-file-directory setting in gdb, and with a name that is derived from the Build ID itself), and gdb will then be able to automatically find and use it. Here is a (contrived) example of creating a stripped executable and its debug file, both with the same Build ID, and then putting the debug file where it will be automatically picked up by gdb (after setting debug-file-directory):

$ g++ -fuse-ld=gold -Wl,--build-id hello.o -o hello.buildid
$ strip --only-keep-debug hello.buildid -o hello.buildid.debug
$ strip hello.buildid -o hello.buildid.stripped
$ readelf -a  hello.buildid.stripped | grep "Build ID"
    Build ID: 8b7c6e6c56287a2959b0fe41996fcb7876c6bf98
$ mkdir -p .build-id/8b
$ cp hello.buildid.debug .build-id/8b/7c6e6c56287a2959b0fe41996fcb7876c6bf98.debug
$ gdb
[...]
(gdb) set debug-file-directory .
(gdb) file hello.buildid.stripped
Reading symbols from hello.buildid.stripped...
Reading symbols from /home/jherland/code/debug_fission_experiment/.build-id/8b/7c6e6c56287a2959b0fe41996fcb7876c6bf98.debug...
(gdb) br main
Breakpoint 1 at 0x400a00: file hello.cpp, line 4.
(gdb) run
Starting program: /home/jherland/code/debug_fission_experiment/hello.buildid.stripped
[...]
Breakpoint 1, main () at hello.cpp:4
4           std::cout << "Hello, world!" << std::endl;
(gdb) list
1       #include <iostream>
2
3       int main() {
4           std::cout << "Hello, world!" << std::endl;
5           return 0;
6       }

Since this scheme requires configuration of gdb and/or control over system-wide paths like /usr/lib/debug, we’ll leave it alone for the rest of this exploration. Still, depending on your development/debugging scenario, this might be a better way to organize the lookup of debug symbols. For example, if your build was performed on some CI infrastructure, and you’re now trying to remotely debug a hardware device running a stripped executable, it would be awfully nice if the CI had already arranged for the debug symbols to be placed somewhere your gdb instance could find them. For situations like these, it’s worth looking into debuginfod for serving debug symbols to gdb. That, however, is outside the scope of what we’re looking at here.

Summary so far

So at this point we have:

A stripped executable of 6464 bytes with a debuglink to:
A separate 28184 byte file with debug information

Together, these two files replace the 31560 byte unstripped executable.

Compared to keeping both the stripped and unstripped executables, this provides a space saving of 3376 bytes (or 9%). And compared to the original unstripped executable, the size increase is modest, at 3088 bytes (or 10%).

In larger projects, the numbers vary of course, but these are savings that are often worth pursuing.

Furthermore, the only tools we have used so far are g++, objcopy, and strip, with which we have been able to achieve what we want without relying on any “modern” toolchain features! 😎

What remains at this point is to examine if we can make this scale to work with larger executables. That is, are we able to:

split off the debug symbols into a separate file already at the compilation stage?
link together the debug symbols for each object file into bigger “packages” of debug symbols for the entire executable?
(re)establish a link from the final executable to this debug package so that gdb is able to seamlessly debug a stripped executable when it is accompanied by its corresponding debug package?

Debug fission, for real

Splitting debug symbols into a `.dwo` file with `-gsplit-dwarf`

The -gsplit-dwarf option is at the core of the debug fission concept. It instructs the compiler to place debug symbols into a separate .dwo file. Let’s try it:

$ g++ -g -gsplit-dwarf -c hello.cpp -o hello.split.o
$ ls -l *o
-rw-r--r-- 1 jherland users 29744 Jan  1 00:00 hello.o
-rw-r--r-- 1 jherland users 20328 Jan  1 00:00 hello.split.dwo
-rw-r--r-- 1 jherland users 10640 Jan  1 00:00 hello.split.o

We have two new files, a new .o file, and a corresponding .dwo file. Looking at the file sizes, it seems that around two thirds of the data in hello.o has been moved into hello.split.dwo, and one third remains in hello.split.o. The overhead introduced is at a modest 1224 bytes (or 4% of the original object file).

If we try to look closer at the actual debug information, it is clear that the debug information now is split between hello.split.o and hello.split.dwo:

$ readelf --debug-dump hello.split.o
The .debug_info section contains link(s) to dwo file(s):

  Name:      hello.split.dwo
  Directory: /home/jherland/code/debug_fission_experiment


hello.split.o: Found separate debug object file: /home/jherland/code/debug_fission_experiment/hello.split.dwo

Contents of the .debug_addr section (loaded from hello.split.o):
[...]
Contents of the .debug_info section (loaded from hello.split.o):
[...]
[9 more sections loaded from hello.split.o...]
Contents of the .debug_info.dwo section (loaded from /home/jherland/code/debug_fission_experiment/hello.split.dwo):
[...]
Contents of the .debug_abbrev.dwo section (loaded from /home/jherland/code/debug_fission_experiment/hello.split.dwo):
[...]
[5 more sections loaded from hello.split.dwo...]

To summarize, -gsplit-dwarf has moved most (but not all) of the debug information from the .o file into a separate .dwo, and replaced it with a reference that links the .o file to its .dwo counterpart to allow tools to access the debug information there.

When it comes time to link the final executable, the linker now has to handle an object file that is only one third of the original size. This results in a smaller executable, and considerably faster link times. For our toy example, these things do not matter, but for a larger code base (although the exact numbers and relative sizes will surely vary) this can make a very significant difference in build times. Especially incremental build times that nonetheless depend on executables to be linked from scratch.

Speaking of linking…

Linking an executable after compiling with `-gsplit-dwarf`

Now let’s link together an unstripped executable using the new, hello.split.o file we create in the previous section:

$ g++ -fuse-ld=gold hello.split.o -o hello.split
$ ls -l hello.split hello.with-g
-rwxr-xr-x 1 jherland users 19984 Jan  1 00:00 hello.split
-rwxr-xr-x 1 jherland users 31560 Jan  1 00:00 hello.with-g

This new hello.split executable is equivalent to the previous hello.with-g, the only difference being that it is based on an object file that was built with -gsplit-dwarf. Indeed, running readelf --debug-dump on the split executable shows the same results as we got for the corresponding .o files above:

readelf --debug-dump hello.split
The .debug_info section contains link(s) to dwo file(s):

  Name:      hello.split.dwo
  Directory: /home/jherland/code/debug_fission_experiment


hello.split: Found separate debug object file: /home/jherland/code/debug_fission_experiment/hello.split.dwo
[...]

The important thing to note here is that the .dwo references are carried forward into the final executable by the linker.

At this point it’s also worthwhile to compare the ELF sections inside this new, split executable against the original executable (with debug symbols). Again, we pull out readelf --sections --wide and compare its output on hello.with-g to the corresponding output on hello.split. Here is a quick summary of the differences (section sizes are in hexadecimal, as reported by readelf):

ELF section name	`hello.with-g` size	`hello.split` size	diff
`.debug_addr`	not present	120	+120
`.debug_info`	2c66	31	-2c35
`.debug_abbrev`	7b9	15	-7a4
`.debug_loclists`	10a	not present	-10a
`.debug_gnu_pubnames`	not present	10e0	+10e0
`.debug_gnu_pubtypes`	not present	fdd	+fdd
`.debug_aranges`	50	50	0
`.debug_rnglists`	7f	17	-68
`.debug_line`	242	264	+22
`.debug_str`	1b62	3d	-1b25
`.debug_line_str`	4e1	577	+96
Sum section sizes	587d	2aa2	-2ddb

So even though most of the ELF sections are the same, we find that 0x2ddb (= 11739) bytes of debug information has moved into the .dwo file. (This corresponds roughly with the directory listing above which shows the split executable being 11576 bytes smaller than the original executable with debug symbols embedded).

Another useful link option: `--gdb-index`

An option that is typically also mentioned when we talk about debug fission is the --gdb-index linker option (or -Wl,--gdb-index via the usual compiler wrapper). Although it’s not easy to find good documentation on this option,⁵ its main objective seems to be to speed up GDB when loading the executable and its symbols for debugging, in effect trading link time for debugging time, perhaps at a minor cost in executable size.

Let’s try it:

$ g++ -fuse-ld=gold Wl,--gdb-index hello.split.o -o hello.split.gdbindex
$ ls -l hello.split hello.split.gdbindex hello.with-g hello.stripped
-rwxr-xr-x 1 jherland users 19984 Jan  1 00:00 hello.split
-rwxr-xr-x 1 jherland users 11377 Jan  1 00:00 hello.split.gdbindex
-rwxr-xr-x 1 jherland users  6368 Jan  1 00:00 hello.stripped
-rwxr-xr-x 1 jherland users 31560 Jan  1 00:00 hello.with-g

Wow, we saved 8607 bytes by adding this option. What happened to the debug sections? Three sections (.debug_gnu_pubnames, .debug_gnu_pubtypes, and .debug_aranges), totalling 8461 bytes were replaced with one .gdb_index section of only 25 bytes.

Furthermore, if we compare the size of this new executable against the hello.stripped executable, we see that although we are still quite a bit off, we have taken off more than half of the overhead from our .split executable.

Can we further strip this executable?

As we did previously, we could try to strip this executable:

$ strip hello.split.gdbindex -o hello.stripped.again
$ ls -l hello.split.gdbindex hello.stripped hello.stripped.again
-rwxr-xr-x 1 jherland users 11377 Jan  1 00:00 hello.split.gdbindex
-rwxr-xr-x 1 jherland users  6368 Jan  1 00:00 hello.stripped
-rwxr-xr-x 1 jherland users  6368 Jan  1 00:00 hello.stripped.again
$ diff --report-identical-files hello.stripped hello.stripped.again
Files hello.stripped and hello.stripped.again are identical

However, we seem to have thrown the baby out with the bath water: there is no debug information at all in hello.stripped.again, not even any references to the .dwo file. It is in fact identical to the hello.stripped we created in an earlier section. This is further confirmed by gdb:

$ gdb hello.stripped.again
GNU gdb (GDB) 13.1
[...]
Reading symbols from hello.stripped.again...
(No debugging symbols found in hello.stripped.again)
(gdb) symbol-file hello.split.dwo
Reading symbols from hello.split.dwo...
(No debugging symbols found in hello.split.dwo)
(gdb) br main
No symbol table is loaded.  Use the "file" command.
Make breakpoint pending on future shared library load? (y or [n]) n

We can’t even use symbol-file to tell gdb to look up symbols in the .dwo file. Thus it seems that the -gsplit-dwarf mechanism relies on putting debug information into both files, and that the .dwo file is not at all useful if we strip the corresponding executable.

What about the `.dwo` files?

So we now have a linked executable with references to the .dwo file(s) that were created in the -gsplit-dwarf compilation step. For a bigger application spread across many source files, this will amount to a lot of .dwo files. If you want to debug the application at a later date, you must make sure to have all these .dwo files available at that point in time. So, do you need to paintstakingly collect .dwo files into a tar archive that accompanies your executable, and that needs to be unpacked before each debugging session?

Fear not: In the same way that the linker takes a collection of .o files and produces an executable, you can regard the dwp tool as a linker for debug information: It takes a collection of .dwo files and produces single .dwp (short for “DWARF package”) file that contains all the debug symbols needed to debug the final executable.

Compiling .dwo files into .dwp packages can be done directly, by passing each .dwo on the dwp command line:

$ dwp -o hello.split.dwp hello.split.dwo
$ ls -l hello.split.dw*
-rw-r--r-- 1 jherland users 20328 Jan  1 00:00 hello.split.dwo
-rw-r--r-- 1 jherland users 57416 Jan  1 00:00 hello.split.dwp

The easier option, however, is probably to use the --exec option to have dwp look at the executable itself, to automatically find all the .dwo files referenced and then “link” them into a .dwp package that corresponds to the name of the executable:

$ dwp --exec hello.split.gdbindex
$ ls -l hello.split*dw*
-rw-r--r-- 1 jherland users 20328 Jan  1 00:00 hello.split.dwo
-rw-r--r-- 1 jherland users 57416 Jan  1 00:00 hello.split.dwp
-rw-r--r-- 1 jherland users 57416 Jan  1 00:00 hello.split.gdbindex.dwp
$ diff --report-identical-files hello.split.dwp hello.split.gdbindex.dwp
Files hello.split.dwp and hello.split.gdbindex.dwp are identical

(Note that I’m not sure why the .dwp file here ends up so much larger than the .dwo file it is based on. On a different (older) toolchain version the size difference is much smaller, almost negligible. I suspect this is due to some constant-size overhead, and that it will disappear in the noise when scaled up to much larger executables.)

In any case, with the .dwp file created, we can remove the .dwo file(s) and still successfully access the debug symbols via the .dwp file:

$ rm *.dwo
$ gdb hello.split.gdbindex
[...]
Reading symbols from hello.split.gdbindex...
(gdb) br main
Breakpoint 1 at 0x400a00: file hello.cpp, line 4.
(gdb) run
Starting program: /home/jherland/code/debug_fission_experiment/hello.split.gdbindex
[...]
Breakpoint 1, main () at hello.cpp:4
4           std::cout << "Hello, world!" << std::endl;
(gdb) list
1       #include <iostream>
2
3       int main() {
4           std::cout << "Hello, world!" << std::endl;
5           return 0;
6       }

Summary of debug fission

So, at last, we have arrived at debug fission: we now have a debuggable executable that is only slightly larger than a stripped executable, and an accompanying .dwp package of debug symbols. The executable can be distributed/deployed on its own, and as long as the corresponding .dwp file is available when you need to debug, all the debug symbols will automatically be available to you. Let’s recap:

The compiler produces two output files from each compilation step: a .o object file without debug symbols, and a .dwo file containing the debug information.
The .o file carries a reference to the corresponding .dwo file. This reference is carried forward by the linker into the final executable.
The .dwo files can also be “linked” together into a .dwp “package” that carries all the debug symbols for the associated executable.
GDB knows how to look up debug symbols in both .dwp and .dwo files, so in the end we need to make either available to GDB, along with the final (stripped) executable.
The use of the -Wl,--gdb-index allows further debugging optimizations to be precomputed into the final executable, and will make debugging considerably faster.

Integration into larger build systems

So far, we’ve looked at the basics, invoking the compiler/linker manually at each step, but this is not how most software is built. Let’s look at enabling debug fission in a couple of popular build systems.

Case study: CMake

CMake currently does not support debug fission (aka. “split dwarf”) natively. Still, we’re not going to give up that easily.⁶

First steps

First, let’s try to wrap our toy example in a simple CMake project:

$ cat CMakeLists.txt
cmake_minimum_required(VERSION 3.25)
project(debug_fission_experiment)
add_executable(hello hello.cpp)
set(CMAKE_VERBOSE_MAKEFILE on)
$ cmake -S . -B cmake_default
[...]
$ cmake --build cmake_default
[...]
[ 50%] Building CXX object CMakeFiles/hello.dir/hello.cpp.o
[...]/g++ [...] -o CMakeFiles/hello.dir/hello.cpp.o -c /home/jherland/code/debug_fission_experiment/hello.cpp
[100%] Linking CXX executable hello
[...]g++ CMakeFiles/hello.dir/hello.cpp.o -o hello
[100%] Built target hello
[...]
$ ls -l cmake_default/hello
-rwxr-xr-x 1 jherland users 16304 Jan  1 00:00 cmake_default/hello

We see that CMake by default creates an executable without debug symbols, and one that is indeed identical to the very first build we did in this entire saga:

$ g++ hello.cpp -o cmake_default/hello.compare
$ diff --report-identical-files cmake_default/hello cmake_default/hello.compare
Files cmake_default/hello and cmake_default/hello.compare are identical

From this we can deduce that:

CMake defaults to building executables without debug symbols
CMake uses the default linker (BFD, not gold) by default

Let’s fix both of those.

Debug builds with the `gold` linker

As far as I can see, there’s no built-in mechanism in CMake to choose the linker to be used, but all we really need to do is to add -fuse-ld=gold to the command line used when linking the executable.

Conversely, although we could “simply” add -g to the compiler command line, CMake instead provides the CMAKE_BUILD_TYPE variable to specify what kind of build we want. The following alternatives are available by default: Debug, Release, RelWithDebInfo and MinSizeRel, and you can see what they entail in terms of compiler flags here. We’ll stick with Debug for this exploration.

We encode these choices into our CMakeLists.txt by adding these lines:

add_link_options(-fuse-ld=gold)
set(CMAKE_BUILD_TYPE Debug)

Setting compiler/linker flags to achieve debug fission

We can use the various CMAKE_*_FLAGS variables to directly pass the debug fission options to the compiler and linker command lines run by CMake:

set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -gsplit-dwarf")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -gsplit-dwarf")
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -Wl,--gdb-index")
set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -Wl,--gdb-index")

Producing the `.dwp` debug package

However, the above compiler/linker flags do not get us all the way there: We still need to tell CMake how to assemble the .dwo files into a .dwp package. That is, for our toy example, we need to run:

$ dwp -o cmake_split/hello.dwp cmake_split/CMakeFiles/hello.dir/hello.cpp.dwo

# or, the more indirect option that goes via .dwo references in the executable:
$ dwp --exec cmake_split/hello -o cmake_split/hello.dwp

(Note that the latter, more indirect, invocation currently fails with a segmentation fault. The instructions below will nonetheless assume that this dwp bug is fixed, and that the indirect invocation will work as advertised.)

Furthermore, we want to make a generic CMake rule so that this is done automatically for all executables. Here is a CMake fragment to do just that:

find_program(DWP_TOOL dwp)
function(add_executable target_name)
    # Call the original function
    _add_executable(${target_name} ${ARGN})
    set(out_dwp "${target_name}.dwp")
    add_custom_command(TARGET ${target_name}
        POST_BUILD
        COMMAND ${DWP_TOOL} --exec ${target_name} -o ${out_dwp}
        WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}
        COMMENT "Linking debug package ${out_dwp}"
        VERBATIM
        )
endfunction()

This redefines the add_executable() CMake function to also attach the appropriate dwp command as an extra command run immediately after the creation of every executable.

In the end, this is our final CMakeLists.txt for our toy project:

cmake_minimum_required(VERSION 3.25)
project(debug_fission_experiment)
add_link_options(-fuse-ld=gold)
set(CMAKE_BUILD_TYPE Debug)
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -gsplit-dwarf")
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -gsplit-dwarf")
set(CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -Wl,--gdb-index")
set(CMAKE_SHARED_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS} -Wl,--gdb-index")

find_program(DWP_TOOL dwp)
function(add_executable target_name)
    # Call the original function
    _add_executable(${target_name} ${ARGN})
    set(out_dwp "${target_name}.dwp")
    add_custom_command(TARGET ${target_name}
        POST_BUILD
        COMMAND ${DWP_TOOL} --exec ${target_name} -o ${out_dwp}
        WORKING_DIRECTORY ${CMAKE_CURRENT_BINARY_DIR}
        COMMENT "Linking debug package ${out_dwp}"
        VERBATIM
        )
endfunction()

add_executable(hello hello.cpp)

Case study: Bazel

As opposed to CMake, modern versions of Bazel (since v6) do support debug fission with the --fission option. However, this is conditional on the underlying toolchain configuration advertising the per_object_debug_info toolchain feature. If we assume that is in place, using debug fission is fairly straightforward:

Build an executable target with --fission=yes, either passed via the command line, or suitably encoded in a .bazelrc file: bazel build //path/to:executable --fission=yes
To get the .dwp file corresponding to an executable, append .dwp to the executable target, to trigger its creation: bazel build //path/to:executable.dwp --fission=yes
(Along the same lines, you can instruct Bazel to build a stripped executable by appending .stripped to the executable target name.)
As always when debugging compilation with Bazel: the --subcommands option is very useful to see exactly how Bazel ends up invoking the compiler/linker.

That’s it, really, from a naive point of view. Depending on the complexity of your project, you might run into other complications, for example, if your Bazel project uses rules_foreign_cc to drive some other build system for a subset of your build products, then you might have to communicate debug fission into that other build system and — crucially — extracting the separate debug symbols out from that build system and back into Bazel.

An example at scale: Building LLVM

Here, we’re going to leave our small experiments behind, and rather look at the potential wins of using debug fission in a larger project. We’ll look at the cost that debug symbols add to a release build, and see how debug fission can mitigate these costs.

For this, we need a larger project where we can compare release builds to builds with debug symbols, with or without debug fission enabled. One such project is LLVM. In addition to building with CMake, LLVM also already provides a LLVM_USE_SPLIT_DWARF option to enable debug fission.⁷

Setting up the LLVM builds

To get numbers suitable for comparison, we’ll choose the Release build type for the build without debug symbols, and RelWithDebInfo (rather than Debug) for the builds with debug symbols.

These are the three separate builds of the LLVM project that we will run:

A build with debug symbols, but no fission enabled. We name this build “debug” in the discussion below, and it is configured like this:
```
cmake -S llvm -B debug -G Ninja \
      -DLLVM_USE_LINKER=lld \
      -DCMAKE_BUILD_TYPE=RelWithDebInfo
```

A build with debug symbols, and debug fission enabled. We name this build “fission”, and it is configured like this:

cmake -S llvm -B fission -G Ninja \
      -DLLVM_USE_LINKER=lld \
      -DCMAKE_BUILD_TYPE=RelWithDebInfo \
      -DLLVM_USE_SPLIT_DWARF=ON

A release build with no debug symbols, named “release”. It is configured like this:

cmake -S llvm -B release -G Ninja \
      -DLLVM_USE_LINKER=lld \
      -DCMAKE_BUILD_TYPE=Release

After configuration, each build is performed by running cmake --build $build_dir, and the following installation is done with cmake --install $build_dir.

The numbers

Here are the relevant statistics from running these builds on my laptop. The absolute numbers in this table are not too interesting, but we’ll examine the relative differences below:

Build phase	Debug	Fission	Release
Wall clock time spent building	71m58s	64m36s	51m53s
Total size of `$build_dir`	46.94GB	14.91GB	3.06GB
Total size of `$build_dir/bin`	30.90GB	7.25GB	1.97GB
Total size of `$build_dir/lib`	15.19GB	7.03GB	1.02GB
Number of files in `$build_dir`	4348	7275	4348
Number of `*.o` files in `$build_dir`	2931	2931	2931
Number of `*.a` files in `$build_dir`	192	192	192
Number of `*.dwo` files in `$build_dir`	0	2927	0
Total size of `*.dwo` files	N/A	4.02GB	N/A

Install phase	Debug	Fission	Release
Wall clock time spent installing	5m44s	1m08s	0m02s
Total size of `$install_dir`	35.69GB	8.16GB	2.16GB
Total size of `$install_dir/bin`	27.45GB	6.48GB	1.78GB
Total size of `$install_dir/lib`	8.22GB	1.65GB	0.36GB
Number of files in `$install_dir`	2201	2201	2201
Number of `*.a` files in `$install_dir`	189	189	189

Focus on a single executable: `llvm-ar`	Debug	Fission	Release
Size of `llvm-ar` executable	259MB	84MB	32MB
Number of `*.dwo` referenced by `llvm-ar`	0	530	0
Size of `*.dwo` referenced by `llvm-ar`	N/A	363MB	N/A

Using the release build as our baseline, the build time increases by 39% for a debug build, but only by 25% when debug fission is enabled. Looking at the size of the build, the differences are much bigger: The debug build needs 15 times the disk space of the release build, but with debug fission, only 5 times the disk space is needed.

Next, let’s add the installation phase into the mix, which in LLVM’s case consists almost exclusively of copying files from the $build_dir. This highlights how the sheer size of build outputs with debug symbols contribute to slowing everything down: The debug build + install is 50% slower than the release build + install, and for debug fission the corresponding slowdown is only 27%.

The same trend is reflected if we focus on a single executable from the many built by LLVM: llvm-ar is 8.1 times bigger in the debug build than in the release build, but with debug fission, this is reduced to 2.6 times. In the debug fission case, the debug information has been moved into a large number of .dwo files that all together (84MB + 363MB) take up more space than the debug executable (259MB), but most of these .dwo files are referenced from (i.e. shared between) several LLVM executables, so the full cost of this debug information is amortized.

Summary of our LLVM build comparison

What can we learn from these three LLVM builds?

First, debug symbols take up a lot of space: When you turn on debug symbols (whether you enable debug fission or not), you should expect your build artifacts to become several times larger compared to a stripped release build.

Without debug fission, the large sections of debug symbols are copied from object files into intermediate archives/libraries and then again into the final executables. This duplication wastes a lot of space, hence it also significantly impacts the overall build time.

With debug fission enabled, however, much of this duplication is eliminated, and the reduced waste helps improve build times as well.

Conclusion

This concludes our exploration of how to separate debug symbols from executables in a way that saves both build time and space. Hopefully I have shown that this is possible to achieve in real-world projects, albeit maybe at the cost of some added complexity, especially if your build system is not already equipped with support for debug fission.

So, is debug fission worth it? The answer, of course, depends on your point of view:

If your baseline is a stripped release executable without any debug information, and you “just” wanted to add the ability to debug — but with minimal overhead in the executable itself — then I’m afraid debug fission is no silver bullet.

In terms of final executable size, the absolute least overhead you can achieve is with the --add-gnu-debuglink or --build-id discussed previously. However, this incurs the most overhead in terms of build time: You first need to do a full debug build, then strip the resulting executable to create the release executable.

On the other hand, if you already have a debug build, but struggle with its space/time requirements, then — depending on how well your build system supports it — debug fission could be a very valuable investment.

It all comes down to knowing your own project/codebase and the context in which it is built. Hopefully, in this article, you have at least found some hints on where and how to look for potential savings.

In terms of toolchain requirements for the techniques described here, you need a halfway modern C/C++ toolchain with support for -gsplit-dwarf and a corresponding debugger capable of reading .dwo/.dwp files. You will also need a linker that is more modern than the default BFD linker. In fact, if your project has access to a linker like gold (or even better: lld or mold), then these are a sure win over the BFD linker in any case, both in terms of improving build times, and the size of the final executables. And that is probably true even before you factor in the techniques described above!

Further discussion/links:

Generally, it seems to me that GCC’s wiki pages on debug fission and the DWARF package format should be considered the authoritative documentation on debug fission. Other than those, I consider this to be an under-documented feature in the world of C/C++ compilers/linkers. Hopefully this article can help remedy that.

GDB’s documentation details the use of --only-keep-debug and --add-gnu-debuglink.

At the start, I mentioned two articles that turned me onto this topic in the first place. In addition to those, here are some other interesting resources that I came across while working on this:

Thanks to Christopher Harrison, Mark Karpov, Cheng Shao, and Arnaud Spiwack for their reviews of this article.

A related topic would be that of compressing the debug symbols. This can be done instead of — or in some cases, in addition to — the separation of debug symbols that we discuss here. Debug symbol compression is a topic worthy of its own exploration, and I won’t tackle it here, except for pointing to resources like this, and this, compiler options like GCC’s -gz, linker options like --compress-debug-sections, or the dwz tool.↩
According to the ELF specifications: “All section names with the prefix .debug hold information for symbolic debugging. The contents of these sections are unspecified.”↩
Again, the ELF specifications have this to say about these sections: “.symtab holds a symbol table”, and “.strtab holds strings, most commonly the strings that represent the names associated with symbol table entries.”↩
According to the GDB documentation.↩
GCC’s wiki states: “Use the gold linker’s --gdb-index option (-Wl,--gdb-index when linking with gcc or g++) at link time to create the .gdb_index section that allows GDB to locate and read the .dwo files as it needs them.”↩
Note that yours truly does not have much experience with CMake, so please don’t regard the following instructions as authoritative in any way. 😉 This section is inspired by some projects that use CMake, and that also do support debug fission, e.g. the WebKit project has this resolved bug, along with this code in a current version.↩
The LLVM_USE_SPLIT_DWARF does not give us the “full” debug fission experience, as outlined in previous sections: Notably here is no “linking” of .dwo files into .dwp debug packages, and the .dwo files are also not part of the final installation. That is, you will need access to the original build tree in order to access the debug information. This aspect of debug fission is therefore missing from the numbers presented below.↩

About the authors

Johan HerlandJohan is a Developer Productivity Engineer at Tweag. Originally from Western Norway, he is currently based in Delft, NL, and enjoys this opportunity to discover the Netherlands and the rest of continental Europe. Johan has almost twenty years of industry experience, mostly working with Linux and open source software within the embedded realm. He has a passion for designing and implementing elegant and useful solutions to challenging problems, and is always looking for underlying root causes to the problems that face software developers today. Outside of work, he enjoys playing jazz piano and cycling.

If you enjoyed this article, you might be interested in joining the Tweag team.

This article is licensed under a Creative Commons Attribution 4.0 International license.

← Organist: stay sane managing your development environments Source filtering with file sets →