Reducing CTF overhead


CTF (Compact C Type Format) encapsulates a reduced form of debugging information similar to DWARF and the venerable stabs. It describes types (structures, unions, typedefs etc.) and function prototypes, and is carefully designed to take a minimum of space in the ELF binaries. The kernel binaries that Sun ship have this data embedded as an ELF section (.SUNW_ctf) so that tools like mdb and dtrace can understand types. Of course, it would have been possible to use existing formats such as DWARF, but they typically have a large space overhead and are more difficult to process.

The CTF data is built from the existing stabs/DWARF data generated by the compiler's -g option, and replaces this existing debugging information in the output binary (ctfconvert performs this job).

For the sake of kmdb and crash dumps, the CTF data for each kernel binary is present in the memory image of a booted kernel. This implies it's paramount that the amount of CTF data is minimised. Since each kernel module will have references to common types such as cpu_t, there's a lot of duplicated type data in all the CTF sections. To help avoid this duplication, the kernel build uses a process known rather fancifully as 'uniquification'.

Uniquification

Each type in the CTF data has an integer ID associated with it. Observe that the main genunix kernel module has a large number of the common types I mention above in its CTF data. We can remove the duplicate data found in other modules by replacing the type data with references to the type data in CTF. This process is uniquification. Consider the bmc driver. After building and linking the bmc object, we want to add CTF for its types, but we also uniquify against the genunix binary, like so:

ctfmerge -L VERSION -d ../../intel/genunix/debug64/genunix -o debug64/bmc debug64/bmc_fe.o debug64/bmc_kcs.o

This command takes the CTF data in the objects comprising bmc (previously converted from stabs/DWARF by ctfconvert) and merges them together (removing any shared duplicates between the two different objects). Then it passes through this CTF data, and looks for any types that match ones in the uniqfile (which we specified with the -d option). For each matching type (for example, cpu_t), we replace any references to the local type definition with a reference to genunix's copy of the type data. Remember that type references are simply integer IDs, so this is just a matter of changing the type ID to the one found in genunix's CTF. Let's use ctfdump to look at the results:

$ ctfdump $SRC/uts/i86pc/bmc/debug64/bmc >bmc.ctf
$ ggrep -C2 bmc_kcs_send bmc.ctf
- Types ----------------------------------------------------------------------

  <32769> STRUCT bmc_kcs_send (3 bytes)
        fnlun type=113 off=0
        cmd type=113 off=8
        data type=5287 off=16
...

Here we see the first member of the struct bmc_kcs_send has a type ID of 113. Since this type ID isn't in the CTF, it must belong to our parent. We look for our parent, then find the type ID we're looking for:

$ grep cth_parname bmc.ctf
  cth_parname  = genunix
$ ctfdump $SRC/uts/intel/genunix/debug64/genunix >genunix.ctf
$ grep '<113>' genunix.ctf
  <113> TYPEDEF uint8_t refers to 86

This manual process is similar to how the CTF lookup actually happens. This uniquification process saves us a significant amount of CTF data, although it causes us some problems, which we'll discuss next.

CTF labels and additive merges

As noted above, all our uniquified modules will have type ID's that refer to the genunix shipped along with them. This means, of course, that if any of the types in genunix itself changes without these modules changing too, all the type references to genunix types will be wrong, since it works by type ID. So, what happens when we need to release kernel changes?

Since we obviously don't want to ship all these modules every time genunix needs to change, we have to keep the existing type IDs in the new genunix binary. But also, we want to have any new or changed types present and correct too. So, instead of doing a full merge and rewriting the existing CTF data in genunix, we perform an "additive merge". This retains the existing CTF types (and IDs) so that references from unchanged modules still point to the right types, and adds on new types.

To do an additive merge, we need to pass a 'withfile' to ctfmerge via its -w option. This first takes all the CTF in the withfile and adds it into the output CTF. Then the CTF from the objects passed to ctfmerge are uniquified against this data. Any remaining types after uniquification are then added on top of the withfile data. This preserves the existing type IDs for any older modules that uniquified against this genunix, whilst also adding the new types.

This 'withfile' is the previous version of genunix. When it was built the first time, we passed -L VERSION to ctfmerge. This adds a label with the value of the environment variable $VERSION. Typically this is something like Generic. When we do the additive merge, we pass in a different label equal to the patch ID of the build, and the additional types are marked with this label. For example, on a Solaris 9 system's genunix:

- Label Table ----------------------------------------------------------------

   5001 Generic
   5981 112233-12
...

Labels are nothing but a mapping from a string to a particular type ID. So here we see that the original types are numbered from 1 to 5001, and we've done an additive merge on top with the label "112233-12", which added more types.

CTF from the ip module

The genunix module contains many common types, but the ip module also contains a lot of types used by many kernel modules, but not found in genunix. To further reduce the amount of CTF in these modules, we merge in the CTF data found in ip into the genunix CTF. The modules can then uniquify against this combined data, removing many more duplicate types. Note that we don't do this for patch builds, as the ip module might not ship in a patch. Unfortunately this can cause problems (notably bug 6347000, though this isn't yet accessible from opensolaris.org).

Further reading

Tags: