How I Code Review

Feb 5, 2022

I thought it might be interesting, at least to myself, to write up how I approach code reviews. My history in tech is one where close code review was emphasized and well-respected: most appreciated that a detailed review was not only worth the reviewer’s time, but mutually beneficial, and often a great learning experience for everyone. So my tiny mind can’t process ideas like post-commit reviews, that seem to be predicated on the idea that testing is some kind of replacement for code review. To me, the kind of things that tests can cover is only a very small part of what code review is useful for.

I’ve code-reviewed entire projects totalling many thousands of lines, and single-character changes, but I’m usually following at least some of the below patterns either way:

Understand the context

First thing I read is the commit message. Of course, this should be in normal form, but I’m really making sure I can understand what the change is, based solely upon the commit message. Without looking at the code, can I, as a casual observer, understand what’s changed? Is the commit title specific enough? Does the commit message’s contents describe not just what changed, but why (as usual, the GNU standard is an exemplar of what not to do here)? Is it clear? Does the message needlessly have things that belong on another tracking system (target gate)?

I will read any associated ticket for its context - especially keeping an eye out for anything that doesn’t seem to fit with the PR’s changes. This could be a missing case, or a fundamental mis-understanding of what the real underlying problem is. If there is any design doc mentioned (and they should be mentioned!) I’ll also read that and diff its contents against what actually got implemented.

I’m looking mainly for disparities between what everyone agreed we should do, and what is actually happening in the code, but especially for missing things; it’s very easy to accidentally skip somebody’s drive-by comment, but that could turn out to be crucial to the implementation.

I also consider if this change makes sense on its own, and if it could be split up: this is often a matter of appetite (and personally I find the Linux kernel approach often goes a little too far), but patch series with one logical change per commit is often much easier to review. It should hopefully go without saying that each individual commit in the series should still pass CI, but unfortunately that’s painful to do with at least github PRs.

Get an overview

Next I start looking at the actual code changes: often with one tab per file, I’m trying to understand how the changes fit together: who calls what, what new intra-code dependencies there are, what the possible impact of the changes could be.

I might well look back in git history for each of these files to understand why the old code is like it is: this is also often very useful in identifying potential issues with the new changes.

Depending on the change, I will often checkout a local copy, and use git grep, ctags, etc. to help me understand how everything fits together.

My focus at this level is often on interfaces: does a new method have a suitable name? Is it at the right level of abstraction? What is the ownership of the relevant objects? Are there any layering violations?

Are there any external dependencies we need to worry about? Equally if anyone is depending on us, are we providing well-written interfaces? Are they designed with care and attention to versioning, information hiding, and all the usual API concerns? Is this going to wear well after it’s been in production for years?

I’m also bearing in mind other ongoing work: if there’s a project underway that is directly relevant to this specific change, I might ask for some accommodation that will make the eventual merge of both easier. Equally if there’s a general desire to take a particular technical direction, I might complain if something is taking a different tack.

It’s a rare code review where I don’t have to pause to go research something: systemd service definition semantics, syscall error modes, how selinux roles work etc. As I said above, great learning experience!

Are there potential performance concerns with the change: lots of unnecessary I/O, potential for big-O issues, needless overhead etc? What are the expected limits of the objects being handled?

What if there are bugs with this change: is the error handling suitable? Is there a sufficient level of logging, exception details, etc. to identify in the field what went wrong? Is there unnecessary noise? How would a stressed-out SRE deal with this in production?

Have any necessary unit/component tests been updated or added? Do they actually test something useful?

I almost never build or test changes I’m reviewing: that’s a job for the submitter and your CI infrastructure. The only exception is if I’m struggling to understand something, and running the tests would help.

Detailed review

I’m now going to go line-by-line through the whole patch, leaving comments where necessary. Sometimes I’ll reach something I don’t understand, add leave a “FIXME” for myself: if, after reading the whole change, I still don’t understand what’s happening, this will often re-formulate itself into a question for the submitter, along with a request for expanded code comments, but usually I can just delete these later.

If I find major - architectural level - issues with what I’m looking at, that’s often a good prompt to take the discussion elsewhere, perhaps to a Zoom call or design document discussion. Doing design review inside a PR is not fun for anyone.

I’ve noticed a tendency to “review the diffs”: the idea that only the changed lines are relevant to the review - that tools expand 10 lines at a time is a symptom of this. This is very wrong-headed in my opinion, and I often find myself in the rest of the code to make sure I can properly review what has changed.

Comb for nits

Everyone has a different appetite for code review nits: generally, I will always point out actual typos (often just once, if it’s repeated, expecting the submitter to apply my comment to all instances). If I have a substantive comment, I might also suggest some style-level improvements. I never expect someone to make these sort of changes for existing code: the general idea is to leave the code in a slightly better place than it was, not re-write whole files for cosmetic nits.

Often these stylistic nits are marked “optional”: if the submitter feels like it, they could change it, but it’s no big deal if not.

I’ll very often have style comments on things like:

  • unnecessary comments that just describe the code
  • missing comments
  • variable, function naming
  • function size and decomposition
  • local customs

Many of these things can be a matter of opinion, so I try to bear in mind other ways of thinking, up to a point. I’m never going to be happy seeing a ton of CamelCase and Hungarian notation in code that doesn’t have it already.


I haven’t yet found a code review tool that’s ideal at iteration: gerrit is still pretty hopeless at tracking outstanding comment state. PRs in github are better at displaying this, but have the fatal flaw that any history rewrite means all context is lost.

Regardless, when I get a new version of the changes, I’ll often review both the incremental diff and the whole change, checking that:

  • my review comments have been acted upon and the fixes look good
  • the change as a whole still makes sense:
    • is the commit message still correct?
    • are there now unnecessary changes, due to follow-on fixes?
    • do the updates go far enough?

Long review cycles for a PR can be grueling, both for reviewers and the PR owner. But in my opinion it’s almost always worth the effort, especially for complex changes: this code is probably going to live a lot longer than you’d think, and be maintained by people other than you.

Even worse, it’s often work not given the respect it’s due: PR owners can see it as a combative process, and management can see it as overhead. I don’t really know what to do about that.

Pay it forward!

Testing a C Library With Python

Dec 22, 2021

It’s still common for a systems library to be written in the default lingua franca, C, although Rust is encroaching, for good reasons.

However, when it comes to testing, things get tedious quickly: writing unit or component tests in C is a slow, bug-prone exercise. With libvfio-user, after fixing too many bugs that were due to the test rather than the test subject, I decided it would be worth looking at alternative approaches. The aim was to reduce the time it takes to develop unit/component tests.

Up until this point, we’d been using ctest, along with cmocka when we needed to mock out certain functions (such as socket handling). Leaving aside my strong feelings on these tools, this was rather unsatisfactory: libvfio-user effectively implements a (UNIX) socket server, but we weren’t actually testing round-trip interactions for the most part. In terms of code coverage, very little useful could be done via this unit testing approach, but the “sample” client/server was tedious to work with for testing purposes.

Python-based testing

After a quick proof of concept, it became clear that using Python would be a great choice to cover most of our testing needs. libvfio-user doesn’t ship with any client bindings, and, given that the main clients are qemu, cloud-hypervisor and SPDK, Python bindings would be of dubious utility.

As a result, we decided against “proper” Python bindings, auto-generated or otherwise, in favour of a small and simple approach. In particular, by using the terrible magic of ctypes, we could easily set up both client and server test cases that fully represent how the library works in real life.

So, instead of auto-generated bindings, we write - by hand - simple, thin, layers of type wrappers:

class vfio_irq_info(Structure):
    _pack_ = 1
    _fields_ = [
        ("argsz", c.c_uint32),
        ("flags", c.c_uint32),
        ("index", c.c_uint32),
        ("count", c.c_uint32),

small harness routines for socket handling

def connect_client(ctx):
    sock = connect_sock()

    json = b'{ "capabilities": { "max_msg_fds": 8 } }'
    # struct vfio_user_version
    payload = struct.pack("HH%dsc" % len(json), LIBVFIO_USER_MAJOR,
                          LIBVFIO_USER_MINOR, json, b'\0')
    hdr = vfio_user_header(VFIO_USER_VERSION, size=len(payload))
    sock.send(hdr + payload)

… interacting with the library on the server side

def get_pci_header(ctx):
    ptr = lib.vfu_pci_get_config_space(ctx)
    return c.cast(ptr, c.POINTER(vfu_pci_hdr_t)).contents

… and so on. Writing this by hand might seem immensely tedious, but in practice, as it’s pretty much all boilerplate, it’s very quick to write and modify, and easily understandable; something that can rarely be said for any kind of auto-generated code.

Client/server interactions

Another observation was that, for the purposes of these tests, we really didn’t need a client process and a server process: in fact, we don’t even need more than one thread of execution. If we make each test round-robin between acting as the client, then acting as the server, it becomes trivial to follow the control flow, and understanding logs, debugging, etc. is much easier. This is illustrated by the msg() helper:

def msg(ctx, sock, cmd, payload=bytearray(), expect_reply_errno=0, fds=None,
        rsp=True, expect_run_ctx_errno=None):
    Round trip a request and reply to the server. vfu_run_ctx will be
    called once for the server to process the incoming message,
    @expect_run_ctx_errrno checks the return value of vfu_run_ctx. If a
    response is not expected then @rsp must be set to False, otherwise this
    function will block indefinitely.
    # FIXME if expect_run_ctx_errno == errno.EBUSY then shouldn't it implied
    # that rsp == False?
    hdr = vfio_user_header(cmd, size=len(payload))

    if fds:
        sock.sendmsg([hdr + payload], [(socket.SOL_SOCKET, socket.SCM_RIGHTS,
                                        struct.pack("I" * len(fds), *fds))])
        sock.send(hdr + payload)

    ret = vfu_run_ctx(ctx, expect_errno=expect_run_ctx_errno)
    if expect_run_ctx_errno is None:
        assert ret >= 0, os.strerror(c.get_errno())

    if not rsp:

    return get_reply(sock, expect=expect_reply_errno)

We are operating as the client when we do the sendmsg(); the server then processes that message via vfu_run_ctx(), before we “become” the client again and receive the response via get_reply().

We can then implement an individual test like this:

def test_dma_region_too_big():
    global ctx, sock

    payload = vfio_user_dma_map(argsz=len(vfio_user_dma_map()),
        flags=(VFIO_USER_F_DMA_REGION_READ |
        offset=0, addr=0x10000, size=MAX_DMA_SIZE + 4096)

    msg(ctx, sock, VFIO_USER_DMA_MAP, payload, expect_reply_errno=errno.ENOSPC)

which we can run via make pytest:

___________________________ test_dma_region_too_big ____________________________
----------------------------- Captured stdout call -----------------------------
DEBUG: quiescing device
DEBUG: device quiesced immediately
DEBUG: adding DMA region [0x10000, 0x80000011000) offset=0 flags=0x3
ERROR: DMA region size 8796093026304 > max 8796093022208
ERROR: failed to add DMA region [0x10000, 0x80000011000) offset=0 flags=0x3: No space left on device
ERROR: msg0x62: cmd 2 failed: No space left on device

This is many times easier to write and test than trying to do this in C, whether as a client/server, or attempting to use mocking. And we can be reasonably confident that the test is meaningful, as we are really executing all of the library’s message handling.

Debugging/testing tools

With a little bit of tweaking, we can also use standard C-based tools like valgrind and gcov. Code coverage is simple: after defeating the mini-boss of cmake, we can run make gcov and get code-coverage results for all C code invoked via the Python tests - it just works!

Running Python tests with valgrind was a little harder: for leak detection, we need to make sure the tests clean up after themselves explicitly. But Python itself also has a lot of valgrind noise. Eventually we found that this valgrind invocation worked well:

	valgrind \
	--suppressions=$(CURDIR)/test/py/valgrind.supp \
	--quiet \
	--track-origins=yes \
	--errors-for-leak-kinds=definite \
	--show-leak-kinds=definite \
	--leak-check=full \
	--error-exitcode=1 \

We need to force Python to use the system allocator, and add a number of suppressions for internal Python valgrind complaints - I was unable to find a working standard suppression file for Python, so had to construct this myself based upon the Python versions in our CI infrastructure.

Unfortunately, at least on our test systems, ASAN was completely incompatible, so we couldn’t directly run that for the Python tests.


The approach I’ve described here has worked really well for us: it no longer feels immensely tedious to add tests along with library changes, which can only help improve overall code quality. They are quick to run and modify, and for the most part easy to understand what the tests are actually doing.

There’s been a few occasions where ctypes has been difficult to work with - for me the documentation is particularly sparse, and callbacks from the C library into Python are distinctly non-obvious - but we’ve so far always managed to battle through, and twist it to our needs.

Doing things this way has a few other drawbacks: it’s not clear, for example, how we might test intermittent allocation failures, or other failure injection scenarios. It’s also not really suitable for any kind of performance or scalability testing.

I’m curious if others have taken a similar approach, and what their experiences might be.

Github Pull Requests

Feb 6, 2021

Stefan Hajnoczi recently posted about clean commit history.

It’s a controversial viewpoint that not everyone agrees with - there is a sizable population in favour of “never rewrite history”. For me, though, the points he makes there are totally correct: each commit should be a logical change, main (neé master) should stay green, and CI should pass at every single point in main’s history. More than just CI though: regardless of whether it passes CI, the main branch should be of good quality at all times, if you want to avoid the Quality Death Spiral.

Unfortunately, Github pull requests make this model a little difficult for a few reasons:

You can’t ever rebase a PR undergoing review

It’s important that a non-draft PR is never rebased, or re-written in any way. Why? Well, aside from making it difficult for a reviewer to see what’s changed since last looking, if you rebase, the commits previously on the PR disappear off into reflog hyperspace.

The View Changes button on review comments is attached to that particular commit hash, which is no longer in the history for that branch, and you get the dreaded:

We went looking everywhere, but couldn’t find those commits.

Note that if your PR is still a draft, you’re fine to edit the history whichever way you like: in fact, it’s often useful for review purposes to have multiple commits even at the start of a PR review before you move it from draft. Up to you.

The only other safe time to rebase is on final approach. At that point, presuming you are keeping to the “single main commit per PR” approach (see below), you’ll be wanting to squash the entire branch history into a single commit to main. For this, I usually use prr: it’s handy for picking up Reviewed-by automatically, and merging commit comments together for final editing.

Github CI only runs on branch tips

You probably don’t want to have a PR where you’re going to merge more than one commit into main. This is because CI only runs on the top-level commit: if an ancestor commit breaks the build, you’ll never know. Stefan mentions using git rebase --exec for checking commits in a stack, which indeed works great, but unless you’re running exactly the same CI that’s running under Github Actions, you can’t rely on it.

If that’s the case, what if you have one or more changes that depend on another? This is where “stacked PRs” come in, and they’re a bit of a pain…

Stacked PRs are cumbersome

Gerrit has a really useful model for reviewing stacks of changes: instead of the full history, each “patchset” corresponds to the single logical change Stefan talks about above. Every time you push to Gerrit, you’re supposed to have collapsed and rebased additional changes into single commits corresponding to each Gerrit CR. The model has some disadvantages as well (in particular, it’s a bit of a pain to keep a full history locally), but the Gerrit review UI doesn’t suffer from the rebasing issues Github does1.

Presuming - as there is no CI available - gerrithub is a non-starter, the only option available on Github is to use multiple PRs. This is better than it used to be, but is still a little painful.

Essentially, a stacked PR is one that’s opened not against the main branch, but against another PR branch. Say we have changes A and B, where B is dependent on A. You’d create a local branch with A, then push it to Github and open a PR. You’d have another local branch with A and B, then push that branch to Github and open a separate PR.

Now we need to make the B PR be based against the A PR. You can do this via the web UI by clicking Edit, though there is annoying bug here: it doesn’t reset the title and description. You can use gh pr create --base ... to avoid this problem.

Now, in the second PR, you’ll just see the commit for B. Each PR can be reviewed separately, and each PR gets its own CI run.

You also might want to merge additional changes up the stack. Let’s say that you have commit A2 on the A PR, that you want in PR B and C: the best - if rather tedious - way to do this, is to merge A into B, then B into C. That’s a lot of merge commits, but remember we’re squashing a PR every time before merging a PR to main.

You’ll see on the web recommendations to “merge downwards”: you wait for commit approval for the whole stack, then merge the top PR (B) into the PR underneath it (A), and so on, until you merge to main.

I don’t think that’s necessary these days2. Instead, when you have approval for the base PR - and logically, it will make sense that is reviewed first - you can merge it to main. Github will then offer to delete the PR branch. If you do this, the stacked PR gets automatically reset such that its merge base is now main !

There is an annoying thing here though: because of that squash during the merge to main, git, and Github, needs you to merge main back into the commit history of the PR that just changed bases. If you already merged the parent PR, you can always do git merge -Xours master to fix this, since there shouldn’t be any actual diff difference between the PR branch diffs as a whole, and what was merged to master. Or, if you didn’t merge in the parent PR, you’ll need a normal git merge master.

Another bug (as far as I’m concerned) is that if you ask for review on a stacked PR, it doesn’t get tagged with “Review required”, since, technically, you could merge the PR into its parent without approval. And there is no “Review requested” tag.

I would love all this to have some tooling: something that lets me do everything on my local stacked branches, automate merges up, keep track of dependencies, and updating the branches in Github. But I haven’t been able to find anything that can do it.

[2022-05-12 update]: I just came across spr which is so far proving excellent in solving some of these problems. I love it!

  1. Gerrit uses Change-ID embedded in the commit message to map commits onto CRs. It’s clumsy but effective. ↩︎

  2. I think it dates from before Github automatically reset a PR when its merge base was deleted ↩︎

Mutt and Office365

Nov 6, 2020

For reasons, I now need to interact with Office365 mail and calendar. It should go without saying that the Outlook webapp is almost painfully unusable (there really is no key binding for “next unread email”). Thus began the quest to get mutt interacting with the O365 server. This was a rather painful process: so much of the information on the web refers to earlier authentication schemes, Microsoft-special protocols, things that don’t support 2FA, dead Linux software, useless Linux software, etc.

After many false starts, I eventually found a working solution that allows mutt usage (and my treasured key binding for “mark current thread as read and move to the next unread email”). That solution is davmail. Yes, it’s on sourceforge, and yes, it’s Java, but it works perfectly.

It’s not very well-documented, but you can run it in headless mode and still do the interactive OAuth2 flow needed with modern O365. Your settings should include:


When davmail starts, it will ask you to visit a URL and paste the resulting URL back - this contains the necessary OAuth2 tokens it needs. No need for any GUI!

Once davmail is running, your .fetchmailrc can be:

poll localhost protocol IMAP port 1143
 auth password username "[email protected]"
 is localuser here
 sslmode none
 mda "/usr/bin/procmail -d %T"
 folders INBOX,etc,etc

Note that since davmail is running locally, there’s not really any need for SSL, though you can set that up if you like.

When you start fetchmail, enter your password, and that will initiate the auth flow against the davmail instance. Note that you’re not storing passwords anywhere, unlike the old-style app password approach you might have used previously on gmail and the like.

I don’t need to send mail often, so I have mutt set up like this:

set smtp_url= "smtp://[email protected]@localhost:1025/"
unset smtp_pass
set ssl_starttls=no
set ssl_force_tls=no

Having to enter my password each time is not a big deal for me.

Equally I have my calendar app set up to pull over caldav from davmail. Works great. I’d love to be able to pull my O365 calendar into Google Calendar, but apparently Google and Microsoft are unable - or more likely unwilling - to make this work in any meaningful way.

I’m pretty sure it’s possible to adapt Google’s OAuth2 scripts to directly use fetchmail with O365’s modern auth stuff, but I’m not sure I have the energy to figure it out - and as far as I can find, nobody else has?

procmail and threads

Sep 14, 2020

I’m apparently old-school enough to find gmail and co painfully inefficient for handling significant amounts of mail. I still find procmail+mutt hard to beat. One thing mutt can’t do, however, is filter threads automatically - there’s no “mute” facility like gmail has; threads have to processed manually.

Equally, procmail itself has no threading facilities or understanding of Message-Id or References.

Matching email threads

It can be done, though, with some cheesy awk:


# If a mail message has a References: value found in the refs file, then
# add the requested header.
# Usage:
# cat mail_msgs | ~/.mail.refs.muted "Muted: true"


cat - >$mail

newrefs="$(cat $mail | formail -x references -x message-id | tr -d '\n')"

touch $ref_file

cat $ref_file | awk -v newrefs="$newrefs" '

		found = 0;
		split(newrefs, tmp);
		for (i in tmp) {

	# Each thread will have one line in the ref file, with
	# space-separated references. So we just need to look for any
	# reference from the mail.
		for (ref in refs) {
			if (index($0, ref) != 0) {
				found = 1;

	END {
		exit(found ? 0 : 1);

if [[ $? = 0 ]]; then
	cat $mail | formail -i "$header"
	cat $mail

rm $mail

Essentially, we record all the References in the thread we’re trying to act on. Then we can trigger the above to see if the new mail is part of the thread of interest.

(This seems like the sort of thing formail could do, given its -D option has a message ID cache, but I haven’t even bothered to take a look at how hard that would be…)

procmail usage

In .procmailrc, we’d use this like so:

:0 Wfh: formail.lock
| $HOME/src/procmail-thread/ $HOME/.refs.muted "Procmail-Muted: true"

:0 Wfh: formail.lock
| $HOME/src/procmail-thread/ $HOME/.refs.watched "Procmail-Watched: true"

This will add the given header if we find any of the email’s References values in our “database”.

Then, we can do what we like with the mails, like deliver them as already-read, carbon copy them to the inbox, etc.:

* Procmail-Muted: true

* Procmail-Watched: true
        :0 c:



mutt usage

To actually watch or mute a thread, we add a couple of mutt macros:

macro index,pager "M" "|~/src/procmail-thread/ ~/.refs.muted<return>"
macro index,pager "W" "|~/src/procmail-thread/ ~/.refs.watched<return>"

The script is similar to the above, but populates the refs file with all message IDs found in the given email.

I put all this in a git repo.

Arrow Keys in Firefox

Sep 2, 2020

I’m not the only one disappointed in the way the web has worked out in lots of ways. From <blink> onwards, so much semantic information is missing from the average website, sometimes wilfully it appears. Why is there so little structural data on what the components of a page are?

One particular peccadillo I dislike is “Previous/Next Page” elements on a list page. Nobody ever uses <a rel="next" ...>. If you’re lucky, there’s an aria-label attribute for accessibility purposes, but as it’s a free-form text, and there isn’t even a convention, it could be pretty much anything.

For reasons unclear to me, almost no sites make use of the left/right arrow keys for navigation. So if I want to map those keys to prev/next, instead of a nice little bit of configuration, I have to resort to this user script:

(function() {
    'use strict';

    /* NB: we already tested for prefix/suffix, so this RE is OK. */
    function wholeWordMatch(haystack, needle) {
        let r = new RegExp("\\s" + needle + "\\s");
        return r.test(haystack);

    const LEFT_KEY_CODE = 37;
    const RIGHT_KEY_CODE = 39;

    const prevStrings = [
        "previous page",

    const nextStrings = [
        "next page",

    document.addEventListener("keyup", function(e) {

        if (!e) {
            e = window.event;

        if (e.isComposing) {

        switch ( {
            case "TEXTAREA":
            case "INPUT":

        const key = e.keyCode ? e.keyCode : e.which;

        var matches = undefined;

        if (key == LEFT_KEY_CODE) {
            matches = prevStrings;
        } else if (key == RIGHT_KEY_CODE) {
            matches = nextStrings;
        } else {

        let found = undefined;
        let score = 0;

        document.querySelectorAll("a").forEach((link) => {
            let strs = [ link.textContent ];

            if (!link.href) {

            /* This is often a good match if the text itself isn't. */
            if (link.attributes["aria-label"]) {

            for (let str of strs) {
                if (typeof str === "undefined") {

                str = str.toLowerCase();

                 * There's no perfect way to find the "best" link, but in
                 * practice this works on a reasonable number of sites: an exact
                 * match, or exact prefix or suffix, always wins; otherwise, we
                 * match a whole-word sub-string: "Go to prev <<" will match,
                 * but not "".
                for (let match of matches) {
                    if (str === match) {
                        found = link;

                    if (str.startsWith(match) || str.endsWith(match)) {
                        found = link;

                    if (score < 1 && wholeWordMatch(str, match)) {
                        found = link;
                        score = 1;

        if (found) {

  }, true);

Yet again, hacky, but it mostly works. It’s pretty cool that this is even possible though.

Docker, Go and USDT

Aug 6, 2020

We have what should be a simple task: we’re on CentOS 7, and we want to deploy a Go binary that will have USDT tracepoints. USDT is an attractive option for a few debugging purposes. It allows applications to define tracepoints with higher levels of stability and semantic meaning than more ad-hoc methods like dynamic uprobes.

Usage of USDT tracepoints tends to have a different focus from other monitoring techniques like logging, Prometheus, OpenTracing etc. These might identify a general issue such as a poor latency metric: you’d then use USDT probes to dig further into the problems in a production system, to identify precisely what’s happening at a particular endpoint or whatever.

USDT in Go

The normal model for USDT involves placing the trace points at specific places in the binary: they are statically defined and built, but dynamically enabled. This is typically done via the DTRACE_PROBE() family of macros.

The only (?) USDT facility for Go is salp. This uses libstapsdt under the hood. This library dynamically creates probes at runtime, even though Go is a compiled language. Yes, this is dynamic static dynamic tracing.

We’re going to use salpdemo in our experiment. This has two USDT probes, p1 and p2 that we’d like to be able to dynamically trace, using bcc-tools' handy trace wrapper. CentOS 7 doesn’t appear to have support for the later USDT support in perf probe.

Setting up a Docker container for dynamic tracing

For a few different reasons, we’d like to be able to trace from inside the container itself. This has security implications, given what’s implemented today, but bear in mind we’re on CentOS 7, so even if there’s a finer-grained current solution, there’s a good chance it wouldn’t work here. In reality, we would probably use an ad-hoc debugging sidecar container, but we’re going to just use the one container here.

First, we’re going to deploy the container with ansible for convenience:

$ cat hosts
localhost ansible_connection=local
$ cat playbook.yml

- hosts: localhost
  become: yes
    - docker_container:
        name: usdt_test
        image: centos:7
        state: started
        command: sleep infinity
        network_mode: bridge
          - memlock:8192000:8192000
          - sys_admin
          - /sys/kernel/debug:/sys/kernel/debug
$ ansible-playbook -i hosts ./playbook.yml

Note that we’re using sleep infinity here to keep our container running so we can play around.

We need the sys_admin capability to be able to program the probes, and the BPF compiler needs the locked memory limit bumping. We also need to mount /sys/kernel/debug read-write (!) in order to be able to write to /sys/kernel/debug/tracing/uprobe_events.

Now let’s install everything we need to be able to trace these probes:

$ docker exec -it usdt_test yum -y install \
    kernel-devel-$(uname -r) kernel-$(uname -r) bcc-tools

Yes, it’s a lot, but unavoidable. You can, in theory, use mounted volumes for the kernel sources, as described here; however, the read-only mounts break packaging inside the container, so we’re not doing that here.

Tracing the probes in the container

The above was a big hammer, but we should be good to go right? Let’s start up the demo binary:

$ docker cp ~/salpdemo usdt_test:/root/
$ docker exec -it usdt_test bash
[[email protected] /]# ~/salpdemo &
[1] 18166
 List the go probes in this demo with
	sudo tplist -vp "$(pgrep salpdemo)" "salp-demo*"
Trace this process with
	sudo trace -p "$(pgrep salpdemo | head -n1)" 'u::p1 "i=%d err=`%s` date=`%s`", arg1, arg2, arg3' 'u::p2 "j=%d flag=%d", arg1, arg2'
	sudo trace -p "$(pgrep salpdemo | head -n1)" 'u::p1 (arg1 % 2 == 0) "i=%d err='%s'", arg1, arg2'

We can indeed list the probes:

[[email protected] /]# /usr/share/bcc/tools/tplist -vp $(pgrep salpdemo) | head
salp-demo:p1 [sema 0x0]
  1 location(s)
  3 argument(s)
salp-demo:p2 [sema 0x0]
  1 location(s)
  2 argument(s)
libc:setjmp [sema 0x0]

So let’s try the suggested trace invocation:

# /usr/share/bcc/tools/trace -p "$(pgrep salpdemo | head -n1)" 'u::p1 (arg1 % 2 == 0) "i=%d err='%s'", arg1, arg2'

perf_event_open(/sys/kernel/debug/tracing/events/uprobes/p__tmp_salp_demo_I8qitQ_so_0x270_18166_bcc_18175/id): Invalid argument
Failed to attach BPF to uprobe

Huh. This doesn’t seem to be a permissions issue, since we got EINVAL. In addition, running from the host has the same problem.

I haven’t proved it, but I think our basic issue here is that Centos 7 is missing this kernel fix:

tracing/uprobe: Add support for overlayfs

I spent way too long trying to work around this by placing the binary somewhere other than overlayfs, before I finally dug a little bit more into how libstapsdt actually works, and figured out the problem.

Working around overlayfs and libstapsdt

To build probes dynamically at runtime, libstapsdt does something slightly crazy: it generates a temporay ELF shared library at runtime that contains the USDT probes and uses dlopen() to bring it into the running binary. Let’s have a look:

[[email protected] /]# grep salp-demo /proc/$(pgrep salpdemo)/maps
7fa9373b5000-7fa9373b6000 r-xp 00000000 fd:10 1506373                    /tmp/
7fa9373b6000-7fa9375b5000 ---p 00001000 fd:10 1506373                    /tmp/
7fa9375b5000-7fa9375b6000 rwxp 00000000 fd:10 1506373                    /tmp/

The process has mapped in this temporary file, named after the provider. It’s on /tmp, hence overlay2 filesystem, explaining why moving the salpdemo binary itself around made no difference.

So maybe we can be more specific?

[[email protected] /]# /usr/share/bcc/tools/trace -p "$(pgrep salpdemo | head -n1)" 'u:/tmp/ (arg1 % 2 == 0) "i=%d err='%s'", arg1, arg2'
perf_event_open(/sys/kernel/debug/tracing/events/uprobes/p__tmp_salp_demo_I8qitQ_so_0x270_18166_bcc_18188/id): Invalid argument
Failed to attach BPF to uprobe

Still not there yet. The above bug means that it still can’t find the uprobe given the binary image path. What we really need is the host path of this file. We can get this from Docker:

$ docker inspect usdt_test | json -a GraphDriver.Data.MergedDir

This is not good, as obviously we can’t reach this path from inside the container. Hey, at least we can run it on the host though.

$ sudo /usr/share/bcc/tools/trace 'u:/data/docker/overlay2/77c1397db72a7f3c7ba3f8af6c5b3824dc9c2ace9432be0b0431a2032ea93bce/merged/tmp/ (arg1 % 2 == 0) "i=%d err='%s'", arg1, arg2'
Event name (p__data_docker_overlay2_77c1397db72a7f3c7ba3f8af6c5b3824dc9c2ace9432be0b0431a2032ea93bce_merged_tmp_salp_demo_I8qitQ_so_0x270) is too long for buffer
Failed to attach BPF to uprobe

SIGH. Luckily, though:

$ sudo /usr/share/bcc/tools/trace 'u:/data/docker/overlay2/77c1397db72a7f3c7ba3f8af6c5b3824dc9c2ace9432be0b0431a2032ea93bce/diff/tmp/ (arg1 % 2 == 0) "i=%d err='%s'", arg1, arg2'
PID     TID     COMM            FUNC             -
19862   19864   salpdemo        p1               i=64 err=An error: 64
19862   19864   salpdemo        p1               i=66 err=An error: 66

It worked! But it’s not so great: we wanted to be able to trace inside a container. If we mounted /data/docker itself inside the container, we could do that, but it’s still incredibly awkward.

Using tmpfs?

Instead, can we get the generated file onto a different filesystem type? libstapsdt hard-codes /tmp which limits our options.

Let’s start again with /tmp inside the container on tmpfs:

$ tail -1 playbook.yml
        tmpfs: /tmp:exec

We need to force on exec mount flag here: otherwise, we can’t dlopen() the generated file. Yes, not great for security again.

$ docker exec -it usdt_test bash
# ~/salpdemo &
[[email protected] /]# /usr/share/bcc/tools/trace -p "$(pgrep salpdemo | head -n1)" 'u::p1 "i=%d err=`%s` date=`%s`", arg1, arg2, arg3' 'u::p2 "j=%d flag=%d", arg1, arg2'
PID     TID     COMM            FUNC             -

Well, we’re sort of there. It started up, but we never get any output. Worse, we get the same if we try this in the host now! I don’t know what the issue here is.

Using a volume?

Let’s try a volume mount instead:

$ tail -3 playbook.yml
          - /sys/kernel/debug:/sys/kernel/debug
          - /tmp/tmp.usdt_test:/tmp

If we run trace in the host now, we can just use u::p1:

$ sudo /usr/share/bcc/tools/trace -p "$(pgrep salpdemo | head -n1)" 'u::p1 "i=%d err=`%s` date=`%s`", arg1, arg2, arg3' 'u::p2 "j=%d flag=%d", arg1, arg2'
PID     TID     COMM            FUNC             -
6864    6866    salpdemo        p2               j=120 flag=1

But we still need a bit of a tweak inside our container:

# /usr/share/bcc/tools/trace -p "$(pgrep salpdemo | head -n1)" 'u::p1 "i=%d err=`%s` date=`%s`", arg1, arg2, arg3'
PID     TID     COMM            FUNC             -
<no output>
[[email protected] /]# cat /proc/$(pgrep salpdemo | head -n1)/maps | grep /tmp/salp-demo*.so | awk '{print $6}' | head -n1
[[email protected] /]# /usr/share/bcc/tools/trace -p  "$(pgrep salpdemo | head -n1)" 'u:/tmp/ "i=%d err=`%s` date=`%s`", arg1, arg2, arg3'
PID     TID     COMM            FUNC             -
11593   11595   salpdemo        p1               i=-17 err=`An error: -17` date=`Thu, 06 Aug 2020 13:12:57 +0000`

I don’t have any clear idea why the name is required inside the container context, but at least, finally, we managed to trace those USDT probes!

ctags, vim and C

Jun 19, 2020

Going to the first matching tag in vim with Control-] can be rather annoying. The exuberant-ctags secondary sort key is the filename, not the tag kind. If you have a struct type that’s also a common member name, you’re forced into using :tselect to find the struct instead of all the members. Most of the time, the struct definition is what you want.

To avoid this issue, I sort the tags file such that any kind == "s" entries come first for that tag. It’s a little annoying due to the format of the file, but it does work:


# ctags, but sub-sorted such that "struct request" comes first, rather than
# members with the same name.

# we can't use "-f -", as that elides the TAG_FILE_SORTED preamble
ctags -R -f tags.$$

awk '

$1 != entry {
	if (entry != "") {
		printf("%s%s", struct, buf);

/^.*"\ts/ {
	struct=struct $0 "\n"

$1 == entry {
	buf=buf $0 "\n"

	printf("%s%s", struct, buf);
}' <tags.$$ >tags

rm tags.$$

A Simple Pibell

May 5, 2020

With all this free time I finally got around to installing a doorbell at home. I had no interest in Ring or the like: what I really wanted was a simple push doorbell that fit the (Victorian) house but would also somehow notify me if I was downstairs…

There are several documented projects on splicing in a Raspberry Pi into existing powered doorbell systems, but that wasn’t what I wanted either.

Instead, the doorbell is a simple contact switch feeding into the Pi’s GPIO pins. It’s effectively extremely simple but I didn’t find a step by step, so this is what I could have done with reading.

I bought the Pi, a case, a power supply, an SD card, and a USB speaker:

Raspberry Pi 3 A+ Pibow Coupé case Pi power supply NOOBS pre-installed SD Card USB speaker

And the doorbell itself plus wiring:

Brass push doorbell Bell wire Crimping pins Crimp Housing

I bought a pre-installed Raspbian SD card as I don’t have an SD card caddy. After some basic configuration (which required HDMI over to a monitor) I started playing with how to set up the Pi.

Of course the PI is absurdly over-powered for this purpose, but I wanted something simple to play with. And anyway, it’s running Pihole too.

The wiring itself is simple: bell wire over through a hole in the door frame to the back of the doorbell (which is a simple contact push). The other end of the wires are connected to the PI’s GPIO pin 18, and ground. The pin is pulled up and we trigger the event when we see a falling edge.

Actually connecting the wires was a bit fiddly: the bell wire is too thin for the 0.1" connector, and lacking a proper crimping tool I had to bodge it with needle-nose pliers. But once in the pins the housing connection is solid enough.

At first I tried to connect it to Alexa but soon gave up on that idea. There’s no way to “announce” via any API, and it kept disconnecting when used as a Bluetooth speaker. And Alexa has that infuriating “Now playing from…” thing you can’t turn off as well.

During fiddling with this I removed PulseAudio from the Pi as a dead loss.

Nor could I use an Anker Soundcore as a Bluetooth speaker: the stupid thing has some sleep mode that means it misses off the first 3 seconds or so of whatever’s playing.

Instead I have the USB speaker above. It’s not great but is enough to be heard from outside and inside.

Aside from playing whatever through the speaker, the bell notifies my desktop as well as sending an email. Here’s the somewhat crappy script it’s running:

#!/usr/bin/python3 -u

# Not going to win any awards this one, is it?
# The Pi is wired up such that pin 18 goes through the switch to ground.
# The on-pin pull-up resistor is enabled (so .input() is normally True).
# When the circuit completes, it goes to ground and hence we get a
# falling edge and .input() becomes False.
# I get the occasional phantom still so we wait for settle_time before
# thinking it's real.

from email.mime.text import MIMEText
from subprocess import Popen, PIPE
from datetime import datetime

import RPi.GPIO as GPIO
import subprocess
import alsaaudio
import threading
import signal
import wave
import time
import sys
import os

samplefile = sys.argv[1]

# in seconds
settle_time = 0.1
bounce_time = 1

active = False

def notify():['/home/pi/notify-sent'])

    msg = MIMEText('At %s' %'%Y-%m-%d %H:%M:%S'))
    msg['From'] = 'doorbell <[email protected]>'
    msg['To'] = 'John Levon <[email protected]>'
    msg['Subject'] = 'Someone is ringing the doorbell'

    p = Popen(['/usr/sbin/sendmail', '-f', '[email protected]', '-t', '-oi'], stdin=PIPE)

def play():
    global samplefile
    global active

    active = True
    count = 0

    with as f:

        format = None

        # 8bit is unsigned in wav files
        if f.getsampwidth() == 1:
            format = alsaaudio.PCM_FORMAT_U8
        # Otherwise we assume signed data, little endian
        elif f.getsampwidth() == 2:
            format = alsaaudio.PCM_FORMAT_S16_LE
        elif f.getsampwidth() == 3:
            format = alsaaudio.PCM_FORMAT_S24_3LE
        elif f.getsampwidth() == 4:
            format = alsaaudio.PCM_FORMAT_S32_LE
            raise ValueError('Unsupported format')

        rate = f.getframerate()

        periodsize = rate // 8

        out = alsaaudio.PCM(alsaaudio.PCM_PLAYBACK, device=device)

        # We always play at least one time round...
        while active or count < 1:
            data = f.readframes(periodsize)

            if data:
                print('looping after %d plays, active %s' % (count, active))
                count += 1

        print('pausing audio')

    print('stopped after %d plays' % count)

def wait():
    global active

    while True:
        input_state = GPIO.input(18)
        if input_state:
            print('got input_state %s, active -> False' % input_state)
            active = False

def trigger():
    print('triggering at %s' % time.time())

    tn = threading.Thread(target=notify)

    tp = threading.Thread(target=play)

    tw = threading.Thread(target=wait)


def settle():
    global settle_time
    input_state = GPIO.input(18)
    print('input state now %s' % input_state)
    return not input_state

def falling_edge(channel):
    input_state = GPIO.input(18)
    print('got falling edge, input_state %s' % input_state)
    if settle():

with as f:
    # things go horrible if the rate isn't 48000 for some reason
    if f.getframerate() != 48000:
        raise ValueError('file must be 48000 rate')
    if f.getsampwidth() not in [ 1, 2, 3, 4]:
            raise ValueError('Unsupported format')

GPIO.setup(18, GPIO.IN, pull_up_down=GPIO.PUD_UP)
GPIO.add_event_detect(18, GPIO.FALLING, callback=falling_edge, bouncetime=(bounce_time * 1000))



URLs in gnome-terminal and mutt

Apr 9, 2020

For some time now, gnome-terminal amongst others has had a heuristic that guesses at URLs, and allows you to control-click to directly open it. However, this was easily foxed by applications doing line-wrapping instead of letting the terminal do so.

A few years ago, gnome-terminal gained ANSI escape sequences for URL highlighting. It requires applications to output the necessary escape codes, but works far more reliably.

Annoyingly, you still need to control-click, but that is easily fixed. I rebuilt Ubuntu’s build with this change like so:

sudo apt build-dep gnome-terminal
apt source gnome-terminal
cd gnome-terminal-3.28.2
dpkg-buildpackage --no-sign -b
sudo dpkg -i ../gnome-terminal_3.28.2-1ubuntu1~18.04.1_amd64.deb

This would be most useful if mutt supported the sequences, but unfortunately its built-in pager is stuck behind libncurses and can’t easily get out from under it. Using an external pager with mutt is not great either, as you lose all the integration.

There’s also no support in w3m. Even though it thankfully avoids libncurses, it’s a bit of a pain to implement, as instead of just needing to track individual bits for bold on/off or whatever, there’s a whole URL target that needs mapping onto the (re)drawn screen lines.

So instead there’s the somewhat ersatz:

$ grep email-html ~/.muttrc
macro pager,index,attach k "<pipe-message>email-html<Enter>"


$ cat email-html

dir=$(mktemp -d -p /tmp)

ripmime -i - -d $dir --name-by-type

cat $dir/text-html* | w3m -no-mouse -o display_link \
    -o display_link_number -T text/html | \
    sed 's!https*://.*!\x1B]8;;&\x1B\\&\x1B]8;;\x1B\\!g' | less -rX

rm -rf $dir

It’ll have to do.