As I mentioned in my last
post, I’ve been
experimenting with replacing davmail
with Simon Robinson’s super-cool
email-oauth2-proxy, and
hooking fetchmail
and mutt
up to it. As before, here’s a specific rundown of
how I configured O365 access using this.
Configuration
We need some small tweaks to the shipped configuration file. It’s used for both
permanent configuration and acquired tokens, but the static part looks something
like this:
[[email protected]]
permission_url = https://login.microsoftonline.com/common/oauth2/v2.0/authorize
token_url = https://login.microsoftonline.com/common/oauth2/v2.0/token
oauth2_scope = https://outlook.office365.com/IMAP.AccessAsUser.All https://outlook.office365.com/POP.AccessAsUser.All https://outlook.office365.com/SMTP.Send offline_access
redirect_uri = https://login.microsoftonline.com/common/oauth2/nativeclient
client_id = facd6cff-a294-4415-b59f-c5b01937d7bd
client_secret =
We’re re-using davmail
’s client_id
again.
Updated 2022-11-22: you also want to set
delete_account_token_on_password_error
to False
: I’m a bit bemused on why anyone would ever want this to be set to
True
, and unfortunately, that’s the default.
We’ll configure fetchmail
as follows:
poll localhost protocol IMAP port 1993
auth password username "[email protected]"
is localuser here
keep
sslmode none
mda "/usr/bin/procmail -d %T"
folders INBOX
and mutt
like this:
set smtp_url = "smtp://[email protected]@localhost:1587/"
unset smtp_pass
set ssl_starttls=no
set ssl_force_tls=no
When you first connect, you will get a GUI pop-up and you need to interact with
the tray menu to follow the authorization flow. After that, the proxy will
refresh tokens as necessary.
Running in systemd
Here’s my service
file I use, slightly modified from the upstream’s README:
$ cat /etc/systemd/system/emailproxy.service
[Unit]
Description=Email OAuth 2.0 Proxy
[Service]
ExecStart=/home/localuser/src/email-oauth2-proxy/emailproxy.py --external-auth --no-gui --config-file /home/localuser/src/email-oauth2-proxy/my.config
Restart=always
[Install]
WantedBy=multi-user.target
Headless operation
In the upstream project, only initial authorizations require the GUI.
Unfortunately, for truly headless operation, things are a bit more complicated.
In theory, you can use the --local-server-auth
with a localhost
redirect-uri
, but this is awkward enough to use that it seems useless: the
README
talks vaguely about log monitoring, and this hack isn’t permitted by
the davmail
client_id
anyway.
The maintainer isn’t interested in supporting headless in any other way, so I
have a fork with this in my no-gui-external
branch.
This does what davmail
does when an authorization is needed, like this:
$ sudo systemctl stop emailproxy
$ ./emailproxy.py --no-gui --config-file /home/localusr/src/email-oauth2-proxy/my.config --external-auth
# Now connect from mutt or fetchmail
2022-11-09 23:44:25: Authorisation request received for [email protected] (interactive mode)
Please visit the following URL to authenticate account [email protected]: https://login.microsoftonline.com/common/oauth2/v2.0/authorize?client_id=...
Please paste the full resulting redirection URL:
# ...
2022-11-09 23:44:42: SMTP (localhost:1587; [email protected]) [ Successfully authenticated SMTP connection - releasing session ]
^C
$ sudo systemctl start emailproxy
Obviously, you’ll need to do this interactively from the terminal, then restart
in daemon mode.
So far this is working well for me, but it’s certainly ugly. I wish there a
better way to do this…
I previously
described accessing Office365 email (and in particular its oauth2
flow) via
davmail
, allowing me to continue using fetchmail
, procmail
and mutt
. As
davmail
is java, it’s a pain to have around, so I thought I’d give some
details on how to do this more directly in fetchmail
, as all the available
docs I found were a little vague, and it’s quite easy to screw up.
As it happens, I came across a generally better
solution shortly after writing
this post, on which more later.
Fetchmail 7
Unfortunately there is little interest in releasing a Fetchmail version with
oauth2
support - the maintainer is taking a political
stance on not
integrating it - so you’ll need to check out the next
branch from git:
cd ~/src/
git clone -b next [email protected]:fetchmail/fetchmail.git fetchmail-next
cd fetchmail-next
./autogen.sh && ./configure --prefix=/opt/fetchmail7 && make && sudo make install
I used the branch as of 43c18a54 Merge branch 'legacy_6x' into next
. Given
that the maintainer warns us they might remove oauth2 support, you might need
this exact hash…
Generate a token
We need to go through the usual flow for getting an initial token. There’s a
helper script for this, but first we need a config file:
[email protected]
client_id=facd6cff-a294-4415-b59f-c5b01937d7bd
client_secret=
refresh_token_file=/home/localuser/.fetchmail-refresh
access_token_file=/home//localuser/.fetchmail-token
imap_server=outlook.office365.com
smtp_server=outlook.office365.com
scope=https://outlook.office365.com/IMAP.AccessAsUser.All https://outlook.office365.com/POP.AccessAsUser.All https://outlook.office365.com/SMTP.Send offline_access
auth_url=https://login.microsoftonline.com/common/oauth2/v2.0/authorize
token_url=https://login.microsoftonline.com/common/oauth2/v2.0/token
redirect_uri=https://login.microsoftonline.com/common/oauth2/nativeclient
Replace [email protected]
and localuser
in the above, and put it at
~/.fetchmail.oauth2.cfg
. It’s rare to find somebody mention this, but O365
does not need a client_secret
, and we’re just going to borrow davmail
’s
client_id
- it’s not a secret in any way, and trying to get your own is a
royal pain. Also, if you see a reference to tenant_id
anywhere, ignore it -
common
is what we need here.
Run the flow:
$ # This doesn't get installed...
$ chmod +x ~/src/fetchmail-next/contrib/fetchmail-oauth2.py
$ # Sigh.
$ sed -i 's+/usr/bin/python+/usr/bin/python3+' ~/src/fetchmail-next/contrib/fetchmail-oauth2.py
$ ~/src/fetchmail-next/contrib/fetchmail-oauth2.py -c ~/.fetchmail.oauth2.cfg --obtain_refresh_token_file
To authorize token, visit this url and follow the directions:
https://login.microsoftonline.com/common/oauth2/v2.0/authorize?...
Enter verification code:
Unlike davmail
, this needs just the code, not the full returned URL, so you’ll
need to be careful to dig out just the code from the response URL (watch out for
any session_state
parameter at the end!).
This will give you an access token that will last for around an hour.
Fetchmail configuration
Now we need an oauthbearer
.fetchmailrc
like this:
set daemon 60
set no bouncemail
poll outlook.office365.com protocol IMAP port 993
auth oauthbearer username "[email protected]"
passwordfile "/home/localuser/.fetchmail-token"
is localuser here
keep
sslmode wrapped sslcertck
folders INBOX
mda "/usr/bin/procmail -d %T"
Replace [email protected]
and localuser
.
At this point, hopefully starting /opt/fetchmail7/bin/fetchmail
will work!
Refresh tokens
As per the OAUTH2
README,
fetchmail
itself does not take care of refreshing the token, so you need
something like this in your crontab
:
*/2 * * * * $HOME/src/fetchmail-next/contrib/fetchmail-oauth2.py -c $HOME/.fetchmail.oauth2.cfg --auto_refresh
I have the misfortune of maintaining some things using CMake. One major
annoyance is that __FILE__
is an absolute path, and that can’t be changed in
CMake itself. Like most CMake annoyances, you can find a discussion online from
about 15 years ago, but no sign of an actual fix.
Instead, you need a hack: this - I think - is the simplest one.
First, in our top-level CMakeLists.txt
, we’ll define this helper function:
function(add_srcs NAME)
set(${NAME} ${ARGN} PARENT_SCOPE)
foreach(f IN LISTS ARGN)
file(RELATIVE_PATH b ${CMAKE_SOURCE_DIR}
${CMAKE_CURRENT_SOURCE_DIR}/${f})
set_source_files_properties(${f} PROPERTIES COMPILE_DEFINITIONS
"__FILE__=\"${b}\"")
endforeach()
endfunction()
This will take each of the arguments, convert each file into a path relative to
the top-level directory, then re-define __FILE__
on a per-source-file basis.
We also set()
a variable for our parent scope to use.
We’ll also need -Wno-builtin-macro-redefined
.
Then, in each child CMakeLists.txt
, we will do something like:
add_srcs(MYCODE_SRCS mycode.c mycode.h)
add_library(mycode ${MYCODE_SRCS})
add_srcs(CODE2_SRCS code2.c code2.h)
add_library(code2 ${CODE2_SRCS})
I thought it might be interesting, at least to myself, to write up how I
approach code reviews. My history in tech is one where close code review was
emphasized and well-respected: most appreciated that a detailed review was not
only worth the reviewer’s time, but mutually beneficial, and often a great
learning experience for everyone. So my tiny mind can’t process ideas like
post-commit
reviews,
that seem to be predicated on the idea that testing is some kind
of replacement for code review. To me, the kind of things that tests can cover
is only a very small part of what code review is useful for.
I’ve code-reviewed entire projects totalling many thousands of lines, and
single-character changes, but I’m usually following at least some of the below
patterns either way:
Understand the context
First thing I read is the commit message. Of course, this should be in normal
form, but I’m really making sure I can understand
what the change is, based solely upon the commit message. Without looking at the
code, can I, as a casual observer, understand what’s changed? Is the commit
title specific enough? Does the commit message’s contents describe not just
what changed, but why (as usual, the GNU standard is an exemplar of what not
to do here)? Is it clear? Does the message needlessly have things that belong on
another tracking system (target gate)?
I will read any associated ticket for its context - especially keeping an eye
out for anything that doesn’t seem to fit with the PR’s changes. This could be a
missing case, or a fundamental mis-understanding of what the real underlying
problem is. If there is any design doc mentioned (and they should be mentioned!)
I’ll also read that and diff
its contents against what actually got
implemented.
I’m looking mainly for disparities between what everyone agreed we should do,
and what is actually happening in the code, but especially for missing things;
it’s very easy to accidentally skip somebody’s drive-by comment, but that could
turn out to be crucial to the implementation.
I also consider if this change makes sense on its own, and if it could be split
up: this is often a matter of appetite (and personally I find the Linux kernel
approach often goes a little too far), but patch series with one logical change
per commit is often much easier to review. It should hopefully go without saying
that each individual commit in the series should still pass CI, but
unfortunately that’s painful to do with at least github
PRs.
Get an overview
Next I start looking at the actual code changes: often with one tab per
file,
I’m trying to understand how the changes fit together: who calls what, what new
intra-code dependencies there are, what the possible impact of the changes could
be.
I might well look back in git history for each of these files to understand why
the old code is like it is: this is also often very useful in identifying
potential issues with the new changes.
Depending on the change, I will often checkout a local copy, and use git grep
,
ctags
, etc. to help me understand how everything fits together.
My focus at this level is often on interfaces: does a new method have a suitable
name? Is it at the right level of abstraction? What is the ownership of the
relevant objects? Are there any layering violations?
Are there any external dependencies we need to worry about? Equally if anyone is
depending on us, are we providing well-written interfaces? Are they designed
with care and attention to versioning, information hiding, and all the usual API
concerns? Is this going to wear well after it’s been in production for years?
I’m also bearing in mind other ongoing work: if there’s a project underway that
is directly relevant to this specific change, I might ask for some accommodation
that will make the eventual merge of both easier. Equally if there’s a general
desire to take a particular technical direction, I might complain if something
is taking a different tack.
It’s a rare code review where I don’t have to pause to go research something:
systemd
service definition semantics, syscall error modes, how selinux
roles
work etc. As I said above, great learning experience!
Are there potential performance concerns with the change: lots of unnecessary
I/O, potential for big-O issues, needless overhead etc? What are the expected
limits of the objects being handled?
What if there are bugs with this change: is the error handling suitable? Is
there a sufficient level of logging, exception details, etc. to identify in the
field what went wrong? Is there unnecessary noise? How would a stressed-out SRE
deal with this in production?
Have any necessary unit/component tests been updated or added? Do they actually
test something useful?
I almost never build or test changes I’m reviewing: that’s a job for the
submitter and your CI infrastructure. The only exception is if I’m struggling to
understand something, and running the tests would help.
Detailed review
I’m now going to go line-by-line through the whole patch, leaving comments where
necessary. Sometimes I’ll reach something I don’t understand, add leave a
“FIXME” for myself: if, after reading the whole change, I still don’t understand
what’s happening, this will often re-formulate itself into a question for the
submitter, along with a request for expanded code comments, but usually I can
just delete these later.
If I find major - architectural level - issues with what I’m looking at, that’s
often a good prompt to take the discussion elsewhere, perhaps to a Zoom call or
design document discussion. Doing design review inside a PR is not fun for
anyone.
I’ve noticed a tendency to “review the diffs”: the idea that only the changed
lines are relevant to the review - that tools expand 10 lines at a time is a
symptom of this. This is very wrong-headed in my opinion, and I often find
myself in the rest of the code to make sure I can properly review what has
changed.
Comb for nits
Everyone has a different appetite for code review nits: generally, I will always
point out actual typos (often just once, if it’s repeated, expecting the
submitter to apply my comment to all instances). If I have a substantive
comment, I might also suggest some style-level improvements. I never expect
someone to make these sort of changes for existing code: the general idea is to
leave the code in a slightly better place than it was, not re-write whole files
for cosmetic nits.
Often these stylistic nits are marked “optional”: if the submitter feels like
it, they could change it, but it’s no big deal if not.
I’ll very often have style comments on things like:
- unnecessary comments that just describe the code
- missing comments
- variable, function naming
- function size and decomposition
- local customs
Many of these things can be a matter of opinion, so I try to bear in mind other
ways of thinking, up to a point. I’m never going to be happy seeing a ton of
CamelCase and Hungarian notation in code that doesn’t have it already.
Iterate
I haven’t yet found a code review tool that’s ideal at iteration: gerrit is
still pretty hopeless at tracking outstanding comment state. PRs in github are
better at displaying this, but have the fatal flaw that any history rewrite
means all context is lost.
Regardless, when I get a new version of the changes, I’ll often review both the
incremental diff and the whole change, checking that:
- my review comments have been acted upon and the fixes look good
- the change as a whole still makes sense:
- is the commit message still correct?
- are there now unnecessary changes, due to follow-on fixes?
- do the updates go far enough?
Long review cycles for a PR can be grueling, both for reviewers and the PR
owner. But in my opinion it’s almost always worth the effort, especially for
complex changes: this code is
probably going to live a lot longer than you’d think, and be maintained by
people other than you.
Even worse, it’s often work not given the respect it’s due: PR owners can see it
as a combative process, and management can see it as overhead. I don’t really
know what to do about that.
Pay it forward!
It’s still common for a systems library to be written in the default lingua
franca, C, although Rust is encroaching, for good reasons.
However, when it comes to testing, things get tedious quickly: writing unit or
component tests in C is a slow, bug-prone exercise. With
libvfio-user, after fixing too many
bugs that were due to the test rather than the test subject, I decided it would
be worth looking at alternative approaches. The aim was to reduce the time it
takes to develop unit/component tests.
Up until this point, we’d been using
ctest, along with
cmocka when we needed to mock out certain functions (such
as socket handling). Leaving aside my strong feelings on these tools, this was
rather unsatisfactory: libvfio-user
effectively implements a (UNIX) socket
server, but we weren’t actually testing round-trip interactions for the most
part. In terms of code coverage, very little useful could be done via this unit
testing approach, but the “sample” client/server was tedious to work with for
testing purposes.
Python-based testing
After a quick proof of concept, it became clear that using Python would be a
great choice to cover most of our testing needs. libvfio-user
doesn’t ship
with any client bindings, and, given that the main clients are
qemu,
cloud-hypervisor and
SPDK, Python bindings would be of dubious utility.
As a result, we decided against “proper” Python bindings, auto-generated or
otherwise, in favour of a small and simple approach. In particular, by using the
terrible magic of ctypes
, we could easily set up both client and server test
cases that fully represent how the library works in real life.
So, instead of auto-generated bindings, we write - by hand -
simple, thin, layers of
type wrappers:
class vfio_irq_info(Structure):
_pack_ = 1
_fields_ = [
("argsz", c.c_uint32),
("flags", c.c_uint32),
("index", c.c_uint32),
("count", c.c_uint32),
]
small harness routines for socket handling
…
def connect_client(ctx):
sock = connect_sock()
json = b'{ "capabilities": { "max_msg_fds": 8 } }'
# struct vfio_user_version
payload = struct.pack("HH%dsc" % len(json), LIBVFIO_USER_MAJOR,
LIBVFIO_USER_MINOR, json, b'\0')
hdr = vfio_user_header(VFIO_USER_VERSION, size=len(payload))
sock.send(hdr + payload)
...
… interacting with the
library on the server
side
…
def get_pci_header(ctx):
ptr = lib.vfu_pci_get_config_space(ctx)
return c.cast(ptr, c.POINTER(vfu_pci_hdr_t)).contents
… and so on. Writing this by hand might seem immensely tedious, but in practice,
as it’s pretty much all boilerplate, it’s very quick to write and modify, and
easily understandable; something that can rarely be said for any kind of
auto-generated code.
Client/server interactions
Another observation was that, for the purposes of these tests, we really didn’t
need a client process and a server process: in fact, we don’t even need more
than one thread of execution. If we make each test round-robin between acting as
the client, then acting as the server, it becomes trivial to follow the
control flow, and understanding logs, debugging, etc. is much easier. This is
illustrated by the
msg()
helper:
def msg(ctx, sock, cmd, payload=bytearray(), expect_reply_errno=0, fds=None,
rsp=True, expect_run_ctx_errno=None):
"""
Round trip a request and reply to the server. vfu_run_ctx will be
called once for the server to process the incoming message,
@expect_run_ctx_errrno checks the return value of vfu_run_ctx. If a
response is not expected then @rsp must be set to False, otherwise this
function will block indefinitely.
"""
# FIXME if expect_run_ctx_errno == errno.EBUSY then shouldn't it implied
# that rsp == False?
hdr = vfio_user_header(cmd, size=len(payload))
if fds:
sock.sendmsg([hdr + payload], [(socket.SOL_SOCKET, socket.SCM_RIGHTS,
struct.pack("I" * len(fds), *fds))])
else:
sock.send(hdr + payload)
ret = vfu_run_ctx(ctx, expect_errno=expect_run_ctx_errno)
if expect_run_ctx_errno is None:
assert ret >= 0, os.strerror(c.get_errno())
if not rsp:
return
return get_reply(sock, expect=expect_reply_errno)
We are operating as the client when we do the sendmsg()
; the server then
processes that message via vfu_run_ctx()
, before we “become” the client again
and receive the response via get_reply()
.
We can then implement an individual test like this:
def test_dma_region_too_big():
global ctx, sock
payload = vfio_user_dma_map(argsz=len(vfio_user_dma_map()),
flags=(VFIO_USER_F_DMA_REGION_READ |
VFIO_USER_F_DMA_REGION_WRITE),
offset=0, addr=0x10000, size=MAX_DMA_SIZE + 4096)
msg(ctx, sock, VFIO_USER_DMA_MAP, payload, expect_reply_errno=errno.ENOSPC)
which we can run via make pytest
:
...
___________________________ test_dma_region_too_big ____________________________
----------------------------- Captured stdout call -----------------------------
DEBUG: quiescing device
DEBUG: device quiesced immediately
DEBUG: adding DMA region [0x10000, 0x80000011000) offset=0 flags=0x3
ERROR: DMA region size 8796093026304 > max 8796093022208
ERROR: failed to add DMA region [0x10000, 0x80000011000) offset=0 flags=0x3: No space left on device
ERROR: msg0x62: cmd 2 failed: No space left on device
...
This is many times easier to write and test than trying to do this in C, whether
as a client/server, or attempting to use mocking. And we can be reasonably
confident that the test is meaningful, as we are really executing all of the
library’s message handling.
With a little bit of tweaking, we can also use standard C-based tools like
valgrind
and gcov
. Code coverage is simple: after defeating the mini-boss of
cmake
, we can run make gcov
and get code-coverage results for all C code
invoked via the Python tests - it just works!
Running Python tests with valgrind
was a little harder: for leak detection, we
need to make sure the tests clean up after themselves explicitly. But Python
itself also has a lot of valgrind
noise. Eventually we found that this valgrind
invocation worked well:
PYTHONMALLOC=malloc \
valgrind \
--suppressions=$(CURDIR)/test/py/valgrind.supp \
--quiet \
--track-origins=yes \
--errors-for-leak-kinds=definite \
--show-leak-kinds=definite \
--leak-check=full \
--error-exitcode=1 \
$(PYTESTCMD)
We need to force Python to use the system allocator, and add a number of
suppressions for internal Python valgrind
complaints - I was unable to find a
working standard suppression file for Python, so had to construct this myself
based upon the Python versions in our CI infrastructure.
Unfortunately, at least on our test systems, ASAN was completely incompatible,
so we couldn’t directly run that for the Python tests.
Summary
The approach I’ve described here has worked really well for us: it no longer
feels immensely tedious to add tests along with library changes, which can only
help improve overall code quality. They are quick to run and modify, and for the
most part easy to understand what the tests are actually doing.
There’s been a few occasions where ctypes
has been difficult to work with -
for me the documentation is particularly sparse, and callbacks from the C
library into Python are distinctly non-obvious - but we’ve so far always
managed to battle through, and twist it to our needs.
Doing things this way has a few other drawbacks: it’s not clear, for example, how we might test
intermittent allocation failures, or other failure injection scenarios. It’s
also not really suitable for any kind of performance or scalability testing.
I’m curious if others have taken a similar approach, and what their experiences
might be.
Stefan Hajnoczi recently posted about clean commit
history.
It’s a controversial viewpoint that not everyone agrees with - there is a
sizable population in favour of “never rewrite history”. For me, though, the
points he makes there are totally correct: each commit should be a logical
change, main
(neé master
) should stay green, and CI should pass at every
single point in main
’s history. More than just CI though: regardless of
whether it passes CI, the main
branch should be of good quality at all times,
if you want to avoid the Quality Death
Spiral.
Unfortunately, Github pull requests make this model a little difficult for a few
reasons:
You can’t ever rebase a PR undergoing review
It’s important that a non-draft PR is never rebased, or re-written in any way.
Why? Well, aside from making it difficult for a reviewer to see what’s changed
since last looking, if you rebase, the commits previously on the PR disappear
off into reflog
hyperspace.
The View Changes
button on review comments is attached to that particular
commit hash, which is no longer in the history for that branch, and you get the
dreaded:
We went looking everywhere, but couldn’t find those commits.
Note that if your PR is still a draft, you’re fine to edit the history whichever
way you like: in fact, it’s often useful for review purposes to have multiple
commits even at the start of a PR review before you move it from draft. Up to you.
The only other safe time to rebase is on final approach. At that point,
presuming you are keeping to the “single main
commit per PR” approach (see
below), you’ll be wanting to squash the entire branch history into a single
commit to main
. For this, I usually use
prr: it’s handy for picking up Reviewed-by
automatically, and merging commit comments together for final editing.
Github CI only runs on branch tips
You probably don’t want to have a PR where you’re going to merge more than
one commit into main. This is because CI only runs on the top-level commit: if
an ancestor commit breaks the build, you’ll never know. Stefan mentions using
git rebase --exec
for checking commits in a stack, which indeed works great,
but unless you’re running exactly the same CI that’s running under Github
Actions, you can’t rely on it.
If that’s the case, what if you have one or more changes that depend on another?
This is where “stacked PRs” come in, and they’re a bit of a pain…
Stacked PRs are cumbersome
Gerrit has a really useful model for
reviewing stacks of changes: instead of the full history, each “patchset”
corresponds to the single logical change Stefan talks about above. Every time
you push to Gerrit, you’re supposed to have collapsed and rebased additional
changes into single commits corresponding to each Gerrit CR. The model has some
disadvantages as well (in particular, it’s a bit of a pain to keep a full
history locally), but the Gerrit review UI doesn’t suffer from the rebasing
issues Github does.
Presuming - as there is no CI available - gerrithub is a
non-starter, the only option available on Github is to use multiple PRs. This is
better than it used to be, but is still a little painful.
Essentially, a stacked PR is one that’s opened not against the main
branch,
but against another PR branch. Say we have changes A
and B
, where B
is
dependent on A
. You’d create a local branch with A
, then push it to Github
and open a PR. You’d have another local branch with A
and B
, then push
that branch to Github and open a separate PR.
Now we need to make the B
PR be based against the A
PR. You can do this via
the web UI by clicking Edit
, though there is annoying bug here: it doesn’t
reset the title and description. You can use gh pr create --base ...
to avoid
this problem.
Now, in the second PR, you’ll just see the commit for B
. Each PR can be
reviewed separately, and each PR gets its own CI run.
You also might want to merge additional changes up the stack. Let’s say that you
have commit A2
on the A
PR, that you want in PR B
and C
: the best - if
rather tedious - way to do this, is to merge A
into B
, then B
into C
.
That’s a lot of merge commits, but remember we’re squashing a PR every time
before merging a PR to main
.
You’ll see on the web recommendations to “merge downwards”: you wait for commit
approval for the whole stack, then merge the top PR (B
) into the PR underneath
it (A
), and so on, until you merge to main
.
I don’t think that’s necessary these days. Instead, when you have approval for
the base PR - and logically, it will make sense that is reviewed first - you can
merge it to main
. Github will then offer to delete the PR branch. If you do
this, the stacked PR gets automatically reset such that its merge base is now
main
!
There is an annoying thing here though: because of that squash during the merge
to main
, git
, and Github, needs you to merge main
back into the commit
history of the PR that just changed bases. If you already merged the parent PR,
you can always do git merge -Xours master
to fix this, since there shouldn’t
be any actual diff difference between the PR branch diffs as a whole, and what
was merged to master. Or, if you didn’t merge in the parent PR, you’ll need a
normal git merge master
.
Another bug (as far as I’m concerned) is that if you ask for review on a stacked
PR, it doesn’t get tagged with “Review required”, since, technically, you could
merge the PR into its parent without approval. And there is no “Review
requested” tag.
I would love all this to have some tooling: something that lets me do
everything on my local stacked branches, automate merges up, keep track of
dependencies, and updating the branches in Github. But I haven’t been able to
find anything that can do it.
[2022-05-12 update]: I just came across spr
which is so far proving excellent in solving some of these problems. I love it!
For reasons, I now need to interact with Office365 mail and calendar. It should
go without saying that the Outlook webapp is almost painfully unusable (there
really is no key binding for “next unread email”). Thus began the quest to get
mutt interacting with the O365 server. This was a rather painful process: so
much of the information on the web refers to earlier authentication schemes,
Microsoft-special protocols, things that don’t support 2FA, dead Linux software,
useless Linux software, etc.
After many false starts, I eventually found a working solution that allows mutt
usage (and my treasured key binding for “mark current thread as read and move to
the next unread email”). That solution is
davmail. Yes, it’s on sourceforge, and yes,
it’s Java, but it works perfectly.
It’s not very well-documented, but you can run it in headless mode and still
do the interactive OAuth2 flow needed with modern O365. Your settings should
include:
davmail.mode=O365Manual
davmail.url=https://outlook.office365.com/EWS/Exchange.asmx
When davmail
starts, it will ask you to visit a URL and paste the resulting URL
back - this contains the necessary OAuth2 tokens it needs. No need for any GUI!
Once davmail
is running, your .fetchmailrc
can be:
poll localhost protocol IMAP port 1143
auth password username "[email protected]"
is localuser here
sslmode none
keep
mda "/usr/bin/procmail -d %T"
folders INBOX,etc,etc
Note that since davmail
is running locally, there’s not really any need for SSL,
though you can set that up if you like.
When you start fetchmail
, enter your password, and that will initiate the auth
flow against the davmail
instance. Note that you’re not storing passwords
anywhere, unlike the old-style app password approach you might have used
previously on gmail and the like.
I don’t need to send mail often, so I have mutt
set up like this:
set smtp_url= "smtp://[email protected]@localhost:1025/"
unset smtp_pass
set ssl_starttls=no
set ssl_force_tls=no
Having to enter my password each time is not a big deal for me.
Equally I have my calendar app set up to pull over caldav from davmail
. Works
great. I’d love to be able to pull my O365 calendar into Google Calendar, but
apparently Google and Microsoft are unable - or more likely unwilling - to
make this work in any meaningful way.
I’m pretty sure it’s possible to adapt Google’s OAuth2 scripts to directly use
fetchmail
with O365’s modern auth stuff, but I’m not sure I have the energy to
figure it out - and as far as I can find, nobody else has?
I’m apparently old-school enough to find gmail
and co painfully inefficient for
handling significant amounts of mail. I still find procmail
+mutt
hard to beat. One thing mutt
can’t do, however, is filter threads
automatically - there’s no “mute” facility like gmail
has; threads
have to processed manually.
Equally, procmail
itself has no threading facilities or understanding
of Message-Id
or References
.
Matching email threads
It can be done, though, with some cheesy awk:
#!/bin/bash
#
# If a mail message has a References: value found in the refs file, then
# add the requested header.
#
# Usage:
#
# cat mail_msgs | match-thread.sh ~/.mail.refs.muted "Muted: true"
#
ref_file="$1"
header="$2"
mail=/tmp/match-thread.mail.$$
cat - >$mail
newrefs="$(cat $mail | formail -x references -x message-id | tr -d '\n')"
touch $ref_file
cat $ref_file | awk -v newrefs="$newrefs" '
BEGIN {
found = 0;
split(newrefs, tmp);
for (i in tmp) {
refs[tmp[i]]++;
}
}
# Each thread will have one line in the ref file, with
# space-separated references. So we just need to look for any
# reference from the mail.
{
for (ref in refs) {
if (index($0, ref) != 0) {
found = 1;
exit(0);
}
}
}
END {
exit(found ? 0 : 1);
}
'
if [[ $? = 0 ]]; then
cat $mail | formail -i "$header"
else
cat $mail
fi
rm $mail
Essentially, we record all the References
in the thread we’re trying
to act on. Then we can trigger the above to see if the new mail is part
of the thread of interest.
(This seems like the sort of thing formail
could do, given its -D
option has a message ID cache, but I haven’t even bothered to take a
look at how hard that would be…)
procmail
usage
In .procmailrc
, we’d use this like so:
:0 Wfh: formail.lock
| $HOME/src/procmail-thread/match-thread.sh $HOME/.refs.muted "Procmail-Muted: true"
:0 Wfh: formail.lock
| $HOME/src/procmail-thread/match-thread.sh $HOME/.refs.watched "Procmail-Watched: true"
This will add the given header if we find any of the email’s
References
values in our “database”.
Then, we can do what we like with the mails, like deliver them as
already-read, carbon copy them to the inbox, etc.:
:0
* Procmail-Muted: true
{
SWITCHRC=$HOME/.procmailrc.markread
}
:0
* Procmail-Watched: true
{
:0 c:
$DEFAULT
SWITCHRC=$HOME/.procmailrc.markread
}
:0
$DEST/
mutt
usage
To actually watch or mute a thread, we add a couple of mutt
macros:
macro index,pager "M" "|~/src/procmail-thread/add-thread.sh ~/.refs.muted<return>"
macro index,pager "W" "|~/src/procmail-thread/add-thread.sh ~/.refs.watched<return>"
The add-thread.sh
script is similar to the above, but populates the
refs file with all message IDs found in the given email.
I put all this in a git repo.
I’m not the only one disappointed in the way the web has worked out
in lots of ways. From <blink>
onwards, so much semantic information is
missing from the average website, sometimes wilfully it appears. Why is
there so little structural data on what the components of a page are?
One particular peccadillo I dislike is “Previous/Next Page” elements on
a list page. Nobody ever uses
<a rel="next" ...>
.
If you’re lucky, there’s an aria-label
attribute for accessibility
purposes, but as it’s a free-form text, and there isn’t even a
convention, it could be pretty much anything.
For reasons unclear to me, almost no sites make use of the left/right
arrow keys for navigation. So if I want to map those keys to prev/next,
instead of a nice little bit of configuration, I have to resort to
this user
script:
(function() {
'use strict';
/* NB: we already tested for prefix/suffix, so this RE is OK. */
function wholeWordMatch(haystack, needle) {
let r = new RegExp("\\s" + needle + "\\s");
return r.test(haystack);
};
const LEFT_KEY_CODE = 37;
const RIGHT_KEY_CODE = 39;
const prevStrings = [
"previous page",
"previous",
"prev"
];
const nextStrings = [
"next page",
"next"
];
document.addEventListener("keyup", function(e) {
if (!e) {
e = window.event;
}
if (e.isComposing) {
return;
}
switch (e.target.tagName) {
case "TEXTAREA":
case "INPUT":
return;
}
const key = e.keyCode ? e.keyCode : e.which;
var matches = undefined;
if (key == LEFT_KEY_CODE) {
matches = prevStrings;
} else if (key == RIGHT_KEY_CODE) {
matches = nextStrings;
} else {
return;
}
let found = undefined;
let score = 0;
document.querySelectorAll("a").forEach((link) => {
let strs = [ link.textContent ];
if (!link.href) {
return;
}
/* This is often a good match if the text itself isn't. */
if (link.attributes["aria-label"]) {
strs.push(link.attributes["aria-label"].nodeValue);
}
for (let str of strs) {
if (typeof str === "undefined") {
return;
}
str = str.toLowerCase();
/*
* There's no perfect way to find the "best" link, but in
* practice this works on a reasonable number of sites: an exact
* match, or exact prefix or suffix, always wins; otherwise, we
* match a whole-word sub-string: "Go to prev <<" will match,
* but not "dpreview.com".
*/
for (let match of matches) {
if (str === match) {
found = link;
break;
}
if (str.startsWith(match) || str.endsWith(match)) {
found = link;
break;
}
if (score < 1 && wholeWordMatch(str, match)) {
found = link;
score = 1;
}
}
}
});
if (found) {
found.click();
}
}, true);
})();
Yet again, hacky, but it mostly works. It’s pretty cool that this is
even possible though.
We have what should be a simple task: we’re on CentOS 7, and we want to
deploy a Go binary that will have USDT
tracepoints. USDT is an attractive
option for a few debugging purposes. It allows applications to define
tracepoints with higher levels of stability and semantic meaning than
more ad-hoc methods like dynamic uprobes.
Usage of USDT tracepoints tends to have a different focus from other
monitoring techniques like logging, Prometheus, OpenTracing etc. These
might identify a general issue such as a poor latency metric: you’d then
use USDT probes to dig further into the problems in a production system,
to identify precisely what’s happening at a particular endpoint or
whatever.
USDT in Go
The normal model for USDT involves placing the trace points at specific
places in the binary: they are statically defined and built, but
dynamically enabled. This is typically done via the DTRACE_PROBE()
family of macros.
The only (?) USDT facility for Go is
salp. This uses
libstapsdt under the hood. This
library dynamically creates probes at runtime, even though Go is a
compiled language. Yes, this is dynamic static dynamic tracing.
We’re going to use salpdemo
in our experiment. This has two USDT
probes, p1
and p2
that we’d like to be able to dynamically trace,
using bcc-tools
' handy trace
wrapper. CentOS 7 doesn’t appear to have
support for the later USDT support in perf probe
.
Setting up a Docker container for dynamic tracing
For a few different reasons, we’d like to be able to trace from inside
the container itself. This has security implications, given what’s
implemented today, but bear in mind we’re on CentOS 7, so even if
there’s a finer-grained current solution, there’s a good chance it
wouldn’t work here. In reality, we would probably use an ad-hoc
debugging sidecar container, but we’re going to just use the one
container here.
First, we’re going to deploy the container with ansible for
convenience:
$ cat hosts
localhost ansible_connection=local
$ cat playbook.yml
---
- hosts: localhost
become: yes
tasks:
- docker_container:
name: usdt_test
image: centos:7
state: started
command: sleep infinity
network_mode: bridge
ulimits:
- memlock:8192000:8192000
capabilities:
- sys_admin
volumes:
- /sys/kernel/debug:/sys/kernel/debug
$ ansible-playbook -i hosts ./playbook.yml
Note that we’re using sleep infinity
here to keep our container
running so we can play around.
We need the sys_admin
capability to be able to program the probes,
and the BPF compiler needs the locked memory limit bumping. We also
need to mount /sys/kernel/debug
read-write (!) in order to be able to
write to /sys/kernel/debug/tracing/uprobe_events
.
Now let’s install everything we need to be able to trace these probes:
$ docker exec -it usdt_test yum -y install \
kernel-devel-$(uname -r) kernel-$(uname -r) bcc-tools
Yes, it’s a lot, but unavoidable. You can, in theory, use mounted
volumes for the kernel sources, as described
here;
however, the read-only mounts break packaging inside the container, so
we’re not doing that here.
Tracing the probes in the container
The above was a big hammer, but we should be good to go right? Let’s
start up the demo binary:
$ docker cp ~/salpdemo usdt_test:/root/
$ docker exec -it usdt_test bash
[root@8ccf34663dd2 /]# ~/salpdemo &
[1] 18166
List the go probes in this demo with
sudo tplist -vp "$(pgrep salpdemo)" "salp-demo*"
Trace this process with
sudo trace -p "$(pgrep salpdemo | head -n1)" 'u::p1 "i=%d err=`%s` date=`%s`", arg1, arg2, arg3' 'u::p2 "j=%d flag=%d", arg1, arg2'
or
sudo trace -p "$(pgrep salpdemo | head -n1)" 'u::p1 (arg1 % 2 == 0) "i=%d err='%s'", arg1, arg2'
We can indeed list the probes:
[root@8ccf34663dd2 /]# /usr/share/bcc/tools/tplist -vp $(pgrep salpdemo) | head
salp-demo:p1 [sema 0x0]
1 location(s)
3 argument(s)
salp-demo:p2 [sema 0x0]
1 location(s)
2 argument(s)
libc:setjmp [sema 0x0]
...
So let’s try the suggested trace
invocation:
# /usr/share/bcc/tools/trace -p "$(pgrep salpdemo | head -n1)" 'u::p1 (arg1 % 2 == 0) "i=%d err='%s'", arg1, arg2'
perf_event_open(/sys/kernel/debug/tracing/events/uprobes/p__tmp_salp_demo_I8qitQ_so_0x270_18166_bcc_18175/id): Invalid argument
Failed to attach BPF to uprobe
Huh. This doesn’t seem to be a permissions issue, since we got EINVAL
.
In addition, running from the host has the same problem.
I haven’t proved it, but I think our basic issue here is that Centos 7
is missing this kernel fix:
tracing/uprobe: Add support for overlayfs
I spent way too long trying to work around this by placing the binary
somewhere other than overlayfs, before I finally dug a little bit more
into how libstapsdt
actually works, and figured out the problem.
Working around overlayfs and libstapsdt
To build probes dynamically at runtime, libstapsdt
does something
slightly
crazy:
it generates a temporay ELF shared library at runtime that contains
the USDT probes and uses dlopen()
to bring it into the running binary.
Let’s have a look:
[root@8ccf34663dd2 /]# grep salp-demo /proc/$(pgrep salpdemo)/maps
7fa9373b5000-7fa9373b6000 r-xp 00000000 fd:10 1506373 /tmp/salp-demo-I8qitQ.so
7fa9373b6000-7fa9375b5000 ---p 00001000 fd:10 1506373 /tmp/salp-demo-I8qitQ.so
7fa9375b5000-7fa9375b6000 rwxp 00000000 fd:10 1506373 /tmp/salp-demo-I8qitQ.so
The process has mapped in this temporary file, named after the provider.
It’s on /tmp
, hence overlay2
filesystem, explaining why moving the
salpdemo
binary itself around made no difference.
So maybe we can be more specific?
[root@8ccf34663dd2 /]# /usr/share/bcc/tools/trace -p "$(pgrep salpdemo | head -n1)" 'u:/tmp/salp-demo-I8qitQ.so:p1 (arg1 % 2 == 0) "i=%d err='%s'", arg1, arg2'
perf_event_open(/sys/kernel/debug/tracing/events/uprobes/p__tmp_salp_demo_I8qitQ_so_0x270_18166_bcc_18188/id): Invalid argument
Failed to attach BPF to uprobe
Still not there yet. The above bug means that it still can’t find the
uprobe given the binary image path. What we really need is the host
path of this file. We can get this from Docker:
$ docker inspect usdt_test | json -a GraphDriver.Data.MergedDir
/data/docker/overlay2/77c1397db72a7f3c7ba3f8af6c5b3824dc9c2ace9432be0b0431a2032ea93bce/merged
This is not good, as obviously we can’t reach this path from inside the
container. Hey, at least we can run it on the host though.
$ sudo /usr/share/bcc/tools/trace 'u:/data/docker/overlay2/77c1397db72a7f3c7ba3f8af6c5b3824dc9c2ace9432be0b0431a2032ea93bce/merged/tmp/salp-demo-I8qitQ.so:p1 (arg1 % 2 == 0) "i=%d err='%s'", arg1, arg2'
Event name (p__data_docker_overlay2_77c1397db72a7f3c7ba3f8af6c5b3824dc9c2ace9432be0b0431a2032ea93bce_merged_tmp_salp_demo_I8qitQ_so_0x270) is too long for buffer
Failed to attach BPF to uprobe
SIGH. Luckily, though:
$ sudo /usr/share/bcc/tools/trace 'u:/data/docker/overlay2/77c1397db72a7f3c7ba3f8af6c5b3824dc9c2ace9432be0b0431a2032ea93bce/diff/tmp/salp-demo-I8qitQ.so:p1 (arg1 % 2 == 0) "i=%d err='%s'", arg1, arg2'
PID TID COMM FUNC -
19862 19864 salpdemo p1 i=64 err=An error: 64
19862 19864 salpdemo p1 i=66 err=An error: 66
It worked! But it’s not so great: we wanted to be able to trace inside a
container. If we mounted /data/docker
itself inside the container, we
could do that, but it’s still incredibly awkward.
Using tmpfs?
Instead, can we get the generated file onto a different filesystem type?
libstapsdt
hard-codes
/tmp
which limits our options.
Let’s start again with /tmp
inside the container on tmpfs
:
$ tail -1 playbook.yml
tmpfs: /tmp:exec
We need to force on exec
mount flag here: otherwise, we can’t
dlopen()
the generated file. Yes, not great for security again.
$ docker exec -it usdt_test bash
# ~/salpdemo &
...
[root@1f56af6e7bee /]# /usr/share/bcc/tools/trace -p "$(pgrep salpdemo | head -n1)" 'u::p1 "i=%d err=`%s` date=`%s`", arg1, arg2, arg3' 'u::p2 "j=%d flag=%d", arg1, arg2'
PID TID COMM FUNC -
Well, we’re sort of there. It started up, but we never get any output.
Worse, we get the same if we try this in the host now! I don’t know
what the issue here is.
Using a volume?
Let’s try a volume mount instead:
$ tail -3 playbook.yml
volumes:
- /sys/kernel/debug:/sys/kernel/debug
- /tmp/tmp.usdt_test:/tmp
If we run trace
in the host now, we can just use u::p1
:
$ sudo /usr/share/bcc/tools/trace -p "$(pgrep salpdemo | head -n1)" 'u::p1 "i=%d err=`%s` date=`%s`", arg1, arg2, arg3' 'u::p2 "j=%d flag=%d", arg1, arg2'
PID TID COMM FUNC -
6864 6866 salpdemo p2 j=120 flag=1
...
But we still need a bit of a tweak inside our container:
# /usr/share/bcc/tools/trace -p "$(pgrep salpdemo | head -n1)" 'u::p1 "i=%d err=`%s` date=`%s`", arg1, arg2, arg3'
PID TID COMM FUNC -
<no output>
[root@d72b822cab0f /]# cat /proc/$(pgrep salpdemo | head -n1)/maps | grep /tmp/salp-demo*.so | awk '{print $6}' | head -n1
/tmp/salp-demo-6kcugm.so
[root@d72b822cab0f /]# /usr/share/bcc/tools/trace -p "$(pgrep salpdemo | head -n1)" 'u:/tmp/salp-demo-6kcugm.so:p1 "i=%d err=`%s` date=`%s`", arg1, arg2, arg3'
PID TID COMM FUNC -
11593 11595 salpdemo p1 i=-17 err=`An error: -17` date=`Thu, 06 Aug 2020 13:12:57 +0000`
...
I don’t have any clear idea why the name is required inside the
container context, but at least, finally, we managed to trace those USDT
probes!