Some prompting on IRC led me to do this write-up on how to configure PCI
passthrough for a bhyve instance running on SmartOS. Please be aware
this isn’t necessarily fully supported or tested; it may work for you,
it also may not.
Some of this is covered under RFD
114; the below is
more of a HOWTO.
Global zone configuration
To allow a bhyve zone to access a PCI device, we need to prevent the
global zone’s access to it, and make it available to bhyve zones. To do
this, we need to make two overlay files available to the system via
Remember that the SmartOS root is an ephemeral ramdisk: as we need to
change two files in
/etc, we’ll have to modify our grub configuration:
Modify grub to include PPT config files
# mount our USB key (modify as needed, see diskinfo output)
mount -F pcfs -o foldcase /dev/dsk/c1t0d0p1 /mnt
We want to modify the menu entry we’re booting to be something like
title my entry
kernel$ /os/20181023T131405Z/platform/i86pc/kernel/amd64/unix ...
module /os/20181023T131405Z/platform/i86pc/amd64/boot_archive type=rootfs name=ramdisk
module /20181023T131405Z/platform/i86pc/amd64/boot_archive.hash type=hash name=ramdisk
module /overlay/etc/ppt_aliases type=file type=file name=etc/ppt_aliases
module /overlay/etc/ppt_matches type=file type=file name=etc/ppt_matches
Make sure to add the
type entry on all
module lines! Before we
reboot, though, we need to actually populate these two files.
This file is a list of *all* devices that we might want to
pass-through, in PCI ID form:
# cat /mnt/overlay/etc/ppt_matches
This file should contain the PCI ID of the type of device you want to
pass through. (Please ignore all PCI specifics here, this is just for
illustration.). Every device on the system that has these IDs will be
listed (after a reboot) in
pptadm list -a.
The second file is used to actually reserve specific devices for
pass-through, based on physical path. For example:
# cat /mnt/overlay/etc/ppt_aliases
ppt "/[email protected],0/pci8086,[email protected]/[email protected]"
ppt "/[email protected],0/pci8086,[email protected]/pci1462,2291"
This binds the “ppt” driver to the given paths under
/devices. On a
reboot, the kernel will process this and attach ppt as needed. This
driver stub makes sure that the host kernel won’t try to process the
Reboot the host
After we reboot, we should find our files are processed. They are
visible under the path
/system/boot - the existing
will be over-ridden. The
pptadm(1m) tool is a handy way of listing
# pptadm list -a -o dev,vendor,device,path
DEV VENDOR DEVICE PATH
/dev/ppt0 10de a65 /[email protected],0/pci8086,[email protected]/[email protected]
/dev/ppt1 10de be3 /[email protected],0/pci8086,[email protected]/pci1462,2291
We can see that two specific devices are now available for pass-through.
Now we need to configure our VM to actually use this device. In the JSON
for the VM, this looks something like this:
"path": "/devices/[email protected],0/pci8086,[email protected]/[email protected]",
"path": "/devices/[email protected],0/pci8086,[email protected]/pci1462,2291",
path is the physical path, and the PCI slot is what the guest
will see (the usual bus,device,function triple). Passing the new JSON
vmadm update should allow the VM to boot with the new
You can check
/zones/$uuid/logs/platform.log for any problems.
I joined Joyent at the start of the year while Meltdown was breaking
news; it was certainly an “interesting” time to start a new job. Luckily
by my first week, Alex and Robert had pretty much figured out how the
changes should look and made good inroads on the implementation. So I
began working with Alex on his KPTI trampoline code (mainly involving
breaking it with my old friend KMDB). I also picked up the PCID work
which I describe here.
As you can probably tell from Alex’s blog
post, Meltdown is unusual
for a security issue: aside from the usual operational pains of any
security patch, the fix itself involved some pretty significant code
changes to the low-level core of the kernel.
There’s also another potential impact, and that’s performance. While the
actual overhead is heavily workload-dependent - and some of the reports
out there seem pretty alarmist - having to switch page tables (i.e.
%cr3) on every kernel entry and exit has a non-trivial
impact on system call cost. Nor can we keep the kernel state in the TLB.
Previously, we would set
PT_GLOBAL on kernel mappings so they’re not
flushed across a
%cr3 reload, but as the CPU would happily use these
TLB entries to speculate into the kernel, we must flush them.
The good news is that there’s a CPU feature on reasonably recent Intel
CPUs called Process Context IDs. This lets you load the lower bits of
%cr3 with a small integer value. This ID is used as a tag in any TLB
lookups or fills. This feature is somewhat similar to ASIDs seen on
other architectures, with one notable difference. The PCID applies to
TLB state implicitly, that is, there’s no way to say “load from memory
using this ID” in
ddi_copyin() and the like.
One way of using PCIDs is to associate an ID with a
struct as: that
is, each time we load a process’s address space into the HAT, we will
use a specific PCID for it, and avoid having to flush the mappings for
the previous processes. This isn’t really a viable option for Illumos,
though: if nothing else we suspect that the additional shootdown flushes
needed (since we’d maintain TLB entries even after switching away from a
struct as) would counteract any performance gain.
Instead we define two fixed PCID values.
PCID_KERNEL, defined as
mainly to keep the boot process simple, is used for the kernel
Thus, all TLB loads while in the kernel will be tagged with this value.
PCID_USER is used when in userspace. Now, when we switch
kernel entry or exit, we can do a non-flushing load. This lets us keep
both the kernel and the userspace mappings around across kernel/user
When we do need to invalidate TLB entries, though, things are now
slightly more complicated. We are by definition in the kernel (and hence
PCID_KERNEL), but we have to account for memory addresses below
USERLIMIT. In this case, we have to flush both
anything that ran in user mode) and
PCID_KERNEL (for any accesses the
kernel may have made such as with
also a little more complicated. As the
%cr3 load there is
non-invalidating, we have to explicitly flush everything if we’re
switching away from a non-
kas HAT, to clear out now-stale user-space
mappings. (Note that this has always been done eagerly on Illumos, even
when switching to a
INVPCID instruction is what enable us to flush
in the kernel. Unfortunately, support for
INVPCID came quite some time
PCID itself. On such systems, we have to emulate, and the only
way Intel gives us to do this is to load the ID into
invalidating the TLB entries. We don’t want to “pollute”
with any extraneous kernel mappings, so this means we need to switch to
the user page tables when loading
PCID_USER. But, remember, KPTI
requires us not to have kernel text (or stack!) mapped into these page
tables. So we have to first make sure we’re in the trampoline text
before doing the invalidations: see
For those interested, Alex posted a draft
of the PCID changes.
All the procmail recipes I found on a quick search failed to handle
quoted-printable HTML encodings, regularly used everywhere. And those
that had quoted-printable examples used tools no longer maintained -
such as mimencode.
The solution is to use Perl directly:
* ^Content-Type: text/html;
* ^Content-Transfer-Encoding: *quoted-printable
| perl -pe 'use MIME::QuotedPrint; $_=MIME::QuotedPrint::decode($_);'
| lynx -dump -force_html -stdin
| formail -i "Content-Type: text/plain; charset=us-ascii"
I’ve been ripping a lot of stuff from vinyl to FLAC recently. Here’s
how I do it.
I have an Alesis I/O 2, which works well and seems fairly decent
First, most important, step, is to stop trying to use Audacity. It’s
incredibly broken and unreliable. Go get ocenaudio instead. It’s fairly
new, but it works reliably.
After monitoring your levels, record the whole thing into ocenaudio.
First trim any obviously loud clicks such as when landing the needle.
ocenaudio doesn’t seem to have a “draw sample” function yet, the only
thing I miss from Audacity, but deleting just a few samples is usually
Then select a whole track using Shift-arrows (and Control to go faster).
Press Control-K to convert it into a region, and name it if you like.
You’ll see references to using zero-crossing finders to split tracks.
This is always a bad idea - it’s simply not reliable enough, especially
with an old crackly record, isopropyl’d or not.
Zoom all the way out again, make sure the number of tracks is right.
Then File->Export Audio From Regions, making sure that the “separate
files” checkbox is set.
Now it’s tagging time: run “kid3 yourdirwithflacs”. First import from
discogs, presuming it has the release (it usually will) File->Import
From Discogs. Then click ‘Tag 2’ in the Format Up part, along with the
format you need. Save all those, then use Tools->Rename Directory to
rename the containing directory. You’re done.
A little note for myself: to get low-latency monitoring, and more
importantly, record at the right rate, you need to set the
Configuration-Profile to “Digital Stereo Input” in
Update: you also need this in ~/.pulse/daemon.conf :
Another update: PA/ALSA often seems to forget the sensible default
devices, and ocenaudio starts
trying to record from the monitor devices. Solution seems to be to run
pavucontrol, start ocenaudio recording, and change the drop down box to
select io|2 Digital Stereo.
I think it's important that everyone should endeavour to maintain existing web content, even if it's
not currently relevant.
This is unbelievably stupid of Paypal. I just got this email from them:
vinyl tap records would like you to use PayPal - the safer, easier way to pay and get paid online.
To send vinyl tap records your payment and see the details of this invoice, copy and paste this link into your web browser:
So much for “never click a URL in email”. Even worse, if you log in
separately, the request is not visible anywhere. Morons.
I got some NatWest phishing spam the other day and was amused to notice
<title>NatWest - Security Information</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<link rel="stylesheet" type="text/css" href="http://www.natwest.com/microsites/global/phishing_demo/includes/css/generic.css" media="all" />
<a href="http://www.natwest.com/"><img src="http://www.natwest.com/microsites/global/phishing_demo/images/h_logo.gif" alt="NatWest - Load home page" /></a>
Enterprising of them to actually uses NatWest’s explanation of phishing
to … phish.
To quote 123-reg customer support:
> When will you be supporting AAAA records?
There are no current plans to implement this but notifications will be
sent out if this takes place.
Late last year, I was forced to find a new host for
movementarian.org, as my previous hosting
provider (Blue Room Hosting, who were really great) were shutting down.
I went with VPS247, as they were local to
Manchester and seemed reasonable.
Unfortunately my experience has been terrible. They’ve failed to keep
the machines on the net, regularly causing ssh sessions to die. The
dmesg is full of warnings about the block drivers failing to write for
more than two minutes: evidently the SAN setup they have is totally
My VM went down for a significant amount of time and support were very
slow to respond. During the total outage, there were no status updates,
and no response on the support tickets or the forums. The penultimate
straw was when my filesystem was massively corrupted. Even though my VM
is hardly critical, I can’t be doing with unreliability like this,
especially when they’re not reachable when problems occur.
My final straw, though, was when I discovered they’d deleted all the
negative comments from the Client Comments section of their
That’s really, really, not on.
I’m now with linode and happy (so far).