xs.so`xspy_introduce_domain: pushl %ebp xs.so`xspy_introduce_domain+1: movl %esp,%ebp ... xs.so`xspy_introduce_domain+0x57: subl $0xc,%esp xs.so`xspy_introduce_domain+0x5a: leal -0xc(%ebp),%eax xs.so`xspy_introduce_domain+0x5d: pushl %eax xs.so`xspy_introduce_domain+0x5e: leal -0x8(%ebp),%eax xs.so`xspy_introduce_domain+0x61: pushl %eax xs.so`xspy_introduce_domain+0x62: leal -0x2(%ebp),%eax xs.so`xspy_introduce_domain+0x65: pushl %eax xs.so`xspy_introduce_domain+0x66: pushl $0xc4b13114 xs.so`xspy_introduce_domain+0x6b: pushl 0xc(%ebp) xs.so`xspy_introduce_domain+0x6e: call +0x43595220 ... xs.so`xspy_introduce_domain+0x11f: leave xs.so`xspy_introduce_domain+0x120: ret
Seems OK - we're pushing three pointers onto the stack (+0x5a-0x65) and two other arguments. Let's look at the sources:
static PyObject *xspy_introduce_domain(XsHandle *self, PyObject *args) { domid_t dom; unsigned long page; unsigned int port; struct xs_handle *xh = xshandle(self); bool result = 0; if (!xh) return NULL; if (!PyArg_ParseTuple(args, "ili", &dom, &page, &port)) return NULL;
Looking up the definition of PyArg_ParseTuple() gave me the clue as to the problem. The format string specifies that we're giving the addresses of an int, long, and int. Yet in the assembly, the offsets of the leal instructions indicate we're pushing addresses to two 32-bit storage slots, and one 16-bit slot. So when PyArg_ParseTuple() writes its 32-bit quantity, it's going to overwrite two more bytes than it should.
As it happens, we're at the very top of the local stack storage space (-0x2(%ebp)). So those two bytes actually end up over-writing the bottom two bytes of the old %ebp we pushed at the start of the function. Then we pop that corrupted value back into the %ebp register via the leave. This has no effect until our caller calls leave itself. We move %ebp into %esp, then attempt to pop from the top of this stack into %ebp again. As it happens, the memory pointed to by the corrupt %ebp is zeroed; thus, we end up setting %ebp to zero. Finally, our caller does a ret, which pops another zero, but this time into %eip. Naturally this isn't a happy state of affairs, and we find ourselves with a core dump as described earlier.
Presumably this bug happened in the first place because someone didn't notice that domid_t was a 16-bit quantity. What's amazing is that nobody else has been hitting this problem!