QEMU 3.0.0 can boot IBM’s AIX to a shell prompt. AIX is IBM’s version of Unix for their Power Systems line of PowerPC servers. I’ve been researching emulation, so I wrote a tutorial for running AIX on your computer.
Tutorial
Trying out AIX takes about 20 minutes. I tested these instructions on macOS 10.13.6.
Step 1: Download the AIX 7.2 Standalone Diagnostics disk from IBM. You want CD72220.ISO
.
Step 2: (optional) If you want the Diagnostics disk to give you an AIX shell prompt instead of running diagnostics, apply this patch to your ISO file:
python patch_cd72220.py CD72220.ISO ModdedCD.ISO
(this replaces usr/lpp/diagnostics/bin/Dctrl
on the ISO with a shell script that launches /bin/sh
)
Step 3: Download and install QEMU 3.0.0.
brew install qemu
Step 4: Launch QEMU with this command line:
qemu-system-ppc64 -cpu POWER9 -machine pseries -m 2G -serial mon:stdio \
-cdrom ModdedCD.iso \
-d guest_errors \
-prom-env "input-device=/vdevice/vty@71000000" \
-prom-env "output-device=/vdevice/vty@71000000" \
-prom-env "boot-command=dev / 0 0 s\" ibm,aix-diagnostics\" property boot cdrom:\ppc\chrp\bootfile.exe -s verbose"
Step 5: Wait. (It’ll get stuck on define_rspc
for a long time.)
Step 6: After nine very, very long minutes, you’ll be greeted with this notice:
******* Please define the System Console. *******
Type a 1 and press Enter to use this terminal as the
system console.
Pour definir ce terminal comme console systeme, appuyez
sur 1 puis sur Entree.
Taste 1 und anschliessend die Eingabetaste druecken, um
diese Datenstation als Systemkonsole zu verwenden.
Premere il tasto 1 ed Invio per usare questo terminal
come console.
Escriba 1 y pulse Intro para utilizar esta terminal como
consola del sistema.
Escriviu 1 1 i premeu Intro per utilitzar aquest
terminal com a consola del sistema.
Digite um 1 e pressione Enter para utilizar este terminal
como console do sistema.
Type 1 in your terminal, then press enter.
Step 7: After another two minutes, you’ll be dropped to an AIX shell prompt (or, if you didn’t patch the ISO, the IBM Diagnostics tool):
exec(/usr/lpp/diagnostics/bin/Dctrl,/usr/lpp/diagnostics/bin/Dctrl){983324,917792}
exec(/bin/sh){1638718,983324}
#
At this point, you can try some Unix commands. The Diagnostics CD doesn’t have many tools, but you can try uname -a
, mount
, or lscfg
.
(If you’re using the Diagnostics tool, you’ll be stuck on the step where it asks you to choose a terminal. Anyone knows how to get past this step?)
Step 8: once you’re done, close the QEMU window.
Alternatively, press Ctrl-A, then C to drop into the QEMU monitor, then type q
to quit.
The full boot log can be found here.
Edit (Sep 7, 2018)
Since writing this article, I’ve also found Liang Guo’s tutorial on setting up AIX, so if you have an AIX setup disc instead of the Live CD, you can install a proper AIX virtual machine.
Also, Nutanix’s AIX VMs are hosted with QEMU. That’s cool! I didn’t know there’s a production deployment of AIX on QEMU.
Introduction
It’s a Unix system! I know this!
– Jurassic Park (1993)
IBM AIX is a proprietary Unix variant developed by IBM for their Power Systems line of POWER9 servers. In the 1990s, every workstation or server manufacturer had their own version of Unix. IBM’s AIX is one of the few still being updated today.
Importantly for me, QEMU 3.0.0 can emulate AIX without any modifications or workarounds, allowing me to try an Enterprise operating system without access to expensive hardware.
In fact, AIX even detects if it’s being run in QEMU:
$ strings - bootfile.exe |grep qemu
patch_qemu
qemu,pseries
qemu,pseries
---> qemu,pseries detected <---
suggesting that IBM themselves have experimented with QEMU and AIX.
From the few minutes I spent playing with AIX, I find it fascinating that AIX is close enough to macOS (BSD) and Linux (GNU), yet subtlely different. In some places, the old Unix preference for terseness is visible:
AIX:
# halt --help
halt: illegal option -- -
usage: halt [-y][-q][-l][-n][-p]
vs Linux:
$ halt --help
halt [OPTIONS...]
Halt the system.
--help Show this help
--halt Halt the machine
-p --poweroff Switch off the machine
--reboot Reboot the machine
-f --force Force immediate halt/power-off/reboot
-w --wtmp-only Don't halt/power-off/reboot, just write wtmp record
-d --no-wtmp Don't write wtmp record
--no-wall Don't send wall message before halt/power-off/reboot
yet other parts gained richer interfaces as befit an Enterprise operating system:
AIX:
# mount
node mounted mounted over vfs date options
-------- --------------- --------------- ------ ------------ ---------------
/dev/ram0 / jfs Jan 01 00:00 rw
/dev/ramdisk0 /mnt jfs Sep 06 03:12 rw,nointegrity
/mnt /tmp namefs Sep 06 03:12 rw
macOS:
$ mount
/dev/disk1s1 on / (apfs, local, journaled)
devfs on /dev (devfs, local, nobrowse)
/dev/disk1s4 on /private/var/vm (apfs, local, noexec, journaled, noatime, nobrowse)
map -hosts on /net (autofs, nosuid, automounted, nobrowse)
map auto_home on /home (autofs, automounted, nobrowse)
It keeps me just off balance enough to be interesting.
Let’s just try it
I chose to try AIX in QEMU mostly because, unlike IBM’s other Enterprise operating systems (IBM i and z/OS), IBM provides a free Diagnostics disc for Power Systems - which contains a live version of AIX.
Others have tried running AIX without a physical IBM Power Server. One person got AIX 4.2 and 5.1 to boot on QEMU. However, modern AIX (7.1) seemed impossible to run in QEMU, according to mailing list posts from 2012, like this thread, this thread, or this thread.
Well, that was six years ago, right? Let’s see what happens if I boot the disc today.
qemu-system-ppc64 -cpu POWER9 -machine pseries -m 2G -serial mon:stdio -cdrom CD72220.iso
Trying to load: from: disk ...
E3405: No such device
Trying to load: from: /vdevice/v-scsi@71000003/disk@8200000000000000 ... Successfully loaded
AIX
Star
And it’s stuck there. Did the boot process hang? Or is it just slow?
OK, I need to get more debugging output. To do so, I need to set boot parameters.
Setting Open Firmware variables
According to the QEMU mailing list, I need to use this command to boot AIX with verbose messages:
boot cdrom:\ppc\chrp\bootfile.exe -s verbose
This guide showed how to change the default boot command, by passing environment variables to SLOF, QEMU’s bootloader:
-prom-env "boot-command=boot cdrom:\ppc\chrp\bootfile.exe -s verbose" \
The same article also mentions other SLOF environment variables; two important ones are input-device
and output-device
, which sets the default input and output devices of the firmware. By default, this is the emulated graphical screen and keyboard. Following the commit message on this patch, I switched it to the virtual serial port to make interacting with the system easier:
-prom-env "input-device=/vdevice/vty@71000000" \
-prom-env "output-device=/vdevice/vty@71000000" \
I also added the QEMU logging option -d unimp,guest_errors
to show me when AIX invokes some unimplemented behaviour in QEMU.
The new command line is:
qemu-system-ppc64 -cpu POWER9 -machine pseries -m 2G -serial mon:stdio \
-cdrom CD72220.iso \
-d unimp,guest_errors \
-prom-env "input-device=/vdevice/vty@71000000" \
-prom-env "output-device=/vdevice/vty@71000000" \
-prom-env "boot-command=boot cdrom:\ppc\chrp\bootfile.exe -s verbose" \
Why did it eject the CD?
Verbose boot and the -d unimp
option definitely helped: with Verbose boot, I can see that /sbin/init
starts executing, indicating that the kernel actually booted fine.
StarLED{814}
AIX Version 7.2
exec(/etc/init){1,0}
INIT: EXECUTING /sbin/rc.boot 1
In addition, this message popped up once every second - which reassures me that AIX hasn’t frozen yet.
Unimplemented SPAPR hcall 0x00000000000002b8
So I waited out the several minute boot process, only to be greeted with:
exec(/usr/sbin/bootinfo,-M){983360,917792}
exec(/usr/lib/boot/bin/bootinfo_chrp,-M){983360,917792}
exec(/etc/eject,/dev/cd0){983362,917792}
exec(/usr/lib/methods/showled,0xA71){983364,917792}
Why did it eject the CD? A search for LED code “0xA71” gave no answers. The -M
option for bootinfo is undocumented, so I don’t know what value would continue the boot process.
I needed to examine the CD’s boot script to find which value bootinfo_chrp -M
needs to return.
To get the init script, I first tried extracting the ram filesystem from the bootfile.exe. Unfortunately, the first bytes of the extracted filesystem were “01 02 03 04
”, which didn’t match any file systems I could find. In addition, there are no visible strings in the filesystem, suggesting that it’s compressed.
I resorted to plan B: dumping the entire RAM of the virtual machine to a giant 2GB file using QEMU’s Monitor:
(qemu) pmemsave 0 0x80000000 look2g.bin
and running strings
on this document gave me:
##########################################################################
# Check to see if the system has diagnostics support
# if not eject cd
bootinfo -M >/dev/null
if [ "$?" -ne 101 -a "$?" -ne 103 ] ; then
eject_cd
So bootinfo_chrp -M
must return 101 or 103.
The device tree
Thankfully, the CD includes a copy of bootinfo_chrp
in usr/lib/boot/bin
, and it included symbols; thus, it was simple to reverse engineer the logic.
The program checks for options with getopt
, and handles -M
using a method named is_eserver
.
The is_eserver
method can return 4 values, depending on the device tree’s root node:
- If the
ibm,model-class
property on the root node is “C5”, “D5”, or “E5”;- if the root node has the
ibm,aix-diagnostics
node, return 101. - otherwise return 100.
- if the root node has the
- Otherwise:
- if the root node has the
ibm,aix-diagnostics
node, return 103. - otherwise return 102.
- if the root node has the
The diagnostic CD checks for 101 or 103, so to return 103, all we need is to add the ibm,aix-diagnostics
node to QEMU’s device tree.
I searched “set property Open Firmware” on the Internet, and found some tutorials, like this one, this one, and this one. I also checked SLOF’s source code. From these, I found out how to add a blank node from the Open Firmware bootloader.
- set the current node to the root node:
dev /
- Call the
property
method to add a new node. The property method takes four arguments,data dlen name nlen
. This is Forth, so the arguments come before the method (since they’re pushed onto the stack.) In addition, thename
andnlen
arguments can be given using a string literal. So the call to add the property is:
0 0 s" ibm,aix-diagnostics" property
This adds a property with data and dlen being 0 (no data), and name, nlen taken from the string literal ibm,aix-diagnostics
.
Finally, check if the property is successfully added:
.properties
This prints out all the properties set on the root node.
Once I confirmed these commands added the proper node, I included them in my boot-command environmental variable on the QEMU command line:
qemu-system-ppc64 -cpu POWER9 -machine pseries -m 2G -serial mon:stdio \
-cdrom CD72220.iso \
-d unimp,guest_errors \
-prom-env "input-device=/vdevice/vty@71000000" \
-prom-env "output-device=/vdevice/vty@71000000" \
-prom-env "boot-command=dev / 0 0 s\" ibm,aix-diagnostics\" property boot cdrom:\ppc\chrp\bootfile.exe -s verbose"
With this property set, the diagnostics CD boots to the diagnostics application. But I wanted a shell prompt.
The disc patch
From the verbose boot log, I knew which executable starts the diagnostics:
exec(/usr/lpp/diagnostics/bin/Dctrl,/usr/lpp/diagnostics/bin/Dctrl){983324,917792}
This executable is loaded from the CD, so I can just replace it with a nice shell script by mounting the ISO, right?
Haha, nope. macOS doesn’t allow writing to ISO files.
Instead, I installed xorriso, which does allow modifying ISO files in place. But that’s still too mainstream for me: I wanted a simple binary patch I can apply to the ISO.
I ended up with the most convoluted method I can think of. I used xorriso to dump the location of this file inside the ISO:
$ xorriso -indev CD72220.iso -find . -exec report_lba|grep Dctrl
GNU xorriso 1.4.8 : RockRidge filesystem manipulator, libburnia project.
xorriso : NOTE : Loading ISO image tree from LBA 0
xorriso : UPDATE : 1027 nodes read in 1 seconds
Drive current: -indev 'CD72220.iso'
Media current: stdio file, overwriteable
Media status : is written , is appendable
Boot record : (system area only) , MBR CHRP cyl-align-off
Media summary: 1 session, 71328 data blocks, 139m data, 92.2g free
Volume id : 'CDROM'
Report layout: xt , Startlba , Blocks , Filesize , ISO image path
File data lba: 0 , 63111 , 498 , 1018740 , './usr/lpp/diagnostics/bin/Dctrl'
I found an ISO’s sector size is 2KB, so the file offset is at hex(2048 * 63111) = 0x7b43800
bytes. I can then write a Python script that patches out the file at that location:
import sys
patch_off = 0x7b43800
patch_data = b"#!/bin/sh\n/bin/sh\nexit\n"
with open(sys.argv[1], "rb") as infile:
indata = bytearray(infile.read())
indata[patch_off:patch_off+len(patch_data)] = patch_data
with open(sys.argv[2], "wb") as outfile:
outfile.write(indata)
With this change, the diagnostics CD boots to a shell.
The kernel debugger
AIX includes full kernel symbols and a full kernel debugger, KDB, which can be entered at boot by passing “-s trap” at the boot command line. (Or “-s debug” to enable the debugger without entering it)
Unfortunately, QEMU shows an error when entering the debugger:
************* Welcome to KDB *************
Call gimmeabreak...
invalid bits: 00010000 for opcode: 1f - 13 - 1a - 01 (7e6186a6) 00000000006771c4
invalid bits: 00010000 for opcode: 1f - 13 - 1a - 01 (7e6186a6) 00000000006771c4
That opcode decodes to slbmfev:
GEN_HANDLER2(slbmfev, "slbmfev", 0x1F, 0x13, 0x1A, 0x001F0001, PPC_SEGMENT_64B),
Indeed, the binary encoding of this instruction has one extra 1
bit, where the third argument for the instruction would be. However, this instruction only has two arguments, according to the disassemblers I found.
Expected: in 011111ddddd00000bbbbb11010100110 slbmfev $reg($d),$reg($b)
Actual: 01111110011000011000011010100110
I decided to work around this issue by commenting out the QEMU code that generates invalid instruction exceptions. This allowed me to enter the kernel debugger.
************* Welcome to KDB *************
Call gimmeabreak...
invalid bits: 00010000 for opcode: 1f - 13 - 1a - 01 (7e6186a6) 00000000006771c4
invalid bits: 00010000 for opcode: 1f - 13 - 1c - 01 (7e418726) 00000000006771c8
Static breakpoint:
.gimmeabreak+000000 tweq r8,r8 r8=0
.gimmeabreak+000004 blr <.kdb_init@AF69_59+0001F0> r3=0
KDB(0)> dc 00000000006771c4
.kdb_state_save_sslb_vsx_tm+00012C slbmfev r19,r16,1
KDB(0)> dc 00000000006771c8
.kdb_state_save_sslb_vsx_tm+000130 slbmfee r18,r16,1
First of all, this shows that both slbmfev
and slbmfee
are affected by the Invalid Bits error. Secondly, AIX’s own disassembler considers these instructions valid. Could this be an instruction set extension?
This GNU binutils patch showed me what happened. POWER9 processors added an extra argument to slbmfev
and slbmfee
to allow these Segment Lookaside Buffer instructions to operate in the presence of a Radix page table (similar to an x86 page table) instead of the traditional PowerPC-styled page table.
Yeah, I don’t know what that means either: you can read “4.10.7.2 SLB Management Instructions” in the POWER9 User’s Manual for a better explanation.
I created a kludge to allow QEMU to ignore the extra third argument. With the patch, the kernel debugger works fine. (This patch isn’t going upstream, since I expect QEMU is still working on Power9 compliance and would probably make their own proper solution)
What I learned
- QEMU’s PowerPC emulation has improved significantly in the last six years - so thank you and congratulations to the developers!
- QEMU’s PowerPC emulation is astonishingly slow since it can’t take any shortcuts like SheepShaver, Dolphin (both of which ignore the MMU), or PearPC (which just emulates one operating system). However, this accuracy pays off as it’s able to emulate more OSes than these other emulators.
- Edit (Sep 7, 2018): before trying something hard, search on Twitter to see if someone already made a tutorial ;)
- If you have a Raptor Talos II POWER9 workstation, can you please try these instructions on your machine? I’m curious how fast AIX would boot with hardware KVM acceleration.