The xv6 Operating System

Setup

If you experience any bugs, try to clean the directory with make clean.

If you'd like to exit out from QEMU's emulation of xv6, press ⌃A X

If you'd like to install the RISC-V edition on macOS follow along below:

sudo apt-get install \
    git \
    build-essential \
    gdb-multiarch \
    qemu-system-misc \
    gcc-riscv64-linux-gnu \
    binutils-riscv64-linux-gnu

The Makefile provided with xv6 has several phony targets for running the system:

make qemu
Build everything and run xv6 with QEMU, with a VGA console in a new window and the serial console in the terminal where you typed this command. Close the VGA window or press Ctrl-C or Ctrl-A X to stop.
make qemu-nox
Run xv6 without the VGA console.
make qemu-gdb
Run xv6 with GDB port open. Refer to the GDB section.
make qemu-nox-gdb
Run xv6 with GDB port open, without the VGA console.

If you get stuck in a boot-loop, you can Remove BYTE directives from kernel linker script to fix triple fault on boot


Remote Debugging xv6 under QEMU

The easiest way to debug xv6 under QEMU is to use GDB's remote debugging feature and QEMU's remote GDB debugging stub.

Remote debugging is a very important technique for kernel development in general: the basic idea is that the main debugger (GDB in this case) runs separately from the program being debugged (the xv6 kernel atop QEMU) - they could be on completely separate machines, in fact. The debugger and the target environment communicate over some simple communication medium, such as a network socket or a serial cable, and a small remote debugging stub handles the "immediate supervision" of the program being debugged in the target environment.

This way, the main debugger can be a large, full-featured program running in a convenient environment for the developer atop a stable existing operating system, even if the kernel to be debugged is running directly on the bare hardware of some other physical machine and may not be capable of running a full-featured debugger itself.

In this case, a small remote debugging stub is typically embedded into the kernel being debugged; the remote debugging stub implements a simple command language that the main debugger uses to inspect and modify the target program's memory, set breakpoints, start and stop execution, etc. Compared with the size of the main debugger, the remote debugging stub is typically miniscule, since it doesn't need to understand any details of the program being debugged such as high-level language source files, line numbers, or C types, variables, and expressions: it merely executes very low-level operations on behalf of the much smarter main debugger.

You will notice that while a window appears representing the virtual machine's display, nothing appears on that display: that is because QEMU initialized the virtual machine but stopped it before executing the first instruction, and is now waiting for an instance of GDB to connect to its remote debugging stub and supervise the virtual machine's execution. In particular, QEMU is listening for connections on a TCP network socket, at the port whose number is set according to the value of the environment variable GDBPORT, which is declared in the Makefile in this example, because of the -p 26000 in the qemu command line above.

To start the debugger and connect it to QEMU's waiting remote debugging stub, open a new, separate terminal window, change to the same xv6 directory, and type:

$ gdb kernel
GNU gdb (GDB) 6.8
Copyright (C) 2009 Free Software Foundation, Inc.
...
Reading symbols from /Users/ford/cs422/xv6/kernel...done.
+ target remote localhost:26000
The target architecture is assumed to be i8086
[f000:fff0] 0xffff0:	ljmp   $0xf000,$0xe05b
0x0000fff0 in ?? ()
(gdb)

Several things are going on here. Note that we entered 'gdb kernel' just as if we were going to debug a program named kernel directly under this GDB instance - but actually trying to execute the xv6 kernel under GDB in this way wouldn't work at all, because GDB would provide an execution environment corresponding to a user-mode Linux process (or a process on whatever operating system you are running GDB on), whereas the kernel expects to be running in privileged mode on a "raw" x86 hardware environment. But even though we're not going to run the kernel locally under GDB, we still need to have GDB load the kernel's ELF program image so that it can extract the debugging information it will need, such as the addresses of C functions and other symbols in the kernel, and the correspondence between line numbers in xv6's C source code and the memory locations in the kernel image at which the corresponding compiled assembly language code resides. That is what GDB is doing when it reports Reading symbols from ....

Important: When remote debugging, always make sure that the program image you give to GDB is exactly the same as the program image running on the debugging target: if they get out of sync for any reason (e.g., because you changed and recompiled the kernel and restarted QEMU without also restarting GDB with the new image), then symbol addresses, line numbers, and other information GDB gives you will not make any sense. Fortunately keeping the target program and the debugger in sync is not too difficult when they are both loaded from the same directory in the same host machine as they are in this case, but synchronization can be a bit more of a challenge with "true" remote debugging, where one machine runs GDB and another machine runs the target kernel loaded from a separate media such as a local hard disk or USB stick.

The GDB command 'target remote' connects to a remote debugging stub, given the waiting stub's TCP host name and port number. In our case, the xv6 directory contains a small GDB script residing in the file .gdbinit, which gets run by GDB automatically when it starts from this directory. This script automatically tries to connect to the remote debugging stub on the same machine (localhost)using the appropriate port number: hence the "+ target remote localhost:26000" line output by GDB. If something goes wrong with the xv6 Makefile's port number selection (e.g., it accidentally picks a port number already in use by some other process on the machine), or if you wish to run GDB on a different machine from QEMU (try it!), you can comment out the 'target remote' command in .gdbinit and enter the appropriate command manually once GDB starts.

Once GDB has connected successfully to QEMU's remote debugging stub, it retrieves and displays information about where the remote program has stopped:

The target architecture is assumed to be i8086
[f000:fff0] 0xffff0:    ljmp   $0xf000,$0xe05b
0x0000fff0 in ?? ()

As mentioned earlier, QEMU's remote debugging stub stops the virtual machine before it executes the first instruction: i.e., at the very first instruction a real x86 PC would start executing after a power on or reset, even before any BIOS code has started executing. For backward compatibility, PCs today still start executing after reset in exactly the same way the very first 8086 processors did: namely in 16-bit, "real mode", starting at address 0xffff0 - 16 bytes short of the end of the BIOS and the top of the 1MB of total addressable memory in the original PC architecture.

We'll leave further exploration of the boot process for later; for now just type in the GDB window:

(gdb) b exec
Breakpoint 1 at 0x100800: file exec.c, line 11.
(gdb) c

These commands set a breakpoint at the entrypoint to the exec function in the xv6 kernel, and then continue the virtual machine's execution until it hits that breakpoint. You should now see QEMU's BIOS go through its startup process, after which GDB will stop again with output like this:

The target architecture is assumed to be i386
0x100800 :	push   %ebp

Breakpoint 1, exec (path=0x20b01c "/init", argv=0x20cf14) at exec.c:11
11	{
(gdb)

At this point, the machine is running in 32-bit mode, the xv6 kernel has initialized itself, and it is just about to load and execute its first user-mode process, the /init program. You will learn more about exec and the init program later; for now, just continue execution:

(gdb) c
Continuing.
0x100800 :	push   %ebp

Breakpoint 1, exec (path=0x2056c8 "sh", argv=0x207f14) at exec.c:11
11	{
(gdb)

The second time the exec function gets called is when the /init program launches the first interactive shell, sh.

Now if you c ontinue again, you should see GDB appear to "hang": this is because xv6 is waiting for a command (you should see a '$' prompt in the virtual machine's display). It won't hit the exec function again until you enter a command, which will cause the shell to run it.

GDB has now trapped the exec system call the shell invoked to execute the requested command.

Now let's inspect the state of the kernel a bit at the point of this exec command.

Further exploration: Look through the online GDB manual to learn about more of GDB's debugging features,

Note that you don't have to settle for the "vanilla" command-line GDB;

you should be able to use one of the many more "user-friendly" variants, such as the GNU Emacs graphical interface to GDB, or the graphical DDD frontend.

But you may have to figure out for yourself how to connect with QEMU's remote debugging stub under your preferred GDB variant or front-end.