The xv6 Operating System
Setup
Install QEMU
Install the GCC compiler toolchain for x86 ELF
brew install x86_64-elf-gcc
Clone the mit-pdos/xv6 repository
git clone git://github.com/mit-pdos/xv6-riscv.git
Enter the directory containing x86
Update the value of the TOOLPREFIX environment variable to x86_64-elf- in
the Makefile (near line 30)
Update the value of the QEMU environment variable to qemu-system-x86_64 in
the Makefile
QEMU = qemu-system-x86_64
Build the qemu project, (launches xv6 via QEMU)
If you experience any bugs, try to clean the directory with make clean.
If you'd like to exit out from QEMU's emulation of xv6, press ⌃A X
If you'd like to install the RISC-V edition on macOS follow along below:
On macOS
brew tap riscv/riscv
brew install riscv-tools
path=(/usr/local/opt/riscv-gnu-toolchain/bin ${path})
brew install qemu
On Debian/Ubuntu Linux
sudo apt-get install \
git \
build-essential \
gdb-multiarch \
qemu-system-misc \
gcc-riscv64-linux-gnu \
binutils-riscv64-linux-gnu
The Makefile provided with xv6 has several phony targets for running the
system:
make qemu- Build everything and run xv6 with QEMU, with a VGA console in a new window
and the serial console in the terminal where you typed this command. Close
the VGA window or press Ctrl-C or Ctrl-A X to stop.
make qemu-nox- Run xv6 without the VGA console.
make qemu-gdb- Run xv6 with GDB port open. Refer to the GDB section.
make qemu-nox-gdb- Run xv6 with GDB port open, without the VGA console.
If you get stuck in a boot-loop, you can Remove BYTE directives from kernel
linker script to fix triple fault on boot
Remote Debugging xv6 under QEMU
The easiest way to debug xv6 under QEMU is to use GDB's remote debugging feature and QEMU's remote GDB debugging stub.
Remote debugging is a very important technique for kernel development in
general: the basic idea is that the main debugger (GDB in this case) runs
separately from the program being debugged (the xv6 kernel atop QEMU) - they
could be on completely separate machines, in fact. The debugger and the target
environment communicate over some simple communication medium, such as a network
socket or a serial cable, and a small remote debugging stub handles the
"immediate supervision" of the program being debugged in the target environment.
This way, the main debugger can be a large, full-featured program running in a
convenient environment for the developer atop a stable existing operating
system, even if the kernel to be debugged is running directly on the bare
hardware of some other physical machine and may not be capable of running a
full-featured debugger itself.
In this case, a small remote debugging stub is typically embedded into the
kernel being debugged; the remote debugging stub implements a simple command
language that the main debugger uses to inspect and modify the target program's
memory, set breakpoints, start and stop execution, etc. Compared with the size
of the main debugger, the remote debugging stub is typically miniscule, since it
doesn't need to understand any details of the program being debugged such as
high-level language source files, line numbers, or C types, variables, and
expressions: it merely executes very low-level operations on behalf of the much
smarter main debugger.
You will notice that while a window appears representing the virtual machine's
display, nothing appears on that display: that is because QEMU initialized the
virtual machine but stopped it before executing the first instruction, and is
now waiting for an instance of GDB to connect to its remote debugging stub and
supervise the virtual machine's execution. In particular, QEMU is listening for
connections on a TCP network socket, at the port whose number is set according
to the value of the environment variable GDBPORT, which is declared in the
Makefile in this example, because of the -p 26000 in the qemu command line
above.
To start the debugger and connect it to QEMU's waiting remote debugging stub,
open a new, separate terminal window,
change to the same xv6 directory, and type:
$ gdb kernel
GNU gdb (GDB) 6.8
Copyright (C) 2009 Free Software Foundation, Inc.
...
Reading symbols from /Users/ford/cs422/xv6/kernel...done.
+ target remote localhost:26000
The target architecture is assumed to be i8086
[f000:fff0] 0xffff0: ljmp $0xf000,$0xe05b
0x0000fff0 in ?? ()
(gdb)
Several things are going on here. Note that we entered 'gdb kernel' just as if
we were going to debug a program named kernel directly under this GDB instance -
but actually trying to execute the xv6 kernel under GDB in this way wouldn't
work at all, because GDB would provide an execution environment corresponding to
a user-mode Linux process (or a process on whatever operating system you are
running GDB on), whereas the kernel expects to be running in privileged mode on
a "raw" x86 hardware environment. But even though we're not going to run the
kernel locally under GDB, we still need to have GDB load the kernel's ELF
program image so that it can extract the debugging information it will need,
such as the addresses of C functions and other symbols in the kernel, and the
correspondence between line numbers in xv6's C source code and the memory
locations in the kernel image at which the corresponding compiled assembly
language code resides. That is what GDB is doing when it reports Reading symbols from ....
Important: When remote debugging, always make sure that the program image
you give to GDB is exactly the same as the program image running on the
debugging target: if they get out of sync for any reason (e.g., because you
changed and recompiled the kernel and restarted QEMU without also restarting GDB
with the new image), then symbol addresses, line numbers, and other information
GDB gives you will not make any sense. Fortunately keeping the target program
and the debugger in sync is not too difficult when they are both loaded from the
same directory in the same host machine as they are in this case, but
synchronization can be a bit more of a challenge with "true" remote debugging,
where one machine runs GDB and another machine runs the target kernel loaded
from a separate media such as a local hard disk or USB stick.
The GDB command 'target remote' connects to a remote debugging stub, given the
waiting stub's TCP host name and port number. In our case, the xv6 directory
contains a small GDB script residing in the file .gdbinit, which gets run by GDB
automatically when it starts from this directory. This script automatically
tries to connect to the remote debugging stub on the same machine
(localhost)using the appropriate port number: hence the "+ target remote
localhost:26000" line output by GDB. If something goes wrong with the xv6
Makefile's port number selection (e.g., it accidentally picks a port number
already in use by some other process on the machine), or if you wish to run GDB
on a different machine from QEMU (try it!), you can comment out the 'target
remote' command in .gdbinit and enter the appropriate command manually once GDB
starts.
Once GDB has connected successfully to QEMU's remote debugging stub,
it retrieves and displays information
about where the remote program has stopped:
The target architecture is assumed to be i8086
[f000:fff0] 0xffff0: ljmp $0xf000,$0xe05b
0x0000fff0 in ?? ()
As mentioned earlier, QEMU's remote debugging stub stops the virtual machine
before it executes the first instruction: i.e., at the very first instruction a
real x86 PC would start executing after a power on or reset, even before any
BIOS code has started executing. For backward compatibility, PCs today still
start executing after reset in exactly the same way the very first 8086
processors did: namely in 16-bit, "real mode", starting at address 0xffff0 - 16
bytes short of the end of the BIOS and the top of the 1MB of total addressable
memory in the original PC architecture.
We'll leave further exploration of the boot process for later; for now just type
in the GDB window:
(gdb) b exec
Breakpoint 1 at 0x100800: file exec.c, line 11.
(gdb) c
These commands set a breakpoint at the entrypoint to the exec function in the
xv6 kernel, and then continue the virtual machine's execution until it hits that
breakpoint. You should now see QEMU's BIOS go through its startup process, after
which GDB will stop again with output like this:
The target architecture is assumed to be i386
0x100800 : push %ebp
Breakpoint 1, exec (path=0x20b01c "/init", argv=0x20cf14) at exec.c:11
11 {
(gdb)
At this point, the machine is running in 32-bit mode, the xv6 kernel has
initialized itself, and it is just about to load and execute its first user-mode
process, the /init program. You will learn more about exec and the init
program later; for now, just continue execution:
(gdb) c
Continuing.
0x100800 : push %ebp
Breakpoint 1, exec (path=0x2056c8 "sh", argv=0x207f14) at exec.c:11
11 {
(gdb)
The second time the exec function gets called is when the /init program
launches the first interactive shell, sh.
Now if you c ontinue again, you should see GDB appear to "hang": this is
because xv6 is waiting for a command (you should see a '$' prompt in the virtual
machine's display). It won't hit the exec function again until you enter a
command, which will cause the shell to run it.
GDB has now trapped the exec system call the shell invoked to execute the requested command.
Now let's inspect the state of the kernel a bit at the point of this exec command.
Further exploration: Look through the online GDB manual to learn about
more of GDB's debugging features,
Note that you don't have to settle for the "vanilla" command-line GDB;
you should be able to use one of the many more "user-friendly" variants,
such as the GNU Emacs graphical interface to GDB,
or the graphical DDD frontend.
But you may have to figure out for yourself
how to connect with QEMU's remote debugging stub
under your preferred GDB variant or front-end.