System-on-Chip (SoC) Design
ECE382M.20, Fall 2023
QEMU/SystemC Tutorial
Notes:
•
This is a tutorial related to the class project.
•
Please use the discussion
board on Piazza for Q&A.
•
Please check relevant web pages.
The goal of this tutorial is to:
• Give
an introduction to the QEMU/SystemC virtual platform simulation environment.
This
tutorial includes the following:
• A
simulation model of the Zynq UltraScale+ development/prototyping platform.
• A
SystemC model of a simple hardware module mapped to the Zynq's FPGA fabric.
• An
application example running on the ARM that calls and interfaces with the
hardware.
We use
the free QEMU simulation model provided by Xilinx for their Zynq platform for
virtual prototyping of our SoC. More information about Xilinx's QEMU setup is
available through their wiki
page and support
forums.
We will be using a special setup of QEMU provided by Xilinx that
integrates with a SystemC co-simulation environment. For QEMU/SystemC
co-simulation, a separate SystemC simulator instance is launched in parallel to
the QEMU simulation, where the two simulators communicate through special
co-simulation ports that model the available bus and other interfaces between
the Zynq’s programmable ARM subsystem and the FPGA fabric. On the SystemC
side, these interfaces are represented as standard TLM2.0 bus models and
sockets that other SystemC modules representing blocks in the programmable
logic can be connected and interfaced to in order to construct a complete SoC
virtual platform simulation. More information about the QEMU/SystemC
integration from Xilinx is available in their wiki page.
The
following example demonstrates a QEMU/SystemC simulation of a Zynq UltraScale+
platform that includes a simple hardware module implemented in the FPGA fabric,
where the application running on the ARM accesses the external hardware through
memory-mapped I/O or a Linux kernel module.
(a)
Environment Setup
We will be using the QEMU simulator that comes with the
PetaLinux tools provided by Xilinx. We provide a pre-built image of a minimal
Linux installation to boot in QEMU running on the emulated ARM platform. To
setup the simulation environment, log into the ECE-LRC servers (see instructions)
and setup the PetaLinux environment:
% source
/usr/local/packages/Xilinx_2022.2/petalinux/2022.2/settings.sh
% umask 022
You should work in the scratch space (/misc/scratch) on
the LRC machines. PetaLinux projects and images take up quite a bit of disk
space, and space in the LRC home directories is limited. Note, however, that
the scratch space on LRC is not backed up and is wiped at the end of every
semester:
% mkdir -p
/misc/scratch/$USER
% cd /misc/scratch/$USER
Alternatively, you can work on your own machine using a
PetaLinux Docker image available here: https://hub.docker.com/r/gerstla/petalinux-systemc.
Important: to create a kernel and root filesystem that is compatible with the
setup on the board, make sure to pull and work with the 2022.2 version/tag of
the Docker image.
For simulation purposes, we will be using a default board
support package (BSP) provided by Xilinx that emulates their basic ZCU102
UltraScale+ evaluation board, a copy of which is available on the LRC machines
under /home/projects/gerstl/ece382m. If you are using
Docker, mount or copy the BSP into your Docker container as described in the
image README. If you are working on LRC, instead make a link to the BSP in your
local directory:
% ln -s
/home/projects/gerstl/ece382m/xilinx-zcu102-v2022.2-10141622.bsp
In either case, create a new PetaLinux project using the BSP
and copy the pre-built Linux image into the project directory (if using Docker,
you need to either mount or copy the image files into the Docker container):
% petalinux-create -t project -n
Lab3 -s xilinx-zcu102-v2022.2-10141622.bsp --template zynqMP
% cd Lab3
% cp -a /home/projects/gerstl/ece382m/images .
Next, configure the PetaLinux settings. In particular, the
default PetaLinux setup boots into Linux using a RAM disk as root filesystem,
where any modifications will not be permanent and space is limited. Instead, we
need to configure PetaLinux to boot from an SD card image. Also, PetaLinux
needs to be configured to use its own built-in compiler since the one on the
LRC machines is too old:
% petalinux-config
Select Image
Packaging Configuration->Root filesystem type->EXT4 (SD/eMMC/SATA/USB)
Select Yocto
Settings->Enable Buildtools Extended
At this point, you should be able to boot the pre-built image
using a provided script:
% petalinux-boot --qemu --kernel
This
will launch two QEMU instances simulating the ARM subsystem and the PMU boot
firmware/boot loader running on a MicroBlaze processor. The two simulators
communicate via a local ./tmp directory that is created if it doesn’t already exist.
If successful, you should see Linux booting on the ARM subsystem in the
terminal. At the prompt, login as 'petalinux', set a password, and you will
have access to a bare-bone, minimal Linux installation (busybox) running on the
emulated ARM platform. Note that to end the simulation, you should always
cleanly shutdown the simulated Linux (sudo halt) and
then terminate the QEMU simulation using Ctrl-A X (Ctrl-C will not work).
(b)
Custom Linux Image (optional, skip this step unless you want
to customize the image)
If you want to modify the image that is booted in QEMU, you
can create a custom kernel and root file system using Xilinx’s PetaLinux
tools. The PetaLinux build process requires a lot of temporary disk space that
is not available on all LRC machines. By default, PetaLinux will create a
temporary directory under /tmp, which will fill up fast and
crash the machine. In order to build a PetaLinux project for custom image
creation, you need to log into and use only the yoshi
machine. On Yoshi, there is a large partition of local disk space that is
mounted under /homework and must be used for temporary PetaLinux
files. Also, PetaLinux projects for image building must be created in the
scratch space (/misc/scratch) on yoshi. As
desribed under (a), as an alternative to working on yoshi, you
can work on your own machine using the PetaLinux Docker image linked in above (https://hub.docker.com/r/gerstla/petalinux-systemc).
If you are working on yoshi, setup the environment
and either create a new PetaLinux project or change into your existing
PetaLinux project directory as described or created in (a):
yoshi% cd /misc/scratch/$USER/Lab3
Then, importantly, make sure to configure PetaLinux to use a
temporary directory under /homework:
yoshi% petalinux-config
Select Yocto
Settings->TMPDIR Location->/homework/<unique directory name>
Also make sure that PetaLinux
is configured to use an SD card image for the root filesystem (see (a)).
To
create a custom kernel, run the kernel configuration and compile the kernel:
yoshi% petalinux-config -c kernel
Select Kernel
hacking->Generic Kernel Debugging Instruments->KGDB: kernel debugger
Select or modify any
other kernel options as you see fit.
yoshi% petalinux-build -c kernel
You
can also customize the Linux root filesystem. If you do, make sure to include
the necessary libraries (libstdc++) needed to run YOLO/Darknet:
yoshi% petalinux-config -c rootfs
Select Filesystem
Packages -> misc -> gcc-runtime -> libstdc++
Select Filesystem
Packages -> misc -> gdb -> gdbserver
Select User
Packages -> peekpoke
Select other
optional packages as desired…
yoshi% petalinux-build -c rootfs
Finally,
finishing building the remaining collateral:
yoshi% petalinux-build
And then
package everything into bootable disk image, where we make sure to not compile
the device tree blob (DTB) into the boot loader but instead load it from disk
so we can more easily update it later:
yoshi% petalinux-package --force
--boot --format BIN --pmufw --fsbl --u-boot --dtb no
yoshi% petalinux-package --wic -e system.dtb
You
can then boot the QEMU simulation as described in part (a).
(c)
SystemC Co-Simulation
As
described by Xilinx for their QEMU/SystemC
co-simulation setup, to make the external SystemC interfaces accessible in
the QEMU simulation, we need to launch the QEMU simulation with extra
co-simulation arguments:
% petalinux-boot --qemu --kernel
--qemu-args "-hw-dtb images/linux/zynqmp-qemu-multiarch-arm.cosim.dtb
-machine-path ./tmp -sync-quantum 1000000"
The co-simulation accepts optional -icount and -sync-quantum
parameters that are passed to QEMU to allow you to play with the tradeoff
between simulation speed and accuracy (see QEMU/SystemC co-cimulation
documentation).
QEMU
will now start but immediately hang waiting for a connection from the SystemC
simulator. To setup the SystemC simulation, we need to include a special co-simulation library
provided by Xilinx. For this example and tutorial, we will be using a small
SystemC demo platform provided by Xilinx that is built on top of this library.
If you are using the provided Docker image, a copy of that demo is already
included. If you are working on LRC, open a new terminal, log into the same LRC
machine, and download and compile the Xilinx SystemC co-simulation demo:
% module load systemc/2.3.4
% git clone https://github.com/Xilinx/systemctlm-cosim-demo.git
% cd systemctlm-cosim-demo
% git submodule update --init libsystemctlm-soc
% vi Makefile
Change SYSTEMC ?=
/usr/local/systemc-2.3.2/
To SYSTEMC ?=
/usr/local/packages/systemc-2.3.4/
And comment out the
following two lines (since they require a newer compiler than available):
#SC_OBJS +=
$(LIBSOC_PATH)/soc/dma/xilinx/mcdma/mcdma.o
#SC_OBJS +=
$(LIBSOC_PATH)/soc/net/ethernet/xilinx/mrmac/mrmac.o
The SystemC demo provided by Xilinx has VCD tracing enabled
by default. VCD trace files can get very large, fill up disk spce, and are not
very useful for debugging at the SystemC and virtual platform level. To disable
tracing, comment out the following tow lines in zynqmp_demo.cc:
// trace_fp
= sc_create_vcd_trace_file("trace");
// trace(trace_fp, *top,
top->name());
and compile the demo:
% make zynqmp_demo
Then,
launch the SystemC side of the co-simulation:
% ./zynqmp_demo unix:../tmp/qemu-rport-_amba@0_cosim@0
1000000
The
SystemC simulator will start and in turn wait to try to establish the
connection to the QEMU simulator. Once the connection is made, both simulators
will proceed, and the QEMU side should continue to boot Linux as normal (but
slower). Note that the last argument is the sync quantum, which should match
the value passed to QEMU.
This
SystemC demo example includes a simple debug device (see debugdev.h/.cc)
attached to the main system bus (see the top-level zyncmp_demo.cc
SystemC module, which stitches everything together). The debug device is
accessible on the following bus addresses:
0xa0000000 – Read SystemC time (in seconds) / Write debug message
and measure time
0xa0000004
– Write output ASCII character to terminal
0xa0000008
– Write to stop the simulation via an exit(1) call
0xa000000c
– Read/Write status of interrupt line (high/low, i.e. 1/0)
0xa0000010
– Read SystemC clock() count
In
addition, the platform demo includes 64kB of local scratchpad memory that sits next
to the debug device at the following address:
0xa0800000-0xa080FFFF – Address range of scratchpad memory
You
can use the peek and poke commands from within the busybox Linux shell to test
reading/writing from/to these addresses.
(d)
Application software
We
demonstrate application development and HW/SW interfacing using a simple
software application running on the virtual platform that interfaces with the
debug hardware device modeled in SystemC. To compile and link applications for
the board, we need to use the ARM cross-compiler tool chain that matches the
one in the virtual co-simulation platform:
% module load xilinx/2022
% source /usr/local/packages/Xilinx_2022.2/Vivado/2022.2/settings64.sh
Alternatively,
you can also use the cross-compilers that are included with PetaLinux (source
/usr/local/packages/Xilinx_2022.2/petalinux/2022.2/settings.sh on
LRC, or using the 2022.2 PetaLinux Docker image from https://hub.docker.com/r/gerstla/petalinux-systemc).
Then download
the application example (from https://github.com/gerstl/QEMU_SystemC_app)
and cross-compile the executable (example) for
the ARM using the provided Makefile:
% git clone https://github.com/gerstl/QEMU_SystemC_app
% cd QEMU_SystemC_app/application
% make
Next, we
need to copy the cross-compiled application into the simulated platform. The
virtual platform simulator includes emulated network access, so the easiest way
to do this is by copying the executable into the simulated platform over the
network. From within the simulation execute (you can use either scp or wget):
# scp <user>@<server>.ece.utexas.edu:<path>/application/example .
Finally,
run the application example from there:
# sudo ./example [<val>]
The
example (see example.c) demonstrates use of memory-mapped I/O
under Linux to access the debug device registers and either read the SystemC
time or write to the debug port similar to the peek and poke
commands above. See the included README.application for more details.
(e)
Interrupt-driven application using a Linux kernel module
The
application package also includes a version of the example code that uses a
kernel module (fpga_drv.c) to implement an interrupt-based device
driver for all accesses to the FPGA hardware. For the kernel module to work, we
first need to tell the Linux kernel about the existence of the external
hardware device. The application example provides a modified kernel device tree
in the ./boot directory that includes an additional
‘ece382m,fpga’ device for this example. You can optionally inspect
and potentially modify the device tree and then compile it into a device tree
blob (DTB) for the kernel:
% cd ../boot
% dtc -I dtb -O dts -o system.dts system.dtb
% vi system.dts (search for ‘ece382m,fpga’)
% dtc -I dts -O dtb -o system.dtb system.dts
If you
compiled your own kernel in step (b), you need to generate a matching device
tree through PetaLinux instead. Go into your PetaLinux project directory and overwrite
the device tree build configuration in the PetaLinux project tree with the updated
one provided in the application repo:
yoshi% cd Lab3
yoshi% cp -a <app_path>/QEMU_SystemC_app/project-spec
.
Alternatively,
copy just the
‘project-spec/meta-user/recipes-bsp/device-tree/files/zynq-fpga.dtsi’
file included with the application package into your project directory, and
manually update the PetaLinux recipes to include the additional
‘fpga’ device:
yoshi% vi
project-spec/meta-user/recipes-bsp/device-tree/files/system-user.dtsi
Add /include/
"zynq-fpga.dtsi"
yoshi% vi project-spec/meta-user/recipes-bsp/device-tree/device-tree.bbappend
Append
SRC_URI:append = " <…> file://zynqmp-fpga.dtsi"
In
either case, configure PetaLinux to not include its own IP components into the
programmable logic (PL) and then generate the device tree:
yoshi% petalinux-config
Select DTG
Settings->Remove PL from devicetree
yoshi% petalinux-build -c device-tree
Finally,
copy the provided or generated device tree blob into the simulated platform and
then reboot the simulation:
# scp <user>@<server>.ece.utexas.edu:<path>/boot/system.dtb .
# sudo cp system.dtb /boot
# sudo reboot
Then
cross-compile the executable (example) and the kernel module (fpga_drv.ko) of
the interrupt-driven application for the ARM:
% cd ../application.irq
% make
Copy
both files into the virtual platform using scp or wget as
shown above. From within the simulation, load the kernel module, check that the
device driver is properly installed and registered for GIC interrupt 121, and
look at the output reported by the driver under its /proc/fpga
entry:
# sudo insmod fpga_drv.ko
# lsmod
# cat /proc/interrupts
# cat /proc/fpga
Finally,
run the application example:
# sudo ./example [<val>]
In
addition to the output from before, you should see messages from the 'fpga_drv'
about I/O accesses, including handling of incoming interrupts for
synchronization with the hardware.
In the
following, we describe various options that are available for debugging both the
SystemC side of the virtual platform as well as the application running on the
simulated ARM.
(a)
SystemC debugging
By default, the SystemC/C++ model of the virtual platform is
already compiled with debug information (-g option) enabled. However, you may
want to also disable compile optimizations (remove the -O2 switch in the
Makefile and recompile). The ‘zynqmp_demo’ executable can then be
executed in your favorite debugger, such as GDB:
% gdb zynqmp_demo
Or, if
you prefer a graphical debugging environment (using the DDD graphical GDB
frontend):
% ddd zynqmp_demo
Finally,
inside the debugger, launch the ‘zyncmp_demo’ program with the
correct command line arguments:
(gdb) run
unix:../tmp/qemu-rport-_amba@0_cosim@0 1000000
From
there on, you can debug the virtual platform’s SystemC/C++ executable
using standard means, e.g., Ctrl-C will interrupt the running program and bring
you back to the (gdb) prompt.
(b)
Application debugging
The Linux installation running on our platform includes a GDB server,
which can be used to attach an external GDB instance and remotely debug an
application running on top of it. In order to do that, first make sure that
your application is compiled with debug information (-g switch enabled). In
case of the given application example using the provided Makefiles:
% cd application[.irq]
% make debug
Then,
(re)start the virtual platform and forward the debug port in the platform to
any available port on the local simulation host:
% cd ..
% petalinux-boot --qemu --kernel
–-qemu-args "<…> -netdev
user,id=eth0,tftp=/tftpboot,hostfwd=tcp::<port>-:1234"
Copy
your application into the platform and start it there under the control of the
GDB server:
# gdbserver HOST:1234 example
Finally,
open another shell on the host and launch an ARM cross-debugger instance
pointing it to the same application binary (with compiled-in debug
information):
% aarch64-linux-gnu-gdb
application[.irq]/example
Inside
GDB, connect to the (forwarded) remote server and run the application under
control of the cross-debugger:
(gdb) target remote localhost:<port>
(gdb) b main
(gdb) c
(c)
Kernel debugging
If you want to venture into debugging the Linux kernel,
including any applications it launches, QEMU supports the capability to attach
a remote debugger to the simulator itself:
% petalinux-boot --qemu --kernel
–-qemu-args "<…> -gdb tcp::<port> -S"
Note
that with the -S option, QEMU will not boot any code until a remote debugger is
attached. If you want to be able to use source information, make sure whatever
code you want to debug is compiled with -g and point the cross-debugger to it,
e.g.:
% aarch64-linux-gnu-gdb
application[.irq]/example
Finally,
connect the cross-debugger to QEMU and start execution:
(gdb) target remote localhost:<port>
(gdb) b main
(gdb) c
You
can then use the virtual platform normally, while being able to control the emulated
code from the remote debugger (e.g. Ctrl-C will stop the simulated ARM and
bring it back under GDB control). Setting a breakpoint as shown above will
trigger when the application you pointed the debugger to is launched inside the
platform (# ./example) and it hits the given source line.