System-on-Chip (SoC) Design
ECE382M.20, Fall 2023
Lab #2
Due: 11:59pm, October 9 October 13, 2023
Instructions:
•
This lab is a team exercise.
•
Please use the discussion board on ED
for Q&A.
•
Submit the report on Canvas
and code on Github Classroom.
The goals of this lab are to:
•
Use Xilinx's Vitis
high-level synthesis (HLS) tool to synthesize the GEMM accelerator and generate
Verilog or VHDL code at the register transfer level (RTL).
•
Validate the generated RTL code and compare the
results with the reference C model.
•
Explore various architectural alternatives.
Please refer to
the following materials for a tutorial by Xilinx that you can follow at your
own pace:
•
Vitis High-Level Synthesis User Guide (UG1399)
Starting from your
isolated GEMM function in Lab #1, we will now create a standalone GEMM function
that can be fed into Vitis HLS for synthesis:
a)
Take the single, isolated GEMM function that you
created in Lab 1 and modify the code, if necessary, such that Vitis HLS can synthesize this top-level function into an
RTL description. If you didn’t do so already in Lab 1, modify its
function prototype such that it now receives int-type inputs and
generates int-type outputs, i.e. the fixed-point conversion does
not happen inside the GEMM function. Make sure the GEMM is a single, standalone
C/C++ function that is side-effect free, i.e. any and all required inputs and
outputs are passed as function parameters or return value as you proceed with
the isolation.
b)
Develop a C/C++ testbench to test your standalone,
fixed-point GEMM function.
c)
Compile and run the code to check if the GEMM is
functioning correctly.
Deliverables:
•
A directory named part3 in the lab-2 repository in Github Classroom,
that includes the following:
–
A README file including how to compile, run and
verify your design.
–
A Makefile or script to
compile and/or run your code, if necessary.
–
All required C/C++ code (source file and testbench
file).
–
Any golden input/output test files. Please keep the
total size of these files to less than 50MB.
•
A writeup in your lab report (submitted on Canvas)
documenting your isolated GEMM and testbench.
•
Note: the TA should be able to run your main
program and compare the results using the testbench you provide.
We will now synthesize the standalone GEMM functionality down to a
cycle-by-cycle RTL description:
a)
On an ECE-LRC machine, setup the Xilinx environment
and launch Vitis HLS:
% module load xilinx/2022
% vitis_hls
b)
Create a new Vitis HLS project
for your design with following settings:
•
Project name: hls_gemm
(or whatever you want)
•
Location: wherever you want, but it is recommended
that you create your projects in your local scratch space under /misc/scratch/<your_username>/
•
Top Function: the name of your top-level GEMM
function
•
Design Files: add your design files that are
supposed to be synthesized (no header files, but .c or .cpp
files).
•
TestBench Files: add your
testbench files and input data files for test. These files are not synthesized.
•
Solution Name: solution1 (or whatever you want). In
a Vitis HLS project, you can create multiple
solutions, and each can have different synthesis options. And you can compare
the solutions in Vitis HLS environment.
•
Clock Period: There is no requirement for this lab.
A good starting point is 10 ns, but the target clock should eventually be one
of the parameters driving optimizations.
•
Part Selection:
–
Select: Parts -> Browse and select
'xczu3eg-sbva484-1-e’
c)
In this lab, we will synthesize the core
computational GEMM kernel of our accelerator that will operate out of local accelerator
SRAM and will later be combined (connected) with SRAMs and external bus
interfaces to integrate it with the rest of the system. By default, Vitis HLS will synthesize any of your C function’s
parameters that are specified as scalars or fixed-size arrays (e.g. int A[1024]) into ports of ap_none (register) and ap_memory (SRAM) interface
type, respectively. However, if your GEMM C function uses arguments of pointer type
(int
*A) or
arrays of undefined size (e.g. int A[]), you will need
to provide directives (either through pragmas in the code or via the Vitis HLS GUI) to synthesize them into the correct ap_memory port
interfaces in the generated RTL. In the process, make sure to set the depth parameter to be equal to the size
of the corresponding array or the co-simulation will fail, e.g.:
#pragma
HLS_INTERFACE port=A mode=ap_memory depth=1024
d)
Start from a default architectural constraint, i.e.
don't specify any other architectural constraints (=synthesis directives) yet
at this point. You will be exploring different architectural alternatives in
the next part of this lab.
e)
Click Project -> Run C simulation. Check the
simulation log. A successful simulation will have a “*****CSIM
finish*******” message in the end.
f)
Select the top-level function in Project ->
Project Settings -> Synthesis and run C synthesis. Discuss the results of
the synthesis report that is automatically shown in Vitis
HLS after synthesis.
g)
Validate your RTL code running C/RTL co-simulation.
Check the digital waveform and confirm the correct GEMM operation.
Deliverables:
• A directory named part4 in the lab-2 repository in Github Classroom containing the following:
– A README file including information of how to compile and run your files in Vitis.
– All required C or C++ code (source code and testbench files, as well as any header files).
• A write-up in the report, briefly explaining how your RTL GEMM works and how you validated the RTL design.
Freely explore at least 3 different architectural alternatives using various features offered by Vitis HLS (e.g. unrolling, pipelining, memory optimization, etc.) to come up with an area-performance optimal design. Discuss your approaches to different solutions and compare them in terms of various design metrics, i.e. area, latency, throughput, and operating clock frequency.
Deliverables:
•
A directory named part5 in the lab-2
repository in Github Classroom. For each of your
designs, create a subdirectory under part5 named design_<design_number> that includes the following files:
–
A README file including how to run and what to
compare.
–
All required C or C++ code of the design.
–
The directives.tcl files to synthesize/verify your designs. The directives.tcl
file you are required to submit is under the Solution_# directory. The TA should be able to
synthesize and verify all your designs. Place each .tcl file in its corresponding
subdirectory.
–
Generated Verilog code of the design.
•
A write-up in your lab report (submitted on
Canvas):
–
Explain the approaches you have used for each
solution.
–
Comparison of synthesis results.
–
Discussion of the results.
Submit the
following deliverable via Canvas:
•
A write-up in PDF format (Part 3, 4, and 5)
•
Parts 3, 4 and 5 in Github
Classroom, as described previously, including.
–
Source code, scripts, generated Verilog and README
files.