System-on-Chip (SoC) Design
ECE382M.20, Fall 2021
Lab #3
Due: 11:59pm, November 1, 2021
Instructions:
•
This lab is a team exercise.
•
Please use the discussion board on Piazza
for Q&A.
•
All reports and code MUST be submitted to the
digital assignment of Canvas.
The goals of this lab are to:
•
Use Xilinx's Vivado
high-level synthesis (HLS) tool to synthesize the GEMM accelerator and generate
Verilog or VHDL code at the register transfer level (RTL).
•
Validate the generated RTL code and compare the
results with the reference C model.
•
Explore various architectural alternatives.
Please refer to
the following materials for a tutorial by Xilinx that you can follow at your
own pace:
•
Xilinx's
Vivado HLS Tutorial (ug871), Chapters 1 through 8
•
Vivado HLS user's guide (ug902)
Starting from your
isolated GEMM SystemC module in Lab #2, we will now
create a standalone GEMM function that can be fed into Vivado
HLS for synthesis:
a)
Take the single, isolated GEMM function or method
that you created in Lab #2, split it out into a standalone, global C/C++
function (if not done so already) and modify the code, if necessary, such that Vivado HLS can synthesize this top-level function into a
RTL description. Note that Vivado can synthesize both
C/C++ functions and SystemC modules. However, in
either case, the computation function to be synthesized by the HLS tool has to
be isolated from any interfacing, memory or other surrounding logic that will
later be implemented manually or put together out of library components. It is
usually simpler to isolate the GEMM in the form of a plain C/C++ function
rather than a separate, standalone SystemC
module.
b)
Develop a C/C++ (or SystemC)
testbench to test your standalone GEMM function.
c)
Compile and run the code to check if the GEMM is
functioning correctly.
Deliverables:
•
A README file including how to compile, run and
verify your design
•
A Makefile or script to
compile and/or run your code
•
All required C/C++ (or SystemC)
code
•
Any golden input/output test files
Note: the TA
should be able to run your main program and compare the results using the
testbench you provide.
We will now synthesize the standalone GEMM functionality down to a
cycle-by-cycle RTL description:
a)
On a ECE-LRC machine, setup the Xilinx environment
and launch Vivado HLS:
% module load xilinx/2018.3
% vivado_hls
b)
Create a new Vivado HLS
project for your design with following settings:
•
Project name: hls_gemm
(or whatever you want)
•
Location: wherever you want
•
Top Function: the name of your top-level GEMM
function
•
Design Files: add your design files that are
supposed to be synthesized (no header files, but .c or .cpp
files).
•
TestBench Files: add your
testbench files and input data files for test. These files are not synthesized.
•
Solution Name: solution1 (or whatever you want). In
a Vivado HLS project, you can create multiple
solutions, and each can have different synthesis options. And you can compare
the solutions in Vivado HLS environment.
•
Clock Period: There is no requirement for this lab.
A good starting point is 10 ns, but the target clock should eventually be one
of the parameters driving optimizations.
•
Part Selection:
–
RTL tool: Auto
–
Specify: Parts -> Select
'xczu3eg-sbva484-1-e’
c)
Start from a default architectural constraint, i.e.
don't specify any architectural constraints (=synthesis directives) yet at this
point. You will be exploring different architectural alternatives in the next
part of this lab.
d)
Click Project -> Run C simulation. Check the
simulation log. A successful simulation will have a “*****CSIM
finish*******” message in the end.
e)
Run C synthesis. Discuss the results of the
synthesis report that is automatically shown in Vivado
HLS after synthesis.
f)
Validate your RTL code running C/RTL co-simulation.
Check the digital waveform, and confirm the correct GEMM operation.
Deliverables:
• A write-up briefly explaining how your RTL GEMM works and how you validated the RTL design.
Freely explore at least 3 different architectural alternatives using various features offered by Vivado HLS (e.g. unrolling, pipelining, memory optimization, etc.) to come up with an optimal design. Discuss your approaches to different solutions and compare them in terms of various design metrics, i.e. area, latency, throughput, and operating clock frequency.
Deliverables:
•
A README file including how to run and what to
compare
•
All required C or C++ code
•
The directives.tcl files to synthesize/verify your
design
You must submit N numbers of .tcl scripts for
all N solutions you have come up with. The directives.tcl file you are required
to submit is under the Solution_# directory. The TA should be able to
synthesize and verify all your designs.
•
Generated RTL code for each of the design
alternatives
•
A write-up in your lab report:
–
Explain the approaches you have used for each
solution.
–
Comparison of synthesis results
–
Discussion of the results
Submit the
following deliverables via Canvas:
•
A write-up in PDF format (Part 3, 4, and 5)
•
An archive in compressed file format (Part 3 and
5).
–
Source code, scripts, and README files