Electrical and Computer Engineering
University of Texas at Austin
EE382M-8: VLSI-2
Course Goals
This course is intended to provide
the student with the two basic capabilities: 1) To do the early design planning
for an embedded SOC using a high level RTL model and 2) To do the circuit
feasibility analysis of the critical components of the SOC. Current VLSI issues
such as noise analysis, power delivery, power management, timing analysis,
clocking, floor-planning/integration, and transistor/wire scaling will be
covered. Circuit designers from IBM, Centaur, Intrinsity, Intel, Texas
Instruments and FreeScale will co-teach the course. The material presented in the course will
be as close to state of the art as possible. There will be 4 major homework
assignments and a class project.
Course outline and Lecture Notes
|
Lecture Topic |
Lecture Notes |
|
Introduction |
|
|
Early Design Planning (EDP): Front End |
|
|
Early Design Planning: |
|
|
Basic Timing Analysis |
|
|
Transistor and Process Technology |
|
|
Flip-Flop Design |
|
|
Memory Array Design for EDP |
|
|
|
|
|
|
|
|
Static & Statistical Timing Analysis |
|
|
Array Circuit Design |
|
|
DSM Interconnect |
|
|
Dynamic Circuit Design |
|
|
Noise Analysis |
|
|
Power Delivery |
|
|
Deep Pipelined Design |
|
|
DFT, DFD, DFX |
|
|
Arch Design for Low Power |
|
|
Circuit Design for Low Power |
|
|
Asynchronous Design |
|
|
I/O, ESD |
|
Chandrakasan,
Bowhill, Fox, Design of High-Performance Microprocessor Circuits, IEEE
Press, 2000.
Bernstein,
et al., High Speed CMOS Design Styles, Kluwer Academic
Harris, Skew
Tolerant Circuit Design, Morgan Kaufmann Publishers
Weste
& Eshraghian, Principles of CMOS VLSI Design: A Systems Perspective
(second edition), Addison Wesley
V.
G. Oklobdzija, The Computer Engineering Handbook, CRC Press, Boca Raton,
Florida, 2002.
The project is to do a top-down design of a ultra low power processor core. The SPARC-T1 core from SUN will be used as the processor core for this project. We will be designing it using sub-threshold 65nm technology.
The project activities will
include doing:
Detailed floorplan of the cluster level components.
Detailed top-level floorplan using the cluster abstracts.
Determining the critical timing paths and setting the component constraints at the top level and the component level. If the critical path exceeds the timing budget, the logic will have to be re-designed. Timing will be negotiated among all clusters and the top-level integration team. NOTE: We will NOT re-pipeline the SPARC-T1 Core.
Doing a detailed power estimation determining the power grid requirements.
Determining the clocking requirements and designing the clock distribution and regeneration components.
Determining
the standard cell and custom library elements needed to completely do the
design with APR tools.
The clusters of the SPARC-T1 core
are:
Instruction Fetch Unit
Execution Unit
Load/Store Unit
Trap Logic Unit
Memory Management Unit
Floating Point Front-End Unit
Stream Processing Unit
The specification for the SPARC-T1
Core is located in:
OpenSPARCT1_DataSheet.
The architecture overview of the SPARC-T1 Core is located at: SPARC-T1 Architecture Overview
Technology specifications
|
Feature size |
65nm |
|
Supply Voltage |
.50V |
|
Transistor models |
|
|
Temperature |
25 degrees C |
|
Transistor densities 65nm |
|
|
Standard cells |
6.6 trans /u2 |
|
Datapath |
7.8 trans / u2 |
|
ROM |
14 trans /u2 |
|
RAM |
10.2 trans / u2 |
|
Layer assignment |
Used for: |
|
M1,M2 |
Local routing |
|
M3(H),M4(V) |
Global signal routing |
|
M5(H), M6(V) |
Global signal routing |
|
M7(H),M8(V) |
Power, Ground, Clock & Reset |
Analysis and Estimation Tools
Power and Area Estimation Spreadsheet
Interconnect Resistance_and_Capacitance Calculator
Design Metrics
There are four areas that we will be optimizing in this design: Cycle time, power, area and TTFG (time to final grade). Naturally TTFG is the top priority. The remaining order of importance is
|
Energy |
.02 mw/MHz |
|
Frequency |
500 KHz |
|
Area |
.5mm2 |
Detailed activities:
1. Each
cluster will determine the library elements needed to implement their
respective cluster by reviewing the Verilog code for each cluster. It may be necessary to create some custom
cells for the datapath blocks. You will need to approximate the timing for each
element of the library. Validation of the timing will be based on HSPICE
characterization. A spreadsheet of the elements will then be posted on the web
once the characterization of the new cells are complete.
2. Each
team will analyze the Verilog code and do a preliminary block diagram of the
design at the lowest level that makes sense.
Determine:
a.
Size
of blocks
b.
Approximate
the RC’s that each critical signal will have.
c.
Do
the initial approximations of the clock requirements.
d.
Do
the initial approximations Power requirements.
3. Each
team will analyze the Verilog code and then build a spreadsheet that shows the
critical timing paths through their respective cluster. They will work with the
integration team to optimize the timing constraints. The team will also be
responsible for redesigning the logic or re-pipelining the design to meet the
timing requirements. Determine:
a.
Timing
diagrams for each block (think of pipeline stages)
b.
No.
of gate delays per stage and fanout of the gates to get the delays (hence, timing)
c.
Include
approx. routing delays (in terms of gate delays -> standardized model)
d.
Hand-synthesize
"critical paths" (worked out through logistics) and determine arrival
and departure timings
4. The
integration team will be responsible for:
a.
Doing
a floor plan of the top level of the chip
b.
Characterizing
the top-level routing delays and determining the assertions and constraints for
each cluster. They will be working with each cluster to optimize the
constraints.
c.
Designing
the clock routing structure:
1.
Spine
vs. H-Tree vs. serpentine, etc.
2.
Determine
how many global and local clock buffers are needed.
3.
Determine
skew allowance
d.
Determining
the clock generation implementation (block diagrams)
1.
Determine
jitter allowance for timing
e.
Determining
the clock regeneration circuitry (block diagrams)
1.
Determine
global and local clock buffer design.
f.
Determining
the reset logic.
1.
Limit
power spikes – do you need staggered reset.
g.
Designing
the power grid.
1.
What
is peak power, etc.
h.
Determining
the power estimation for the global clock and signal routing.
i.
Determine
the activity factor for each cluster.
j.
Determine
the leakage power guidelines.
k.
Generating
the power budget for each cluster.
l.
Generating
the area budget for each cluster.
5.
Each
team will do a power estimation of their respective clusters. They will work
with the integration team to optimize the design to meet the power budget.
6. Each
team will do a detailed floorplan of their respective blocks. The output will
be a spreadsheet analysis showing the contribution from each of the following:
a.
Power
grid
b.
Clocking
c.
Signal
Routing
d.
Datapath
area
e.
Random
logic area
f.
White
space
7. Each
team will write a detailed design report that includes all of the results from
the above activities. The integration team will be responsible for coalescing
the entire design report into a final document.
8. Each
team will be required to do a detailed design review of their respective
blocks. Each member of the team will have to present their portion of the
design at design review.