#### 14. Memories

- Last module:
  - Synthesis and Verilog
- This module
  - Memory arrays
  - SRAMs
  - Serial Memories
  - Dynamic memories

D. Z. Pan 14. Memories 1



# **Array Architecture**

- 2<sup>n</sup> words of 2<sup>m</sup> bits each
- If n >> m, fold by 2<sup>k</sup> into fewer rows of more columns



- Good regularity easy to design
- Very high density if good cells are used

  14. Memories

  14. Memories

# 12T SRAM Cell

- · Basic building block: SRAM Cell
  - Holds one bit of information, like a latch
  - Must be read and written
- 12-transistor (12T) SRAM cell
  - Use a simple latch connected to bitline

 $-46 \times 75 \lambda$  unit cell





D. Z. Pan

14. Memories 4

#### **6T SRAM Cell**

- · Cell size accounts for most of array size
  - Reduce cell size at expense of complexity
- 6T SRAM Cell
  - Used in most commercial chips
  - Data stored in cross-coupled inverters
- Read:
  - Precharge bit, bit\_b
  - Raise wordline
- Write:
  - Drive data onto bit, bit\_b
  - Raise wordline

D. Z. Pan



#### **SRAM Read**

- · Improve performance when bit-line cap. is high
- · Precharge both bitlines high
- Then turn on wordline
- One of the two bitlines will be pulled down by the cell
- Ex: A = 0, A\_b = 1
  - bit discharges, bit\_b stays high
  - But A bumps up slightly
- Read stability
  - A must not flip
  - N1 >> N2

D. Z. Pan



# **VLSI** Design

# 14. Memories













# **VLSI** Design

# 14. Memories

# **Large Decoders**

- For n > 4, NAND gates become slow
  - Break large gates into multiple smaller gates





# Column Circuitry

- · Some circuitry is required for each column
  - Bitline conditioning
  - Sense amplifiers
  - Column multiplexing

D. Z. Pan 14. Memories 15

# **Bitline Conditioning**

· Precharge bitlines high before reads

• Equalize bitlines to minimize voltage difference when using sense amplifiers



D. Z. Pan 14. Memories 16

# **Sense Amplifiers**

- · Bitlines have many cells attached
  - Ex: 32-kbit SRAM has 256 rows x 128 cols
  - 128 cells on each bitline
- t<sub>pd</sub> ∝ (C/I) ∆V
  - Even with shared diffusion contacts, 64C of diffusion capacitance (big C)
  - Discharged slowly through small transistors (small I)
- Sense amplifiers are triggered on small voltage swing (reduce ∆V)

D. Z. Pan 14. Memories 17

# **Differential Pair Amp**

- · Differential pair requires no clock
- · But always dissipates static power



Z. Pan 14. Memories 18

# **Clocked Sense Amp**

- Clocked sense amp saves power
- Requires sense\_clk after enough bitline swing
- Isolation transistors cut off large bitline capacitance





# Column Multiplexing

- Recall that array may be folded for good aspect ratio
- Ex: 2 kword x 16 folded into 256 rows x 128 columns
  - Must select 16 output bits from the 128 columns
  - Requires 16 8:1 column multiplexers

D. Z. Pan 14. Memories 21

#### Tree Decoder Mux

- Column mux can use pass transistors
  - Use nMOS only, precharge outputs
- One design is to use k series transistors for 2<sup>k</sup>:1 mux
  - No external decoder logic needed



# Single Pass-Gate Mux

 Or eliminate series transistors with separate decoder





# **VLSI** Design

#### 14. Memories

# **Multiple Ports**

- We have considered single-ported SRAM
  - One read or one write on each cycle
- Multiported SRAM are needed for register files
- · Examples:
  - Multicycle MIPS must read two sources or write a result on some cycles
  - Pipelined MIPS must read two sources and write a third result each cycle
  - Superscalar MIPS must read and write many sources and results each cycle

D. Z. Pan 14. Memories 25

#### **Dual-Ported SRAM**

- Simple dual-ported SRAM
  - Two independent single-ended reads
  - Or one differential write



- Do two reads and one write by time multiplexing
  - Read during ph1, write during ph2

D. Z. Pan

14. Memories 26

#### Multi-Ported SRAM

- · Adding more access transistors hurts read stability
- · Multi-ported SRAM isolates reads from state node
- Single-ended design minimizes number of bitlines



#### **Serial Access Memories**

- Serial access memories do not use an address
  - Shift Registers
  - Tapped Delay Lines
  - Serial In Parallel Out (SIPO)
  - Parallel In Serial Out (PISO)
  - Queues (FIFO, LIFO)

D. Z. Pan

14. Memories 28

# **Shift Register**

- · Shift registers store and delay data
- · Simple design: cascade of registers
  - Watch your hold times!



Z. Pan 14. ľ

# **Denser Shift Registers**

- Flip-flops aren't very area-efficient
- For large shift registers, keep data in SRAM instead
- Move R/W pointers to RAM rather than data
  - Initialize read address to first entry, write to last
  - Increment address on each cycle



14. Memories 30

# **Tapped Delay Line**

- A tapped delay line is a shift register with a programmable number of stages
- Set number of stages with delay controls to mux
  - Ex: 0 63 stages of delay



# Serial In Parallel Out • 1-bit shift register reads in serial data – After N steps, presents N-bit parallel output clk Sin P0 P1 P2 P3

#### Parallel In Serial Out

- Load all N bits in parallel when shift = 0
  - Then shift one bit out per cycle



#### Queues

- Queues allow data to be read and written at different rates.
- · Read, Write each use their own clock, data
- · Queue indicates whether it is full or empty
- Build with SRAM and read/write counters (pointers)



# FIFO, LIFO Queues

- First In First Out (FIFO)
  - Initialize read and write pointers to first element
  - Queue is EMPTY
  - On write, increment write pointer
  - If write almost catches read, Queue is FULL
  - On read, increment read pointer
- Last In First Out (LIFO)
  - Also called a stack
  - Use a single stack pointer for read and write

D. Z. Pan 14. Memories 35

# A-Transistor Dynamic RAM Cell Remove the two p-channel transistors from static RAM cell, to get a four-transistor dynamic RAM cell Data stored as charge on gate capacitors (complementary nodes) Data must be refreshed regularly Dynamic cells must be designed very carefully D. Z. Pan 14. Memories 36





