Department of Electrical and Computer Engineering
The University of Texas at Austin
EE 360N, Fall 2004
Problem Set 4
Due: 18 October 2004, before class
Yale N. Patt, Instructor
Aater Suleman, Huzefa Sanjeliwala, Dam Sunwoo, TAs
You are encouraged to work on the problem set in groups and turn in
one problem set for the entire group. Remember to put all your names on
the solution sheet. Also remember to put the name of the TA in whose discussion
section you would like the problem set returned to you.
The virtual address of variable x is x3456789A. Using the VAX's
virtual memory architecture, find the physical address of x.
Remember that in VAX each Vitual Adress consists of
2 bits to specify the Address Space
21 bits to specify Virtual Page Number
9 bits to spcify the byte on the page
You will need to know the contents of P0BR: x8AC40000 and SBR: x000C8000.
You will also need to know the contents of the following physical memory locations:
Some intermediate questions to help you:
- What virtual page of P0 Space is x on?
- What is VA of the PTE of the page containing x?
- What virtual page of System Space is this PTE on?
- What is the PA of the PTE of this page of System Space?
- What is the PA of the PTE of the page containing x?
(Hamacher, pg.255, question 5.13) A byte-addressable
computer has a small data cache capable of holding eight 32-bit words.
Each cache block consists of one 32-bit word. When a given program is executed,
the processor reads data from the following sequence of hex addresses:
200, 204, 208, 20C, 2F4, 2F0, 200, 204, 218, 21C, 24C, 2F4
This pattern is repeated four times.
a. Show the contents of the cache at the end of each pass throughout
this loop if a direct-mapped cache is used. Compute the hit rate for this
example. Assume that the cache is initially empty.
b. Repeat part (a) for a fully-associative cache that uses the LRU-replacement
c. Repeat part (a) for a four-way set-associative cache that uses
the LRU replacement algorithm.
A computer has an 8KB write-through cache. Each cache block is 64
bits, the cache is 4-way set associative and uses a victim/next-victim
pair of bits in each block for its replacement policy. Assume a
24-bit address space and byte-addresable memory. How big (in bits) is
the tag store?
- An LC-3b system ships with a two-way set associative, write back cache
with perfect LRU replacement. The tag store requires a total of 4352
bits of storage. What is the block size of the cache?
Please show all your work.
Hint: 4352 = 2^12 + 2^8
(Hamacher et al., p. 255, question 5.18) You are working with a
computer that has a primary and secondary cache. The cache block
consists of 8 words. Make the following assumptions:
- The hit rate is the same for both the caches: 0.95 for instructions and 0.90 for data.
- The times needed to access an 8-word block in these caches are C1 = 1 cycle and C2 = 10 cycles.
- It takes one clock cycle to send an address to main memory.
- It takes one clock cycle to send one word from the memory to the cache.
- 30 percent of all instructions are loads and stores.
- It takes the time needed to access the cache to know if an access
was a hit or miss. Also, start the memory access once you know that you miss in
the two caches.
- The accesses to the caches and memory are done serially. So if
you miss in the L1, and hit in the L2, it takes 11 cycles.
- When you receive data from memory, you can access the data as the
writes to the caches are done in parallel.
1. What is the average access time experienced by the CPU if the
main memory uses 4 way interleaving? Assume that the memory is built
with DRAM chips that allow the first word to be accessed in 8 cycles,
but subsequent words of the block are in 4 clock cycles per
word. Also, one clock cycle is required to send one word to the cache.
2. What is the average access time if the main memory is not
3. What is the improvement obtained with interleaving?
Below, we have given you four different sequences of addresses
generated by a program running on a processor with a data cache. Cache
hit ratio for each sequence is also shown below. Assuming that the
cache is initially empty at the beginning of each sequence, find out
the following parameters of the processor's data cache:
* Associativity (1, 2, or 4 ways)
Assumptions:All memory accesses are one byte accessses. All addresses are byte addresses.
* Block size (1, 2, 4, 8, 16, or 32 bytes)
* Total cache size (256B, or 512B)
* Replacement policy (LRU or FIFO)
sequence 1: 0, 2, 4, 8, 16, 32 hit ratio: 0.33
sequence 2: 0, 512, 1024, 1536, 2048, 1536, 1024, 512, 0 hit ratio: 0.33
sequence 3: 0, 64, 128, 256, 512, 256, 128, 64, 0 hit ratio: 0.33
sequence 4: 0, 512, 1024, 0, 1536, 0, 2048, 512 hit ratio: 0.25
AND, for those who cannot get enough of this stuff, the following problem:
(Note: You can get full credit without doing this problem)
You will be given a cache simulator (just the executable) with a
hard-coded configuration. Your job is to determine the configuration
of the cache. The simulator takes a trace of memory addresses as input
and provides a hit ratio as output. Find the following:
* Associativity (1, 2, 4, or 8 ways)
Show the traces you used to determine each parameter of the cache.
Assumptions:All memory accesses are one byte accessses. All addresses
are byte addresses.
* Block size (1, 2, 4, 8, 16, or 32 bytes)
* Total cache size (256B, 512B, or 1024B)
* Replacement policy (LRU or Pseudo-LRU)
The syntax for running
the program is:
The traces are just text files with one integer memory address per
line. For example, the following trace would cause conflict misses in
a direct-mapped, 256B cache:
Simulator for Linux
Simulator for Solaris
After downloading the file, please do "chmod 700 cachesim.linux" or "chmod 700 cachesim.solaris".