Department of Electrical and Computer Engineering
The University of Texas at Austin
EE 360N, Spring 2001
Yale N. Patt, Instructor
Kameswar Subramaniam, Onur Mutlu, TAs
Problem Set 2, due February 26, 5.30pm.

1. State 20 in the state machine for the LC-2 microarchitecture does not change the state of any registers in the processor. The only thing it does is to decide if the next state is state 1 or state 35. Can we do anything useful in state 20 so that the cycle is not wasted and the microarchitecture still emulates the LC-2 instruction set correctly? What would be the corresponding changes to the microinstruction for state 20?

2.

a) In which states in the LC-2 state diagram should the LD_BEN signal be asserted? Is there a way for the LC-2 to work correctly without the LD_BEN signal? Explain.
b) Suppose we want to get rid of the BEN register altogether. Can this be done? If so, explain how. If not, why not? Is it a good idea. Explain.
c) Suppose we took this further and got rid of state 20. The figure below shows a modified microsequencer. What are the signals denoted as A and B in the figure?

3. (Hamacher, pg.255, question 5.13)  A byte-addressable computer has a small data cache capable of holding eight 32-bit words. Each cache block consists of one 32-bit word. When a given program is executed, the processor reads data from the following sequence of hex addresses:

`     200, 204, 208, 20C, 2F4, 2F0, 200, 204, 218, 21C, 24C, 2F4`
This pattern is repeated four times.
(a) Show the contents of the cache at the end of each pass throughout this loop if a direct-mapped cache is used. Compute the hit rate for this example. Assume that the cache is initially empty.
(b) Repeat part (a) for a fully-associative cache that uses the LRU-replacement algorithm
(c) Repeat part (a) for a four-way set-associative cache that uses a perfect LRU-scheme
4. A computer with a 64-bit wide data bus uses 1Mbit (2^20 locations, 1-bit addressibility = 2^20 * 1 bit) DRAM memory chips. What is the smallest memory in bytes that this computer can have?

5. A processor supports byte-addressible memory with a 30-bit address space. The processor is connected to memory via a 64-bit data bus. Design a eight-way-interleaved memory that supports the full address space of the processor. Use only 512Kbit (2^19 * 1 bit) memory chips. Draw a diagram of your memory system, with chip enables, write enables, data bus, and address bus. On your diagram, label memory locations 0 through 31. How big is this memory? Give a breakdown of each field in a memory address. (Don't worry about the logic for unaligned accesses.)

6.Suppose the processor in problem 5 is updated to support a 32-bit address space with byte-addressible memory. Using the physical memory developed above, we would like to support virtual memory. The virtual memory will support the following features:

```        8KB page size
4 levels of access:
none
execute
user and kernel privilege levels
page replacement using a reference bit```
How many pages are there in virtual memory? How many frames are there in physical memory? Design an access protection scheme that supports all of the above access and privilege levels using a minimum number of bits. Show a PTE for this memory, and specify the length of each field in the PTE. Finally, show how a 32-bit virtual address is translated into a physical address.

7. The virtual address of variable x is x3456789A. Using the VAX's virtual memory architecture, find the physical address of x.

You will need to know the contents of P0BR: x8AC40000 and SBR: x000C8000.

You will also need to know the contents of physcial memory locations:

x1EBA6EF0:    x80000A72
x0022D958:    x800F5D37
a. What virtual page of P0 Space is x on?

b. What is VA of the PTE of the page containing x?

c. What virtual page of System Space is this PTE on?

d. What is the PA of the PTE of this page of System Space?

e. What is the PA of the PTE of the page containing x?

8. (Hamacher, pg.255, question 5.20) 1024x1024 array of 32-bit numbers is to be normalized as follows. For each column the largest element is found and all elements of the column are divided by this maximum value.Assume that each page in the virtual memory consists of 4Kbytes and that 1Mbytes of the main memory are allocated for storing data during this computation. Suppose that it takes 40 ms to load a page from the disk to the main memory when a page fault occurs (assume that when we start, the main memory is empty ).
a. How many page faults would occur if the elements of the array are stored in column order in the virtual memory?

b. How many page faults would occur if the elements are stored in row order?

c. Estimate the total time needed to perform this normalization for both arrangements a & b. Assume that it takes 2 ns to do a comparison, 20 ns to do a divide and 100 ns to do a load/store to memory.

9. We have been referring to the LC-2 memory as 64 Kwords of memory, word-addressible. This is the memory that the user sees, and may bear no relationship to the actual physical memory. Suppose the actual physical address space is 8Kwords, and we keep the notion of 512 word pages. What is  the size of the PFN? Suppose we use the VAX convention of partitioning the virtual address space into User Space (P0) and System Space, with 48 Kwords of user space and 16 Kwords of system space. Suppose we further insist, like the VAX that System Page Table remains resident in physical memory. If each PTE contained, in addition to the PFN, a Valid bit, a modify bit, and two bits of access control, how many bits of physical memory would be required to store the System Page Table?