# Department of Electrical and Computer Engineering The University of Texas at Austin

EE 360N, Spring 2007
Yale Patt, Instructor
Chang Joo Lee, Rustam Miftakhutdinov, Poorna Samanta, TAs
Exam 1, March 7, 2007

百九

| Name: | LUTIONS                |  |
|-------|------------------------|--|
|       |                        |  |
|       | Problem 1 (25 points): |  |
|       | Problem 2 (10 points): |  |
| i     | Problem 3 (20 points): |  |
|       | Problem 4 (20 points): |  |
|       | Problem 5 (25 points): |  |
|       | Total (100 points):    |  |

Note: Please be sure that your answers to all questions (and all supporting work that is required) are contained in the space provided.

Note: Please be sure your name is recorded on each sheet of the exam.

GOOD LUCK!

| Name:                                                                                                                                                                                                                 |                                                 |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------|
| Problem 1 (20 points)                                                                                                                                                                                                 |                                                 |
| Part a (5 points): A 1GB physical memory system is byte addressable 16-way interleaved. The memory is made out of 8MB chips, each with are row address bits, chip address bits, byte of bus bits and interleave bits. | 8 data pins. Identity which bits in the address |
| 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 chip address                                                                                                                                                    | interleaving byte of bus                        |

No row bits (there's only one row)

Part b (5 points): If we add ECC protection to the LC-3b, then each 16 bit word would require another 5 bits to be able to correct a one bit error. Suppose we did, and we got back the following 21 bit pattern.



Which bit was in error?

19

Part c (5 points): The atomic unit of processing is:

instruction

Name:\_\_\_\_\_

### Problem 1 continued

Part d (5 points): Some of the following are part of the ISA, the rest are part of the microarchitecture. Put a check mark next to each that is part of the ISA.

page size:

MDR:

condition codes:

memory ready bit (R):

trap vector:

Part e (5 points): The xyz machine, which is bigendian, executes LD32 R1,A. Relevant memory locations before the instruction executes is as shown below:

A: 11110000

A+1: 11111111

A+2: 10101010

A+3: 00000000

After execution, R1 contains:

|    |     | 20 | 27 |   |   | 24 | 23 |   |   | 20 | 19 |   |   | 16 | 15 |   |   | 12 | 11 |   |   | 8 | 7 |   |   | 4 | 3 | _ | _ | 0 |
|----|-----|----|----|---|---|----|----|---|---|----|----|---|---|----|----|---|---|----|----|---|---|---|---|---|---|---|---|---|---|---|
| 31 | 1.1 | 1  | 0  | 0 | 0 | 0  | 1  | 1 | 1 | 1  | 1  | 1 | 1 | 1  | 1  | 0 | 1 | 0  | 1  | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |

| Name: |  |
|-------|--|
|       |  |

#### Part c (10 points)

We have not talked about pipelining yet in class. When we do, you will see a pipelined machine can easily issue a memory request every cycle. At the memory controller side, however, we may need a queue to buffer the memory addresses if there are memory bank conflicts. In this example, eight successive memory accesses have arrived at the memory controller, and are buffered until their banks are free. Accesses must be done in the order in which they arrived. The figure below shows which memory accesses are active during each cycle.

| - 1 |   |     | - 1 | . 1 |   | 6 | 7 | lock cy | 0 | 10 | 11 | 12       | 13 | 14  | 15 | 16 |
|-----|---|-----|-----|-----|---|---|---|---------|---|----|----|----------|----|-----|----|----|
| 1   | 2 | - 3 | 3   | 4   | 5 | 0 | 1 | 0       | , | 10 | ** | 12       |    |     |    | 1  |
| - 1 |   | 1   | ÷   |     |   |   |   |         |   |    |    | 1        |    |     |    |    |
| -   | _ |     | +   |     |   |   |   |         |   |    |    |          |    |     |    |    |
| -   |   |     | 1   | 2   |   |   |   |         |   |    |    |          |    |     |    |    |
| 1   |   |     | 1   | G.  |   |   |   |         |   |    |    |          |    |     |    |    |
| - 1 |   | -   | ÷   |     | 3 |   |   |         |   |    |    |          |    |     |    | 1  |
| 1   |   |     | - 1 |     |   |   |   | 4       |   |    |    |          |    |     |    |    |
|     |   |     | į   |     |   |   |   |         |   |    | 1  |          |    |     |    | !  |
|     |   |     | 1   |     |   |   | _ | -       | 5 | _  | _  |          |    | 1   | 1  | :  |
|     |   | ì   | į   |     | 1 |   |   |         |   | 6  |    |          |    | -   |    | 1  |
|     |   |     | - 1 |     | 1 | : |   | _       | - | 1  |    |          |    | 1   | :  | 1  |
|     |   | į   | 1   |     |   | : |   | !       |   |    | 7  |          |    | 4   |    | 1  |
|     |   | 1   |     |     | 1 |   | : |         |   |    |    |          |    | 8   | -  | 1  |
|     | i | 1   | -   |     | 1 | 1 | ! | 1       | 1 | i  |    | $\vdash$ | :  | . 0 | +  | +  |
|     |   | 1   | i   |     | 1 | 1 | 1 | 1       | 1 | 1  | 1  | i        | i  | 1   | 1  | 1  |
|     | 1 | 1   | - 1 |     | 1 |   |   | 1       | : | 1  | 1  | 1        |    |     | 1  | 1  |

Note that memory access 1 is initiated in cycle 1 and returns data at the end of cycle 5. Memory has an access time of five cycles, and is four way interleaved.

Your job: Identify which bank each memory access goes to and fill in the table below accordingly. Memory access 1 has already been entered. (Note: there are several correct solutions: any one of them will receive full credit.)

| BANK 3 | BANK 2 | BANK 1 | BANK 0 |
|--------|--------|--------|--------|
| 7      | 3      | 2      | 1      |
|        | 6      | 5      | 4      |
|        |        | 8      |        |
|        |        |        |        |

Note:-This is one
of many possible
correct solutions.
A correct solution
would satisfy the
following constraints:a]Accesses 1 & 4 go to
same bank
b]Accesses 5 & 8 go to

same bank

4 c]Accesses 1,2,3 go to different banks.

d] Accesses 4,5,6,7 go to different banks.

Name:

### Problem 3 (20 points)

An x86 assembly language programmer complained that the LC-3b did not have what to her was the most valuable addressing mode which is available in the x86 ISA. Recall that the x86 instruction is variable length. One of the optional bytes in that instruction is called SIB (for Scale/Index/Base). It allows one to construct an address by scaling (multiplying) the contents of one register (the Index) and adding the result to the contents of another register (the Base). That is, Address = Base + Scale\*Index.

NO PROBLEM, we say. We will use an unused opcode to provide the same capability with the LC-3b ISA. We will call the new opcode SIB:

SIB DR, BaseR, Scale, IndexR

which will load DR with the address computed by multiplying the IndexR register by 2 Scale and adding the result to the contents of the BaseR register.

We thus get the same effect as the x86 SIB byte, only it takes two LC-3b instructions. That is,

SIB R5, R3, #3, R2 LDW R1,R5,#0

will load R1 with the contents of memory whose address is obtained by adding R3 to the product of R2 and 23.

The next page shows the data sheet for the SIB instruction in the style of Appendix A.

Part a (5 points): We can implement the SIB instruction with either one extra state or two extra states in the state

Two states are better. To do a shift followed by an add in one cycle would increase the cycle time substantially. Sowing one cycle to execute SIB at the expense of lengthening the cycle time of everything else is not a good design decision.

Part b (15 points): Your job here is to implement the SIB instruction with two extra states (state 10, and state 26). Using the notation of the LC-3b State Diagram, describe what happens in these states inside their corresponding bubbles. Show all output arc(s) to indicate the next state after state 10 and state 26.



Using the data path diagram labeled "SIB with two extra states" on the next page, add any additional structures and any control signals needed to implement SIB as specified by the two states shown in the bubbles. Label any additional control signals "ECS 1" (for "extra control signal 1"), "ECS 2" etc.

Show the values in the figure below for each control signal corresponding to states 10 and 26.



|          | IRD | COND |   |   |   | _  |   |   |   |
|----------|-----|------|---|---|---|----|---|---|---|
| State 10 | 0   | 0    | 0 | D | 1 | 1  | 0 | 1 | 0 |
| State 26 | 0   | 0    | 0 | 0 | 1 | ٥. | 0 | 1 | 0 |

## SIB with two extra states



- SHF CONTROL MUX
- SHFINPUTMUX
- SR2TEMPSELMUX

Name:

### Problem 4 (20 points)

Consider the following two level virtual memory system for the LC-3b:

Virtual Address Space:

64KB

Physical Memory Size:

User Space Range:

x0000 to x7FFF

Page Size:

256 bytes

System Space Range:

x8000 to xFFFF

Page Table Entry Size:

2 bytes

The system does not include a Translation Lookaside Buffer. The Page Table Entry format is as follows:

| 1 | 15 |   |   |   | 11 |   |   |   | 0       |
|---|----|---|---|---|----|---|---|---|---------|
| ٦ | V  | 0 | 0 | 0 | M  | 0 | 0 | 0 | <br>PFN |

Part a (2 points): How many bits are allocated for the Page Frame Number (PFN) in the PTE? Show the computation.

$$\frac{4 \text{ KB}}{256 \text{ B}} = \frac{2^{12}}{2^8} = 2^4 \text{ frames Answer:} \qquad 4 \text{ bits}$$
which the machine stopped at a breakpoint and the following state information was observed:

Part b (18 points): The machine stopped at a breakpoint and the following state information was observed:

Note: SBR is the System Page Table Base Register and UBR is the User Page Table Base Register. Each points to the first entry of the corresponding page table.

After execution resumed, the machine issued the following successive six physical memory requests, uninterrupted by any page faults, access control violations, or anything else. Note that each entry is incomplete. Your job: complete the six entries.

| Access # | PA   | Data  | Identity of Item Being Read       |
|----------|------|-------|-----------------------------------|
| 1        | ×020 | x8004 | The PTE for System Virt. Page x10 |
|          | x462 | x8001 | The Pte for User Virt. Page x31   |
| 3        | ×100 | x6200 | The instruction LDW R1, R0, #0    |
| 4        | x020 | ×8004 | The PTE for System Virt. Page x10 |
|          | x460 | x8007 | The PTE for User Virt. Page x30   |
|          | x7FO | x8004 | Data due to the LDW instr.        |

Note: The last column should identify what is being read specifically. For example: "The instruction JSR HELP," "The PTE for User Virtual Page 0," "PTE for System Virtual Page 0," "Data due to load instruction," etc.

Name

### Problem 5 continued:

The control bits are encoded as follows:

DATASIZE:

0 = Byte

1 = Word

MDRHIGH.LD:

0 = No Load 1 = Load High Byte

MDRLOW.LD:

0 = No Load

ROTATE:

1 = Load Low Byte 0 = No Rotation

1 = 8-bit Rotation

FIRST/SECOND:

0 = First Access 1 = Second Access

MDR.READY:

0 = MDR Not Ready

1 = MDR Ready

Part a (5 points): Identify the 1-bit signal X and the 15-bit signal Y shown on the diagram. (Note: they are generated by the processor.)



Part b (5 points): The state machine for the Unaligned Load Controller is shown below. State 0 is an idle state, where no memory access is occurring. In state 3, the state machine sets MDR.READY which the processor interprets as the old R bit from memory, so it can move on and read the value in the MDR.



Briefly explain what is accomplished in States 1 and 2.



| Name: |  |
|-------|--|

### Problem 5 continued:

Part c (10 points): The Unaligned Load Controller has signals C1, C2, C3, C4, and C5, which are used to transition the controller through its states. Complete the logic equations for those control signals. (Note: No control signals are required to transition form state 3 to state 0.)

| C1=  | MEM. READY                      |  |
|------|---------------------------------|--|
| C2=  | MEM. READY . DATASIZE . MARCO]  |  |
| C3 = | MEM DEADY . (DATASIZE · MAR[0]) |  |
| C4=  | LIELA DEADY                     |  |
| C5 = | MEM READY                       |  |

Part d (5 points): The Unaligned Load Controller is best thought of as a Moore Machine. That is, its outputs are associated with the state. Complete the table below, identifying the value of each output signal for each state.

|         | мем.се | FIRST/SECOND | MDRHIGH.LD | MDRLOW.LD | MDR.READY |
|---------|--------|--------------|------------|-----------|-----------|
| State 0 | 0      | X            | 0          | 0         | D         |
| State 1 | 1      | 0            | 1          | /         | 0         |
| State 2 | 1      | /            | 1          | 0         | 0         |
| State 3 | 0      | ×            | 0          | , O       | 1         |