Department of Electrical and Computer Engineering

The University of Texas at Austin

EE 360N, Spring 2009
Problem Set 1 Solutions
Yale N. Patt, Instructor
Ramapriyan Chakravarthy, Khubaib, Vivekanand Venugopal, TAs
  1. The ISA is the contract between the hardware and the software. The microarchitecture is a particular implementation of an ISA. Compiler does not need to know any information about the microarchitecture.

    1. ISA
    2. Microarchitecture
    3. ISA
    4. Microarchitecture
    5. ISA
    6. Microarchitecture/ISA
    7. ISA
    8. Microarchitecture
    9. Microarchitecture
    10. Microarchitecture
  2. The first program causes x0004 to be stored in location x3000 when the assembled code is loaded into the memory. The second program causes x0004 to be stored in x3000 during the execution of the program.

  3. The following table shows the classification of the LC-3b instructions into Operate, Data Movement and Control instructions.

    InstructionOperateData MovementControl
    ADD X
    AND X
    BR X
    JMP/RET X
    JSR/JSRR X
    LDB X
    LDW X
    LEA X
    RTI X
    SHF X
    STB X
    STW X
    TRAP X
    XOR/NOT X
  4. Only the third instruction (never branch) can be used as a NOP. The second instruction always skips the next instruction after it, so it is not a NOP. The first instruction (ADD) sets the condition codes based on the value in R1, therefore it is also not a NOP.

    1. The first option does not support either subroutine nesting or subroutine recursion.
    2. The second option supports subroutine nesting, but does not support recursion.
    3. The third option supports both subroutine nesting and subroutine recursion.
    1. Little endian: MEM[x1000] + MEM[x1002] = x1A0E + x0C11 = x261F
    2. Big endian: MEM[x1000] + MEM[x1002] = x0E1A + x110C = x1F26
  5. 32 Megabytes = 228 bits
    1. The memory consists of 228 addressable locations. 28 bits are required to uniquely address each location.
    2. The memory consists of 228 ÷ 23 = 225 addressable locations. 25 bits are required to address a location.
    3. The memory consists of 228 ÷ 27 = 221 addressable locations. 21 bits are required to address a location.
    1. Zero-address machine:
      PUSH A
      PUSH B
      PUSH C
      MUL
      ADD
      PUSH D
      PUSH E
      PUSH D
      PUSH C
      MUL
      ADD
      SUB
      MUL
      POP X

      Advantages: good for fast arithmetic; operate instructions only require an opcode so they can be encoded very densely.

      Disadvantages: not flexible in terms of manipulating operands, more instructions required to write programs.

    2. One-address machine:
      LOAD B
      MUL C
      ADD A
      STORE A
      LOAD C
      MUL D
      ADD E
      STORE E
      LOAD D
      SUB E
      MUL A
      STORE X

      Advantages: only a single register required, fewer instructions compared to the stack machine.

      Disadvantage: not as flexible as 2 or 3-address machines. We need instructions to move data to and from the accumulator (We don't need these in a 3-address machine that supports memory-to-memory operations).

    3. Two-address machine:
      MUL B, C
      ADD A, B
      MUL C, D
      ADD C, E
      SUB D, C
      MUL A, D
      SUB X, X   ; these two instructions emulate a MOV X, A
      ADD X, A
    4. Three-address machine:
      MUL X, B, C
      ADD A, X, A
      MUL X, D, C
      ADD X, E, X
      SUB X, D, X
      MUL X, X, A
    1. 2 address machine.
    2. 4 registers
    3. Lets say you want MOV R2, R1 (R2 = R1):
      AND R2, #0
      ADD R2, R1
      or
      LEA  TEMP   ; Absolute address of TEMP in R0
      ST   R1, R0 
      LD   R2, R0
      ....
      ....
      TEMP  .FILL  xDEAD  ; any garbage value
    4. Use the LEA instruction:
      	LEA LABEL
              ADD R2, R3 ; Set the Condition Codes overwritten by LEA
      	BR  R0
      	....
      LABEL	....	; Branch to this line


    1. SymbolAddress
      AGAIN x300E
      NO x3022
      B x3024
      A x3026

    2. If the high and low byte of the word stored at location x4000 are equal, R5 is set to 1, else to 0.

    3. There are several possible answers to this code optimization question. Some of these are:

      • The programmer used a loop to left shift a value in R2 by 8 bits. He/she could have done this using a single LC-3b instruction, LSHF. He/she should have replaced the loop
        AGAIN   ADD R2, R2, R2
                ADD R3, R3, #-1
                BRp AGAIN
        with
                LSHF R2, R2, #8

      • Instead of using "subtraction" to compare the high and low byte, the programmer could have used a single XOR instruction to check the equality of the two bytes. He/she could have replaced the following instructions
                NOT R1, R1
        	     ADD R1, R1, #1
        	     ADD R2, R2, R1
        with
                XOR R2, R1, R2
      • The program is comparing the high byte and low byte of the word in memory location x4000. The programmer could have utilized the LDB instruction to load the high byte and low byte into two separate registers and compare them. This way there would be no need for shifting and masking. The optimized program could look like this:
                .ORIG   x3000
        	AND     R5, R5, #0
        	LEA	R0, A
        	LDW	R0, R0, #0
        	LDB	R1, R0, #0
        	LDB	R2, R0, #1
        	XOR	R2, R1, R2
        	BRnp    NO
        	ADD     R5, R5, #1
        NO	HALT
        A       .FILL   x4000
                .END	 

    1. 	.ORIG	x4000
      MAIN LEA R2,L0
      JSRR R2
      JSR L1
      HALT
      ;
      L0 ADD R0,R0,#5
      RET
      ;
      L1 ADD R1,R1,#5
      RET
            

      x4000 xE403
      x4002 x4080
      x4004 x4803
      x4006 xF025
       
      x4008 x1025
      x400A xC1C0
       
      x400C x1265
      x400E xC1C0
    2. As long as the subroutine we are calling is located at most 1023 instructions before the JSR instruction or at most 1024 instructions after the JSR instruction, there is no need to use the first way, which requires at least two instructions. This is due to the fact that JSR instruction can change the PC to an address within the range PC + 2 - 2048 and PC + 2 + 2046, because it uses a limited offset of 11 bits.

      The method that requires two instructions is necessary if the subroutine we are calling is not within the range of the JSR instruction. Note that the JSRR instruction can change the PC to any address residing in its base register. The address in the base register can be set to any address in memory.

      For example if we have a program starting at memory location x3000 and we would like to call a subroutine starting at memory location 0xF000, there is no way to do this by just using a JSR instruction at location x3000. However, we can call the subroutine at 0xF000 using JSRR with the following sequence of instructions:

                 .ORIG      x3000
                 LEA        R0, SUBADDR
                 LDW        R1, R0, #0
                 JSRR	      R1
                 HALT
      SUBADDR    .FILL      xF000
                 .END