Department of Electrical and Computer Engineering

The University of Texas at Austin

EE 306, Fall 2017
Problem Set 5
Due: before class, November 13.
Yale N. Patt, Instructor
TAs: Stephen Pruett, Siavash Zangeneh, Aniket Deshmukh, Zachary Susskind, Meiling Tang, Jiahan Liu

Instructions:
You are encouraged to work on the problem set in groups and turn in one problem set for the entire group. Remember to put all your names on the solution sheet. Also, remember to put the name of the TA and the time for the discussion section you would like the problem set turned back to you. Show your work.


  1. The LC-3 has just finished executing a large program. A careful examination of each clock cycle reveals that the number of executed store instructions (ST, STR, and STI) is greater than the number of executed load instructions (LD, LDR, and LDI). However, the number of memory write accesses is less than the number of memory read accesses, excluding instruction fetches. How can that be? Be sure to specify which instructions may account for the discrepancy.


  2. (Adapted from 7.18) The following LC-3 program compares two character strings of the same length. The source strings are in the .STRINGZ form. The first string starts at memory location x4000, and the second string starts at memory location x4100. If the strings are the same, the program terminates with the value 1 in R5; otherwise the program terminates with the value 0 in R5. Insert one instruction each at (a), (b), and (c) that will complete the program. Note: The memory location immediately following each string contains x0000.

    .ORIG x3000
    LD R1, FIRST
    LD R2, SECOND
    AND R0, R0, #0
    LOOP____________________; (a)
    LDR R4, R2, #0
    BRz NEXT
    ADD R1, R1, #1
    ADD R2, R2, #1
    ____________________; (b)
    ____________________; (c)
    ADD R3, R3, R4
    BRz LOOP
    AND R5, R5, #0
    BRnzp DONE
    NEXTAND R5, R5, #0
    ADD R5, R5, #1
    DONETRAP x25
    FIRST   .FILL x4000
    SECOND   .FILL x4100
    .END



    1. Bob Computer just bought a fancy new graphics display for his LC-3. In order to test out how fast it is, he rewrote the OUT trap handler so it would not check the DSR before outputting. Sadly he discovered that his display was not fast enough to keep up with the speed at which the LC-3 was writing to the DDR. How was he able to tell?

    2. Bob also rewrote the handler for GETC, but when he typed ABCD into the keyboard, the following values were input:

      
      AAAAAAAAAAAAAAAAABBBBBBBBBBBBBBBBBBBBCCCCCCCCCCCCCCCCCCCDDDDDDDDDDDDDDDDDDDD
      

      What did Bob do wrong?



  3. (Adapted from 6.16) Shown below are the partial contents of memory locations x3000 to x3006.

      15 0
    x3000 0 0 1 0 0 0 0                  
    x3001 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 1
    x3002 1 0 1 1 0 0 0                  
    x3003                                
    x3004 1 1 1 1 0 0 0 0 0 0 1 0 0 1 0 1
    x3005 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0
    x3006                                


    The PC contains the value x3000, and the RUN button is pushed.

    As the program executes, we keep track of all values loaded into the MAR. Such a record is often referred to as an address trace. It is shown below.

    MAR Trace
    x3000
    x3005
    x3001
    x3002
    x3006
    x4001
    x3003
    x0021

    Your job: Fill in the missing bits in memory locations x3000 to x3006.


  4. (Adapted from 10.1)
    What are the defining characteristics of a stack? Give two implementations of a stack and describe their differences.

  5. (Adapted from 10.9) The input stream of a stack is a list of all the elements we pushed onto the stack, in the order that we pushed them. The input stream from Exercise 10.8 on page 284 of the book for example is ABCDEFGHIJKLM
    The output stream is a list of all the elements that are popped off the stack in the order that they are popped off.
    a. If the input stream is ZYXWVUTSR, create a sequence of pushes and pops such that the output stream is YXVUWZSRT.
    b. If the input stream is ZYXW, how many different output streams can be created?

  6. (Adapted from 10.6) Rewrite the PUSH and POP routines such that the stack on which they operate holds elements that take up two memory locations each. Assume we are writing a program to simulate a stack machine that manipulates 32-bit integers with the LC-3. We would need PUSH and POP routines that operate with a stack that holds elements which take up two memory locations each. Rewrite the PUSH and POP routines for this to be possible.

  7. A zero-address machine is a stack-based machine where all operations are done by using values stored on the operand stack. For this problem, you may assume that the ISA allows the following operations:

    PUSH M - pushes the value stored at memory location M onto the operand stack.

    POP M - pops the operand stack and stores the value into memory location M.

    OP - Pops two values off the operand stack and performs the binary operation OP on the two values. The result is pushed back onto the operand stack.

    Note 1: OP can be ADD, SUB, MUL, or DIV for parts a and b of this problem.
    Note 2: To perform DIV and SUB operations, the top element of the stack is considered as the second operand. i.e. If we first push "A" and then push "B" followed by a "SUB" operation, "A" and "B" will be popped from stack and "A-B" will be pushed into stack.

    1. Draw a picture of the stack after each of the instructions below are executed. What is the minimum number of memory locations that have to be used on the stack for the purposes of this program? Also write an arithmetic equation expressing u in terms of v, w, x, y, and z. The values u, v, w, x, y, and z are stored in memory locations U, V, W, X, Y, and Z.
                  PUSH V
                  PUSH W
                  PUSH X
                  PUSH Y
                  MUL
                  ADD
                  PUSH Z
                  SUB
                  DIV
                  POP U
      	    
    2. Write the assembly language code for a zero-address machine (using the same type of instructions from part a) for calculating the expression below. The values a, b, c, d, and e are stored in memory locations A, B, C, D, and E.

      e = ((a * ((b - c) + d))/(a + c))


  8. Jane Computer (Bob's adoring wife), not to be outdone by her husband, decided to rewrite the TRAP x22 handler at a different place in memory. Consider her implementation below. If a user writes a program that uses this TRAP handler to output an array of characters, how many times is the ADD instruction at the location with label A executed? Assume that the user only calls this "new" TRAP x22 once. What is wrong with this TRAP handler? Now add the necessary instructions so the TRAP handler executes properly.

    Hint: RET uses R7 as linkage back to the caller (RET is equivalent to JMP R7).

    
    ; TRAP handler
    ; Outputs ASCII characters stored in consecutive memory locations.
    ; R0 points to the first ASCII character before the new TRAP x22 is called.
    ; The null character (x00) provides a sentinel that terminates the output sequence.
    
            .ORIG x020F
    START   LDR  R1, R0, #0
            BRz  DONE
            ST   R0, SAVER0
            ADD  R0, R1, #0
            TRAP x21
            LD   R0, SAVER0
    A       ADD  R0, R0, #1
            BRnzp START
    DONE    RET
     
    SAVER0  .BLKW #1
            .END
    


  9. (Adapted from 9.2)
    1. How many TRAP service routines can be implemented in the LC-3? Why?

    2. Why must a RET instruction be used to return from a TRAP routine? Why won't a BRnzp (unconditional BR) instruction work instead?

    3. How many accesses to memory are made during the processing of a TRAP instruction?


  10. Assume that you have the following table in your program:

    
    MASKS   .FILL x0001
            .FILL x0002
            .FILL x0004
            .FILL x0008
            .FILL x0010
            .FILL x0020
            .FILL x0040
            .FILL x0080
            .FILL x0100
            .FILL x0200
            .FILL x0400
            .FILL x0800
            .FILL x1000
            .FILL x2000
            .FILL x4000
            .FILL x8000
    
    1. Write a subroutine CLEAR in LC-3 assembly language that clears a bit in R0 using the table above. The index of the bit to clear is specified in R1. R0 and R1 are inputs to the subroutine.

    2. Write a similar subroutine SET that sets the specified bit instead of clearing it.

    3. Hint: You should remember to save and restore any registers your subroutine uses (the "callee save" convention). Use the RET instruction as the last instruction in your subroutine (R7 contains the address of where in the caller to return to.)


  11. Suppose we are writing an algorithm to multiply the elements of an array (unpacked, 16-bit 2's complement numbers), and we are told that a subroutine "mult_all" exists which multiplies four values, and returns the product. The mult_all subroutine assumes the source operands are in R1, R2, R3, R4, and returns the product in R0. For purposes of this assignment, let us assume that the individual values are small enough that the result will always fit in a 16-bit 2's complement register.

    Your job: Using this subroutine, write a program to multiply the set of values contained in consecutive locations starting at location x6001. The number of such values is contained in x6000. Store your result at location x7000. Assume there is at least one value in the array(i.e., M[x6000] is greater than 0).

      Hint: Feel free to include in your program

      
      PTR	.FILL x6001
      CNT	.FILL x6000
      


  12. (9.13) The following program is supposed to print the number 5 on the screen. It does not work. Why? Answer in no more than ten words, please.
    
    	.ORIG 	x3000
    	JSR	A
    	OUT			;TRAP  x21
    	BRnzp	DONE
    A 	AND	R0,R0,#0
    	ADD	R0,R0,#5
    	JSR	B
    	RET	
    DONE	HALT
    ASCII	.FILL	x0030
    B	LD	R1,ASCII
    	ADD	R0,R0,R1
    	RET
    	.END
    

  13. (Question Moved to Problem Set 6 - updated 11/10/17)

  14. (8.16) What does the following LC-3 program do?
    
            .ORIG x3000
            LD  R0, ASCII
            LD  R1, NEG
    AGAIN   LDI  R2, DSR
            BRzp AGAIN
            STI  R0, DDR
            ADD  R0, R0, #1
            ADD  R2, R0, R1
            BRnp AGAIN
            HALT
    ASCII   .FILL x0041
    NEG     .FILL xFFB6
    DSR     .FILL xFE04
    DDR     .FILL xFE06
            .END
    

  15. (9.5) The following LC-3 program is assembled and then executed. There are no assemble time or run-time errors. What is the output of this program? Assume all registers are initialized to 0 before the program executes. (Updated 11/09/17)
    
            .ORIG x3000
            ST R0, #6 ; x3007
            LEA R0, LABEL
            TRAP x22
            TRAP x25
    LABEL   .STRINGZ "FUNKY"
    LABEL2  .STRINGZ "HELLO WORLD"
            .END
    

  16. The memory locations given below store students' exam scores in form of a linked list. Each node of the linked list uses three memory locations to store

    1. Address of the next node
    2. Starting address of the memory locations where name of the student is stored
    3. Starting address of the memory locations where the his/her exam score is stored

    in the given order. The first node is stored in locations x4000 ~ x4002. The ASCII code x0000 is used as a sentinel to indicate the end of the string. Both the name and exam score are stored as strings.
    Write down the students' names and scores in the order that they appear in the list.
    
        Address         Contents
        x4000           x4016
        x4001           x4003
        x4002           x4008
        x4003           x004D
        x4004           x0061
        x4005           x0072
        x4006           x0063
        x4007           x0000
        x4008           x0039
        x4009           x0030
        x400A           x0000
        x400B           x0000
        x400C           x4019
        x400D           x401E
        x400E           x004A
        x400F           x0061
        x4010           x0063
        x4011           x006B
        x4012           x0000
        x4013           x0031
        x4014           x0038
        x4015           x0000
        x4016           x400B
        x4017           x400E
        X4018           x4013
        x4019           x004D
        x401A           x0069
        x401B           x006B
        x401C           x0065
        x401D           x0000
        x401E           x0037
        x401F           x0036
        x4020           x0000
    

    1. The program below counts the number of zeros in a 16-bit word. Fill in the missing blanks below to make it work.
                  .ORIG x3000
                  AND   R0, R0, #0
                  LD    R1, SIXTEEN
                  LD    R2, WORD
      A           BRn   B
                  ________________		
      B           ________________
                  BRz   C
                  ________________	
                  BR    A	; note: BR = BRnzp
      C           ST    R0, RESULT
                  HALT
      
      SIXTEEN     .FILL #16
      WORD        .BLKW #1
      RESULT      .BLKW #1
                  .END
    2. After you have the correct answer above, what one instruction can you change (without adding any instructions) that will make the program count the number of ones instead?

  17. The main program below calls a subroutine, F. The F subroutine uses R3 and R4 as input, and produces an output which is placed in R0. The subroutine modifies registers R0, R3, R4, R5, and R6 in order to complete its task. F calls two other subroutines, SaveRegisters and RestoreRegisters, that are intended handle the saving and restoring of the modified registers (although we will see in part b that this may not be the best idea!).

    ; Main Program
    ;
          .ORIG x3000
          .....
          .....
          JSR F
          .....
          .....
          HALT
    
    
    ; R3 and R4 are input.
    ; Modifies R0, R3, R4, R5, and R6
    ; R0 is the output
    ;
    F    JSR SaveRegisters
          ....
          ....
          ....
         JSR RestoreRegisters
         RET
         .END
    

    Part a) Write the two subroutines SaveRegisters and RestoreRegisters.

    Part b) When we run the code we notice there is an infinite loop. Why? What small change can we make to our program to correct this error. Please specify both the correction and the subroutine that is being corrected.

  18. Suppose we want to make a 10 item queue starting from location x4000. In class, we discussed using a HEAD and a TAIL pointer to keep track of the beginning and end of the queue. In fact, we suggested that the HEAD pointer could point to the first element that we would remove from the queue and the TAIL pointer could point the last element that we have added the queue. It turns out that our suggestion does not work.

    Part a) What is wrong with our suggestion? (Hint: how do we check if the queue is full? How do we check if it is empty?)

    Part b) What simple change could be made to our queue to resolve this problem?

    Part c) Using your correction, write a few instructions that check if the queue is full. Use R3 for the HEAD pointer and R4 for the TAIL pointer.

    Part d) Using your correction, write a few instructions that check if the queue is empty. Again, using R3 for the HEAD pointer and R4 for the TAIL pointer.

  19. The following nonsense program is assembled and executed. (Problem added 11/07/17)
            .ORIG x4000
            LD    R2,BOBO
            LD    R3,SAM
    AGAIN   ADD   R3,R3,R2
            ADD   R2,R2,#-1
            BRnzp SAM
    BOBO    .STRINGZ "Why are you asking me this?"
    SAM     BRnp  AGAIN
            TRAP  x25
            .BLKW 5
    JOE     .FILL x7777
            .END
    

    How many times is the loop executed? When the program halts, what is the value in R3? (If you do not want to the arithmetic, it is okay to answer this with a mathematical expression.)