EE 306, Fall 2009
Problem Set 6 Solutions
Due:
Not to be turned in
Yale N. Patt, Instructor
TAs: Faruk Guvenilir, Milad Hashemi, Jennifer Davis, Garret Galow, Ben Lin, Taylor Morrow, Stephen Pruett, Jee Ho Ryoo
Note: This problem set is unusually long, and
is not to be turned in. We have put it together and handed it out to
give you some challenging examples to help you prepare for the final
exam.
Assume that you have the following table in your program:
MASKS .FILL x0001
.FILL x0002
.FILL x0004
.FILL x0008
.FILL x0010
.FILL x0020
.FILL x0040
.FILL x0080
.FILL x0100
.FILL x0200
.FILL x0400
.FILL x0800
.FILL x1000
.FILL x2000
.FILL x4000
.FILL x8000
Write a subroutine CLEAR
in LC-3 assembly language that clears a bit in R0
using the table above. The index of the bit to clear is specified in R1
. R0
and R1
are inputs to the subroutine.
CLEAR ST R2, TEMP
LEA R2, MASKS
ADD R2,R1,R2
LDR R2,R2,#0
NOT R2,R2
AND R0,R2,R0
LD R2, TEMP
RET
TEMP .BLKW #1
Write a similar subroutine SET
that sets the specified bit instead of clearing it.
SET ST R2, TEMP
LEA R2, MASKS
ADD R2,R1,R2
LDR R2,R2,#0
NOT R2,R2
NOT R0,R0
AND R0,R2,R0
NOT R0,R0
LD R2, TEMP
RET
TEMP .BLKW #1
Suppose we are writing an algorithm to multiply the elements of an array (unpacked, 16-bit 2's complement numbers), and we are told that a subroutine "mult_all" exists which multiplies four values, and returns the product. The mult_all subroutine assumes the source operands are in R1, R2, R3, R4, and returns the product in R0. For purposes of this assignment, let us assume that the individual values are small enough that the result will always fit in a 16-bit 2's complement register.
Your job: Using this subroutine, write a program to multiply the set of values contained in consecutive locations starting at location x6001. The number of such values is contained in x6000. Store your result at location x7000.
Hint: Feel free to include in your program
PTR .FILL x6001
CNT .FILL x6000
.ORIG x3000
LD R5, PTR
LDI R6, CNT
BRz DONEz ;checks if more numbers to multiply(CNT=0)
MORE LDR R1,R5,#0
ADD R5,R5,#1
ADD R6,R6,#-1
BRz DONE1 ;continues if more numbers to multiply
LDR R2,R5,#0 ;
ADD R5,R5,#1
ADD R6,R6,#-1
BRz DONE2 ;continues if more numbers to multiply
LDR R3,R5,#0
ADD R5,R5,#1
ADD R6,R6,#-1
BRz DONE3 ;continues if more numbers to multiply
LDR R4,R5,#0
ADD R5,R5,#1
ADD R6,R6,#-1
BRnzp READY ;CNT is multiple of 4
DONEz AND R0,R0,#0
ADD R0,R0,#1
BRnzp END
;(CNT = 4x+1) multiplies R1 by three 1's
DONE1 AND R2,R2,#0
ADD R2,R2,#1 ;R2 = 1
ADD R3,R2,#0 ;R3 = 1
ADD R4,R2,#0 ;R4 = 1
BRnzp READY
;(CNT = 4x+2) multiplies R1,R2 by two 1's
DONE2 AND R3,R3,#0
ADD R3,R3,#1 ;R3 = 1
ADD R4,R4,#0 ;R4 = 1
BRnzp READY
;
;(CNT = 4x+3) multiplies R1,R2,R3 by 1
DONE3 AND R4,R4,#0
ADD R4,R4,#1
READY JSR mult_all
ADD R6,R6,#0
BRz END ;checks CNT
;
;if CNT is not zero takes R0 from subroutine and puts back into
memory to multiply more numbers
ADD R5,R5,#-1
STR R0,R5,#0
;add one back to CNT because R0 is back into memory
ADD R6,R6,#1
BRnzp MORE
;
;store result of multiplication in memory location RESULT
END ST R0,RESULT
HALT
RESULT .BLKW 1
mult_all ... ;multiples R1,R2,R3,R4 and stores result in R0
...
...
RET
PTR .FILL x6001
CNT .FILL x6000
.END
.ORIG x3000
JSR A
OUT ;TRAP x21
BRnzp DONE
A AND R0,R0,#0
ADD R0,R0,#5
JSR B
RET
DONE HALT
ASCII .FILL x0030
B LD R1,ASCII
ADD R0,R0,R1
RET
.END
Need to save R7 so 1st service routine can return. Second RET overwrites the first RET value.PUSH V PUSH W PUSH X PUSH Y MUL ADD PUSH Z SUB DIV POP U
PUSH A
PUSH B
PUSH C
SUB
PUSH D
ADD
MUL
PUSH A
PUSH C
ADD
DIV
POP E
Address Contents
x4000 x4016
x4001 x4003
x4002 x4008
x4003 x004D
x4004 x0061
x4005 x0072
x4006 x0063
x4007 x0000
x4008 x0039
x4009 x0030
x400A x0000
x400B x0000
x400C x4019
x400D x401E
x400E x004A
x400F x0061
x4010 x0063
x4011 x006B
x4012 x0000
x4013 x0031
x4014 x0038
x4015 x0000
x4016 x400B
x4017 x400E
X4018 x4013
x4019 x004D
x401A x0069
x401B x006B
x401C x0065
x401D x0000
x401E x0037
x401F x0036
x4020 x0000
Solution:(Adapted from 10.9) The input stream of a stack is a list of
all the elements we pushed onto the stack, in the order that we
pushed them. The input stream from Excercise 10.8 from page 284 of
the book for example is ABCDEFGHIJKLM
The output stream is a list
of all the elements that are popped off the stack in the order that
they are popped off.
a. If the input stream is ZYXWVUTSR, create
a sequence of pushes and pops such that the output stream is
YXVUWZSRT.
Push Z
Push Y
Pop Y
Push
X
Pop X
Push W
Push V
Pop V
Push U
Pop U
Pop
W
Pop Z
Push T
Push S
Pop S
Push R
Pop R
Pop
T
b. If the input stream is ZYXW, how many different
output streams can be created.
14 different
output streams.
Do Problem 6.16 on page 175 in the textbook.
(Adapted from 10.6) Rewrite the PUSH and POP routines such
that the stack on which they operate holds elements that take up two
memory locations each. Assume we are writing a program to simulate a stack machine that manipulates 32-bit integers with the LC-3. We would need PUSH and POP routines that operate with a stack that holds elements which take up two memory locations each. Rewrite the PUSH and POP routines for this to be possible.
The problem assumes
that each element of the value being pushed on the stack is
32-bits.
For the PUSH, assume bits [15:0] of that value to be
pushed are in R0 and bits [31:16] are in R1.
For the POP, bits
[15:0] will be popped into R0 and bits [31:16] will be popped into
R1.
Also assume the lower order bits of the number being pused or
popped are located in the smaller address in memroy. For example if
the two memory locations to be used to store the number are x2FFF
and x2FFE, bits [15:0] will be stored in x2FFE and [31:16] will be
stored in x2FFF.
PUSH:
ADD R6, R6, #-2
STR R0, R6,
#0
STR R1, R6, #1
POP:
LDR R0, R6, #0
LDR R1, R6,
#1
ADD R6, R6, #2
Consider the following LC-3 assembly language program. Assumming that the memory locations DATA get filled before the program executes, what is the relationship between the final values at DATA and the initial values at DATA?
.ORIG x3000 LEA R0, DATA AND R1, R1, #0 ADD R1, R1, #9 LOOP1 ADD R2, R0, #0 ADD R3, R1, #0 LOOP2 JSR SUB1 ADD R4, R4, #0 BRzp LABEL JSR SUB2 LABEL ADD R2, R2, #1 ADD R3, R3, #-1 BRp LOOP2 ADD R1, R1, #-1 BRp LOOP1 HALT DATA .BLKW #10 SUB1 LDR R5, R2, #0 NOT R5, R5 ADD R5, R5, #1 LDR R6, R2, #1 ADD R4, R5, R6 RET SUB2 LDR R4, R2, #0 LDR R5, R2, #1 STR R4, R2, #1 STR R5, R2, #0 RET .END
The final values at DATA will be sorted in ascending order.
During the initiation of the interrupt service routine, the
N, Z, and P condition codes are saved on the stack. By means of a
simple example show how incorrect results would be generated if the
condition codes were not saved. Also, clearly describe the steps
required for properly handling an interrupt.
Lets
take the following program which adds 10 numbers starting at memory
location x4000 and stores the result at x5000.
.ORIG x3000 LD R1, PTR AND R0, R0, #0 LD R2, COUNT LOOP LDR R3, R1, #0 ADD R0, R0, R3 ADD R1, R1, #1 ADD R2, R2, #-1 BRp LOOP STI R0, RESULT HALT PTR .FILL x4000 RESULT .FILL x5000 COUNT .FILL #10
If the condition codes were not saved as part
of initiation of the interrupt service routine, we could end up with
incorrect results. In this program, take the case when an interrupt
occurred during the processing of the instruction at location x3006
and the condition codes were not saved. Let R2 = 5 and hence the
condition codes would be N=0, Z=0, P=1, before servicing the
interrupt. When control is returned to the instruction at location
x3007, the BRp instruction, the condition codes depend on the
processing within the interrupt service routine. If they are N=0,
Z=1, P=0, then the BRp is not taken. This means that the result
stored is just the sum of the first five values and not all
ten.
Steps for handling interrupts:
Saving the State of the machine
Loading the state of the interrupt
Service the Interrupt
RTI
Note: In-depth explanation of interrupt
handling on pages 259-261 of the texbook.
The program below counts the number of zeros in a 16-bit word.
.ORIG x3000 AND R0, R0, #0 LD R1, SIXTEEN LD R2, WORD A BRn B ADD R0, R0, #1 B ADD R1, R1, #-1 BRz C ADD R2, R2, R2 BR A C ST R0, RESULT HALT SIXTEEN .FILL #16 WORD .BLKW #1 RESULT .BLKW #1 .END
Fill in the missing blanks so that the subroutine below
implements a stack multiply. That is it pops the top two elements
off the stack, multiplies them, and pushes the result back on the
stack. You can assume that the two numbers will be non-negative
integers (greater than or equal to zero) and that their product will
not produce an overflow. Also assume that the stack has been
properly initialized, the PUSH and POP subroutines have been written
for you and work just as described in class, and that the stack will
not overflow or underflow.
Note: All blanks must be filled
for the program to operate correctly.
MUL ST R7, SAVER7 ST R0, SAVER0 ST R1, SAVER1 ST R2, SAVER2 ST R5, SAVER5 AND R2, R2, #0 JSR POP ADD R1, R0, #0 JSR POP ADD R1, R1, #0 BRz DONE AGAIN ADD R2, R2, R0 ADD R1, R1, #-1 BRp AGAIN DONE ADD R0, R2, #0 JSR PUSH LD R7, SAVER7 LD R0, SAVER0 LD R1, SAVER1 LD R2, SAVER2 LD R5, SAVER5 RET
The program below calculates the closest integer greater than
or equal to the square root of the number stored in NUM, and prints
it to the screen. That is, if the number stored in NUM is 25, "5"
will be printed to the screen. If the number stored in NUM is 26,
"6" will be printed to the screen. Fill in the blanks
below to make the program work.
Note: Assume that the value
stored at NUM will be between 0 an 81.
.ORIG x3000 AND R2, R2, #0 LD R3, NUM BRz OUTPUT NOT R3, R3 ADD R3, R3, #1 OUTLOOP ADD R2, R2, #1 ADD R0, R2, #0 AND R1, R1, #0 INLOOP ADD R1, R1, R2 ADD R0, R0, #-1 BRp INLOOP ADD R1, R1, R3 BRn OUTLOOP OUTPUT LD R0, ZERO ADD R0, R0, R2 TRAP x21 HALT NUM .BLKW 1 ZERO .FILL x30 .END
The figure below shows the part of the LC-3 data path that
deals with memory and I/O. Note the signals labeled A through F. A
is the memory enable signal, if it is 1 memory is enabled, if it is
0, memory is disabled. B, C, and D are the load enable signals for
the Device Registers. If the load enable signal is 1, the register
is loaded with a value, otherwise it is not. E is the 16-bit output
of INMUX, and F is the 2-bit select line for INMUX.
The
initial values of some of the processor registers and the I/O
registers, and some memory locations are as follows:
R0 = x0000 |
KBSR = x8000 |
M[x3009] = xFE00 |
During the entire instruction cycle, memory is accessed
between one and three times (why?). The following table lists two
consecutive instructions to be executed on the LC-3. Complete the
table with the values that each signal or register takes right after
each of the memory accesses performed by the instruction. If an
instruction does not require three memory accesses, draw a line
accross the unused accesses. To help you get started, we have filled
some of the values for you.
PC |
Instruction |
Access |
MAR |
A |
B |
C |
D |
E[15:0] |
F[1] |
F[0] |
MDR |
x3000 |
LD R0, x9 |
1 |
x3000 |
1 |
0 |
0 |
0 |
x2009 |
1 |
1 |
x2009 |
2 |
x300A |
1 |
0 |
0 |
0 |
xFE02 |
1 |
1 |
xFE02 |
||
3 |
--------- |
--- |
--- |
--- |
--- |
--------- |
------ |
------ |
--------- |
||
x3001 |
LDR R0, R0, #0 |
1 |
x3001 |
1 |
0 |
0 |
0 |
x6000 |
1 |
1 |
x6000 |
2 |
xFE02 |
0 |
0 |
0 |
0 |
x0061 |
0 |
0 |
x0061 |
||
3 |
--------- |
--- |
--- |
--- |
--- |
--------- |
------ |
------ |
--------- |
Note: This problem is NOT easy. In fact, it took me a while
to solve it, and I am supposed to be an expert on 306 material. So,
if you are struggling to pass this course, I suggest you ignore it.
On the other hand, if you are a hot shot and think no problem is
beyond you, then by all means go for it. We put it on the problem
set to keep some of the hot shots out of mischief. We would not put
it on the final, because we think it is too difficult to put on the
exam.
A programmer wrote this program to do something useful.
He, however, forgot to comment his code, and now can't remember what
the program is supposed to do. Your job is to save him the trouble
and figure it out for him. In 20 words or fewer tell us what
valuable information the program below provides about the value
stored in memory location INPUT. Assume that there is a non-zero
value at location INPUT before the program is executed.
HINT:
When testing different values of INPUT pay attention to their bit
patterns. How does the bit pattern correspond to the RESULT?
.ORIG x3000 LD R0, INPUT AND R3, R3, #0 LEA R6, MASKS LD R1, COUNT LOOP LDR R2, R6, #0 ADD R3, R3, R3 AND R5, R0, R2 BRz SKIP ADD R3, R3, #1 ADD R0, R5, #0 SKIP ADD R6, R6, #1 ADD R1, R1, #-1 BRp LOOP ST R3, RESULT HALT COUNT .FILL #4 MASKS .FILL 0xFF00 .FILL 0xF0F0 .FILL 0xCCCC .FILL 0xAAAA INPUT .BLKW 1 RESULT .BLKW 1 .END
This program identifies the most significant bit position that is set in the value stored at INPUT and stores that bit position in RESULT. For example, if INPUT contained the value 0010 0100 0101 0110, RESULT would contain the value 13 since bit 13 is the most significant bit postition that is a 1.
Figure out what the following program does.
.ORIG X3000 LEA R2, C LDR R1, R2, #0 LDI R6, C LDR R5, R1, #-3 ST R5, C LDR R5, R1, #-4 LDR R0, R2, #1 JSRR R5 AND R3, R3, #0 ADD R3, R3, #7 LEA R4, B A STR R4, R1, #0 ADD R4, R4, #2 ADD R1, R1, #1 ADD R3, R3, #-1 BRP A HALT B ADD R2, R2, #1 LDR R0, R2, #0 JSRR R5 TRAP X29 ADD R2, R2, #15 ADD R0, R2, #3 LD R5, C TRAP X2B ADD R2, R2, #5 LDR R0, R2, #0 JSRR R5 TRAP X27 JSRR R5 JSRR R6 C .FILL X25 .STRINGZ "EE306 and tests are awesome" .END
The short answer is that the program outputs “EE some” this is because we over write the trap vector table. Below is a commented version of the program to help you see what is going on.
.ORIG X3000 LEA R2, C LDR R1, R2, #0 ; load x25 into R1 LDI R6, C ; loads the starting address of the HALT trap service routine into R6 LDR R5, R1, #-3 ; loads the starting address of (x25 – 3) trap x22 (puts) into R5 ST R5, C ; stores the starting address or puts into C LDR R5, R1, #-4 ; loads the starting address of (x25 – 4) trap x21 (out) into r5 LDR R0, R2, #1 ; loads R0 with the first charater of the stringz “E” JSRR R5 ; does the out routine (outputs “E” to the display) AND R3, R3, #0 ; clears r3 ADD R3, R3, #7 ; makes r3 7 LEA R4, B ; loads the address of B into r4 ;NOTE Loop A overwrites the trap vector table, x25 to x2b ; This makes trap x25 – trap x2b point to this program, see label B and below A STR R4, R1, #0 ; overwrites the trap vector with the address in R4 ADD R4, R4, #2 ADD R1, R1, #1 ADD R3, R3, #-1 BRP A HALT ; What does this do? Trap x25, what is now at memory location x25? ;In the following section <- trap xY indicates what address is in memory location Y B ADD R2, R2, #1 ; <- trap x25 (makes R2 point to the first character in the stringz “E”) LDR R0, R2, #0 ; (loads r0 with the ascci code for “E“) JSRR R5 ; <- trap x26 (what is in r5? The starting address of out,outputs “E” on the screen) TRAP X29 ADD R2, R2, #15; <- trap x27 (makes r2 points to the (6 + 15) 21th character of the .stringz ADD R0, R2, #3 ; (makes to point to the (21+3) 24th character of the stringz the s in awesome) LD R5, C ; <- trap x28 (LD R5, C loads r5 with the starting address of puts) TRAP X2B ADD R2, R2, #5 ; <- trap x29 (“makes R2 point to the 6th character in the .stringz “ “) LDR R0, R2, #0 ; (loads r0 with the ascci code for “ “) JSRR R5 ; <- trap x2a (outputs a space on the screen) TRAP X27 JSRR R5 ; <- trap x2b (jsrr to puts outputs “some” to the screen) JSRR R6 ; remember r6 contains the starting address of trap x25 (halt) so this halts C .FILL X25 .STRINGZ "EE306 and tests are awesome" .END