Developing Software in Assembly Language
Local Variables
By Jonathan W. Valvano
This article, which discusses assembly language programming,
accompanies the book Embedded Microcomputer Systems: Real Time Interfacing published by Brooks-Cole 1999. This document has four overall
parts
Overview
Syntax (fields, pseudo ops)
Local variables (this document)
Examples
Local variables
Introduction
Memory Allocation of Local and Global Variables
Implementation of Local Variables on the Stack
Implementation of Local Variables using a Stack Frame
C Compiler Implementation of Local and Global Variables
Stack animation of a C function
Introduction to Local Variables
Because their contents are allowed to change, all variables
must be allocated in RAM and not ROM. A local variable is temporary information used only by one software module. Local
variables are typically allocated, used, then deallocated. The
information stored in a local variable is not permanent. This
means if we store a value into a local variable during one execution
of the module, the next time that module is executed the previous
value is not available. Examples include loop counters, temporary
sums. We use a local variable to store data that is temporary
in nature. We can implement a local variable using the stack or
registers. Reasons why we place local variables on the stack include
dynamic allocation/release allows for reuse of memory
limited scope of access provides for data protection
only the program that created the local variable can access
it
since an interrupt will save registers and create its own
stack frame, the code
can be made reentrant.
since absolute addressing is not used, the code is relocatable
the number of variables is only limited by the size of the
stack allocation
that can be much larger than using local variables in registers
A global variable is information shared by more than one program module. E.g., we use globals to pass data between the main (or foreground) process and an interrupt (or background) process. Global variables are not deallocated. The information they store is permanent. Examples include time of day, date, user name, temperature, pointers to shared data. On the 6811, we use absolute addressing (direct or extended) to access their information.
Observation: Sometimes we store temporary information in global variables
out of laziness. This practice is to be discouraged because it
wastes memory and may cause the module to not be reentrant.
--------------------------------------------------------------------------------------
Memory Allocation of Local and Global Variables
For more information on the allocation into specific type of memory
within the embedded microcomputer see the section on Memory Allocation
in Chapter 2 of Embedded Microcomputer Systems: Real Time Interfacing by Jonathan W. Valvano. On the Intel-based computers, global
data, stack (hence local variables), and programs are divided
into three separate memory segments. This division of code stack
and data areas is an example of segmentation. Special segment
registers point to each segment. Segmentation provides both speed
and protection. A stack overflow can not destroy global data.
In particular:
Segment register DS points to the global data segment
Segment register SS points to the stack segment
Segment register CS points to the code segment
Segmentation is such a good idea that Apple implements it on the
Macintosh, even though the 68000 provides for no hardware support
of segmentation
A5 relative addressing is used to access global data
SP points to the stack and local variables
PC points to the current instruction.
Because the embedded system does not load programs off disk when started, segmentation is an extremely important issue for these systems as well. Typical software segments include global variables, local variables, fixed constants, and machine instructions. For single chip implementations, recall that there are three types of memory:
Memory | volatile? | Ability to Read/Write |
RAM | volatile | random and fast access |
EEPROM | nonvolatile | easily erased and reprogrammed |
ROM | nonvolatile | programmed once |
In an embedded application, we usually put structures that must be changed during execution in RAM. Examples include recorded data, parameters passed to subroutines, global and local variables. We place fixed constants in EEPROM because the information remains when the power is removed, but can be reprogrammed at a later time. Examples of fixed constants include translation tables, security codes, calibration data, and configuration parameters. We place machine instructions, interrupt vectors (address to go to when an interrupt occurs) and the reset vector (starting address) in ROM because this information is stored once and can not be reprogrammed. Because the machine instructions and vectors are in ROM, the software will begin execution when power is applied.
--------------------------------------------------------------------------------------
Implementation of Local Variables on the Stack
Stack implementation of local variables has four stages: binding,
allocation, access, and deallocation.
1. Binding is the assignment of the address (not value) to a symbolic name. This address will be the actual memory location to store the local variable. The assembler binds the Symbolic name to a stack index. The computer calculates the actual location during execution. For example:
; MC68HC708XL36
I set 1 8 bit number
PT set 2 16 bit address
Result set 4 8 bit number
; MC68HC11A8 or MC68HC812A4
I set 0 8 bit number
PT set 1 16 bit address
Result set 3 8 bit number
2. Allocation is the generation of memory storage for the local variable. The computer allocates space during execution by decrementing the SP. In this first example, the software allocates the local variable by pushing a register on the stack.
; MC68HC708XL36
psha allocate Result
pshx allocate PT (lsbyte)
pshh (msbyte)
psha allocate I
; MC68HC11A8 or MC68HC812A4
psha allocate Result
pshx allocate PT
psha allocate I
In this next example, the software allocates the local variable by decrementing the stack pointer.
; MC68HC11A8 or MC68HC812A4
des allocate Result
des allocate PT
des
des allocate I
In this last example, the technique provides a mechanism for allocating large amounts of stack space.
; MC68HC708XL36
ais #-4 alloc Result, PT, I
; MC68HC11A8
tsx allocate I,PT,Result
xgdx
subd #4
xgdx
txs
; MC68HC812A4
leas -4,sp alloc Result, PT, I
3. The access to a local variable is a read or write during execution. In the
next program, the local variable Iis cleared, the local variable PT is copied into a register, and the local variable Result is written.
; MC68HC708XL36
clr I,sp Clear I
ldx PT,sp msbyte
pshx
pulh H is msbyte of PT
ldx PT+1,sp lsbyte
sta Result,sp store Result
; MC68HC11A8
tsx Reg X points to locals
clr I,x Clear I
ldy PT,x Reg Y is a copy of PT
staa Result,x store into Result
; MC68HC812A4
clr I,sp Clear I
ldy PT,sp Reg Y is a copy of PT
staa Result,sp store into Result
4. Deallocation is the release of memory storage for the location variable. The computer deallocates space during execution by incrementing SP. In this first example, the software deallocates the local variable by pulling a register from the stack.
; MC68HC708XL36
pula deallocate Result
pulx deallocate PT (lsbyte)
pulh (msbyte)
pula deallocate I
; MC68HC11A8 or MC68HC812A4
pula deallocate Result
pulx deallocate PT
pula deallocate I
Observation: When the software uses the "push-register" technique to allocate
and the "pull-register" technique to deallocate, it looks like
it is saving and restoring the register. Because most applications
of local variables involve storing into the local, the value pulled
will NOT match the value pushed.
In this next example, the software deallocates the local variable
by incrementing the stack pointer.
; MC68HC11A8 or MC68HC812A4
ins deallocate Result
ins deallocate PT
ins
ins deallocate I
In this last example, the technique provides a mechanism for allocating large amounts of stack space.
; MC68HC708XL36
ais #4 dealloc Result, PT, I
; MC68HC11A8
tsx deallocate I,PT,Result
ldab #4
abx
txs
; MC68HC812A4
leas 4,sp deallocate Result,PT,I
Implementation of Local Variables using a Stack Frame
The 6812 provides a negative offset index addressing mode. With
this addressing mode it is possible to establish a stack frame
pointer using either register X or Y. It is important in this
implementation that once the stack frame pointer is established
(e.g., the tsx instruction), that the stack frame register (X) not be modified. Because the stack frame pointer should not
be modified, every subroutine will save the old stack frame pointer
of the function that called the subroutine (e.g., pshx at the top) and restore it before returning (e.g., pulx at the bottom.) The tsx instruction will create the stack frame, the leas -4,sp will allocate 4 bytes of local storage and the txs will deallocate the local variables. Local variable access uses
indexed addressing mode with a negative offset. This example will
be extended to include parameters later in the chapter. In particular,
the ICC12 C compiler uses this method to access local variables
and stack. Notice the subroutine deallocated by moving the stack
frame pointer back into SP with the txs instruction.
; MC68HC812A4
; *****binding phase***************
I set -4
PT set -3
Ans set -1
; *******allocation phase *********
function pshx save old Reg X
tsx create stack frame pointer
leas -4,sp allocate four bytes for I,PT,Result
; ********access phase ************
clr I,x Clear I
ldy PT,x Reg Y is a copy of PT
staa Ans,x store into Ans
; ********deallocation phase *****
txs deallocation
pulx restore old X
rts
Observation: One advantage of the stack frame method is the ability to push
and pull data from the stack within the function without changing
Reg X. In this way, the binding of the local variables remains
fixed.
--------------------------------------------------------------------------------------
Application of Local Variables
Although on most systems local variables are implemented on the
stack, we have adopted a more general definition of local variables
in this book. Recall, that a local variable is temporary storage
accessed by only one module. In order to compare and contract
various implementations of local variables, consider a few subroutines
that implement 8 bit signed multiply. On the 6805 and 6808 we
will multiply register A and X and return the 16 bit result in
the register combination X:A. On the 6811 and 6812 we will multiple
registers A times B and return the 16 bit result in register D.
In each case there is a local variable, SIGN, which when true
(nonzero), means the result will be negative. In the first example,
the local variable is allocated in global memory and accessed
using direct or extended addressing. This technique is simple
and easy to debug, but is not relocatable or reentrant. On the
other hand, it is the only available method for the 6805. Notice
that the allocation occurs at assembly time, and is fixed. In
other words it is never allocated, and not reused.
The following is a 6805 implementation of local variables using
direct addressing.
; MC68HC705J1A
org $0000 RAM
SIGN rmb 1 Allocation
org $0300 EPROM
SMUL clr SIGN direct addr
tsta Is A negative?
bpl APOS Skip if positive
com SIGN Flip sign
nega 0 <= A <= 128
APOS tstx Is X negative?
bpl XPOS Skip if positive
com SIGN Flip sign
negx 0 <= X <= 128
XPOS mul Unsigned A•X
tst SIGN negate result?
beq POS Skip if positive
coma 1's complement
comx
add #1 plus one
bcc POS
incx carry
POS rts Never dealloc
The following is a 6808 implementation of local variables using
direct addressing.
; MC68HC708XL36
org $0050 RAM
SIGN rmb 1 Allocation
org $6E00 EEPROM
SMUL clr SIGN direct addr
tsta Is A negative?
bpl APOS Skip if positive
com SIGN Flip sign
nega 0 <= A <= 128
APOS tstx Is X negative?
bpl XPOS Skip if positive
com SIGN Flip sign
negx 0 <= X <= 128
XPOS mul Unsigned A•X
tst SIGN negate result?
beq POS Skip if positive
coma 1's complement
comx
add #1 plus one
bcc POS
incx carry
POS rts Never dealloc
The following is a 6811 implementation of local variables using
direct addressing.
; MC68HC11A8
org 0 RAM
SIGN rmb 1 Allocation of local
org $E000 ROM
SMUL clr SIGN direct addressing
tsta Is A negative?
bpl APOS Skip if A positive
com SIGN Flip sign
nega 0 <= A <= 128
APOS tstb Is B negative?
bpl BPOS Skip if B positive
com SIGN Flip sign
negb 0 <= B <= 128
BPOS mul Unsigned A•B
tst SIGN Need to negate D?
beq DPOS Skip if positive
coma 1's complement
comb
addd #1 plus one
DPOS rts Never deallocated
The following is a 6812 implementation of local variables using
extended addressing.
; MC68HC812A4
org $0800 RAM
SIGN rmb 1 Allocation of local
org $F000 EEPROM
SMUL clr SIGN extended addressing
tsta Is A negative?
bpl APOS Skip if A positive
com SIGN Flip sign
nega 0 <= A <= 128
APOS tstb Is B negative?
bpl BPOS Skip if B positive
com SIGN Flip sign
negb 0 <= B <= 128
BPOS mul Unsigned A•B
tst SIGN Need to negate D?
beq DPOS Skip if positive
coma 1's complement
comb
addd #1 plus one
DPOS rts Never deallocated
In the second example, the local variable is implemented using a register. This technique is simple and very fast. It is appropriate for small amounts of data. Because no absolute memory addresses are required to access the registers, the subroutine will be reentrant. If the function saves and restores registers it will be reentrant. All of the Motorola 8 bit microcomputers have very few registers, so this technique will have limited application. There is no formal allocation or deallocation of the register, but the register can certainly be reused after the information is no longer needed. A 6805 implementation is not given because it doesn’t have enough registers.
6808 implementation of local variables using registers.
; MC68HC708XL36
SMUL clrh H=sign
tsta Is A negative?
bpl APOS Skip if positive
aix #$60 set H=1
aix #$60
aix #$40
nega 0 <= A <= 128
APOS tstx Is X negative?
bpl XPOS Skip if positive
aix #$80 H=H-1
aix #$80 Flip sign
negx 0 <= X <= 128
XPOS mul Unsigned A•X
psha
pshh
pula
tsta negate result?
pula
beq POS Skip if positive
coma 1's complement
comx
add #1 plus one
bcc POS
incx carry
POS rts Never dealloc
6811/6812 implementation of local variables using registers.
; MC68HC11A8 or MC68HC812A4
org $E000 ROM
SMUL ldy #0 Y=sign
tsta Is A negative?
bpl APOS Skip if A positive
iny Flip sign
nega 0 <= A <= 128
APOS tstb Is B negative?
bpl BPOS Skip if B positive
dey Flip sign
negb 0 <= B <= 128
BPOS mul Unsigned A•B
cpy #0 Need to negate D?
beq DPOS Skip if positive
coma 1's complement
comb
addd #1 plus one
DPOS rts Never deallocated
In the last example, the local variable is allocated on the stack and accessed using indexed addressing. Allocation occurs dynamically (i.e., at run time). The deallocation step allows the memory to be reused for other purposes. Because the 6805 does not have stack relative addressing, a 6805 implementation is not shown. Because a new "private" copy is allocated when the subroutine is entered, the subroutine is re-entrant. The indexed addressing mode does not depend on the PC, so the subroutine is also relocatable.
6808 implementation of local variables using index addressing
; MC68HC708XL36
SIGN equ 1 binding
SMUL ais #-1 allocation
clr SIGN,SP Stack access
tsta Is A negative?
bpl APOS Skip if positive
com SIGN,SP Flip sign
nega 0 <= A <= 128
APOS tstx Is X negative?
bpl XPOS Skip if positive
com SIGN,SP Flip sign
negx 0 <= X <= 128
XPOS mul Unsigned A*X
tst SIGN,SP negate?
beq POS Skip if OK
coma 1's complement
comx
add #1 plus one
bcc POS
incx carry
POS ais #1 deallocation
rts
6811/6812 implementation of local variables using index addressing
; MC68HC11A8 or MC68HC812A4
SIGN equ 0 binding
SMUL des allocation
tsx X -> SIGN
clr SIGN,X Stack access
tsta Is A negative?
bpl APOS Skip if positive
com SIGN,X Flip sign
nega 0 <= A <= 128
APOS tstb Is B negative?
bpl BPOS Skip if positive
com SIGN,X Flip sign
negb 0 <= B <= 128
BPOS mul Unsigned A*B
tst SIGN,X Need to negate?
beq DPOS Skip if OK
coma 1's complement
comb
addd #1 plus one
DPOS ins deallocation
rts
6812 implementation of local variables using SP index addressing
; MC68HC812A4
SIGN equ 0 binding
SMUL des allocation
clr SIGN,SP Stack access
tsta Is A negative?
bpl APOS Skip if positive
com SIGN,SP Flip sign
nega 0 <= A <= 128
APOS tstb Is B negative?
bpl BPOS Skip if positive
com SIGN,SP Flip sign
negb 0 <= B <= 128
BPOS mul Unsigned A•B
tst SIGN,SP Need to negate?
beq DPOS Skip if OK
coma 1's complement
comb
addd #1 plus one
DPOS ins deallocation
rts
In the above example, the line SIGN equ 0 is an assembly-time pseudo-instruction used only to make the source code more readable. Unfortunately with most assemblers, the label can only be defined once with an equ. Thus, we can not reuse local variable names. Some assemblers, like TExaS, provide the set pseudo-instruction, which works like equ, but allows you to redefine its value. Thus using set, you can reuse a local variable name.
--------------------------------------------------------------------------------------
C Compiler Implementation of Local and Global Variables
In order to understand both the machine architecture and the C
compiler, we can look at the assembly code generated. This first
example shows a simple C program with a global variable x, two local variables both called yand a function parameter z.
int x; /* definition of a global variable */
main(){
int y; /* definition of a local variable */
x=5; /* access global variable */
y=6; /* access local variable */
x=sub(y); /* call function, pass parameter */
return(0);}
int sub(int z){ int y;
y=z+1;
return(y);}
The first compiler we will study is Symantec Think C version 7
for the 68K Macintosh. The disassembled output has been edited
to clarify its operation. The loader will allocate 3 segmented
memory areas: code pointed to by the PC; global pointed to by A5; and local pointed to by the stack pointer A7. The global symbol, x, will be assigned or bound by the loader. "Binding" means establishing
its address. The compiler can bind the local variables and subroutine
parameters. The link instruction establishes a stack frame pointer, A6, and allocates local variables. The actual ThinkC compiler optimized
the subroutine by placing the local variable, y, in register D7, but in this example, I "unoptimized" it to illustrate the use
of local variables.
y | equ | -2 | local variable binding A6 relative |
main: | LINK | A6,#-2 | allocate y for main |
MOVE.W | #5,x(A5) | x=5; | |
MOVE.W | #6,y(A6) | y=6; | |
MOVE.W | y(A6),-(A7) | call by value | |
JSR | sub | ||
MOVE.W | D0,x(A5) | x=result of sub | |
MOVEQ | #0,D0 | ||
UNLK | A6 | ||
RTS | |||
y | equ | -2 | local variable binding A6 relative |
z | equ | 8 | Parameter binding A6 relative |
sub: | LINK | A6,#-2 | allocate y for sub |
MOVEQ.W | #1,y(A6) | ||
ADD.W | z(A6),y(A6) | y=z+1; | |
MOVE.W | y(A6),D0 | D0 is the return parameter | |
UNLK | A6 | deallocate y | |
RTS |
The stack frame at the time of the ADD.W z(A6),y(A6) instruction is shown. Within the subroutine the local variables
of main are not accessible.
The next compiler we will study is ImageCraft ICC11 version 4.0
for the Motorola 6811. Again, the disassembled output has been
edited to clarify its operation. The linker/loader also allocates
3 segmented memory areas: code pointed to by the PC; global accessed with absolute addressing; and locals pointed
to by the stack pointer SP. The global symbol, _x, will be assigned or bound by the linker/loader. The pshx instruction allocates the local variable, and the TSX instruction establishes a stack frame pointer, X. This compiler passes the first input parameter into the subroutine
by placing it in register D. The remaining parameters (none in this example) would have been
pushed on the stack..
y | equ | 0 | local variable binding X relative |
main: | PSHX | allocate y for main | |
TSX | establish stack frame pointer | ||
LDD | #5 | ||
STD | _x | x=5; | |
LDD | #6 | ||
STD | y,X | y=6; | |
JSR | sub | ||
STD | _x | x=result of sub | |
CLRA | |||
CLRB | |||
PULX | deallocate y | ||
RTS | |||
y | equ | 0 | local variable binding X relative |
z | equ | 2 | Parameter binding X relative |
sub: | PSHB | put parameter on stack | |
PSHA | |||
PSHX | allocate y for sub | ||
TSX | establish stack frame pointer | ||
LDD | z,X | ||
ADDD | #1 | ||
STD | y,X | y=z+1; | |
PULX | deallocate y | ||
PULX | discard z | ||
RTS | RegD is the return parameter |
The stack frame at the time of the ADDD #1 instruction is shown. Within the subroutine the local variables
of main are not accessible.
The third compiler we will study is ImageCraft ICC12 version 4.0
for the Motorola 6812. Again, the disassembled output has been
edited to clarify its operation. Like the 6811, the linker/loader
also allocates 3 segmented memory areas: code pointed to by the
PC; global accessed with absolute addressing; and locals pointed
to by the stack pointer SP. The leas -2,sp instruction allocates the local variable, and the tfr s,x instruction establishes a stack frame pointer, X. ImageCraft ICC12 compiler passes the first input parameter into
the subroutine by placing it in register D. The remaining parameters (none in this example) would have been
pushed on the stack..
y | equ | -2 | ;local variable binding X relative |
main: | pshx | ; main() | |
tfr | s,x | ; X is the stack frame pointer | |
leas | -2,sp | ; allocate y int y; | |
movw | #5,_x | ; x=5; | |
movw | #6,-2,x | ; y=6; | |
ldd | -2,x | ; parameter in RegD | |
jsr | sub | ; x=sub(y); | |
std | _x | ; store return in global x | |
ldd | #0 | ; return(0);} | |
tfr | x,s | ||
pulx | |||
rts | |||
Y | equ | -4 | ; local variable binding X relative |
Z | equ | -2 | ; Parameter binding X relative |
sub: | pshx | ; int sub(int z){ | |
tfr | s,x | ||
pshd | |||
leas | -2,sp | ; allocate y int y; | |
ldd | Z,x | ; y=z+1; | |
addd | #1 | ||
std | Y,x | ||
ldd | Y,x | ; return(y);} | |
tfr | x,s | ||
pulx | |||
rts |
The stack frame at the time of the ADDD #1 instruction is shown. Within the subroutine the local variables
of main are not accessible.
The fourth compiler we will study is Borland C version 3 for the
Intel 386 IBM-PC. The disassembled output has been edited to clarify
its operation. The small memory model was used. With the large
memory models, the short call would be replaced with a far callf. Similar to the Macintosh, the loader will allocate 3 segmented
memory areas: code pointed to by the CC:IP; global accessed with data segment addressing DS:offset; and
local pointed to by the stack pointer SP. The offset of global symbol, x, can be bound by the compiler. The loader will establish the
position in memory and set the segment register, DS. The compiler can bind the local variables and subroutine parameters.
The SUB SP,2 instruction allocates the local variable, and the PUSH BP MOV BP,SP instructions establishes a stack frame pointer, SS:BP. In later versions of the compiler, these three instructions
are replaced by the single instruction ENTER. The MOV SP,BP POP BP instructions are replaced by LEAVE.
ymain: | PUSH | BP | ||||
MOV | BP,SP | establish stack frame | ||||
SUB | SP,2 | allocate y for main | ||||
MOV | word ptr[x],5 | x=5; | ||||
MOV | word ptr[BP+y],6 | y=6; | ||||
PUSH | word ptr[BP+y] | call by value | ||||
CALL | sub | |||||
POP | CX | discard parameter | ||||
MOV | [x],AX | x=result of sub | ||||
XOR | AX,AX | |||||
MOV | SP,BP | |||||
POP | BP | |||||
RET | ||||||
y | equ | -2 | local binding BP relative | |||
z | equ | 4 | Parameter binding BP relative | |||
sub: | PUSH | BP | ||||
MOV | BP,SP | establish stack frame | ||||
SUB | SP,2 | allocate y for sub | ||||
MOV | AX,[BP+z] | |||||
INC | AX | |||||
MOV | [BP+y],AX | y=z+1; | ||||
MOV | AX,[BP+y] | AX is the return parameter | ||||
MOV | SP,BP | |||||
POP | BP | |||||
RET |
The stack frame at the time of the INC AX instruction is shown. Within the subroutine the local variables
of main are not accessible.
--------------------------------------------------------------------------------------
Stack animation of a C function
In order to understand both the machine architecture and the C compiler, we can look at the assembly code generated. This example shows a simple C program with three local variables. Although the function doesn't do much it will serve to illustrate how local variables are created (allocation), accessed (read and write) and destroyed (deallocated.)
void sub(void){ short y1,y2,y3; /* 3 local variables*/
y1=1000;
y2=2000;
y3=y1+y2;
}
The compiler we will study is the ImageCraft ICC12 version 5.0 for the Motorola 6812. The disassembled output has been edited to clarify its operation (although the compiler does create the "; y3 -> -6,x" comment). The linker/loader allocates 3 segmented memory areas: code pointed to by the PC; global accessed with absolute addressing; and locals pointed to by the stack pointer SP. The leas -6,sp instruction allocates the local variables, and the tfr s,x instruction establishes a stack frame pointer, X. Within the subroutine the local variables of other functions are not accessible.
This document has four overall parts
Overview
Syntax (fields, pseudo ops)
Local variables (this document)
Examples