EE 306: Email Wed, 1st Nov, 2017, 01:34

Wed, 1st Nov 2017, 01:34 A handout on the data structures we have been discussing


My students,

As you know, I have changed the order of discussion of topics in Chapters
8, 9, and 10 from what you have in your textbook to what it will look like
in the 3rd edition (which hopefully will get finished this year!).
Unfortunately, by then EE306 will be just a pleasant memory.  so, although
I can not give you the chapter as it will exist when the book gets published,
I can at least give you a better sense of what we are covering in class right
now than is in the book.

Therefore this handout, which will have some new material, but will also point
you to the sections in your book (2nd edition) where some of these topics are
discussed.

I have added some figures to this handout that are currently not in the book.
I will refer to them as New.1, New.2, etc. in the text below.  My hope is to
have them scanned in tomorrow, and put them on the website.

So, onward:

Chapter 8.  Data Structures

Up to now, each item of information we have processed with the computer
has been a single value -- either an integer, a floating point number, or
an ASCII character.  The real world is filled with items of information
far more complex than simple single numbers.  A company's organization
chart and a list of items arranged in alphabetical order are two examples.
We call these complex items of information abstract data types, or more
colloquially: data structures.  In this chapter, we will study three
abstract data types: stacks, queues, and trees.  We will examine alternative
implementations, and write programs to solve problems that require
expressing information according to its structure.

Before we get to the data structures, however, we should introduce a new
concept that will prove very useful in manipulating data structures:
subroutines, or what is also called functions.

The material on Subroutines is in your book in Section 9.2.  You can forget
about sections 9.2.3 and 9.2.4 right now.  They deal with some I/O stuff that
we will talk about next week and after the second midterm.  There is also
some stuff on saving and restoring registers (caller save and callee save)
in Section 9.1.7 that you will want to read.


8.1.  9.2. Subroutines
      9.2.1. The Call/Return Mechanism
    fig 9.7 Instruction execution with/without subroutines
      9.2.2. The JSR(R) instruction
      9.2.5. Library Routines
    The example c^2 = a^2 + b^2
    fig 9.10 the right triangle, solving for the hypotenuse
    fig 9.11 code for finding the hypotenuse
    fig 9.12 code using a library routine
    fig 9.13 the executable image, constructed from multiple files (some library
                routines also -- an opportunity, preview of "include" in C)
      9.1.7. Saving and Restoring Registers

8.1. An example: Getting out of a maze.

>>I will add this example when I get a chance<<

Now we are ready for the first and most important data structure, the stack.
The material on stacks is in your book, Section 10.1.  (I have also included
the new section headings that will be used in the 3rd edition.  Just ignore
them.  They are there to help me when I write the chapter for the 3rd edition.)


8.2.   10.1. The Stack: its basic structure
8.2.1  10.1.1. The stack: an abstract data type
    fig 10.1 coin holder
8.2.2  10.1.2. Two example implementations
    fig 10.2 a stack implemented in hardware -- data entries move
8.2.3  10.1.3. Implementation in memory
    fig 10.3 a stack implemented in memory -- data entries do not move
    PUSH
    Overflow, test for overflow, and Success/Fail code
    fig NEW.1 PUSH code, including test for overflow
    POP
    Underflow
    fig 10.4 POP code, including test for underflow
8.2.4   10.1.4. The complete picture
    fig 10.5 The stack protocol

Please skip sections 10.2.  It deals with "interrupts," a very important
item, which we will deal with before we part company in December.

Section 10.3 discusses the use of a stack for performing arithmetic, which
we discussed in class.


8.2.5   10.3. Arithmetic using a Stack
8.2.5.1 10.3.1. The stack as temporary storage
        10.3.2. An example (A+B)x(D+D)
        10.3.3. OpAdd, OpMult, OpNeg
    fig 10.8 Stack usage during computation (25+17)x(3+2)
    OpADD
    fig 10.9 Flow chart for OpAdd
    fig 10.10 code for OpAdd
    OpMult
    fig 10.13 flowchart OpMult
    fig 10.14. OpMult code
    OpNeg
    fig 10.15 OpNeg code

And, now, some new stuff (preview of the 3rd edition!):

8.3  The queue, its basic structure.

Recall, the property that defined the concept of *stack* was LIFO, the last
think we pushed onto the stack is the first thing we pop off the stack.
The *queue* has its defining property: FIFO.  FIFO stands for "First in First
out."  It is a queue in the sense of a queue at a polite supermarket, or a
polite ticket counter.  That is, the first person in line is the first person
serviced.  In the context of the abstract data type, this means we need to
keep track of two ends of the storage structure: a FRONT pointer for servicing
(i.e., removing) and a REAR pointer for joining (i.e., inserting) into the
queue.

Figure New.2 shows a sequential block of six memory locations that have been
allocated for storing the queue.  The queue grows from xA000 to xA005.  We
arbitrarily assign the FRONT pointer to the location just before the first
element of the queue.  We assign the REAR pointer to the location containing
the most recent element that was added to the queue.

Thus Figure New.2 shows a queue in which five values have been stored in the
queue.  Two (x45 and x17) have been removed, and three (x23, x2, and x74)
remain.  Note that the values x45 and x17 are still contained in memory
locations xA000 and xA001.  That is the nature of load instructions.  They
do not erase the contents of memory, they simply copy the memory location's
contents into the destination register.  However, since FRONT contains the
address xA001, there is no way to load from those two locations.  The front
of the queue is the value x23, which is in memory location xA002.

At first blush, it looks like we will never be able to store more than six
values into this queue.  Not so!  When we remove a value from the queue,
that location location becomes available for storing another entry.  We do
that my allowing the available storage locations to "wrap around."  For example,
suppose we want to add two more values, x10 and x20 to the queue.  We store the
value x10 in location xA005, and the value x20 in location xA000.  We can store
x20 in location xA000, since the earlier value that was in xA000 has been
removed from the queue.  The result of these two insertions is shown in Figure
New.3.  There are five values in the queue.  x23 is at the front, x20 is at
the rear.

"Wrap around" works by having our removal and insertion algorithms test the
contents of FRONT and REAR for the value xA005.  If we wish to insert, and
REAR contains xA005, we know we have reached the end of our available storage
and we must see if xA000 is available.  If we wish to remove, we must first
see if FRONT contains the address xA005.  If it does, the front of the queue
is in xA000.

Let's look again at Figure New.2.  There are three values in the queue.  If we
remove x23, FRONT will point to xA002.  If we then remove x2 and x74, FRONT
will point to xA004.  The queue will be empty, which we can tell because
FRONT=REAR.

Our subroutine for removing the element at the front of the queue is shown in
Figure New.4.  We will use R3 for the FRONT pointer and R4 for the REAR
pointer.  As with the stack, we will use R5 to report success or failure.

We first check to see if the queue is empty; i.e., if R3=R4.  If so, we
branch to UNDERFLOW, where we set R5 to failure, restore R1, and return.

If not, we check to see if R3 is pointing to xA005; if so we need to wrap
around.  Once we have R3 pointing to the front of the queue, we use LDR to
remove the front element of the queue, restore R1, and return.

We leave it as an exercise for the reader to write a program for inserting
an element at the REAR of the queue.  Hint: we will say the queue is full
when all but one of the locations contains a value that has been inserted
but not removed.

BIG Question: Why do we not consider the queue full when ALL the locations
contain values that have been inserted but not removed?

8.4. The linked list

There is a common alternative to storing a data structure in sequential
locations in memory.  It is called a linked list.  Figure New.5 shows an
example of four values ABLE, BRAVO, CHARLIE, DELTA that are stored both in
sequential memory and as a linked list.  In both cases, the order ABLE before
BRAVO, BRAVO before CHARLIE, CHARLIE before DELTA is maintained.  Figure
New.5c shows the usual way we represent the elements of a linked list, with
arrows representing the pointers to the next element, rather than the explicit
addresses.  Note that the last element in the linked list shows a pointer to
"ground," which is the usual way we represent the null pointer x0000,
signifying the last element in the list.

Note further that the order of the elements in the list is determined by the
pointers, and NOT by the actual locations in memory.  Also, note that each
element in the linked list requres, in addition to its value, an extra word
of memory to store the pointer.

What about underflow and overflow?  Much easier for the linked list.
Underflow is the situation when there are no elements in the list, easily
identified by HEAD containing the value x0000.  Overflow is generally not a
problem, since if the element is contained somewhere in memory, it can be
linked to the linked list.

Suppose, as in the case of the example of Figure New.5, we wish to maintain
the linked list in alphabetical order.  It is much easier to add an element
and maintain alphabetical order than if we had stored the elements in
sequential memory.  Suppose, for example, we wished to insert AUSTIN into our
data structure, and maintain alphabetical order. If the data structure were
stored in sequential memory, we would have to move BRAVO, CHARLIE, and DELTA
in order to make room for AUSTIN in location x8053.  With a linked list, we
simply have to change the pointer from ABLE to AUSTIN and from AUSTIN to
BRAVO.  Figure 5.d illustrates the addition of AUSTIN to the linked list.

In fact, a useful rule of thumb as to whether sequential memory or linked list
is preferred is whether the data structure will be accessed or updated most of
the time.  If updated, as discussed above, a linked list is much faster.  If
accessed, sequential storage is better since we can locate an element in
minimal time by performing a binary search on the elements.

8.4.1. The doubly linked list

A simple variation of a linked list is the doubly-linked list, where two
locations of memory are needed for each element, a forward pointer and a
backward pointer.  Figure New.6 illustrates the doubly-linked list.

8.4.2. A program that uses it (to be added later)
8.4.3. Another program that uses it (to be added later)


8.5. Character strings

A common data structure is a sequential set of locations used to store a
character string.  A person's name is a good example.  Figure New.7 shows
a linked list consistenting of three elements, where each element represents
a person.  For each person, two pieces of information are stored, the person's
name and some piece of data.  Each name could contain a different number of
letters, as in the example.  An easy way to store this data structure is to
use a pointer for each name, and then store the name in sequential locations
of memory, each location containing the ascii code for a letter in the name.

Programs searching this data structure for a particular name would compare
each letter in the character string with the corresponding letter in the name
being searched.  We often use the null character x0000 to represent the end of
a name.  Using the data structure of Figure New.7, the name NYE would produce
a match on the third element in the linked list.  The name NYEBOLD would not
produce a match.

8.6. Trees

Figure New.8 illustrates a *tree*.  A tree is a very useful data structure
for representing families and organizations, for example. Figure New.8 could
represent a family, where A has three children, B, C, and D.  B has one child,
E.  D has two children F and G.  We could store information about each person
in each element.  That is, for example, the locations A through G shown in
Figure New.8 could contain information about gender, date of birth, favorite
foods, etc.  One could then search the data structure, asking a question like
who in the family likes cheesecake.

Note that each person in the tree has three pointers: a pointer to the parent,
a pointer to the next younger sibling, and a pointer to the oldest child.
These three pointers are sufficient to completely characterize any tree.

Figures New.8 could also representing the organizational structure of a
company.  A is the Big Boss, and has three people reporting directly to him:
B,C, and D.  E reports to B, F and G report to D, no one reports to C.  Each
element of the tree could contain informatio0n about the employee, such as
social security number, salary, years with the company, etc.  One could then
query this data structure to identify all employees who are celebrating their
10th year with the company.

>>end of chapter<<