Wed, 1st Nov 2017, 01:34 A handout on the data structures we have been discussing
My students, As you know, I have changed the order of discussion of topics in Chapters 8, 9, and 10 from what you have in your textbook to what it will look like in the 3rd edition (which hopefully will get finished this year!). Unfortunately, by then EE306 will be just a pleasant memory. so, although I can not give you the chapter as it will exist when the book gets published, I can at least give you a better sense of what we are covering in class right now than is in the book. Therefore this handout, which will have some new material, but will also point you to the sections in your book (2nd edition) where some of these topics are discussed. I have added some figures to this handout that are currently not in the book. I will refer to them as New.1, New.2, etc. in the text below. My hope is to have them scanned in tomorrow, and put them on the website. So, onward: Chapter 8. Data Structures Up to now, each item of information we have processed with the computer has been a single value -- either an integer, a floating point number, or an ASCII character. The real world is filled with items of information far more complex than simple single numbers. A company's organization chart and a list of items arranged in alphabetical order are two examples. We call these complex items of information abstract data types, or more colloquially: data structures. In this chapter, we will study three abstract data types: stacks, queues, and trees. We will examine alternative implementations, and write programs to solve problems that require expressing information according to its structure. Before we get to the data structures, however, we should introduce a new concept that will prove very useful in manipulating data structures: subroutines, or what is also called functions. The material on Subroutines is in your book in Section 9.2. You can forget about sections 9.2.3 and 9.2.4 right now. They deal with some I/O stuff that we will talk about next week and after the second midterm. There is also some stuff on saving and restoring registers (caller save and callee save) in Section 9.1.7 that you will want to read. 8.1. 9.2. Subroutines 9.2.1. The Call/Return Mechanism fig 9.7 Instruction execution with/without subroutines 9.2.2. The JSR(R) instruction 9.2.5. Library Routines The example c^2 = a^2 + b^2 fig 9.10 the right triangle, solving for the hypotenuse fig 9.11 code for finding the hypotenuse fig 9.12 code using a library routine fig 9.13 the executable image, constructed from multiple files (some library routines also -- an opportunity, preview of "include" in C) 9.1.7. Saving and Restoring Registers 8.1. An example: Getting out of a maze. >>I will add this example when I get a chance<< Now we are ready for the first and most important data structure, the stack. The material on stacks is in your book, Section 10.1. (I have also included the new section headings that will be used in the 3rd edition. Just ignore them. They are there to help me when I write the chapter for the 3rd edition.) 8.2. 10.1. The Stack: its basic structure 8.2.1 10.1.1. The stack: an abstract data type fig 10.1 coin holder 8.2.2 10.1.2. Two example implementations fig 10.2 a stack implemented in hardware -- data entries move 8.2.3 10.1.3. Implementation in memory fig 10.3 a stack implemented in memory -- data entries do not move PUSH Overflow, test for overflow, and Success/Fail code fig NEW.1 PUSH code, including test for overflow POP Underflow fig 10.4 POP code, including test for underflow 8.2.4 10.1.4. The complete picture fig 10.5 The stack protocol Please skip sections 10.2. It deals with "interrupts," a very important item, which we will deal with before we part company in December. Section 10.3 discusses the use of a stack for performing arithmetic, which we discussed in class. 8.2.5 10.3. Arithmetic using a Stack 8.2.5.1 10.3.1. The stack as temporary storage 10.3.2. An example (A+B)x(D+D) 10.3.3. OpAdd, OpMult, OpNeg fig 10.8 Stack usage during computation (25+17)x(3+2) OpADD fig 10.9 Flow chart for OpAdd fig 10.10 code for OpAdd OpMult fig 10.13 flowchart OpMult fig 10.14. OpMult code OpNeg fig 10.15 OpNeg code And, now, some new stuff (preview of the 3rd edition!): 8.3 The queue, its basic structure. Recall, the property that defined the concept of *stack* was LIFO, the last think we pushed onto the stack is the first thing we pop off the stack. The *queue* has its defining property: FIFO. FIFO stands for "First in First out." It is a queue in the sense of a queue at a polite supermarket, or a polite ticket counter. That is, the first person in line is the first person serviced. In the context of the abstract data type, this means we need to keep track of two ends of the storage structure: a FRONT pointer for servicing (i.e., removing) and a REAR pointer for joining (i.e., inserting) into the queue. Figure New.2 shows a sequential block of six memory locations that have been allocated for storing the queue. The queue grows from xA000 to xA005. We arbitrarily assign the FRONT pointer to the location just before the first element of the queue. We assign the REAR pointer to the location containing the most recent element that was added to the queue. Thus Figure New.2 shows a queue in which five values have been stored in the queue. Two (x45 and x17) have been removed, and three (x23, x2, and x74) remain. Note that the values x45 and x17 are still contained in memory locations xA000 and xA001. That is the nature of load instructions. They do not erase the contents of memory, they simply copy the memory location's contents into the destination register. However, since FRONT contains the address xA001, there is no way to load from those two locations. The front of the queue is the value x23, which is in memory location xA002. At first blush, it looks like we will never be able to store more than six values into this queue. Not so! When we remove a value from the queue, that location location becomes available for storing another entry. We do that my allowing the available storage locations to "wrap around." For example, suppose we want to add two more values, x10 and x20 to the queue. We store the value x10 in location xA005, and the value x20 in location xA000. We can store x20 in location xA000, since the earlier value that was in xA000 has been removed from the queue. The result of these two insertions is shown in Figure New.3. There are five values in the queue. x23 is at the front, x20 is at the rear. "Wrap around" works by having our removal and insertion algorithms test the contents of FRONT and REAR for the value xA005. If we wish to insert, and REAR contains xA005, we know we have reached the end of our available storage and we must see if xA000 is available. If we wish to remove, we must first see if FRONT contains the address xA005. If it does, the front of the queue is in xA000. Let's look again at Figure New.2. There are three values in the queue. If we remove x23, FRONT will point to xA002. If we then remove x2 and x74, FRONT will point to xA004. The queue will be empty, which we can tell because FRONT=REAR. Our subroutine for removing the element at the front of the queue is shown in Figure New.4. We will use R3 for the FRONT pointer and R4 for the REAR pointer. As with the stack, we will use R5 to report success or failure. We first check to see if the queue is empty; i.e., if R3=R4. If so, we branch to UNDERFLOW, where we set R5 to failure, restore R1, and return. If not, we check to see if R3 is pointing to xA005; if so we need to wrap around. Once we have R3 pointing to the front of the queue, we use LDR to remove the front element of the queue, restore R1, and return. We leave it as an exercise for the reader to write a program for inserting an element at the REAR of the queue. Hint: we will say the queue is full when all but one of the locations contains a value that has been inserted but not removed. BIG Question: Why do we not consider the queue full when ALL the locations contain values that have been inserted but not removed? 8.4. The linked list There is a common alternative to storing a data structure in sequential locations in memory. It is called a linked list. Figure New.5 shows an example of four values ABLE, BRAVO, CHARLIE, DELTA that are stored both in sequential memory and as a linked list. In both cases, the order ABLE before BRAVO, BRAVO before CHARLIE, CHARLIE before DELTA is maintained. Figure New.5c shows the usual way we represent the elements of a linked list, with arrows representing the pointers to the next element, rather than the explicit addresses. Note that the last element in the linked list shows a pointer to "ground," which is the usual way we represent the null pointer x0000, signifying the last element in the list. Note further that the order of the elements in the list is determined by the pointers, and NOT by the actual locations in memory. Also, note that each element in the linked list requres, in addition to its value, an extra word of memory to store the pointer. What about underflow and overflow? Much easier for the linked list. Underflow is the situation when there are no elements in the list, easily identified by HEAD containing the value x0000. Overflow is generally not a problem, since if the element is contained somewhere in memory, it can be linked to the linked list. Suppose, as in the case of the example of Figure New.5, we wish to maintain the linked list in alphabetical order. It is much easier to add an element and maintain alphabetical order than if we had stored the elements in sequential memory. Suppose, for example, we wished to insert AUSTIN into our data structure, and maintain alphabetical order. If the data structure were stored in sequential memory, we would have to move BRAVO, CHARLIE, and DELTA in order to make room for AUSTIN in location x8053. With a linked list, we simply have to change the pointer from ABLE to AUSTIN and from AUSTIN to BRAVO. Figure 5.d illustrates the addition of AUSTIN to the linked list. In fact, a useful rule of thumb as to whether sequential memory or linked list is preferred is whether the data structure will be accessed or updated most of the time. If updated, as discussed above, a linked list is much faster. If accessed, sequential storage is better since we can locate an element in minimal time by performing a binary search on the elements. 8.4.1. The doubly linked list A simple variation of a linked list is the doubly-linked list, where two locations of memory are needed for each element, a forward pointer and a backward pointer. Figure New.6 illustrates the doubly-linked list. 8.4.2. A program that uses it (to be added later) 8.4.3. Another program that uses it (to be added later) 8.5. Character strings A common data structure is a sequential set of locations used to store a character string. A person's name is a good example. Figure New.7 shows a linked list consistenting of three elements, where each element represents a person. For each person, two pieces of information are stored, the person's name and some piece of data. Each name could contain a different number of letters, as in the example. An easy way to store this data structure is to use a pointer for each name, and then store the name in sequential locations of memory, each location containing the ascii code for a letter in the name. Programs searching this data structure for a particular name would compare each letter in the character string with the corresponding letter in the name being searched. We often use the null character x0000 to represent the end of a name. Using the data structure of Figure New.7, the name NYE would produce a match on the third element in the linked list. The name NYEBOLD would not produce a match. 8.6. Trees Figure New.8 illustrates a *tree*. A tree is a very useful data structure for representing families and organizations, for example. Figure New.8 could represent a family, where A has three children, B, C, and D. B has one child, E. D has two children F and G. We could store information about each person in each element. That is, for example, the locations A through G shown in Figure New.8 could contain information about gender, date of birth, favorite foods, etc. One could then search the data structure, asking a question like who in the family likes cheesecake. Note that each person in the tree has three pointers: a pointer to the parent, a pointer to the next younger sibling, and a pointer to the oldest child. These three pointers are sufficient to completely characterize any tree. Figures New.8 could also representing the organizational structure of a company. A is the Big Boss, and has three people reporting directly to him: B,C, and D. E reports to B, F and G report to D, no one reports to C. Each element of the tree could contain informatio0n about the employee, such as social security number, salary, years with the company, etc. One could then query this data structure to identify all employees who are celebrating their 10th year with the company. >>end of chapter<<