hi everyone,

hw2 is due 11:59pm june 18 (sunday). i've gotten a blackboard setup for our class, will be configuring it on monday, and will send our instructions for uploading hws then.

chapter 2 is on algorithms and data structures.

most undergrad programming classes focus on algos/DS (IMO, overly so, at the expense of many pragmatic aspects of programming), so you should know most of what's there already. a review is always good though, and you may see things differently now that you are in the real world.

hw2 is relatively straightforward - there is some programming involved, but it's short snippets of code, and the point is more to get you to think about the different data structures and algos you can build on them.

there are already some very good libraries of data structures and associated algorithms (STL for simpler data structures, boost for more complicated ones), and the only situation where you might want to write own, e.g., a binary search tree routine, is when you need very high performance. in such situations, although you don't typically design new algorithms, you do organize the data and code to optimally use system resources (e.g., make things fit on a cache line).

another reason people give for writing your own data structures/algos is to save time reading the library documentation, but given how long it takes to test/debug such code, i've NEVER seen this reason to hold up.

bear in mind that the big-O notation hides constant factors, and that two functions can both be O(N), while still being very different in terms of performance. similarly, some data structure may be very friendly to the L1-L2-L3 caching schemes used in modern processors. similarly, often you can solve performance problems by looking at input statistics and caching common cases, etc.

IMO, the important reason to know these data structures and algorithms is so that you can recognize when you come across a probem what tools might be appropriate. for example, if you didn't know about the heap data structure (not to be confused with the memory space that's allocated dynamically!), you might use a binary search tree to implement the functionality of inserts and extract-min's. however, the heap is much more memory friendly (it's typically implemented as an array, so there's no pointer redirection).

similarly, when you need the functionality of a hash table, you still need to provide hash and compare functions. designing a good hash function can be quite tricky, and sometimes coming up with any hash function can be really hard (e.g., say your entries are Boolean logic functions - it turns out there is a way of hashing them, but it's not at all straightforward). usually though all you hash are numbers or strings (any object in memory can be viewed as a string of bytes, so string hash functions have pretty general use).

there are scores of books on advanced data structures and algorithms. the three that i like best are:

Introduction to Algorithms. 2nd edition, Cormen, Leiserson, Rivest, and Stein. (Quite mathematical, but lots of examples, and easy to read.)
Programming Pearls. 2nd edition, Bentley. (A bit too full of cutesy examples that you are unlikely to encounter outside of a Microsoft job interview, but still fun to read.)
Algorithms in C. 3rd edition, Sedgewick. (Somewhat terse, but includes code for all the routines - you can get versions of this book for Java and C++)

cheers,

adnan