hi everyone, we've made it to the last chapter, and i thought i'd use this email to i thought i'd jot down some thoughts on things that are important to programmers, but we just didn't have time for. - parsing/compilation: reading any nontrivial data (csv is trivial in my book) should be done using tools for generating parsers; i use yacc and lex (actually their gnu equivalents, bison and flex, both of which run on windows too). there's a LOT of theory on how to compile programming languages (the Chomsky hierarchy, Post Correspondence, etc.) but as a practicing programmer you can get by with much less. probably the best way to understand and starting writing parsers is to scarf someone else's parser, at least that how i did it. for example, the code under the ctlp directory in the vis release implements a fairly simple parser. http://embedded.eecs.berkeley.edu/research/vis/getting_VIS.html if you want to go the traditional route, my friend and colleague stephen edwards teaches a compiler class with great notes and references at www1.cs.columbia.edu/~sedwards/classes/2003/w4115f - operating system: again, there's tons of theory about OSs, both mathematical and philosophical. as a programmer i feel what you need to know can be summed up very succinctly: an OS is an abstraction of the hardware. in particular, an OS is a program (a big one, but a single program none-the-less); it supports all other program that are running (a running program is a process in unix speak). one key thing an OS does is schedule processes, since the processor can only run one at a time. (thus it provides an abstraction of the processor to processes.) in addition it provides an abstraction of the disk (so you don't have to treat the disk as a raw array of bytes), of the memory (so you dont have to deal with memory protection), and of the network (so you don't have to write raw ethernet packets). the OS also provides "system calls" to your programs that you invoke just like you invoke functions (e.g., read( 12, 100, & foo );) except that they are run in the OS process, not yours. (these calls are the only way in which you can tell the OS to do stuff for you.) somewhat surprisingly, there are relatively few system calls (~150 in linux, <1000 in win32). they include read, write, timer functions, file/dir creation, etc. calls to printf, rand, etc. are NOT system calls - they are calls to C functions compiled into a library called libc. libc code itself make system calls (write for printf, none for rand). the OS itself contains "driver" code that implements the read/write, etc., based on the device type. andrew tanenbaum's OS book (OS design and implementation) is the clearest i've read: not too theoretical, not too implementation focused. (it also directly led to the invention of linux.) - linker/loader/library technology: it used to be that executables were "complete" in that the loader pulled them into memory, and the OS just ran them (of course, the OS rand the system calls). in the late 80s/early 90s, executables became very big (even trivial windowing programs would be huge because they included all the windowing code). this led to the use of "dynamic linkage" that is when your program made a call to a library function, only then would that code get sucked in to the process. dynamic linkage is not trivial, since the runtime environment has to be able to figure out where to find the function, and the memory addresses in the function need to be set based on where in memory its loaded to. the solution is to use hash tables mapping function/variable names (just char arrays) to pointers. these are stored with the object code in the libraries. (if you think about it, there is significant overhead only the first time the function is called, forever afterward, the runtime environment sets the pointer to the function address.) you can read about linux implementation of dynamic linkage and the elf format for storing the information needed in the object files at www.linux.org. (i'm not as familiar with the win32 implementation, but the central idea of hash tables is the same.) parsing is something that you use in practice; most people don't write OS code. however, i feel knowing about the OS and linker/libraries can only make you a better programmer: it facilitates debugging, identifying performance problems, etc. cheers, adnan