Sun, 27 Jan 2013, 04:40

A student writes:

> Dr. Patt,
> I am in the middle of writing the error detection part of the assembler and
> I thought of a somewhat contorted example of a line you could give us. A
> label is supposed to contain only alphanumeric characters, so an invalid
> label could technically contain white space in it. For example the line
> A B ADD R0, R0, R0
> could be thought of as either a label with a tab character in it (in which
> case it would exit with error 4 for an incorrect label) or it could be
> thought of as an invalid opcode (in which case it would exit with error 2
> for invalid opcode). I think that it should exit with error 4, but the
> parser you have written would not treat the label as having white space and
> so it would be much more logical to exit with error code 2. Which should we
> implement, or would a case like this not be tested?
> Thank you!
> <<name withheld to protect the student who found a case that requires we
> add clarification #18 to the list of clarifications>>

Congratulations!  You found a case that tells me we should have provided
clarification #18, instead of assuming it was clear without making it
explicit: A comma, space, or tab character always terminates the previous 

The student's example will be interpreted as:

the first token A must be a label.
the second token B must be an illegal opcode.

Then, although A B ADD r0, r1, r2 is probably a label error, 
the assembler will report it as an undefined opcode error. 

Lest you be too upset by our assembler not being intelligent enough
to know what the programmer had in mind, you can take solace in the
fact that many (all?) modern compilers get the actual error wrong 
some of the time.  What the compiler writers tell me is, "Yeah, yeah, 
yeah, but at least we found the line that had an error, and the 
programmer is probably able to use that information to identify the 
actual mistake and fix it.

Good luck completing the first programming lab on time.

Yale Patt