9/9/04

During lecture, I did something I usually don't do: time was running out, I wanted to get through the logical operators, and so I decided to leave out one piece of the floating point number story, saving it for when I see you again in 360N. ...since it is not crucial to your understanding in 306. In fact, one of my TAs congratulated me after class on having the self-control to not go over this piece. He said, "I bet you wanted to show them that real bad." And then, he added, "But if you had, you never would have gotten into the logical operators."

He was right. BUT, I just got email from a student who clearly wants to know the whole story. So, even though I will not force you all to know it, I will try to explain it here, since some of you don't want to wait for 360N. For those of you who can wait until 360N, enjoy the Arkansas game on Saturday.

       Dr. Patt,
       In the examples for floating point in the book, one has
       0 00000000 00001000000000000000000
       And states that the leading 0 makes the number positive, the 0's for
       the exponents make the exponent -126 and the last 23 bits gives
       0.00001000000000000000000

       Another contains
       1 00000000 00000000000000000000001
       And states that the leading 1 makes the number negative, the 0's make
       the exponent -126 and the last 23 bits yields
       0.00000000000000000000001

Yup, all of this is correct.

       I heard you state in class that we would not use floating
       point numbers which contain all 0's for the exponent in this course.

       However, does having a zero before the fraction bits
       (ex. 0.0000100...0) come from all eight exponent bits being zero? Are
       there any other cases in which you would have zero before the decimal
       point?

       Thank you,
       << name withheld to protect the person who can't wait until 360N >> 

So, let me try to say it again, since at least one student did not get it (or I misspoke) the first time. What I thought I said (certainly what I meant to say) was that we will only deal with "normalized numbers" in 306, and we will save other numbers for 360N and other courses.

The 32 bit representation using the 32 bit format we discussed in class can be lumped into 3 groups: those with 8-bit exponent field containing
(a) 11111111
(b) 00000000
(c) all the rest Normalized numbers use the values in group (c) above. In fact, the only normalized numbers that can be represented in this scheme are those of the form:
+/- 1.<<23 bits of significance>> * 2^actual_exponent

where the actual_exponent can be represented by an excess 127 code in the range of (c), that is, from 00000001 (+1) to 11111110 (+254). Thus, the actual_exponent must be between -126 [-126 + 127 = +1] and +127 [127 + 127 = +254] An actual_exponent smaller than -126 would produce an excess code too small to be represented.

So that takes care of the normalized numbers.

BUT what about the two cases in which the exponent field contains (a) 11111111 and (b) 00000000. The case where the exponent field contains 11111111 is set aside to represent unusual values, like +/- infinity, sqrt(-1), 0/0, etc.

The case where the exponent field contains 00000000 is set aside to represent very tiny numbers that are so small they cannot be normalized. Recall that the way you normalize a tiny number is to move the binary point to the right past all those 0's until you have a "1" to the left of it. If you have to move it n digits, you must compensate that by subtracting n from the exponent of the value. If the number is tiny enough, then subtracting n will make the exponent less than -126, which will be too small to represent as an 8bit excess-127 code.

Thus if the number is sufficiently small that we cannot normalize it, we call it a subnormal number. It's values are in the not-normalized range:
+/- 0.<<23 bits of exponent>> * 2^-126

Note the 0 to the left of the binary point means the number is not normalized.

Such a number is represented as
+/- 00000000 <<23 bits of exponent>>

Note that if the 23 bits are 11111111111111111111111, then the number is
+/- 0.11111111111111111111111 * 2^-126

Which is a tad less than
+/- 1.00000000000000000000000 * 2^-126

which is the smallest normalized number.

Note also, that the number
+/- 0.00000000000000000000000 * 2^-126 is the number 0.

And that it is represented by all 0's in both exponent and fraction fields.

I think that completes the story for now! ...unless someone wants more.
(Aren't you glad you asked!)

Yale Patt