9/9/04
During lecture, I did something I usually don't do: time was running out, I
wanted to get through the logical operators, and so I decided to leave out one
piece of the floating point number story, saving it for when I see you again in
360N. ...since it is not crucial to your understanding in 306. In fact, one of
my TAs congratulated me after class on having the self-control to not go over
this piece. He said, "I bet you wanted to show them that real bad." And then,
he added, "But if you had, you never would have gotten into the logical
operators."
He was right. BUT, I just got email from a student who clearly wants to know
the whole story. So, even though I will not force you all to know it, I will
try to explain it here, since some of you don't want to wait for 360N. For
those of you who can wait until 360N, enjoy the Arkansas game on Saturday.
Dr. Patt, In the examples for floating point in the book, one has 0 00000000 00001000000000000000000 And states that the leading 0 makes the number positive, the 0's for the exponents make the exponent -126 and the last 23 bits gives 0.00001000000000000000000 Another contains 1 00000000 00000000000000000000001 And states that the leading 1 makes the number negative, the 0's make the exponent -126 and the last 23 bits yields 0.00000000000000000000001
Yup, all of this is correct.
I heard you state in class that we would not use floating point numbers which contain all 0's for the exponent in this course. However, does having a zero before the fraction bits (ex. 0.0000100...0) come from all eight exponent bits being zero? Are there any other cases in which you would have zero before the decimal point? Thank you, << name withheld to protect the person who can't wait until 360N >>
So, let me try to say it again, since at least one student
did not get it (or I misspoke) the first time. What I thought I said (certainly
what I meant to say) was that we will only deal with "normalized numbers" in
306, and we will save other numbers for 360N and other courses.
The 32 bit representation using the 32 bit format we discussed in class can
be lumped into 3 groups: those with 8-bit exponent field containing
where the actual_exponent can be represented by an excess 127 code in the
range of (c), that is, from 00000001 (+1) to 11111110 (+254). Thus, the
actual_exponent must be between -126 [-126 + 127 = +1] and +127 [127 + 127 = +254]
An actual_exponent smaller than -126 would produce an excess code too small to
be represented.
So that takes care of the normalized numbers.
BUT what about the two cases in which the exponent field contains (a) 11111111
and (b) 00000000. The case where the exponent field contains 11111111 is set
aside to represent unusual values, like +/- infinity, sqrt(-1), 0/0, etc.
The case where the exponent field contains 00000000 is set aside to
represent very tiny numbers that are so small they cannot be normalized. Recall
that the way you normalize a tiny number is to move the binary point to the
right past all those 0's until you have a "1" to the left of it. If you have to
move it n digits, you must compensate that by subtracting n from the exponent
of the value. If the number is tiny enough, then subtracting n will make the
exponent less than -126, which will be too small to represent as an 8bit
excess-127 code.
Thus if the number is sufficiently small that we cannot normalize it, we
call it a subnormal number. It's values are in the not-normalized range:
Note the 0 to the left of the binary point means the number is not
normalized.
Such a number is represented as
Note that if the 23 bits are 11111111111111111111111, then the number is
Which is a tad less than
which is the smallest normalized number.
Note also, that the number
And that it is represented by all 0's in both exponent and fraction fields.
I think that completes the story for now! ...unless someone wants more.
Yale Patt
(a) 11111111
(b) 00000000
(c) all the rest
+/- 1.<<23 bits of significance>> * 2^actual_exponent
+/- 0.<<23 bits of exponent>> * 2^-126
+/- 00000000 <<23 bits of exponent>>
+/- 0.11111111111111111111111 * 2^-126
+/- 1.00000000000000000000000 * 2^-126
+/- 0.00000000000000000000000 * 2^-126 is the number 0.
(Aren't you glad you asked!)