Wed, 16 November 2016, 02:14



A student writes:


> There seems to be a difference between what a normalized floating point
> number in the page 4 of the notes (using the 6 bit example) is and what a
> normalized floating point number in the IEEE.
>
> In the floating point notes, normalized starts with a 0, meaning we have
> 0.101 for fraction = 10
>
> but in IEEE 32 bit format we have that normalized starts with a 1, which
> in a 6 bit example means that we have 1.01 for fraction = 10.
>
> <<name withheld to protect the student who sees an apparent inconsistency>>


Actually, let me first apologize for not making clear that in class I only
dealt with the IEEE standard for floating point numbers, which starts on
page 8, so the representation on page 10 is what you are used to.

I decided to include pages 1 through 7 in the handout which give some
general information on floating point, and on page 4 shows the typical
representation of numbers BEFORE the IEEE standard was adopted.  One of
the significant differences between the old way and the IEEE standard
I did mention in class, but only briefly.

The old way only allowed normalized numbers to be represented exactly.
Subnormal numbers were represented as 0.  That is there was no such thing
as 0.xxxxxx...x * 2^(1-BIAS), so if a number was too small to be represented
as a normalized number, we just represented that number as 0.

There is a picture of the real line on page 5 that shows which numbers can be
represented exactly if we do not allow subnormal numbers to be represented
by anything other than 0.  You can visualize from that picture, and you may
recall in class, we agreed that this would make the error due to underflow
(representing a number as 0) 2^52 times greater than the error due to
inexact for 64-bit floating point numbers.  For our 6 bit floating point
numbers, it made the error due to underflow 4 times greater than an error
due to inexact.  For that reason, the IEEE Floating point standard
introduced the representation of subnormal numbers of the form
0.xxxxx... * 2^(-1022), making the error due to underflow the same as the
error due to inexact.  It did not take too many years for most companies
to move from the old way to the IEEE Standard.

Hope that helps.

Good luck on the exam.


Yale Patt