Thu, 1 Sep 2011, 04:23
To my students in EE306: I felt we were a little rushed today at the end, so I wanted to provide you with this email to go over more slowly what we went over toward the end of class today - the floating point data type. First, recall, a data type is a representation of information such that we have instructions in the ISA that operate on that representation. We will see when we get to the LC-3 that there is an ADD instruction that adds two 16-bit 2's-complement integers, producing a 16-bit integer result. The integers we work with (with 16 bits each) range from negative 2^15 to positive 2^15 -1. Sometimes it is useful to be able to represent much larger numbers and much tinier numbers than we can represent with 2's complement integers. AND, we are willing to sacrifice some of the bits that specify the digits and use those bits to specify the size. Most of you are familiar with that representation from high school chemistry for example when you represented such values as 6.023 x 10^23. You were willing to express the value with only four significant digits in order to have some digits available for representing the exponent, in this case the exponent 23. Most computers have in their ISA the floating point data type which satisfies this need of sacrificing some bits of significance in order to express much larger and much tinier values. These ISAs have, for example, an FADD (for floating point add) instruction which operates on values expressed according to this second data type. In class, we discussed the simplest of these floating data types, a 32-bit representation where the high order bit (bit 31) conveys the sign of the value (0 is positive, 1 is negative), the next eight bits (bits[30:23] are used for the exponent, and the remaining 23 bits are used for the significant bits of the value. For example, the value + 9 5/8 which in binary is + 1001.101 can be written in normalized form as + 1.001101 x 2^3. As a floating point data type it would be represented (ALWAYS normalized when POSSIBLE) as 0 10000010 00110100000000000000000 Note bit 31 is 0 because the value is positive. Note bits 30:23 have the value 130, which is the exponent 3 plus the excess 127. Note bits 22:0 have the significant binary digits after we remove the leading 1 (since we know the leading bit of a *normalized* number has to be a 1). We say we represent values in normalized form whenever we can. In fact, we specify that if Bits[30:23] contain any value between 00000001 and 11111110, the entire 32 bits represent a value in normalized form. That value can be reconstructed by noting whether bit[31] is a 0 or 1 and specifying the sign accordingly, then taking bits[22:0] and putting them after "1." to reflect the leading bit that was not stored, and finally multiplying by 2^k where k is the value obtained by subtracting 127 from the unsigned value stored in bits[30:23]. We still have to deal with those representations that contain 11111111 or 00000000 in bits[30:23]. In the case of 11111111, we use this to designate infinity: 0 11111111 00000000000000000000000 = + infinity 1 11111111 00000000000000000000000 = - infinity In the case of 00000000, it is a little trickier, and in fact, the more I think of it, the more I think it is beyond what should be expected of you in EE 306. So, in the interest of focusing on other more central issues in EE 306, I will not require you to understand in EE 306 how 00000000 in bits[30:23] should be interpreted. I will save that for EE 460N. However, because I fully realize that some of you will not be satified without the whole story, I will include it below, EVEN THOUGH you will not see it again on an exam or a program or a problem set. XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX With that caveat, I continue: We first note that the smallest normalized positive value is 2^(-126), which is represented as 0 00000001 00000000000000000000000. How did we get that? The significant bits are obtained by starting with "1." and appending the 23 0s in bits[22:0}. We multiply that by 2^k where k is 1 (obtained from bits[30:23] minus the excess (127). The result is 1.00000000000000000000000 times 2^(-126), which is 2^(-126). Professor Kahan, who invented the floating point data type, did not allow bits[30:23] = 00000000 to be interpreted the same way, because if he did, then 0 00000000 0000000000000000000 would be interpreted as 2^(-127), which would have denied him the ability to represent the most important of all values - zero! So, he decided to use 00000000 in a different way. Professor W. Kahan noted more than 30 years ago two things: 1. that the interval between 0 and 2^(-126) was equal to the interval between 2^(-126) and 2^(-125). Do you see why that is the case? 2. that there are 2^23 -1 distinct normalized values that can be represented exactly between 2^(-126) and 2^(-125). Do you see why that is the case? He then concluded that if he changes the "1." in front of the 23 bits that is taken from bits[22:0] to "0." he would have 2^23 -1 distinct values between 0 and 2^(-126) ALL OF WHICH are too tiny to be normalized. He called these the *subnormal* numbers and identified them with 00000000 in bits[30:23}. The result: If bits[30:23] contains 00000000, the value represented is formed as: "0." followed by bits[22:0} times 2^(-126). For example, the number represented as 0 00000000 00010001000100010001001 is: + 0.00010001000100010001001 x 2^(-126). Finally, how do we represent the most important number: zero ?? Easy: 0 00000000 00000000000000000000000 (All zeros!) And now (for those who could not have been able to sleep without it) you have the whole story! In EE 306, we will only expect you to understand up to the line XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX above. Hope this helps. Feel free to ask the TAs or me anything here that is still confusing. Enjoy the discussion section on Friday. Have a great Labor Day. Don't forget to turn in your student information sheet with your first problem set next Wednesday. I will see you in class next Wednesday. Be safe this long weekend. Yale Patt