Let a 32-bit unsigned int number be given. How to determine if it can be represented in the float? I figured out the numbers of up to 23 bits: they completely fit into the 23-bit mantissa. But some numbers greater than 23 bits can also be represented exactly. How to define it?
1 answer
The answer came to light on Wikipedia :
- Positive numbers up to 2²⁴ inclusive are accurately represented.
- Positive numbers from 2²⁴ + 1 to 2²⁵ are rounded to a multiple of 2
- Positive numbers from 2²⁵ + 1 to 2²⁶ are rounded to a multiple of 4
- ...
- Positive numbers from 2¹²⁶ + 1 to 2¹²⁶ are rounded to a multiple of 2¹⁰³
- Positive numbers from 2¹²⁷ + 1 to 2¹²⁸ - 2¹⁰⁴ are rounded to a multiple of 2¹²⁷⁻²³
- Positive numbers from 2¹²⁸ turn into infinity.
Similarly for negative numbers.
Rounding is towards the nearest multiple. For equidistant cases, “banking” rounding is used: 2²⁴ + 1 is rounded down, 2²⁴ + 3 up, 2²⁴ + 5 down again, and so on.
Regarding the missing segment from 2¹²⁸ - 2¹⁰⁴ to 2¹²⁸ - 1: in my experiment, numbers from 2¹²⁸ - 2¹⁰⁴ to 2¹²⁸ - 2¹⁰³ - 1 when turning into float gave the result 2¹²⁸ - 2¹⁰⁴ (that is, were rounded down to a multiple of 2¹²⁷⁻²³), and large rounded up (i.e. to infinity).
I do not know if this behavior is guaranteed by the IEEE 754 standard, but this behavior seems logical, since it coincides with the behavior for smaller numbers.
- Positive numbers from
2^128turn into infinity. Shouldn't there be2^128 - 2^104 + 1, but it is not clear what happens to numbers in the range from2^128 - 2^104 + 1to2^128 - 1? - PetSerAl - @PetSerAl: I copied the text from wikipedia, now I'm just checking what happens to the other numbers. - VladD
- @PetSerAl: Wrote a supplement response. - VladD
|
i == (unsigned int)(float)i- PetSerAl(unsigned int)fiffis2^32. In other cases, the conversion tounsigned intshould not have an error. - PetSerAl(i / (i ^ (i & (i - 1)))) < (1 << 23)@VladD Here without transition tofloat. - PetSerAl