## digitalmars.D.learn - float.max + 1.0 does not overflow

• rumbu (8/8) Dec 27 2017 Is that normal?
• Benjamin Thaut (23/31) Dec 27 2017 This is actually correct floating point behavior. Consider the
• Dave Jones (12/25) Dec 28 2017 The float with the lower exponent would have to be shifted to
```Is that normal?

use std.math;
float f = float.max;
f += 1.0;
assert(IeeeFlags.overflow) //failure
assert(f == float.inf) //failure, f is in fact float.max

On the contrary, float.max + float.max will overflow. The
behavior is the same for double and real.
```
Dec 27 2017
```On Wednesday, 27 December 2017 at 13:40:28 UTC, rumbu wrote:
Is that normal?

use std.math;
float f = float.max;
f += 1.0;
assert(IeeeFlags.overflow) //failure
assert(f == float.inf) //failure, f is in fact float.max

On the contrary, float.max + float.max will overflow. The
behavior is the same for double and real.

This is actually correct floating point behavior. Consider the
following program:

float nextReprensentableToMax = float.max;
// find next smaller representable floating point number
(*cast(int*)&nextReprensentableToMax)--;
writefln("%f", float.max - nextReprensentableToMax);

It computes the difference between float.max and the next smaller
reprensentable number in floating point. The difference printed
by the program is:
20282409603651670423947251286016.0

As you might notice this is siginificantly bigger then 1.0.
Floating point operations work like this: They perform the
operation and then round to the nearest representable number in
floating point. So adding 1.0 to float.max and then rounding to
the nearest representable number will just give you back
float.max. If you however add float.max and float.max the next
nearest reprensentable number is float.inf.

When trying to understand how floating point works I would highly
recommend that you read these articles (oldest first):
https://randomascii.wordpress.com/category/floating-point/

Kind Regards
Benjamin Thaut
```
Dec 27 2017
```On Wednesday, 27 December 2017 at 14:14:42 UTC, Benjamin Thaut
wrote:
On Wednesday, 27 December 2017 at 13:40:28 UTC, rumbu wrote:
Is that normal?

It computes the difference between float.max and the next
smaller reprensentable number in floating point. The difference
printed by the program is:
20282409603651670423947251286016.0

As you might notice this is siginificantly bigger then 1.0.
Floating point operations work like this: They perform the
operation and then round to the nearest representable number in
floating point. So adding 1.0 to float.max and then rounding to
the nearest representable number will just give you back
float.max. If you however add float.max and float.max the next
nearest reprensentable number is float.inf.

The float with the lower exponent would have to be shifted to
match the higher which means 1.0 would be shifted something like
156 bits to the right before the addition can be done. If you
shift right more bits than are in the mantissa then it get
rounded to zero. Hence once the two values are lined up to do the
actual op it becomes float.max + 0.0.

That said i suspect the OP was expecting the FPU unit to catch
that in theory it should overflow. Not that the actual op would
overflow but that the FPU would be checking the values on input.
Maybe.
```
Dec 28 2017