## digitalmars.D - integral to floating point conversion

• Andrei Alexandrescu (3/3) Jul 02 2016 So what's the fastest way to figure that an integral is convertible to a...
• Walter Bright (3/6) Jul 02 2016 Test that its absolute value is <= the largest unsigned value represente...
• Andrei Alexandrescu (3/10) Jul 02 2016 What is the largest unsigned value represented by the float's mantissa
• MakersF (3/17) Jul 02 2016 Isn't it 2^24-1?
• Walter Bright (3/14) Jul 02 2016 https://en.wikipedia.org/wiki/Single-precision_floating-point_format
• Simen Kjaeraas (7/21) Jul 02 2016 2uL^^float.mant_dig
• Matthias Bentrup (5/13) Jul 03 2016 That has to be '<' given the condition that no other integer
• ketmar (11/15) Jul 02 2016 bool isConvertible(T) (long n) if (is(T == float) || is(T ==
• Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (11/15) Jul 03 2016 If it is within what the mantissa can represent then it is easy.
```So what's the fastest way to figure that an integral is convertible to a
floating point value precisely (i.e. no other integral converts to the
same floating point value)? Thanks! -- Andrei
```
Jul 02 2016
```On 7/2/2016 1:17 PM, Andrei Alexandrescu wrote:
So what's the fastest way to figure that an integral is convertible to a
floating point value precisely (i.e. no other integral converts to the same
floating point value)? Thanks! -- Andrei

Test that its absolute value is <= the largest unsigned value represented by
the
float's mantissa bits.
```
Jul 02 2016
```On 7/2/16 4:30 PM, Walter Bright wrote:
On 7/2/2016 1:17 PM, Andrei Alexandrescu wrote:
So what's the fastest way to figure that an integral is convertible to a
floating point value precisely (i.e. no other integral converts to the
same
floating point value)? Thanks! -- Andrei

Test that its absolute value is <= the largest unsigned value
represented by the float's mantissa bits.

What is the largest unsigned value represented by the float's mantissa
bits? I recall there is a constant somewhere for it. -- Andrei
```
Jul 02 2016
```On Saturday, 2 July 2016 at 20:49:27 UTC, Andrei Alexandrescu
wrote:
On 7/2/16 4:30 PM, Walter Bright wrote:
On 7/2/2016 1:17 PM, Andrei Alexandrescu wrote:
So what's the fastest way to figure that an integral is
convertible to a
floating point value precisely (i.e. no other integral
converts to the
same
floating point value)? Thanks! -- Andrei

Test that its absolute value is <= the largest unsigned value
represented by the float's mantissa bits.

What is the largest unsigned value represented by the float's
mantissa bits? I recall there is a constant somewhere for it.
-- Andrei

Isn't it 2^24-1?
```
Jul 02 2016
```On 7/2/2016 1:49 PM, Andrei Alexandrescu wrote:
On 7/2/16 4:30 PM, Walter Bright wrote:
On 7/2/2016 1:17 PM, Andrei Alexandrescu wrote:
So what's the fastest way to figure that an integral is convertible to a
floating point value precisely (i.e. no other integral converts to the
same
floating point value)? Thanks! -- Andrei

Test that its absolute value is <= the largest unsigned value
represented by the float's mantissa bits.

What is the largest unsigned value represented by the float's mantissa bits? I
recall there is a constant somewhere for it. -- Andrei

https://en.wikipedia.org/wiki/Single-precision_floating-point_format

24 bits. So it can store +-0x0FFF_FFFF
```
Jul 02 2016    Simen Kjaeraas <simen.kjaras gmail.com> writes:
```On Saturday, 2 July 2016 at 20:49:27 UTC, Andrei Alexandrescu
wrote:
On 7/2/16 4:30 PM, Walter Bright wrote:
On 7/2/2016 1:17 PM, Andrei Alexandrescu wrote:
So what's the fastest way to figure that an integral is
convertible to a
floating point value precisely (i.e. no other integral
converts to the
same
floating point value)? Thanks! -- Andrei

Test that its absolute value is <= the largest unsigned value
represented by the float's mantissa bits.

What is the largest unsigned value represented by the float's
mantissa bits? I recall there is a constant somewhere for it.
-- Andrei

2uL^^float.mant_dig

And the same for double. For real (on x86), mant_dig is 64, so it
can represent any ulong precisely.

--
Simen
```
Jul 02 2016    Matthias Bentrup <matthias.bentrup googlemail.com> writes:
```On Saturday, 2 July 2016 at 20:30:03 UTC, Walter Bright wrote:
On 7/2/2016 1:17 PM, Andrei Alexandrescu wrote:
So what's the fastest way to figure that an integral is
convertible to a
floating point value precisely (i.e. no other integral
converts to the same
floating point value)? Thanks! -- Andrei

Test that its absolute value is <= the largest unsigned value
represented by the float's mantissa bits.

That has to be '<' given the condition that no other integer
converts to
the same value. Although 2^n can be represented exactly, 2^n+1
would be converted to the same float value.
```
Jul 03 2016
```On Saturday, 2 July 2016 at 20:17:59 UTC, Andrei Alexandrescu
wrote:
So what's the fastest way to figure that an integral is
convertible to a floating point value precisely (i.e. no other
integral converts to the same floating point value)? Thanks! --
Andrei

bool isConvertible(T) (long n) if (is(T == float) || is(T ==
double)) {
pragma(inline, true);
static if (is(T == float)) {
return (((n+(n>>63))^(n>>63))&0xffffffffff000000UL) == 0;
} else {
return (((n+(n>>63))^(n>>63))&0xffe0000000000000UL) == 0;
}
}
```
Jul 02 2016
```On Saturday, 2 July 2016 at 20:17:59 UTC, Andrei Alexandrescu
wrote:
So what's the fastest way to figure that an integral is
convertible to a floating point value precisely (i.e. no other
integral converts to the same floating point value)? Thanks! --
Andrei

If it is within what the mantissa can represent then it is easy.
But you also have to consider the cases where the mantissa is
shifted.

n is an unsigned 64 bit integer

mbits = representation bits for mantissa +1

tz = trailing_zero_bits(n)

assert(mbits >= (64 - tz - lz))
```
Jul 03 2016
```On Sunday, 3 July 2016 at 09:08:14 UTC, Ola Fosheim Grøstad wrote:
On Saturday, 2 July 2016 at 20:17:59 UTC, Andrei Alexandrescu
wrote:
So what's the fastest way to figure that an integral is
convertible to a floating point value precisely (i.e. no other
integral converts to the same floating point value)? Thanks!
-- Andrei

If it is within what the mantissa can represent then it is
easy. But you also have to consider the cases where the
mantissa is shifted.

n is an unsigned 64 bit integer

mbits = representation bits for mantissa +1

tz = trailing_zero_bits(n)

assert(mbits >= (64 - tz - lz))

This is the correct answer for another definition of "precisely
convertible", not the one Andrei gave.
```
Jul 03 2016
```On Sunday, 3 July 2016 at 09:52:38 UTC, Guillaume Boucher wrote:
This is the correct answer for another definition of "precisely
convertible", not the one Andrei gave.

True, I see now that he actually asked for unique representation,
not precisely convertible.
```
Jul 03 2016
```On 07/03/2016 05:52 AM, Guillaume Boucher wrote:
On Sunday, 3 July 2016 at 09:08:14 UTC, Ola Fosheim Grøstad wrote:
On Saturday, 2 July 2016 at 20:17:59 UTC, Andrei Alexandrescu wrote:
So what's the fastest way to figure that an integral is convertible
to a floating point value precisely (i.e. no other integral converts
to the same floating point value)? Thanks! -- Andrei

If it is within what the mantissa can represent then it is easy. But
you also have to consider the cases where the mantissa is shifted.

n is an unsigned 64 bit integer

mbits = representation bits for mantissa +1

tz = trailing_zero_bits(n)

assert(mbits >= (64 - tz - lz))

This is the correct answer for another definition of "precisely
convertible", not the one Andrei gave.

Well to be more precise here's what I'm looking for. When you compare an
integral with a floating point number, the integral is first converted
to floating point format. I.e. for long x and double y, x == y is the
same as double(x) == y.

Now, say we want to eliminate the "bad" cases of this comparison. Those
would make it non-transitive. Consider two distinct longs x1 and x2. If
they convert to the same double y, then x1 == y and x2 == y are true,
which is contradictory with x1 != x2.

Andrei
```
Jul 03 2016
```On Sunday, 3 July 2016 at 11:49:15 UTC, Andrei Alexandrescu wrote:
Well to be more precise here's what I'm looking for. When you
compare an integral with a floating point number, the integral
is first converted to floating point format. I.e. for long x
and double y, x == y is the same as double(x) == y.

Now, say we want to eliminate the "bad" cases of this
comparison. Those would make it non-transitive. Consider two
distinct longs x1 and x2. If they convert to the same double y,
then x1 == y and x2 == y are true, which is contradictory with
x1 != x2.

If you assume round-to-even rounding mode then you get unique
representations for integers in the ranges as other people have
suggested:

int_to_float : [-((1<<24)-1) , (1<<24)-1]

int_to_double : [-((1<<53)-1) , (1<<53)-1]
```
Jul 03 2016
```On Sunday, 3 July 2016 at 13:41:27 UTC, Ola Fosheim Grøstad wrote:
On Sunday, 3 July 2016 at 11:49:15 UTC, Andrei Alexandrescu
wrote:
Well to be more precise here's what I'm looking for. When you
compare an integral with a floating point number, the integral
is first converted to floating point format. I.e. for long x
and double y, x == y is the same as double(x) == y.

Now, say we want to eliminate the "bad" cases of this
comparison. Those would make it non-transitive. Consider two
distinct longs x1 and x2. If they convert to the same double
y, then x1 == y and x2 == y are true, which is contradictory
with x1 != x2.

If you assume round-to-even rounding mode then you get unique
representations for integers in the ranges as other people have
suggested:

int_to_float : [-((1<<24)-1) , (1<<24)-1]

int_to_double : [-((1<<53)-1) , (1<<53)-1]

Seems to me there are some assumptions being made here that
only the permanent storage bits are in play.  You've covered