www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - a way to specily floating-point numbers as bit patters

reply ketmar <ketmar ketmar.no-ip.org> writes:
let's say that i have precomputed some `float`-typed tables, and now i want 
to use 'em in my code. for example, a table for Lagrange series. the table 
itself is like 10 numbers, but the code calculating it rather big, and it 
depends of architecture (so it can yield different results on different 
arch). yet i want the same IEEE numbers everywhere.

the code problem is that i really can't express *exactly* *the* *same* IEEE
float 
with D. let me illustrate this.

one of my calculated values is `-0.166667`, which has bit-pattern of 0xBE2AAAB7.
now, let's say i want to use this number in my code:

	float v = -0.166667f;
	writefln("%f 0x%08X", v, *cast(uint*)&v);

oooops. "-0.166667 0xBE2AAAC1". it's not the same! (and yes, it matters).

and this means that i can't inline my calculated values! each time i want 
to get floating value with a known bit-pattern, i have to store it as uint, 
and resort to ugly hack: `*cast(immutable(float)*)(&mytable[2])`.

and i can't do this trick in CTFE, as such pointer tricks aren't permitted.

i tried different workarounds, but they're all ugly. it would be very nice 
to have a way to define IEEE floating point numbers as bit-patterns in the 
language itself. what do you think?

yes, i know that floating numbers has to be loaded from memory anyway, so 
my trick is not really worse than specifying the constant directly. but it 
is still dirty trick, and i cannot use it in  safe code too, so i have to 
mark my code as  trusted, which is not the same at all. there probably may 
be other trick such this for  safe code, but... you got the idea, i think.
Jun 09
next sibling parent reply Honey <honey pot.com> writes:
On Friday, 9 June 2017 at 16:07:36 UTC, ketmar wrote:
 one of my calculated values is `-0.166667`, which has 
 bit-pattern of 0xBE2AAAB7.
 now, let's say i want to use this number in my code:

 	float v = -0.166667f;
 	writefln("%f 0x%08X", v, *cast(uint*)&v);

 oooops. "-0.166667 0xBE2AAAC1". it's not the same! (and yes, it 
 matters).
Try -0.16666685f.
Jun 09
parent reply ketmar <ketmar ketmar.no-ip.org> writes:
Honey wrote:

 On Friday, 9 June 2017 at 16:07:36 UTC, ketmar wrote:
 one of my calculated values is `-0.166667`, which has bit-pattern of 
 0xBE2AAAB7.
 now, let's say i want to use this number in my code:

 	float v = -0.166667f;
 	writefln("%f 0x%08X", v, *cast(uint*)&v);

 oooops. "-0.166667 0xBE2AAAC1". it's not the same! (and yes, it matters).
Try -0.16666685f.
it doesn't matter if i can find the decimal representation for the given bit pattern or not. the whole post is about removing the need to rely on lossy binary->decimal->binary conversions.
Jun 09
parent reply Honey <honey pot.com> writes:
On Friday, 9 June 2017 at 16:34:28 UTC, ketmar wrote:
 Try -0.16666685f.
it doesn't matter if i can find the decimal representation for the given bit pattern or not. the whole post is about removing the need to rely on lossy binary->decimal->binary conversions.
Lossless turn-around is guaranteed if you are using sufficiently many digits. In case of IEEE-754 single precision it's 8 significant decimal digits.
Jun 09
next sibling parent Honey <honey pot.com> writes:
On Friday, 9 June 2017 at 16:41:01 UTC, Honey wrote:
 Lossless turn-around is guaranteed if you are using 
 sufficiently many digits. In case of IEEE-754 single precision 
 it's 8 significant decimal digits.
s/turn-around/recovery/g s/8/9/g :-P
Jun 09
prev sibling parent reply ketmar <ketmar ketmar.no-ip.org> writes:
Honey wrote:

 On Friday, 9 June 2017 at 16:34:28 UTC, ketmar wrote:
 Try -0.16666685f.
it doesn't matter if i can find the decimal representation for the given bit pattern or not. the whole post is about removing the need to rely on lossy binary->decimal->binary conversions.
Lossless turn-around is guaranteed if you are using sufficiently many digits. In case of IEEE-754 single precision it's 8 significant decimal digits.
it is highly platform-dependent. and both bin->dec, and dec->bin conversion routines can contain errors, btw. so using decimal forms for exact bit-patterns is the last thing i want to do, as i know how fragile they are.
Jun 09
parent Honey <honey pot.com> writes:
On Friday, 9 June 2017 at 17:25:22 UTC, ketmar wrote:
 it is highly platform-dependent. and both bin->dec, and 
 dec->bin conversion routines can contain errors, btw. so using 
 decimal forms for exact bit-patterns is the last thing i want 
 to do, as i know how fragile they are.
Sure. Hex-format is the right choice if you want a fixed bit pattern.
Jun 09
prev sibling next sibling parent reply Basile B. <b2.temp gmx.com> writes:
On Friday, 9 June 2017 at 16:07:36 UTC, ketmar wrote:
 let's say that i have precomputed some `float`-typed tables, 
 and now i want to use 'em in my code. for example, a table for 
 Lagrange series. the table itself is like 10 numbers, but the 
 code calculating it rather big, and it depends of architecture 
 (so it can yield different results on different arch). yet i 
 want the same IEEE numbers everywhere.

 the code problem is that i really can't express *exactly* *the* 
 *same* IEEE float with D. let me illustrate this.

 one of my calculated values is `-0.166667`, which has 
 bit-pattern of 0xBE2AAAB7.
 now, let's say i want to use this number in my code:

 	float v = -0.166667f;
 	writefln("%f 0x%08X", v, *cast(uint*)&v);

 oooops. "-0.166667 0xBE2AAAC1". it's not the same! (and yes, it 
 matters).
-0.166667f is not representable as a 32 bit float. The actuall value that's stored is -0.16666699945926666259765625, hence the difference. See https://www.h-schmidt.net/FloatConverter/IEEE754.html and enter your value in the field labeled "You entered".
 and this means that i can't inline my calculated values! each 
 time i want to get floating value with a known bit-pattern, i 
 have to store it as uint, and resort to ugly hack: 
 `*cast(immutable(float)*)(&mytable[2])`.

 and i can't do this trick in CTFE, as such pointer tricks 
 aren't permitted.

 i tried different workarounds, but they're all ugly. it would 
 be very nice to have a way to define IEEE floating point 
 numbers as bit-patterns in the language itself. what do you 
 think?
Yes, easy to do, a template alà octal or hexString.
 yes, i know that floating numbers has to be loaded from memory 
 anyway, so my trick is not really worse than specifying the 
 constant directly. but it is still dirty trick, and i cannot 
 use it in  safe code too, so i have to mark my code as 
  trusted, which is not the same at all. there probably may be 
 other trick such this for  safe code, but... you got the idea, 
 i think.
Jun 09
parent reply ketmar <ketmar ketmar.no-ip.org> writes:
Basile B. wrote:

 oooops. "-0.166667 0xBE2AAAC1". it's not the same! (and yes, it matters).
-0.166667f is not representable as a 32 bit float. The actuall value that's stored is -0.16666699945926666259765625, hence the difference. See https://www.h-schmidt.net/FloatConverter/IEEE754.html and enter your value in the field labeled "You entered".
i'm completely aware about floating point values representation, and problems with bin->dec->bin conversions of floats. the post is about avoiding those conversions at all.
 and this means that i can't inline my calculated values! each time i 
 want to get floating value with a known bit-pattern, i have to store it 
 as uint, and resort to ugly hack: 
 `*cast(immutable(float)*)(&mytable[2])`.

 and i can't do this trick in CTFE, as such pointer tricks aren't 
 permitted.

 i tried different workarounds, but they're all ugly. it would be very 
 nice to have a way to define IEEE floating point numbers as bit-patterns 
 in the language itself. what do you think?
Yes, easy to do, a template alà octal or hexString.
can you show it, please? remember, CTFE-able!
Jun 09
parent reply Basile B. <b2.temp gmx.com> writes:
On Friday, 9 June 2017 at 16:42:31 UTC, ketmar wrote:
 Basile B. wrote:
 Yes, easy to do, a template alà octal or hexString.
can you show it, please? remember, CTFE-able!
Sure, here's a dirty draft: template binFloat(string sign, string exp, string mant) { enum s = sign == "+" ? "0" : "1"; const b = mixin("0b" ~ s ~ exp ~ mant); pragma(msg,"0b" ~ s ~ exp ~ mant); enum binFloat = *cast(float*) &b; } unittest { static assert(binFloat!("+", "10000000", "10000000000000000000000") == 3.0f); static assert(binFloat!("-", "10000000", "10000000000000000000000") == -3.0f); }
Jun 09
parent reply ketmar <ketmar ketmar.no-ip.org> writes:
Basile B. wrote:

      enum binFloat = *cast(float*) &b;
i was SO sure that this won't work in CTFE that i didn't even tried to do it. "it will fail anyway, there is no reason in trying!" ;-)
Jun 09
parent reply Basile B. <b2.temp gmx.com> writes:
On Friday, 9 June 2017 at 17:18:43 UTC, ketmar wrote:
 Basile B. wrote:

      enum binFloat = *cast(float*) &b;
i was SO sure that this won't work in CTFE that i didn't even tried to do it. "it will fail anyway, there is no reason in trying!" ;-)
You can do the arithmetic as well. I don't know why but i supposed that my static asserts were a proof of CTFE-ability.
Jun 09
parent ketmar <ketmar ketmar.no-ip.org> writes:
Basile B. wrote:

 On Friday, 9 June 2017 at 17:18:43 UTC, ketmar wrote:
 Basile B. wrote:

      enum binFloat = *cast(float*) &b;
i was SO sure that this won't work in CTFE that i didn't even tried to do it. "it will fail anyway, there is no reason in trying!" ;-)
You can do the arithmetic as well. I don't know why but i supposed that my static asserts were a proof of CTFE-ability.
yeah, it is CTFEable. now i recall that such casting for floats was special-cased in interpreter exactly to allow this kind of things. my bad.
Jun 09
prev sibling parent reply Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Friday, 9 June 2017 at 16:07:36 UTC, ketmar wrote:
 let's say that i have precomputed some `float`-typed tables, 
 and now i want to use 'em in my code. for example, a table for 
 Lagrange series. the table itself is like 10 numbers, but the 
 code calculating it rather big, and it depends of architecture 
 (so it can yield different results on different arch). yet i 
 want the same IEEE numbers everywhere.

 the code problem is that i really can't express *exactly* *the* 
 *same* IEEE float with D. let me illustrate this.

 one of my calculated values is `-0.166667`, which has 
 bit-pattern of 0xBE2AAAB7.
 now, let's say i want to use this number in my code:

 	float v = -0.166667f;
 	writefln("%f 0x%08X", v, *cast(uint*)&v);

 oooops. "-0.166667 0xBE2AAAC1". it's not the same! (and yes, it 
 matters).

 and this means that i can't inline my calculated values! each 
 time i want to get floating value with a known bit-pattern, i 
 have to store it as uint, and resort to ugly hack: 
 `*cast(immutable(float)*)(&mytable[2])`.

 and i can't do this trick in CTFE, as such pointer tricks 
 aren't permitted.

 i tried different workarounds, but they're all ugly. it would 
 be very nice to have a way to define IEEE floating point 
 numbers as bit-patterns in the language itself. what do you 
 think?

 yes, i know that floating numbers has to be loaded from memory 
 anyway, so my trick is not really worse than specifying the 
 constant directly. but it is still dirty trick, and i cannot 
 use it in  safe code too, so i have to mark my code as 
  trusted, which is not the same at all. there probably may be 
 other trick such this for  safe code, but... you got the idea, 
 i think.
Do HexFloats (http://dlang.org/spec/lex#HexFloat) help?
Jun 09
parent reply ketmar <ketmar ketmar.no-ip.org> writes:
Petar Kirov [ZombineDev] wrote:

 Do HexFloats (http://dlang.org/spec/lex#HexFloat) help?
hm. i somehow completely missed "%a" format specifier! yeah, "-0x1.55556ep-3" did the trick. tnx. i should do my homework *before* posting big rants, lol.
Jun 09
parent Nicholas Wilson <iamthewilsonator hotmail.com> writes:
On Friday, 9 June 2017 at 16:56:46 UTC, ketmar wrote:
 Petar Kirov [ZombineDev] wrote:

 Do HexFloats (http://dlang.org/spec/lex#HexFloat) help?
hm. i somehow completely missed "%a" format specifier! yeah, "-0x1.55556ep-3" did the trick. tnx. i should do my homework *before* posting big rants, lol.
There's also FloatRep in std.bitmanip.
Jun 09