www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Questions about IEEE754 floating point in D

reply Trip Volpe <mraccident gmail.com> writes:
I'm currently writing a compiler for my own language in D, and one of the
things I'm implementing at the moment is the processing of floating-point
literals. My primary reference is William Clinger's "How to read floating point
numbers accurately," which is available here:
ftp://ftp.ccs.neu.edu/pub/people/will/howtoread.pdf

Clinger describes a method for guaranteeing the selection of the binary
floating-point number most closely approximating the number input in decimal
(or any other base). So far so good, but along the way I've had occasion to
consider what D itself is doing, and I have a couple of questions:

1. Does D guarantee the closest approximation for decimal floating-point
literals? I ask mainly because for unit testing it would be convenient to be
able to write

    expect( 0.001 == nearestDouble( 1, -3, 10 ) );

as opposed to manually checking the mantissa and exponent. :-)


2. Minimum exponent. In D, double.min_exp is equal to -1021. However, the
Wikipedia article on IEEE754-2008 and appendix D in Sun's Numerical Computation
Guide ("What Every Computer Scientist Should Know About Floating-Point
Arithmetic", http://docs.sun.com/source/806-3568/ncg_goldberg.html) list Emin
for the IEEE754 double format as -1022. Is this an error?

As expected under the standard, D has no trouble producing a normalized double
with exponent less than -1021:

     DoubleRep dr;
     dr.value = 0x1p-1022;
     writefln("f = %d, e = %d", dr.fraction, dr.exponent);

This prints "f = 0, e = 1", which corresponds to a mantissa of 1.0 and an
exponent of -1022, as expected. If you try 0x1p-1023, you get a denormal, also
as expected, with an exponent field of 0. Subtract DoubleRep.bias and you get
-1023, which according to the standard must be Emin - 1. So why isn't
double.min_exp equal to -1022?
Feb 21 2010
parent reply Don <nospam nospam.com> writes:
Trip Volpe wrote:
 1. Does D guarantee the closest approximation for decimal floating-point
literals?
Not at present.
 
 2. Minimum exponent. In D, double.min_exp is equal to -1021. However, the
Wikipedia article on IEEE754-2008 and appendix D in Sun's Numerical Computation
Guide ("What Every Computer Scientist Should Know About Floating-Point
Arithmetic", http://docs.sun.com/source/806-3568/ncg_goldberg.html) list Emin
for the IEEE754 double format as -1022. Is this an error?
[snip] There are no errors anywhere. However, double.min_exp is defined in the spec as "the minimum value such that 2^^min_exp-1 is representable as a normalized value". This means that 2^^min_exp-1 == 2^^Emin. And indeed min_exp-1 == -1022 = Emin. I have no idea why min_exp is defined in such a peculiar way. In particular, I don't know why it's different from the definition of min_10_exp. It seems bizarre and useless. I've had a sudden thought though -- DMC/DMD used to have an out-by-1 bug in the %a format for denormals. Maybe this behaviour isn't intentional, but was rather a mistake, caused by that? Note to Walter: I changed min --> min_normal in the ddoc for my 'floatingpoint' article long ago, but it hasn't been copied into the download.
Feb 22 2010
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Don:
 I have no idea why min_exp is defined in such a peculiar way. In 
 particular, I don't know why it's different from the definition of 
 min_10_exp. It seems bizarre and useless.
If you spot problems in such things it's MUCH better if you try to discuss&fix them now than never :-) Bye and thank you, bearophile
Feb 22 2010
prev sibling parent reply Don <nospam nospam.com> writes:
Don wrote:
 Trip Volpe wrote:
 1. Does D guarantee the closest approximation for decimal 
 floating-point literals?
Not at present.
 2. Minimum exponent. In D, double.min_exp is equal to -1021. However, 
 the Wikipedia article on IEEE754-2008 and appendix D in Sun's 
 Numerical Computation Guide ("What Every Computer Scientist Should 
 Know About Floating-Point Arithmetic", 
 http://docs.sun.com/source/806-3568/ncg_goldberg.html) list Emin for 
 the IEEE754 double format as -1022. Is this an error?
[snip] There are no errors anywhere. However, double.min_exp is defined in the spec as "the minimum value such that 2^^min_exp-1 is representable as a normalized value". This means that 2^^min_exp-1 == 2^^Emin. And indeed min_exp-1 == -1022 = Emin. I have no idea why min_exp is defined in such a peculiar way. In particular, I don't know why it's different from the definition of min_10_exp. It seems bizarre and useless.
For C++, DBL_MIN_EXP = -1021. http://www.qnx.com/developers/docs/6.4.1/dinkum_en/ecpp/float.html And it is defined as the minimum integer such that FLT_RADIX^^(DBL_MIN_EXP - 1) is a normalized, finite representable value of type double. I have a vague recollection that this bizarre definition is for compatibility with an ancient mistake in C. Some clown miscalculated it, and by the time people realized, they felt it was too late to fix it.
Feb 23 2010
parent reply "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
Don wrote:
 Don wrote:
 Trip Volpe wrote:
 1. Does D guarantee the closest approximation for decimal 
 floating-point literals?
Not at present.
 2. Minimum exponent. In D, double.min_exp is equal to -1021. However, 
 the Wikipedia article on IEEE754-2008 and appendix D in Sun's 
 Numerical Computation Guide ("What Every Computer Scientist Should 
 Know About Floating-Point Arithmetic", 
 http://docs.sun.com/source/806-3568/ncg_goldberg.html) list Emin for 
 the IEEE754 double format as -1022. Is this an error?
[snip] There are no errors anywhere. However, double.min_exp is defined in the spec as "the minimum value such that 2^^min_exp-1 is representable as a normalized value". This means that 2^^min_exp-1 == 2^^Emin. And indeed min_exp-1 == -1022 = Emin. I have no idea why min_exp is defined in such a peculiar way. In particular, I don't know why it's different from the definition of min_10_exp. It seems bizarre and useless.
For C++, DBL_MIN_EXP = -1021. http://www.qnx.com/developers/docs/6.4.1/dinkum_en/ecpp/float.html And it is defined as the minimum integer such that FLT_RADIX^^(DBL_MIN_EXP - 1) is a normalized, finite representable value of type double. I have a vague recollection that this bizarre definition is for compatibility with an ancient mistake in C. Some clown miscalculated it, and by the time people realized, they felt it was too late to fix it.
Then it sounds like something D should get right. -Lars
Feb 23 2010
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Lars T. Kyllingstad wrote:
 Don wrote:
 I have a vague recollection that this bizarre definition is for 
 compatibility with an ancient mistake in C. Some clown miscalculated 
 it, and by the time people realized, they felt it was too late to fix it.
Then it sounds like something D should get right.
There's a problem with that - porting working C numerics code to D.
Feb 23 2010
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Walter Bright wrote:
 Lars T. Kyllingstad wrote:
 Don wrote:
 I have a vague recollection that this bizarre definition is for 
 compatibility with an ancient mistake in C. Some clown miscalculated 
 it, and by the time people realized, they felt it was too late to fix 
 it.
Then it sounds like something D should get right.
There's a problem with that - porting working C numerics code to D.
Let's do what we know works - define the right thing with a different name, and deprecate the existing name. Andrei
Feb 23 2010
prev sibling parent Don <nospam nospam.com> writes:
Walter Bright wrote:
 Lars T. Kyllingstad wrote:
 Don wrote:
 I have a vague recollection that this bizarre definition is for 
 compatibility with an ancient mistake in C. Some clown miscalculated 
 it, and by the time people realized, they felt it was too late to fix 
 it.
Then it sounds like something D should get right.
There's a problem with that - porting working C numerics code to D.
That's not a concern. The syntax is completely different, so it always requires thought. We could perhaps define float.min_2_exp with the correct value (by analogy with float.min_10_exp) and get rid of .min_exp. If porting C code mechanically, you'll just import core.stdc.float_; It contains the line: enum DBL_MIN_EXP = double.min_exp; which could be changed to: enum DBL_MIN_EXP = double.min_2_exp + 1; It's not a big deal, but it's certainly something we could fix.
Feb 23 2010