www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 360] New: Compile-time floating-point calculations are sometimes inconsistent

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=360

           Summary: Compile-time floating-point calculations are sometimes
                    inconsistent
           Product: D
           Version: 0.167
          Platform: PC
        OS/Version: Windows
            Status: NEW
          Severity: normal
          Priority: P2
         Component: DMD
        AssignedTo: bugzilla digitalmars.com
        ReportedBy: digitalmars-com baysmith.com


The following code should print false before it exits.

import std.stdio;

void main() {
        const float STEP_SIZE = 0.2f;


        float j = 0.0f;
        while (j <= ( 1.0f / STEP_SIZE)) {
                j += 1.0f;
                writefln(j <= ( 1.0f / STEP_SIZE));
        }

}

This problem does not occur when:
1. the code is optimized
2. STEP_SIZE is not a const
3. STEP_SIZE is a real


-- 
Sep 21 2006
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=360


bugzilla digitalmars.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID




------- Comment #1 from bugzilla digitalmars.com  2006-09-21 18:31 -------
The example is mixing up 3 different precisions - 32, 64, and 80 bit. Each
involves different rounding of unrepresentable numbers like 0.2. In this case,
the 1.0f/STEP_SIZE is calculated at different precisions based on how things
are compiled. Constant folding, for example, is done at compile time and done
at max precision even if the variables involved are floats.

The D language allows this, the guiding principle is that algorithms should be
designed to not fail if precision is increased.

Not a bug.


-- 
Sep 21 2006
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=360





------- Comment #2 from digitalmars-com baysmith.com  2006-09-21 18:46 -------
*** Bug 361 has been marked as a duplicate of this bug. ***


-- 
Sep 21 2006
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=360





------- Comment #3 from digitalmars-com baysmith.com  2006-09-21 18:51 -------
Why are the expressions in the while and writefln statements calculated at
different precisions?

Wouldn't the constant folding be done the same for both?


-- 
Sep 21 2006
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=360





------- Comment #4 from bugzilla digitalmars.com  2006-09-21 20:17 -------
while (j <= (1.0f/STEP_SIZE)) is at double precision,
writefln((j += 1.0f) <= (1.0f/STEP_SIZE)) is at real precision.


-- 
Sep 21 2006
prev sibling next sibling parent reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=360





------- Comment #5 from clugdbug yahoo.com.au  2006-09-22 02:25 -------
(In reply to comment #4)
 while (j <= (1.0f/STEP_SIZE)) is at double precision,
 writefln((j += 1.0f) <= (1.0f/STEP_SIZE)) is at real precision.

I don't understand where the double precision comes from. Since all the values are floats, the only precisions that make sense are float and reals. Really, 0.2f should not be the same number as 0.2. When you put the 'f' suffix on, surely you're asking the compiler to truncate the precision. It can be expanded to real precision later without problems. Currently, there's no way to get a low-precision constant at compile time. (In fact, you should be able to write real a = 0.2 - 0.2f; to get the truncation error). Here's how I think it should work: const float A = 0.2; // infinitely accurate 0.2, but type inference on A should return a float. const float B = 0.2f; // a 32-bit approximation to 0.2 const real C = 0.2; // infinitely accurate 0.2 const real D = 0.2f; // a 32-bit approximation to 0.2, but type inference will give an 80-bit quantity. --
Sep 22 2006
parent reply Walter Bright <newshound digitalmars.com> writes:
d-bugmail puremagic.com wrote:
 ------- Comment #5 from clugdbug yahoo.com.au  2006-09-22 02:25 -------
 (In reply to comment #4)
 while (j <= (1.0f/STEP_SIZE)) is at double precision,
 writefln((j += 1.0f) <= (1.0f/STEP_SIZE)) is at real precision.

are floats, the only precisions that make sense are float and reals.

The compiler is allowed to evaluate intermediate results at a greater precision than that of the operands.
 Really, 0.2f should not be the same number as 0.2.

0.2 is not representable exactly, the only question is how much precision is there in the representation.
 When you put the 'f' suffix
 on, surely you're asking the compiler to truncate the precision.

Not in D. The 'f' suffix only indicates the type. The compiler may maintain internally as much precision as possible, for purposes of constant folding. Committing the actual precision of the result is done as late as possible.
 It can be
 expanded to real precision later without problems. Currently, there's no way to
 get a low-precision constant at compile time.

You can by putting the constant into a static, non-const variable. Then it cannot be constant folded.
 (In fact, you should be able to write real a = 0.2 - 0.2f; to get the
 truncation error).

Not in D, where the compiler is allowed to evaluate using as much precision as possible for purposes of constant folding. The vast majority of calculations benefit from delaying rounding as long as possible, hence D's bias towards using as much precision as possible. The way to write robust floating point calculations in D is to ensure that increasing the precision of the calculations will not break the result. Early versions of Java insisted that rounding to precision of floating point intermediate results always happened. While this ensured consistency of results, it mostly resulted in consistently getting inferior and wrong answers.
Sep 22 2006
next sibling parent reply Don Clugston <dac nospam.com.au> writes:
Walter Bright wrote:
 d-bugmail puremagic.com wrote:
 ------- Comment #5 from clugdbug yahoo.com.au  2006-09-22 02:25 -------
 (In reply to comment #4)
 while (j <= (1.0f/STEP_SIZE)) is at double precision,
 writefln((j += 1.0f) <= (1.0f/STEP_SIZE)) is at real precision.

the values are floats, the only precisions that make sense are float and reals.

The compiler is allowed to evaluate intermediate results at a greater precision than that of the operands.
 Really, 0.2f should not be the same number as 0.2.

0.2 is not representable exactly, the only question is how much precision is there in the representation.
 When you put the 'f' suffix
 on, surely you're asking the compiler to truncate the precision.

Not in D. The 'f' suffix only indicates the type.

And therefore, it only matters in implicit type deduction, and in function overloading. As I discuss below, I'm not sure that it's necessary even there. In many cases, it's clearly a programmer error. For example in real BAD = 0.2f; where the f has absolutely no effect. The compiler may
 maintain internally as much precision as possible, for purposes of 
 constant folding. Committing the actual precision of the result is done 
 as late as possible.
 
 It can be
 expanded to real precision later without problems. Currently, there's 
 no way to
 get a low-precision constant at compile time.

You can by putting the constant into a static, non-const variable. Then it cannot be constant folded.

Actually, in this case you still want it to be constant folded.
 
 (In fact, you should be able to write real a = 0.2 - 0.2f; to get the
 truncation error).

Not in D, where the compiler is allowed to evaluate using as much precision as possible for purposes of constant folding. The vast majority of calculations benefit from delaying rounding as long as possible, hence D's bias towards using as much precision as possible. The way to write robust floating point calculations in D is to ensure that increasing the precision of the calculations will not break the result. Early versions of Java insisted that rounding to precision of floating point intermediate results always happened. While this ensured consistency of results, it mostly resulted in consistently getting inferior and wrong answers.

I agree. But it seems that D is currently in a halfway house on this issue. Somehow, 'double' is privileged, and don't think it's got any right to be. const XXX = 0.123456789123456789123456789f; const YYY = 1 * XXX; const ZZZ = 1.0 * XXX; auto xxx = XXX; auto yyy = YYY; auto zzz = ZZZ; // now xxx and yyy are floats, but zzz is a double. Multiplying by '1.0' causes a float constant to be promoted to double. real a = xxx; real b = zzz; real c = XXX; Now a, b, and c all have different values. Whereas the same operation at runtime causes it to be promoted to real. Is there any reason why implicit type deduction on a floating point constant doesn't always default to real? After all, you're saying "I don't particularly care what type this is" -- why not default to maximum accuracy? Concrete example: real a = sqrt(1.1); This only gives a double precision result. You have to write real a = sqrt(1.1L); instead. It's easier to do the wrong thing, than the right thing. IMHO, unless you specifically take other steps, implicit type deduction should always default to the maximum accuracy the machine could do.
Sep 22 2006
next sibling parent Dave <Dave_member pathlink.com> writes:
Don Clugston wrote:
 Walter Bright wrote:
 d-bugmail puremagic.com wrote:
 ------- Comment #5 from clugdbug yahoo.com.au  2006-09-22 02:25 -------
 (In reply to comment #4)
 while (j <= (1.0f/STEP_SIZE)) is at double precision,
 writefln((j += 1.0f) <= (1.0f/STEP_SIZE)) is at real precision.

the values are floats, the only precisions that make sense are float and reals.

The compiler is allowed to evaluate intermediate results at a greater precision than that of the operands.
 Really, 0.2f should not be the same number as 0.2.

0.2 is not representable exactly, the only question is how much precision is there in the representation.
 When you put the 'f' suffix
 on, surely you're asking the compiler to truncate the precision.

Not in D. The 'f' suffix only indicates the type.

And therefore, it only matters in implicit type deduction, and in function overloading. As I discuss below, I'm not sure that it's necessary even there. In many cases, it's clearly a programmer error. For example in real BAD = 0.2f; where the f has absolutely no effect. The compiler may
 maintain internally as much precision as possible, for purposes of 
 constant folding. Committing the actual precision of the result is 
 done as late as possible.

 It can be
 expanded to real precision later without problems. Currently, there's 
 no way to
 get a low-precision constant at compile time.

You can by putting the constant into a static, non-const variable. Then it cannot be constant folded.

Actually, in this case you still want it to be constant folded.
 (In fact, you should be able to write real a = 0.2 - 0.2f; to get the
 truncation error).

Not in D, where the compiler is allowed to evaluate using as much precision as possible for purposes of constant folding. The vast majority of calculations benefit from delaying rounding as long as possible, hence D's bias towards using as much precision as possible. The way to write robust floating point calculations in D is to ensure that increasing the precision of the calculations will not break the result. Early versions of Java insisted that rounding to precision of floating point intermediate results always happened. While this ensured consistency of results, it mostly resulted in consistently getting inferior and wrong answers.

I agree. But it seems that D is currently in a halfway house on this issue. Somehow, 'double' is privileged, and don't think it's got any right to be. const XXX = 0.123456789123456789123456789f; const YYY = 1 * XXX; const ZZZ = 1.0 * XXX; auto xxx = XXX; auto yyy = YYY; auto zzz = ZZZ; // now xxx and yyy are floats, but zzz is a double. Multiplying by '1.0' causes a float constant to be promoted to double. real a = xxx; real b = zzz; real c = XXX; Now a, b, and c all have different values. Whereas the same operation at runtime causes it to be promoted to real. Is there any reason why implicit type deduction on a floating point constant doesn't always default to real? After all, you're saying "I don't particularly care what type this is" -- why not default to maximum accuracy? Concrete example: real a = sqrt(1.1); This only gives a double precision result. You have to write real a = sqrt(1.1L); instead. It's easier to do the wrong thing, than the right thing. IMHO, unless you specifically take other steps, implicit type deduction should always default to the maximum accuracy the machine could do.

Great point.
Sep 22 2006
prev sibling parent reply Walter Bright <newshound digitalmars.com> writes:
Don Clugston wrote:
 Walter Bright wrote:
 Not in D. The 'f' suffix only indicates the type.

And therefore, it only matters in implicit type deduction, and in function overloading. As I discuss below, I'm not sure that it's necessary even there. In many cases, it's clearly a programmer error. For example in real BAD = 0.2f; where the f has absolutely no effect.

It may come about as a result of source code generation, though, so I'd be reluctant to make it an error.
 You can by putting the constant into a static, non-const variable. 
 Then it cannot be constant folded.

Actually, in this case you still want it to be constant folded.

A static variable's value can change, so it can't be constant folded. To have it participate in constant folding, it needs to be declared as const.
 I agree. But it seems that D is currently in a halfway house on this 
 issue. Somehow, 'double' is privileged, and don't think it's got any 
 right to be.
 
     const XXX = 0.123456789123456789123456789f;
     const YYY = 1 * XXX;
     const ZZZ = 1.0 * XXX;
 
    auto xxx = XXX;
    auto yyy = YYY;
    auto zzz = ZZZ;
 
 // now xxx and yyy are floats, but zzz is a double.
 Multiplying by '1.0' causes a float constant to be promoted to double.

That's because 1.0 is a double. A double*float => double.
    real a = xxx;
    real b = zzz;
    real c = XXX;
 
 Now a, b, and c all have different values.
 
 Whereas the same operation at runtime causes it to be promoted to real.
 
 Is there any reason why implicit type deduction on a floating point 
 constant doesn't always default to real? After all, you're saying "I 
 don't particularly care what type this is" -- why not default to maximum 
 accuracy?
 
 Concrete example:
 
 real a = sqrt(1.1);
 
 This only gives a double precision result. You have to write
 real a = sqrt(1.1L);
 instead.
 It's easier to do the wrong thing, than the right thing.
 
 IMHO, unless you specifically take other steps, implicit type deduction 
 should always default to the maximum accuracy the machine could do.

It is a good idea, but isn't that way for the reasons: 1) It's the way C, C++, and Fortran work. Changing the promotion rules would mean that, when translating solid, reliable libraries from those languages to D, one would have to be very, very careful. 2) Float and double are expected to be implemented in hardware. Longer precisions are often not available. I wanted to make it practical for a D implementation on those machines to provide a software long precision floating point type, rather than just making real==double. Such a type would be very slow compared with double. 3) Real, even in hardware, is significantly slower than double. Doing constant folding at max precision at compile time won't affect runtime performance, so it is 'free'.
Sep 22 2006
parent reply Don Clugston <dac nospam.com.au> writes:
Walter Bright wrote:
 Don Clugston wrote:
 Walter Bright wrote:
 Not in D. The 'f' suffix only indicates the type.

And therefore, it only matters in implicit type deduction, and in function overloading. As I discuss below, I'm not sure that it's necessary even there. In many cases, it's clearly a programmer error. For example in real BAD = 0.2f; where the f has absolutely no effect.

It may come about as a result of source code generation, though, so I'd be reluctant to make it an error.
 You can by putting the constant into a static, non-const variable. 
 Then it cannot be constant folded.

Actually, in this case you still want it to be constant folded.

A static variable's value can change, so it can't be constant folded. To have it participate in constant folding, it needs to be declared as const.

But if it's const, then it's not float precision! I want both!
 I agree. But it seems that D is currently in a halfway house on this 
 issue. Somehow, 'double' is privileged, and don't think it's got any 
 right to be.

     const XXX = 0.123456789123456789123456789f;
     const YYY = 1 * XXX;
     const ZZZ = 1.0 * XXX;

    auto xxx = XXX;
    auto yyy = YYY;
    auto zzz = ZZZ;

 // now xxx and yyy are floats, but zzz is a double.
 Multiplying by '1.0' causes a float constant to be promoted to double.

That's because 1.0 is a double. A double*float => double.
    real a = xxx;
    real b = zzz;
    real c = XXX;

 Now a, b, and c all have different values.

 Whereas the same operation at runtime causes it to be promoted to real.

 Is there any reason why implicit type deduction on a floating point 
 constant doesn't always default to real? After all, you're saying "I 
 don't particularly care what type this is" -- why not default to 
 maximum accuracy?

 Concrete example:

 real a = sqrt(1.1);

 This only gives a double precision result. You have to write
 real a = sqrt(1.1L);
 instead.
 It's easier to do the wrong thing, than the right thing.

 IMHO, unless you specifically take other steps, implicit type 
 deduction should always default to the maximum accuracy the machine 
 could do.

It is a good idea, but isn't that way for the reasons: 1) It's the way C, C++, and Fortran work. Changing the promotion rules would mean that, when translating solid, reliable libraries from those languages to D, one would have to be very, very careful.

That's very important. Still, those languages don't have implicit type deduction. Also, none of those languages guarantee accuracy of decimal->binary conversions, so there's always some error in decimal constants. Incidentally, I recently read that GCC uses something like 160 bits for constant folding, so it's always going to give results that are different to those on other compilers. Why doesn't D behave like C with respect to 'f' suffixes? (Ie, do the conversion, then truncate it to float precision). Actually, I can't imagine many cases where you'd actually want a 'float' constant instead of a 'real' one.
 2) Float and double are expected to be implemented in hardware. Longer 
 precisions are often not available. I wanted to make it practical for a 
 D implementation on those machines to provide a software long precision 
 floating point type, rather than just making real==double. Such a type 
 would be very slow compared with double.

Interesting. I thought that 'real' was supposed to be the highest accuracy fast floating point type, and would therefore be either 64, 80, or 128 bits. So it could also be a double-double? For me, the huge benefit of the 'real' type is that it guarantees that optimisation won't change the results. In C, using doubles, it's quite unpredictable when a temporary will be 80 bits, and when it will be 64 bits. In D, if you stick to real, you're guaranteed that nothing weird will happen. I'd hate to lose that.
 3) Real, even in hardware, is significantly slower than double. Doing 
 constant folding at max precision at compile time won't affect runtime 
 performance, so it is 'free'.

In this case, the initial issue remains: in order to write code which maintains accuracy regardless of machine precision, it is sometimes necessary to specify the precision that should be used for constants. The original code was an example where weird things happened because that wasn't respected.
Sep 23 2006
parent reply Walter Bright <newshound digitalmars.com> writes:
Don Clugston wrote:
 Walter Bright wrote:
 A static variable's value can change, so it can't be constant folded. 
 To have it participate in constant folding, it needs to be declared as 
 const.


You can always use hex float constants. I know they're not pretty, but the point of them is to be able to specify exact floating point bit patterns. There are no rounding errors with them.
 1) It's the way C, C++, and Fortran work. Changing the promotion rules 
 would mean that, when translating solid, reliable libraries from those 
 languages to D, one would have to be very, very careful.

That's very important. Still, those languages don't have implicit type deduction. Also, none of those languages guarantee accuracy of decimal->binary conversions, so there's always some error in decimal constants. Incidentally, I recently read that GCC uses something like 160 bits for constant folding, so it's always going to give results that are different to those on other compilers. Why doesn't D behave like C with respect to 'f' suffixes? (Ie, do the conversion, then truncate it to float precision). Actually, I can't imagine many cases where you'd actually want a 'float' constant instead of a 'real' one.

A float constant would be desirable to keep the calculation all floats for speed reasons. I can't think of many reasons one would want reduced precision.
 2) Float and double are expected to be implemented in hardware. Longer 
 precisions are often not available. I wanted to make it practical for 
 a D implementation on those machines to provide a software long 
 precision floating point type, rather than just making real==double. 
 Such a type would be very slow compared with double.

Interesting. I thought that 'real' was supposed to be the highest accuracy fast floating point type, and would therefore be either 64, 80, or 128 bits. So it could also be a double-double? For me, the huge benefit of the 'real' type is that it guarantees that optimisation won't change the results. In C, using doubles, it's quite unpredictable when a temporary will be 80 bits, and when it will be 64 bits. In D, if you stick to real, you're guaranteed that nothing weird will happen. I'd hate to lose that.

I don't see how one would lose that if real were done in software.
 3) Real, even in hardware, is significantly slower than double. Doing 
 constant folding at max precision at compile time won't affect runtime 
 performance, so it is 'free'.

In this case, the initial issue remains: in order to write code which maintains accuracy regardless of machine precision, it is sometimes necessary to specify the precision that should be used for constants. The original code was an example where weird things happened because that wasn't respected.

Weird things always happen with floating point. It's just a matter of where one chooses the seams to show (you pointed out where seams show in C with temporary precision). I've seen a lot of cases where people were surprised that 0.2f (or similar) was even rounded off, and got caught by the roundoff error. I used to work in mechanical engineering where a lot of numerical calculations were done. Accumulating roundoff errors were a huge problem, and a lot (most?) engineers didn't understand it. They were using calculators for long chains of calculation, and rounding off after each step instead of carrying the full calculator precision. They were mystified by getting answers at the end that were way off. It's my experience with that (and also in college where we were taught to never round off anything but the final answer) that led to the D design decision to internally carry around consts in full precision, regardless of type. Deliberately reduced precision is something that only experts would want, and only for special cases. So it's reasonable that that would be harder to do (i.e. using hex float constants). P.S. I also did some digital electronic design work long ago. The cardinal rule there was that since TTL devices got faster all the time, and old slower TTL parts became unavailable, one designed so that swapping in a faster chip would not cause the failure of the system. Hence the rule that increasing the precision of a calculation should not cause the program to fail <g>.
Sep 23 2006
parent reply Don Clugston <dac nospam.com.au> writes:
Walter Bright wrote:
 Don Clugston wrote:
 Walter Bright wrote:
 A static variable's value can change, so it can't be constant folded. 
 To have it participate in constant folding, it needs to be declared 
 as const.


You can always use hex float constants. I know they're not pretty, but the point of them is to be able to specify exact floating point bit patterns. There are no rounding errors with them.

 1) It's the way C, C++, and Fortran work. Changing the promotion 
 rules would mean that, when translating solid, reliable libraries 
 from those languages to D, one would have to be very, very careful.

That's very important. Still, those languages don't have implicit type deduction. Also, none of those languages guarantee accuracy of decimal->binary conversions, so there's always some error in decimal constants. Incidentally, I recently read that GCC uses something like 160 bits for constant folding, so it's always going to give results that are different to those on other compilers. Why doesn't D behave like C with respect to 'f' suffixes? (Ie, do the conversion, then truncate it to float precision). Actually, I can't imagine many cases where you'd actually want a 'float' constant instead of a 'real' one.

A float constant would be desirable to keep the calculation all floats for speed reasons. I can't think of many reasons one would want reduced precision.

Me, too. In fact I've seen a lot of code where ignorant programmers were adding 'f' to end of every floating point constant. It could be that the number of cases where you actually care about the precision are so small, that hex constants are adequate.
 2) Float and double are expected to be implemented in hardware. 
 Longer precisions are often not available. I wanted to make it 
 practical for a D implementation on those machines to provide a 
 software long precision floating point type, rather than just making 
 real==double. Such a type would be very slow compared with double.

Interesting. I thought that 'real' was supposed to be the highest accuracy fast floating point type, and would therefore be either 64, 80, or 128 bits. So it could also be a double-double? For me, the huge benefit of the 'real' type is that it guarantees that optimisation won't change the results. In C, using doubles, it's quite unpredictable when a temporary will be 80 bits, and when it will be 64 bits. In D, if you stick to real, you're guaranteed that nothing weird will happen. I'd hate to lose that.

I don't see how one would lose that if real were done in software.
 3) Real, even in hardware, is significantly slower than double. Doing 
 constant folding at max precision at compile time won't affect 
 runtime performance, so it is 'free'.

In this case, the initial issue remains: in order to write code which maintains accuracy regardless of machine precision, it is sometimes necessary to specify the precision that should be used for constants. The original code was an example where weird things happened because that wasn't respected.

Weird things always happen with floating point. It's just a matter of where one chooses the seams to show (you pointed out where seams show in C with temporary precision). I've seen a lot of cases where people were surprised that 0.2f (or similar) was even rounded off, and got caught by the roundoff error. I used to work in mechanical engineering where a lot of numerical calculations were done. Accumulating roundoff errors were a huge problem, and a lot (most?) engineers didn't understand it. They were using calculators for long chains of calculation, and rounding off after each step instead of carrying the full calculator precision. They were mystified by getting answers at the end that were way off. It's my experience with that (and also in college where we were taught to never round off anything but the final answer) that led to the D design decision to internally carry around consts in full precision, regardless of type. Deliberately reduced precision is something that only experts would want, and only for special cases. So it's reasonable that that would be harder to do (i.e. using hex float constants).

OK, you've convinced me. It needs to be better documented, though.
 P.S. I also did some digital electronic design work long ago. The 
 cardinal rule there was that since TTL devices got faster all the time, 
 and old slower TTL parts became unavailable, one designed so that 
 swapping in a faster chip would not cause the failure of the system. 
 Hence the rule that increasing the precision of a calculation should not 
 cause the program to fail <g>.

I think it would be useful to specify more precisely what happens in constant folding. Eg, mention that all constant folding will be done in IEEE round-to-nearest, ties-to-even. In the longer term, I've been wondering if the precision for real constants even needs to be the same as for the 'real' type. I can see some distinct benefits that would come if the precision of literals was defined to always be IEEE quadruple precision. Of course they'd always be rounded to 64 or 80-bit reals when the time came for them to actually be used. Looking at the spec for the forthcoming IEEE 754R standard, and the state of SSE3 on AMD-64, it seems that Intel/AMD could very easily add a quadruple precision type (they already have 16 128 bit registers, two 64 bit mantissa units, and the quadruple exponent is the same as for x87. So I don't think it would require much silicon, and it would mean they could emulate the x87 stuff entirely on SSE). Some forward-compatibility things to consider in DMD 2.0; ignore for now.
Sep 24 2006
parent reply Walter Bright <newshound digitalmars.com> writes:
Don Clugston wrote:
 Walter Bright wrote:
 OK, you've convinced me. It needs to be better documented, though.

I agree with you and Bradley Smith on that.
 P.S. I also did some digital electronic design work long ago. The 
 cardinal rule there was that since TTL devices got faster all the 
 time, and old slower TTL parts became unavailable, one designed so 
 that swapping in a faster chip would not cause the failure of the 
 system. Hence the rule that increasing the precision of a calculation 
 should not cause the program to fail <g>.

I think it would be useful to specify more precisely what happens in constant folding. Eg, mention that all constant folding will be done in IEEE round-to-nearest, ties-to-even.

Yes.
 In the longer term, I've been wondering if the precision for real 
 constants even needs to be the same as for the 'real' type. I can see 
 some distinct benefits that would come if the precision of literals was 
 defined to always be IEEE quadruple precision. Of course they'd always 
 be rounded to 64 or 80-bit reals when the time came for them to actually 
 be used.

I agree.
 Looking at the spec for the forthcoming IEEE 754R standard, and the 
 state of SSE3 on AMD-64, it seems that Intel/AMD could very easily add a 
 quadruple precision type (they already have 16 128 bit registers, two 64 
 bit mantissa units, and the quadruple exponent is the same as for x87. 
 So I don't think it would require much silicon, and it would mean they 
 could emulate the x87 stuff entirely on SSE). Some forward-compatibility 
 things to consider in DMD 2.0; ignore for now.

I was disappointed in the AMD-64 because it didn't do 128 bit floats, in fact, it relegated 80 bit floats to a backwater in the instruction set. Few computer people seem to understand the value in high precision floating point.
Sep 24 2006
parent reply Don Clugston <dac nospam.com.au> writes:
Walter Bright wrote:
 Don Clugston wrote:
 Walter Bright wrote:
 OK, you've convinced me. It needs to be better documented, though.

I agree with you and Bradley Smith on that.
 P.S. I also did some digital electronic design work long ago. The 
 cardinal rule there was that since TTL devices got faster all the 
 time, and old slower TTL parts became unavailable, one designed so 
 that swapping in a faster chip would not cause the failure of the 
 system. Hence the rule that increasing the precision of a calculation 
 should not cause the program to fail <g>.

I think it would be useful to specify more precisely what happens in constant folding. Eg, mention that all constant folding will be done in IEEE round-to-nearest, ties-to-even.

Yes.
 In the longer term, I've been wondering if the precision for real 
 constants even needs to be the same as for the 'real' type. I can see 
 some distinct benefits that would come if the precision of literals 
 was defined to always be IEEE quadruple precision. Of course they'd 
 always be rounded to 64 or 80-bit reals when the time came for them to 
 actually be used.

I agree.

One consequence of that would be in the name mangling for floating point constants in templates. Currently it's 20 hex characters, which only makes sense for a system with 80-bit reals; might be better to make it 32 hex characters, even if the extra 12 are all '0'.
 
 Looking at the spec for the forthcoming IEEE 754R standard, and the 
 state of SSE3 on AMD-64, it seems that Intel/AMD could very easily add 
 a quadruple precision type (they already have 16 128 bit registers, 
 two 64 bit mantissa units, and the quadruple exponent is the same as 
 for x87. So I don't think it would require much silicon, and it would 
 mean they could emulate the x87 stuff entirely on SSE). Some 
 forward-compatibility things to consider in DMD 2.0; ignore for now.

I was disappointed in the AMD-64 because it didn't do 128 bit floats, in fact, it relegated 80 bit floats to a backwater in the instruction set. Few computer people seem to understand the value in high precision floating point.

Intel seems to be better than AMD in this regard. Intel added an 82 bit floating point type to the Itanium so that it could do 80-bit hypot() without overflow (in fact, Itanium seems to have by far the best floating point support that I've seen); AMD's 3DNow! didn't even support subnormals, infinity, or NaN.
Sep 24 2006
next sibling parent reply Walter Bright <newshound digitalmars.com> writes:
Don Clugston wrote:
 One consequence of that would be in the name mangling for floating point 
  constants in templates. Currently it's 20 hex characters, which only 
 makes sense for a system with 80-bit reals; might be better to make it 
 32 hex characters, even if the extra 12 are all '0'.

I'm reluctant to do that because there are already problems with the mangled names getting too long.
Sep 25 2006
parent reply xs0 <xs0 xs0.com> writes:
Walter Bright wrote:
 Don Clugston wrote:
 One consequence of that would be in the name mangling for floating 
 point  constants in templates. Currently it's 20 hex characters, which 
 only makes sense for a system with 80-bit reals; might be better to 
 make it 32 hex characters, even if the extra 12 are all '0'.

I'm reluctant to do that because there are already problems with the mangled names getting too long.

What if you used characters other than A-F to compress the zeros? G = 2 * '0' H = 3 * '0' ... Z = 21 * '0' xs0
Sep 25 2006
parent Walter Bright <newshound digitalmars.com> writes:
xs0 wrote:
 Walter Bright wrote:
 Don Clugston wrote:
 One consequence of that would be in the name mangling for floating 
 point  constants in templates. Currently it's 20 hex characters, 
 which only makes sense for a system with 80-bit reals; might be 
 better to make it 32 hex characters, even if the extra 12 are all '0'.

I'm reluctant to do that because there are already problems with the mangled names getting too long.

What if you used characters other than A-F to compress the zeros? G = 2 * '0' H = 3 * '0' ... Z = 21 * '0'

Compression is one solution.
Sep 25 2006
prev sibling parent Sean Kelly <sean f4.ca> writes:
Don Clugston wrote:
 Walter Bright wrote:
 I was disappointed in the AMD-64 because it didn't do 128 bit floats, 
 in fact, it relegated 80 bit floats to a backwater in the instruction 
 set. Few computer people seem to understand the value in high 
 precision floating point.

Intel seems to be better than AMD in this regard. Intel added an 82 bit floating point type to the Itanium so that it could do 80-bit hypot() without overflow (in fact, Itanium seems to have by far the best floating point support that I've seen); AMD's 3DNow! didn't even support subnormals, infinity, or NaN.

I think AMD simply set its sights on the game industry as the battleground, which seems to be supported by the presence of forums on LAN parties and system modding (http://forums.amd.com/). This stands in contrast with the Intel, who has an entire set of forums for software development (http://softwareforums.intel.com/). I decided to ask whether AMD has another location for software development discussion. I have no idea whether science-minded software companies or developers communicate to AMD that they'd like improved floating-point support, but a bit more couldn't hurt. Sean
Sep 25 2006
prev sibling parent Bradley Smith <digitalmars-com baysmith.com> writes:
To summarize: ---

The compiler is allowed to evaluate intermediate results at a greater 
precision than that of the operands. The literal type suffix (like 'f') 
only indicates the type. The compiler may maintain internally as much 
precision as possible, for purposes of constant folding. Committing the 
actual precision of the result is done as late as possible.

For a low-precision constant put the value into a static, non-const 
variable. Since this is not really a constant, it cannot be constant 
folded and therefore affected by a possible compile-time increase in 
precision. However, if mixed with a higher precision at runtime, a 
increase in precision will still occur.

The way to write robust floating point calculations in D is to ensure
that increasing the precision of the calculations will not break the 
result.

--- end of summary

This is the explanation I was looking for. Although it was clear that 
during runtime, D evaluates intermediate results at high precision. The 
compile-time behavior (namely using a const) is different than the 
runtime behavior (using a static), but I don't think that is clearly 
explained in the documentation.

Would you please add this information to the D documentation? Perhaps an 
addition to the Floating Point page 
(http://www.digitalmars.com/d/float.html). Of course, if any of the 
above is incorrect, please change as necessary.

A follow-on question would be: How does one create an low-precision 
constant that is ensured to actually stay constant? A static won't do 
since a static is really non-const, and a programming error would change 
the value.


Thanks,
   Bradley
Sep 22 2006
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=360


smjg iname.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |smjg iname.com




------- Comment #7 from smjg iname.com  2006-09-22 15:06 -------
(In reply to comment #5)
 const float A = 0.2;  // infinitely accurate 0.2, but type inference on A
 should return a float.
 
 const float B = 0.2f; // a 32-bit approximation to 0.2
 const real C = 0.2; // infinitely accurate 0.2
 const real D = 0.2f; // a 32-bit approximation to 0.2, but type inference will
 give an 80-bit quantity.

I agree. Only I'm not sure about A. If you want it to be "infinitely accurate", then why would you declare it to be a float? It appears to me to be a means by which a float can hold more precision than it really can. On the other hand, D should definitely generate a 32-bit approximation to 0.2. By using the 'f' suffix, this is exactly what the programmer asked for. --
Sep 22 2006