www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Question/request/bug(?) re. floating-point in dmd

reply "Apollo Hogan" <apollo.hogan gmail.com> writes:
Hi all-

I'm a newcomer to the D language, but am quite impressed with it. 
  I've run into an issue, though, in a little learning project.  
I'm implementing a "double-double" class based on a well-known 
trick using standard floating-point arithmetic.  (See, for 
example, 
http://en.wikipedia.org/wiki/Quadruple-precision_floating-point_format#Doub
e-double_arithmetic 
and links).

The techniques used to get this kind of code to work correctly, 
rely on subtle properties of IEEE floating point numbers.  Mixing 
precisions or optimisations in floating point code can destroy 
the results.

For example, the appended code produces the following output when 
compiled (DMD32 D Compiler v2.063.2 under WinXP/cygwin) with no 
optimization:

immutable(pair)(1.1, -2.03288e-20)
pair(1, 0.1)
pair(1.1, -8.32667e-17)

and the following results when compiled with optimization (-O):

immutable(pair)(1.1, -2.03288e-20)
pair(1, 0.1)
pair(1.1, 0)

The desired result would be:

immutable(pair)(1.1, -8.32667e-17)
pair(1, 0.1)
pair(1.1, -8.32667e-17)

That is: without optimization the run-time "normalization" is 
correct.  With optimization it is broken.  That is pretty easy to 
work around by simply compiling the relevant library without 
optimization.  (Though it would be nice to have, for example, 
pragmas to mark some functions as "delicate" or 
"non-optimizable".)  A bigger issue is that the compile-time 
normalization call gives the 'wrong' answer consistently with or 
without optimization.  One would expect that evaluating a pure 
function at run-time or compile-time would give the same result...

Cheers,
--Apollo

import std.stdio;
struct pair { double hi, lo; }
pair normalize(pair q)
{
   double h = q.hi + q.lo;
   double l = q.lo + (q.hi - h);
   return pair(h,l);
}
void main()
{
   immutable static pair spn = normalize(pair(1.0,0.1));
   writeln(spn);
   writeln(pair(1.0,0.1));
   writeln(normalize(pair(1.0,0.1)));
}
Oct 23 2013
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Apollo Hogan:

import std.stdio;

struct Pair { double hi, lo; }

Pair normalize(Pair q) {
     immutable h = q.hi + q.lo;
     immutable l = q.lo + (q.hi - h);
     return Pair(h, l);
}
void main() {
     immutable static Pair spn = Pair(1.0, 0.1).normalize;
     spn.writeln;
     Pair(1.0, 0.1).writeln;
     Pair(1.0, 0.1).normalize.writeln;
}

...>ldmd2 -O -run temp.d
immutable(Pair)(1.1, -2.03288e-020)
Pair(1, 0.1)
Pair(1.1, -8.32667e-017)

...>ldmd2 -run temp.d
immutable(Pair)(1.1, -2.03288e-020)
Pair(1, 0.1)
Pair(1.1, -8.32667e-017)

ldc2 0.11.0

Bye,
bearophile
Oct 23 2013
parent "Apollo Hogan" <apollo.hogan gmail.com> writes:
Hi Bearophile-

Interesting.  Looks like the run-time calculation in ldmd2 works 
fine.

The compile-time computation in both my and your examples looks 
like it is being done in 80-bit arithmetic.

Thanks,
--Apollo
Oct 23 2013
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/23/2013 8:44 AM, Apollo Hogan wrote:
 That is: without optimization the run-time "normalization" is correct.  With
 optimization it is broken.  That is pretty easy to work around by simply
 compiling the relevant library without optimization.  (Though it would be nice
 to have, for example, pragmas to mark some functions as "delicate" or
 "non-optimizable".)  A bigger issue is that the compile-time normalization call
 gives the 'wrong' answer consistently with or without optimization.  One would
 expect that evaluating a pure function at run-time or compile-time would give
 the same result...
A D compiler is allowed to compute floating point results at arbitrarily large precision - the storage size (float, double, real) only specify the minimum precision. This behavior is fairly deeply embedded into the front end, optimizer, and various back ends. To precisely control maximum precision, I suggest using inline assembler to use the exact sequence of instructions needed for double-double operations.
Oct 23 2013
next sibling parent reply "David Nadlinger" <code klickverbot.at> writes:
On Wednesday, 23 October 2013 at 16:15:56 UTC, Walter Bright 
wrote:
 A D compiler is allowed to compute floating point results at 
 arbitrarily large precision - the storage size (float, double, 
 real) only specify the minimum precision.

 This behavior is fairly deeply embedded into the front end, 
 optimizer, and various back ends.
I know we've had this topic before, but just for the record, I'm still not sold on the idea of allowing CTFE to yield different results than runtime execution. David
Oct 23 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/23/2013 9:22 AM, David Nadlinger wrote:
 On Wednesday, 23 October 2013 at 16:15:56 UTC, Walter Bright wrote:
 A D compiler is allowed to compute floating point results at arbitrarily large
 precision - the storage size (float, double, real) only specify the minimum
 precision.

 This behavior is fairly deeply embedded into the front end, optimizer, and
 various back ends.
I know we've had this topic before, but just for the record, I'm still not sold on the idea of allowing CTFE to yield different results than runtime execution.
Java initially tried to enforce a maximum precision, and it was a major disaster for them. If I have been unable to convince you, I suggest reviewing that case history. Back when I designed and built digital electronics boards, it was beaten into my skull that chips always get faster, never slower, and the slower parts routinely became unavailable. This means that the circuits got designed with maximum propagation delays in mind, and with a minimum delay of 0. Then, when they work with a slow part, they'll still work if you swap in a faster one. FP precision is the same concept. Swap in more precision, and your correctly designed algorithm will still work.
Oct 23 2013
next sibling parent reply "Apollo Hogan" <apollo.hogan gmail.com> writes:
On Wednesday, 23 October 2013 at 16:50:52 UTC, Walter Bright 
wrote:
 On 10/23/2013 9:22 AM, David Nadlinger wrote:
 On Wednesday, 23 October 2013 at 16:15:56 UTC, Walter Bright 
 wrote:
 A D compiler is allowed to compute floating point results at 
 arbitrarily large
 precision - the storage size (float, double, real) only 
 specify the minimum
 precision.

 This behavior is fairly deeply embedded into the front end, 
 optimizer, and
 various back ends.
I know we've had this topic before, but just for the record, I'm still not sold on the idea of allowing CTFE to yield different results than runtime execution.
Java initially tried to enforce a maximum precision, and it was a major disaster for them. If I have been unable to convince you, I suggest reviewing that case history. Back when I designed and built digital electronics boards, it was beaten into my skull that chips always get faster, never slower, and the slower parts routinely became unavailable. This means that the circuits got designed with maximum propagation delays in mind, and with a minimum delay of 0. Then, when they work with a slow part, they'll still work if you swap in a faster one. FP precision is the same concept. Swap in more precision, and your correctly designed algorithm will still work.
There are a couple of points here: - it seems that whatever the semantics of floating-point arithmetic, they should be the same at compile-time as at run-time. - I agree that the majority of floating point code is only improved by increasing the working precision. (If we don't worry about reproducibility across compilers/machines/etc.) The "real" data-type seems to be designed exactly for this: use "real" in numerical code and the compiler will give you a good answer at the highest performant precision. However there _are_ cases where it can be very useful to have precise control of the precision that one is using. Implementing double-double or quadruple-double data types is an example here. Viewing D as a _systems_ language, I'd like to have the ability to just have it do what I ask (and being forced to go to assembler seems unreasonable...) Anyway, thanks for the replies. I guess I've got to go off and design the brand new D^^2 language to conform to my whims now. Cheers, --Apollo
Oct 23 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/23/2013 11:39 AM, Apollo Hogan wrote:
 There are a couple of points here:

 - it seems that whatever the semantics of floating-point arithmetic, they
should
 be the same at compile-time as at run-time.
It's not very practical, especially considering that the compile time environment may be not at all the same as the runtime one.
 - I agree that the majority of floating point code is only improved by
 increasing the working precision.  (If we don't worry about reproducibility
 across compilers/machines/etc.)
As I mentioned earlier, Java initially mandated that floating point results be exactly reproducible in multiple environments. This was a fine idea, and turned out to be completely unworkable in practice.
 The "real" data-type seems to be designed
 exactly for this: use "real" in numerical code and the compiler will give you a
 good answer at the highest performant precision.  However there _are_ cases
 where it can be very useful to have precise control of the precision that one
is
 using.  Implementing double-double or quadruple-double data types is an example
 here.
It's not that bad. You can also force a reduction in precision by calling a function like this: double identity(double d) { return d; } and ensuring (via separate compilation) that the compiler cannot inline calls to identity().
 Viewing D as a _systems_ language, I'd like to have the ability to just
 have it do what I ask (and being forced to go to assembler seems
 unreasonable...)
Perhaps, but I think you are treading on implementation defined behavior here for most languages, and will be hard pressed to find a language that *guarantees* the loss of precision you desire, even though it may deliver that behavior on various platforms with various compiler switches.
 Anyway, thanks for the replies.  I guess I've got to go off and design the
brand
 new D^^2 language to conform to my whims now.
Join the club! :-)
Oct 23 2013
parent reply "Apollo Hogan" <apollo.hogan gmail.com> writes:
On Wednesday, 23 October 2013 at 19:10:22 UTC, Walter Bright 
wrote:
 On 10/23/2013 11:39 AM, Apollo Hogan wrote:
 There are a couple of points here:

 - it seems that whatever the semantics of floating-point 
 arithmetic, they should
 be the same at compile-time as at run-time.
It's not very practical, especially considering that the compile time environment may be not at all the same as the runtime one.
Understood, but it certainly was a surprising result to me that compiling and running the program on the same platform I got different results for a static vs. non-static variable initialization... (My version of PI as 3.14159265358979311594779789241 was a bit confusing...)
 It's not that bad. You can also force a reduction in precision 
 by calling a function like this:

     double identity(double d) { return d; }

 and ensuring (via separate compilation) that the compiler 
 cannot inline calls to identity().
Thanks, a useful trick. It at least lets me confound the optimizer a bit. (Though doesn't help with the compile vs. run headache. Though this seems to be workaroundable by using a static constructor. Yeah, I'm a noob.) Thanks for the replies, --Apollo
Oct 23 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 10/23/13 1:59 PM, Apollo Hogan wrote:
 Thanks, a useful trick.  It at least lets me confound the optimizer a
 bit.  (Though doesn't help with the compile vs. run headache.  Though
 this seems to be workaroundable by using a static constructor.  Yeah,
 I'm a noob.)
+1 for "workaroundable". Andrei
Oct 23 2013
prev sibling next sibling parent reply "qznc" <qznc web.de> writes:
On Wednesday, 23 October 2013 at 16:50:52 UTC, Walter Bright 
wrote:
 Java initially tried to enforce a maximum precision, and it was 
 a major disaster for them. If I have been unable to convince 
 you, I suggest reviewing that case history.
It was recently on Hacker News. Here is one of the relevant rants: http://www.cs.berkeley.edu/~wkahan/JAVAhurt.pdf Page 16 is probably the core argument "Linguistically legislated exact reproducibility is unenforceable", but everything Kahan says here is worth listening to (despite the ugly presentation).
Oct 23 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 10/23/2013 10:45 PM, qznc wrote:
 Page 16 is probably the core argument "Linguistically legislated exact
 reproducibility is unenforceable", but everything Kahan says here is worth
 listening to (despite the ugly presentation).
I agree that Prof Kahan is well worth listening to for anyone with even a passing interest in floating point arithmetic.
Oct 23 2013
prev sibling parent reply "Don" <x nospam.com> writes:
On Wednesday, 23 October 2013 at 16:50:52 UTC, Walter Bright 
wrote:
 On 10/23/2013 9:22 AM, David Nadlinger wrote:
 On Wednesday, 23 October 2013 at 16:15:56 UTC, Walter Bright 
 wrote:
 A D compiler is allowed to compute floating point results at 
 arbitrarily large
 precision - the storage size (float, double, real) only 
 specify the minimum
 precision.

 This behavior is fairly deeply embedded into the front end, 
 optimizer, and
 various back ends.
I know we've had this topic before, but just for the record, I'm still not sold on the idea of allowing CTFE to yield different results than runtime execution.
Java initially tried to enforce a maximum precision, and it was a major disaster for them. If I have been unable to convince you, I suggest reviewing that case history. Back when I designed and built digital electronics boards, it was beaten into my skull that chips always get faster, never slower, and the slower parts routinely became unavailable. This means that the circuits got designed with maximum propagation delays in mind, and with a minimum delay of 0. Then, when they work with a slow part, they'll still work if you swap in a faster one. FP precision is the same concept. Swap in more precision, and your correctly designed algorithm will still work.
THIS IS COMPLETELY WRONG. You cannot write serious floating-point code under such circumstances. This takes things back to the bad old days before IEEE, where results were implementation-dependent. We have these wonderful properties, float.epsilon, etc, which allow code to adapt to machine differences. The correct approach is to write generic code which will give full machine precision and will work on any machine configuration. That's actually quite easy. But to write code which will function correctly when an unspecified and unpredictable error can be added to any calculation -- I believe that's impossible. I don't know how to write such code.
Oct 29 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/29/2013 5:08 AM, Don wrote:
 On Wednesday, 23 October 2013 at 16:50:52 UTC, Walter Bright wrote:
 On 10/23/2013 9:22 AM, David Nadlinger wrote:
 On Wednesday, 23 October 2013 at 16:15:56 UTC, Walter Bright wrote:
 A D compiler is allowed to compute floating point results at arbitrarily large
 precision - the storage size (float, double, real) only specify the minimum
 precision.

 This behavior is fairly deeply embedded into the front end, optimizer, and
 various back ends.
I know we've had this topic before, but just for the record, I'm still not sold on the idea of allowing CTFE to yield different results than runtime execution.
Java initially tried to enforce a maximum precision, and it was a major disaster for them. If I have been unable to convince you, I suggest reviewing that case history. Back when I designed and built digital electronics boards, it was beaten into my skull that chips always get faster, never slower, and the slower parts routinely became unavailable. This means that the circuits got designed with maximum propagation delays in mind, and with a minimum delay of 0. Then, when they work with a slow part, they'll still work if you swap in a faster one. FP precision is the same concept. Swap in more precision, and your correctly designed algorithm will still work.
THIS IS COMPLETELY WRONG. You cannot write serious floating-point code under such circumstances. This takes things back to the bad old days before IEEE, where results were implementation-dependent. We have these wonderful properties, float.epsilon, etc, which allow code to adapt to machine differences. The correct approach is to write generic code which will give full machine precision and will work on any machine configuration. That's actually quite easy. But to write code which will function correctly when an unspecified and unpredictable error can be added to any calculation -- I believe that's impossible. I don't know how to write such code.
Unpredictable, sure, but it is unpredictable in that the error is less than a guaranteed maximum error. The error falls in a range 0<=error<=epsilon. As an analogy, all engineering parts are designed with a maximum deviation from the ideal size, not a minimum deviation.
Oct 29 2013
parent reply "Don" <x nospam.com> writes:
On Tuesday, 29 October 2013 at 19:42:08 UTC, Walter Bright wrote:
 On 10/29/2013 5:08 AM, Don wrote:
 On Wednesday, 23 October 2013 at 16:50:52 UTC, Walter Bright 
 wrote:
 On 10/23/2013 9:22 AM, David Nadlinger wrote:
 On Wednesday, 23 October 2013 at 16:15:56 UTC, Walter Bright 
 wrote:
 A D compiler is allowed to compute floating point results 
 at arbitrarily large
 precision - the storage size (float, double, real) only 
 specify the minimum
 precision.

 This behavior is fairly deeply embedded into the front end, 
 optimizer, and
 various back ends.
I know we've had this topic before, but just for the record, I'm still not sold on the idea of allowing CTFE to yield different results than runtime execution.
Java initially tried to enforce a maximum precision, and it was a major disaster for them. If I have been unable to convince you, I suggest reviewing that case history. Back when I designed and built digital electronics boards, it was beaten into my skull that chips always get faster, never slower, and the slower parts routinely became unavailable. This means that the circuits got designed with maximum propagation delays in mind, and with a minimum delay of 0. Then, when they work with a slow part, they'll still work if you swap in a faster one. FP precision is the same concept. Swap in more precision, and your correctly designed algorithm will still work.
THIS IS COMPLETELY WRONG. You cannot write serious floating-point code under such circumstances. This takes things back to the bad old days before IEEE, where results were implementation-dependent. We have these wonderful properties, float.epsilon, etc, which allow code to adapt to machine differences. The correct approach is to write generic code which will give full machine precision and will work on any machine configuration. That's actually quite easy. But to write code which will function correctly when an unspecified and unpredictable error can be added to any calculation -- I believe that's impossible. I don't know how to write such code.
Unpredictable, sure, but it is unpredictable in that the error is less than a guaranteed maximum error. The error falls in a range 0<=error<=epsilon. As an analogy, all engineering parts are designed with a maximum deviation from the ideal size, not a minimum deviation.
I don't think the analagy is strong. There's no reason for there to be any error at all. Besides, in the x87 case, there are exponent errors as well precision. Eg, double.min * double.min can be zero on some systems, but non-zero on others. This causes a total loss of precision. If this is allowed to happen anywhere (and not even consistently) then it's back to the pre-IEEE 754 days: underflow and overflow lead to unspecified behaviour. The idea that extra precision is always a good thing, is simply incorrect. The problem is that, if calculations can carry extra precision, double rounding can occur. This is a form of error that doesn't otherwise exist. If all calculations are allowed to do it, there is absolutely nothing you can do to fix the problem. Thus we lose the other major improvement from IEEE 754: predictable rounding behaviour. Fundamentally, there is a primitive operation "discard extra precision" which is crucial to most mathematical algorithms but which is rarely explicit. In theory in C and C++ this is applied at each sequence point, but in practice that's not actually done (for x87 anyway) -- for performance, you want to be able to keep values in registers sometimes. So C didn't get this exactly right. I think we can do better. But the current behaviour is worse. This issue is becoming more obvious in CTFE because the extra precision is not merely theoretical, it actually happens.
Oct 30 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/30/2013 6:50 AM, Don wrote:
 Unpredictable, sure, but it is unpredictable in that the error is less than a
 guaranteed maximum error. The error falls in a range 0<=error<=epsilon. As an
 analogy, all engineering parts are designed with a maximum deviation from the
 ideal size, not a minimum deviation.
I don't think the analagy is strong. There's no reason for there to be any error at all. Besides, in the x87 case, there are exponent errors as well precision. Eg, double.min * double.min can be zero on some systems, but non-zero on others. This causes a total loss of precision. If this is allowed to happen anywhere (and not even consistently) then it's back to the pre-IEEE 754 days: underflow and overflow lead to unspecified behaviour. The idea that extra precision is always a good thing, is simply incorrect.
Not exactly what I meant - I mean the algorithm should be designed so that extra precision does not break it.
 The problem is that, if calculations can carry extra precision, double rounding
 can occur. This is a form of error that doesn't otherwise exist. If all
 calculations are allowed to do it, there is absolutely nothing you can do to
fix
 the problem.

 Thus we lose the other major improvement from IEEE 754: predictable rounding
 behaviour.

 Fundamentally, there is a primitive operation "discard extra precision" which
is
 crucial to most mathematical algorithms but which is rarely explicit.
 In theory in C and C++ this is applied at each sequence point, but in practice
 that's not actually done (for x87 anyway) -- for performance, you want to be
 able to keep values in registers sometimes. So C didn't get this exactly right.
 I think we can do better. But the current behaviour is worse.

 This issue is becoming more obvious in CTFE because the extra precision is not
 merely theoretical, it actually happens.
I think it's reasonable to add 3 functions (similar to peek and poke) that force rounding to float/double/real precision. By inserting that into the code where the algorithm requires it would make it far more clear than the C idea of "sequence points" and having no clue whether they matter or not.
Oct 30 2013
parent reply "Don" <x nospam.com> writes:
On Wednesday, 30 October 2013 at 18:28:14 UTC, Walter Bright 
wrote:
 On 10/30/2013 6:50 AM, Don wrote:
 Unpredictable, sure, but it is unpredictable in that the 
 error is less than a
 guaranteed maximum error. The error falls in a range 
 0<=error<=epsilon. As an
 analogy, all engineering parts are designed with a maximum 
 deviation from the
 ideal size, not a minimum deviation.
I don't think the analagy is strong. There's no reason for there to be any error at all. Besides, in the x87 case, there are exponent errors as well precision. Eg, double.min * double.min can be zero on some systems, but non-zero on others. This causes a total loss of precision. If this is allowed to happen anywhere (and not even consistently) then it's back to the pre-IEEE 754 days: underflow and overflow lead to unspecified behaviour. The idea that extra precision is always a good thing, is simply incorrect.
Not exactly what I meant - I mean the algorithm should be designed so that extra precision does not break it.
Unfortunately, that's considerably more difficult than writing an algorithm for a known precision. And it is impossible in any case where you need full machine precision (which applies to practically all library code, and most of my work).
 The problem is that, if calculations can carry extra 
 precision, double rounding
 can occur. This is a form of error that doesn't otherwise 
 exist. If all
 calculations are allowed to do it, there is absolutely nothing 
 you can do to fix
 the problem.

 Thus we lose the other major improvement from IEEE 754: 
 predictable rounding
 behaviour.

 Fundamentally, there is a primitive operation "discard extra 
 precision" which is
 crucial to most mathematical algorithms but which is rarely 
 explicit.
 In theory in C and C++ this is applied at each sequence point, 
 but in practice
 that's not actually done (for x87 anyway) -- for performance, 
 you want to be
 able to keep values in registers sometimes. So C didn't get 
 this exactly right.
 I think we can do better. But the current behaviour is worse.

 This issue is becoming more obvious in CTFE because the extra 
 precision is not
 merely theoretical, it actually happens.
I think it's reasonable to add 3 functions (similar to peek and poke) that force rounding to float/double/real precision. By inserting that into the code where the algorithm requires it would make it far more clear than the C idea of "sequence points" and having no clue whether they matter or not.
Yeah, the sequence points thing is a bit of a failure. It introduces such a performance penalty, that compilers don't actually respect it, and nobody would want them to. A compiler intrinsic, which generates no code (simply inserting a barrier for the optimiser) sounds like the correct approach. Coming up for a name for this operation is difficult.
Nov 05 2013
next sibling parent reply "bearophile" <bearophileHUGS lycos.com> writes:
Don:

 A compiler intrinsic, which generates no code (simply inserting 
 a barrier for the optimiser) sounds like the correct approach.

 Coming up for a name for this operation is difficult.
Something like this? noFPOpt naiveFP literalFP asisFP FPbarrier barrierFP Bye, bearophile
Nov 05 2013
parent "qznc" <qznc web.de> writes:
On Tuesday, 5 November 2013 at 16:31:23 UTC, bearophile wrote:
 Don:

 A compiler intrinsic, which generates no code (simply 
 inserting a barrier for the optimiser) sounds like the correct 
 approach.

 Coming up for a name for this operation is difficult.
Something like this? noFPOpt naiveFP literalFP asisFP FPbarrier barrierFP
The name should be about the semantic, not the intended behavior (optimization barrier). My ideas: x + precisely!float(y - z) x + exactly!float(y - z) x + exact!float(y - z) x + strictly!float(y - z) x + strict!float(y - z) x + strictfp!float(y - z) // familiar for Java programmers
Nov 05 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/5/2013 8:19 AM, Don wrote:
 On Wednesday, 30 October 2013 at 18:28:14 UTC, Walter Bright wrote:
 Not exactly what I meant - I mean the algorithm should be designed so that
 extra precision does not break it.
Unfortunately, that's considerably more difficult than writing an algorithm for a known precision. And it is impossible in any case where you need full machine precision (which applies to practically all library code, and most of my work).
I have a hard time buying this. For example, when I wrote matrix inversion code, more precision was always gave more accurate results.
 A compiler intrinsic, which generates no code (simply inserting a barrier for
 the optimiser) sounds like the correct approach.

 Coming up for a name for this operation is difficult.
float toFloatPrecision(real arg) ?
Nov 05 2013
next sibling parent "John Colvin" <john.loughran.colvin gmail.com> writes:
On Wednesday, 6 November 2013 at 06:28:59 UTC, Walter Bright 
wrote:
 On 11/5/2013 8:19 AM, Don wrote:
 On Wednesday, 30 October 2013 at 18:28:14 UTC, Walter Bright 
 wrote:
 Not exactly what I meant - I mean the algorithm should be 
 designed so that
 extra precision does not break it.
Unfortunately, that's considerably more difficult than writing an algorithm for a known precision. And it is impossible in any case where you need full machine precision (which applies to practically all library code, and most of my work).
I have a hard time buying this. For example, when I wrote matrix inversion code, more precision was always gave more accurate results.
I had a chat with a fluid simulation expert (mostly plasma and microfluids) with a broad computing background and the only algorithms he could think of that are by necessity fussy about max precision are elliptical curve algorithms.
 A compiler intrinsic, which generates no code (simply 
 inserting a barrier for
 the optimiser) sounds like the correct approach.

 Coming up for a name for this operation is difficult.
float toFloatPrecision(real arg) ?
Nov 06 2013
prev sibling parent reply "Don" <x nospam.com> writes:
On Wednesday, 6 November 2013 at 06:28:59 UTC, Walter Bright 
wrote:
 On 11/5/2013 8:19 AM, Don wrote:
 On Wednesday, 30 October 2013 at 18:28:14 UTC, Walter Bright 
 wrote:
 Not exactly what I meant - I mean the algorithm should be 
 designed so that
 extra precision does not break it.
Unfortunately, that's considerably more difficult than writing an algorithm for a known precision. And it is impossible in any case where you need full machine precision (which applies to practically all library code, and most of my work).
I have a hard time buying this. For example, when I wrote matrix inversion code, more precision was always gave more accurate results.
With matrix inversion you're normally far from full machine precision. If half the bits are correct, you're doing very well. The situations I'm referring to, are the ones where the result is correctly rounded, when no extra precision is present. If you then go and add extra precision to some or all of the intermediate results, the results will no longer be correctly rounded. eg, the simplest case is rounding to integer: 3.499999999999999999999999999 must round to 3. If you round it twice, you'll get 4. But we can test this. I predict that adding some extra bits to the internal calculations in CTFE (to make it have eg 128 bit intermediate values instead of 80), will cause Phobos math unit tests to break. Perhaps this can already be done trivially in GCC.
 A compiler intrinsic, which generates no code (simply 
 inserting a barrier for
 the optimiser) sounds like the correct approach.

 Coming up for a name for this operation is difficult.
float toFloatPrecision(real arg) ?
Meh. That's wordy and looks like a rounding operation. I'm interested in the operation float -> float and double -> double (and perhaps real->real), where no conversion is happening, and on most architectures it will be a no-op. It should be a name that indicates that it's not generating any code, you're just forbidding the compiler from doing funky weird stuff. And for generic code, the name should be the same for float, double, and real. Perhaps an attribute rather than a function call. double x; double y = x.strictfloat; double y = x.strictprecision; ie, (expr).strictfloat would return expr, discarding any extra precision. That's the best I've come up with so far.
Nov 06 2013
next sibling parent reply Iain Buclaw <ibuclaw ubuntu.com> writes:
On 6 November 2013 09:09, Don <x nospam.com> wrote:

 On Wednesday, 6 November 2013 at 06:28:59 UTC, Walter Bright wrote:

 On 11/5/2013 8:19 AM, Don wrote:

 On Wednesday, 30 October 2013 at 18:28:14 UTC, Walter Bright wrote:

 Not exactly what I meant - I mean the algorithm should be designed so
 that
 extra precision does not break it.
Unfortunately, that's considerably more difficult than writing an algorithm for a known precision. And it is impossible in any case where you need full machine precision (which applies to practically all library code, and most of my work).
I have a hard time buying this. For example, when I wrote matrix inversion code, more precision was always gave more accurate results.
With matrix inversion you're normally far from full machine precision. If half the bits are correct, you're doing very well. The situations I'm referring to, are the ones where the result is correctly rounded, when no extra precision is present. If you then go and add extra precision to some or all of the intermediate results, the results will no longer be correctly rounded. eg, the simplest case is rounding to integer: 3.499999999999999999999999999 must round to 3. If you round it twice, you'll get 4. But we can test this. I predict that adding some extra bits to the internal calculations in CTFE (to make it have eg 128 bit intermediate values instead of 80), will cause Phobos math unit tests to break. Perhaps this can already be done trivially in GCC.
The only tests that break in GDC because GCC operates on 160 bit intermediate values are the 80-bit specific tests (the unittest in std.math with the comment "Note that these are only valid for 80-bit reals"). Saying that though, GCC isn't exactly IEEE 754 compliant either...
  A compiler intrinsic, which generates no code (simply inserting a barrier
 for
 the optimiser) sounds like the correct approach.

 Coming up for a name for this operation is difficult.
float toFloatPrecision(real arg) ?
Meh. That's wordy and looks like a rounding operation. I'm interested in the operation float -> float and double -> double (and perhaps real->real), where no conversion is happening, and on most architectures it will be a no-op. It should be a name that indicates that it's not generating any code, you're just forbidding the compiler from doing funky weird stuff. And for generic code, the name should be the same for float, double, and real. Perhaps an attribute rather than a function call. double x; double y = x.strictfloat; double y = x.strictprecision; ie, (expr).strictfloat would return expr, discarding any extra precision. That's the best I've come up with so far.
double y = cast(float) x; ? :o) -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';
Nov 06 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/6/2013 7:07 AM, Iain Buclaw wrote:
 double y = cast(float) x;  ?  :o)
I don't like overlaying a new meaning onto the cast operation. For example, if one was using it for type coercion, that is different from wanting precision reduction. There'd be no way to separate the two effects if one desires only one. An intrinsic function solves the issue neatly and cleanly.
Nov 07 2013
prev sibling next sibling parent reply Jerry <jlquinn optonline.net> writes:
"Don" <x nospam.com> writes:

 On Wednesday, 6 November 2013 at 06:28:59 UTC, Walter Bright wrote:
 Perhaps an attribute rather than a function call.

 double x;
 double y = x.strictfloat;
 double y = x.strictprecision;

 ie, (expr).strictfloat  would return expr, discarding any extra precision.
 That's the best I've come up with so far.
What about something like the following? double x; double y; with (strictprecision) { y = x; } The idea being that you can create a scope within which operations are executed with no extra precision. Jerry
Nov 07 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 11/7/2013 8:55 AM, Jerry wrote:
 What about something like the following?

 double x;
 double y;
 with (strictprecision) {
    y = x;
 }
That has immediate problems with things like function calls that might or might not be inlined.
Nov 07 2013
parent reply "John Colvin" <john.loughran.colvin gmail.com> writes:
On Thursday, 7 November 2013 at 20:02:05 UTC, Walter Bright wrote:
 On 11/7/2013 8:55 AM, Jerry wrote:
 What about something like the following?

 double x;
 double y;
 with (strictprecision) {
   y = x;
 }
That has immediate problems with things like function calls that might or might not be inlined.
it could apply only to operations on fundamental types within the region and guarantee nothing for any called code. It could even guarantee to not apply to any called code even if inlined. I think in practice this wouldn't be particularly inconvenient.
Nov 07 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/7/2013 12:09 PM, John Colvin wrote:
 On Thursday, 7 November 2013 at 20:02:05 UTC, Walter Bright wrote:
 On 11/7/2013 8:55 AM, Jerry wrote:
 What about something like the following?

 double x;
 double y;
 with (strictprecision) {
   y = x;
 }
That has immediate problems with things like function calls that might or might not be inlined.
it could apply only to operations on fundamental types within the region and guarantee nothing for any called code. It could even guarantee to not apply to any called code even if inlined. I think in practice this wouldn't be particularly inconvenient.
I think it would be very inconvenient, as it will make problems for use of generic code. Also, it is too blunt - it'll cover a whole set of code, rather than just the one spot where it would matter.
Nov 07 2013
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 11/6/2013 1:09 AM, Don wrote:
 But we can test this. I predict that adding some extra bits to the internal
 calculations in CTFE (to make it have eg 128 bit intermediate values instead of
 80), will cause Phobos math unit tests to break.
Bring it on! Challenge accepted!
Nov 07 2013
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 23/10/13 18:16, Walter Bright wrote:
 To precisely control maximum precision, I suggest using inline assembler to use
 the exact sequence of instructions needed for double-double operations.
Could be relevant here: last year I wrote some code which divided up the closed interval [0, 1] into small increments. If the division was by powers of 10, dmd would wind up mis-calculating; with powers of 2, it was fine. By contrast GDC and LDC were fine, which makes me think their backends must implement effective decimal support for floating-point within certain limited bounds.
Oct 30 2013
prev sibling next sibling parent reply Martin Nowak <code dawg.eu> writes:
On 10/23/2013 06:16 PM, Walter Bright wrote:
 A D compiler is allowed to compute floating point results at arbitrarily
 large precision - the storage size (float, double, real) only specify
 the minimum precision.
It seems like there is some change in C99 to address excess precision. Namely assignments and casts are NOT allowed to have greater precision. Not sure if assignments refers to first storing and then loading a value. So maybe it's useful to review that rule. http://stackoverflow.com/questions/503436/how-to-deal-with-excess-precision-in-floating-point-computations/503523#503523 http://gcc.gnu.org/ml/gcc-patches/2008-11/msg00105.html
Oct 30 2013
parent Martin Nowak <code dawg.eu> writes:
On 10/30/2013 07:29 PM, Martin Nowak wrote:
 It seems like there is some change in C99 to address excess precision.
 Namely assignments and casts are NOT allowed to have greater precision.
 Not sure if assignments refers to first storing and then loading a
 value. So maybe it's useful to review that rule.
Issue 7455 - Allow a cast to discard precision from a floating point during constant folding http://d.puremagic.com/issues/show_bug.cgi?id=7455
Oct 30 2013
prev sibling parent reply Iain Buclaw <ibuclaw ubuntu.com> writes:
On Oct 23, 2013 5:21 PM, "Walter Bright" <newshound2 digitalmars.com> wrote:
 On 10/23/2013 8:44 AM, Apollo Hogan wrote:
 That is: without optimization the run-time "normalization" is correct.
With
 optimization it is broken.  That is pretty easy to work around by simply
 compiling the relevant library without optimization.  (Though it would
be nice
 to have, for example, pragmas to mark some functions as "delicate" or
 "non-optimizable".)  A bigger issue is that the compile-time
normalization call
 gives the 'wrong' answer consistently with or without optimization.  One
would
 expect that evaluating a pure function at run-time or compile-time would
give
 the same result...
A D compiler is allowed to compute floating point results at arbitrarily
large precision - the storage size (float, double, real) only specify the minimum precision.
 This behavior is fairly deeply embedded into the front end, optimizer,
and various back ends.
 To precisely control maximum precision, I suggest using inline assembler
to use the exact sequence of instructions needed for double-double operations. Why do I feel like you recommend writing code in assembler every other post you make. :o) Regards -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';
Oct 30 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 10/30/2013 3:36 PM, Iain Buclaw wrote:
 Why do I feel like you recommend writing code in assembler every other post you
 make. :o)
You're exaggerating. I recommend assembler in only 1 out of 4 posts.
Nov 05 2013
prev sibling next sibling parent reply "qznc" <qznc web.de> writes:
On Wednesday, 23 October 2013 at 15:44:54 UTC, Apollo Hogan wrote:
 For example, the appended code produces the following output 
 when compiled (DMD32 D Compiler v2.063.2 under WinXP/cygwin) 
 with no optimization:

 immutable(pair)(1.1, -2.03288e-20)
 pair(1, 0.1)
 pair(1.1, -8.32667e-17)

 and the following results when compiled with optimization (-O):

 immutable(pair)(1.1, -2.03288e-20)
 pair(1, 0.1)
 pair(1.1, 0)

 The desired result would be:

 immutable(pair)(1.1, -8.32667e-17)
 pair(1, 0.1)
 pair(1.1, -8.32667e-17)

 Cheers,
 --Apollo

 import std.stdio;
 struct pair { double hi, lo; }
 pair normalize(pair q)
 {
   double h = q.hi + q.lo;
   double l = q.lo + (q.hi - h);
   return pair(h,l);
 }
 void main()
 {
   immutable static pair spn = normalize(pair(1.0,0.1));
   writeln(spn);
   writeln(pair(1.0,0.1));
   writeln(normalize(pair(1.0,0.1)));
 }
I can replicate it here. Here is an objdump diff of normalize: Optimized: | Unoptimized: 08076bdc <_D6fptest9normalizeFS6fptest4pairZS6fptest4pair>: 08076bdc <_D6fptest9normalizeFS6fptest4pairZS6fptest4pair>: 8076bdc: 55 push %ebp 8076bdc: 55 push %ebp 8076bdd: 8b ec mov %esp,%ebp 8076bdd: 8b ec mov %esp,%ebp 8076bdf: 83 ec 10 sub $0x10,%esp | 8076bdf: 83 ec 14 sub $0x14,%esp 8076be2: dd 45 08 fldl 0x8(%ebp) 8076be2: dd 45 08 fldl 0x8(%ebp) 8076be5: d9 c0 fld %st(0) | 8076be5: dc 45 10 faddl 0x10(%ebp) 8076be7: 89 c1 mov %eax,%ecx | 8076be8: dd 5d ec fstpl -0x14(%ebp) 8076be9: dc 45 10 faddl 0x10(%ebp) | 8076beb: dd 45 08 fldl 0x8(%ebp) 8076bec: dd 55 f0 fstl -0x10(%ebp) | 8076bee: dc 65 ec fsubl -0x14(%ebp) 8076bef: de e9 fsubrp %st,%st(1) | 8076bf1: dc 45 10 faddl 0x10(%ebp) 8076bf1: dd 45 f0 fldl -0x10(%ebp) | 8076bf4: dd 5d f4 fstpl -0xc(%ebp) 8076bf4: d9 c9 fxch %st(1) | 8076bf7: dd 45 ec fldl -0x14(%ebp) 8076bf6: dc 45 10 faddl 0x10(%ebp) | 8076bfa: dd 18 fstpl (%eax) 8076bf9: dd 5d f8 fstpl -0x8(%ebp) | 8076bfc: dd 45 f4 fldl -0xc(%ebp) 8076bfc: dd 45 f8 fldl -0x8(%ebp) | 8076bff: dd 58 08 fstpl 0x8(%eax) 8076bff: d9 c9 fxch %st(1) | 8076c02: c9 leave 8076c01: dd 19 fstpl (%ecx) | 8076c03: c2 10 00 ret $0x10 8076c03: dd 59 08 fstpl 0x8(%ecx) | 8076c06: 90 nop 8076c06: 8b e5 mov %ebp,%esp | 8076c07: 90 nop 8076c08: 5d pop %ebp | 8076c08: 90 nop 8076c09: c2 10 00 ret $0x10 | 8076c09: 90 nop > 8076c0a: 90 nop > 8076c0b: 90 nop I cannot see any significant difference. The fadd-fsub-fadd sequence seems to be the same in both cases.
Oct 23 2013
parent reply "qznc" <qznc web.de> writes:
On Thursday, 24 October 2013 at 06:12:03 UTC, qznc wrote:
 import std.stdio;
 struct pair { double hi, lo; }
 pair normalize(pair q)
 {
  double h = q.hi + q.lo;
  double l = q.lo + (q.hi - h);
  return pair(h,l);
 }
 void main()
 {
  immutable static pair spn = normalize(pair(1.0,0.1));
  writeln(spn);
  writeln(pair(1.0,0.1));
  writeln(normalize(pair(1.0,0.1)));
 }
I cannot see any significant difference. The fadd-fsub-fadd sequence seems to be the same in both cases.
More observations. It requires -m32 to reproduce. On 64bit is gives the desired output even in optimized form. I wrote a small C program for comparison: ------ #include <stdio.h> typedef struct pair { double hi, lo; } pair; pair normalize(pair q) { double h = q.hi + q.lo; double l = q.lo + (q.hi - h); //double l = q.lo + (q.hi - (q.hi + q.lo)); pair res = {h,l}; return res; } int main() { pair x = {1.0, 0.1}; pair y = normalize(x); printf("%g %g\n", y.hi, y.lo); return 0; } ------ gcc -m32 normtest_replicate.c; ./a.out Output: 1.1 -8.32667e-17 Note the commented line, if you use that instead of the line above, then Output: 1.1 0 The C semantic says that h must be rounded to double. In the second case l is computed with full hardware precision instead. For a shorter example, two versions: double x = a - b; double y = c + (a - b); double y = c + x; In C those have different semantics in terms of the precision, but in D they are the equivalent.
Oct 24 2013
parent "qznc" <qznc web.de> writes:
On Thursday, 24 October 2013 at 07:27:09 UTC, qznc wrote:
 On Thursday, 24 October 2013 at 06:12:03 UTC, qznc wrote:
 import std.stdio;
 struct pair { double hi, lo; }
 pair normalize(pair q)
 {
 double h = q.hi + q.lo;
 double l = q.lo + (q.hi - h);
 return pair(h,l);
 }
 void main()
 {
 immutable static pair spn = normalize(pair(1.0,0.1));
 writeln(spn);
 writeln(pair(1.0,0.1));
 writeln(normalize(pair(1.0,0.1)));
 }
I cannot see any significant difference. The fadd-fsub-fadd sequence seems to be the same in both cases.
More observations. It requires -m32 to reproduce. On 64bit is gives the desired output even in optimized form. I wrote a small C program for comparison: ------ #include <stdio.h> typedef struct pair { double hi, lo; } pair; pair normalize(pair q) { double h = q.hi + q.lo; double l = q.lo + (q.hi - h); //double l = q.lo + (q.hi - (q.hi + q.lo)); pair res = {h,l}; return res; } int main() { pair x = {1.0, 0.1}; pair y = normalize(x); printf("%g %g\n", y.hi, y.lo); return 0; } ------ gcc -m32 normtest_replicate.c; ./a.out Output: 1.1 -8.32667e-17 Note the commented line, if you use that instead of the line above, then Output: 1.1 0 The C semantic says that h must be rounded to double. In the second case l is computed with full hardware precision instead. For a shorter example, two versions: double x = a - b; double y = c + (a - b); double y = c + x; In C those have different semantics in terms of the precision, but in D they are the equivalent.
Ah, finally found the relevant part of the spec: "It's possible that, due to greater use of temporaries and common subexpressions, optimized code may produce a more accurate answer than unoptimized code." http://dlang.org/float.html
Oct 24 2013
prev sibling parent Martin Nowak <code dawg.eu> writes:
It's called excess precision and regularly causes confusion.
http://d.puremagic.com/issues/show_bug.cgi?id=6531
Oct 30 2013