www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Integer semantic in D, what are the tradeoff ? Do we have numbers ?

reply "deadalnix" <deadalnix gmail.com> writes:
Following the thread on integer semantic, i wanted to know if 
some data are available on the tradeoff we are doing. Let me 
first explain what they are.

In D, integer are guaranteed to loop, ie uint.max + 1 == 0 . In 
C, the compiler is allowed to consider that operation do not 
overflow, which allow some optimization. See 2 examples bellow :

if(a < a + 1) { ... }

On that case, a compiler can decide that the condition is alway 
true in C. This may seems like a stupid piece of code, but in 
fact this is something that the compiler can do on a regular 
basis. You have to keep in mind that the example usually not 
appears in a way that is as obvious as this, but after several 
code transformation (function inlining, constant propagation, etc 
. . .).

Another example is (x + 1) * 2; The compiler may decide to 
rewrite it as 2 * x + 2 as this is done in one operation on many 
CPU, when (x + 1) * 2 is not always possible. Both are 
equivalent, expect if the integer overflow.

As seen, ensuring that integer loop properly is costly (if 
someone have numbers, I'd be happy to know how much costly it is).

On the other hand, we have predictable results on integer 
computation when they overflow. I wonder how much of a gain it 
is. In my experience, most piece of code are doomed anyway when 
integer overflow occurs.

An other solution is to use checked integers. But this is even 
more costly. This have been discussed quite a lot in the previous 
thread, so i want to concentrate to the first issue.

How much performances are sacrificed compared to a looser integer 
semantic (I frankly don't know), and how much programs benefit 
from it (I suspect very little, but I may be wrong) ?
Dec 16 2012
next sibling parent "Tim =?UTF-8?B?xIxhcyI=?= <darkuranium gmail.com> writes:
On Sunday, 16 December 2012 at 22:52:44 UTC, deadalnix wrote:
 Following the thread on integer semantic, i wanted to know if 
 some data are available on the tradeoff we are doing. Let me 
 first explain what they are.

 In D, integer are guaranteed to loop, ie uint.max + 1 == 0 . In 
 C, the compiler is allowed to consider that operation do not 
 overflow, which allow some optimization. See 2 examples bellow :

 if(a < a + 1) { ... }

 On that case, a compiler can decide that the condition is alway 
 true in C. This may seems like a stupid piece of code, but in 
 fact this is something that the compiler can do on a regular 
 basis. You have to keep in mind that the example usually not 
 appears in a way that is as obvious as this, but after several 
 code transformation (function inlining, constant propagation, 
 etc . . .).

 Another example is (x + 1) * 2; The compiler may decide to 
 rewrite it as 2 * x + 2 as this is done in one operation on 
 many CPU, when (x + 1) * 2 is not always possible. Both are 
 equivalent, expect if the integer overflow.

 As seen, ensuring that integer loop properly is costly (if 
 someone have numbers, I'd be happy to know how much costly it 
 is).

 On the other hand, we have predictable results on integer 
 computation when they overflow. I wonder how much of a gain it 
 is. In my experience, most piece of code are doomed anyway when 
 integer overflow occurs.

 An other solution is to use checked integers. But this is even 
 more costly. This have been discussed quite a lot in the 
 previous thread, so i want to concentrate to the first issue.

 How much performances are sacrificed compared to a looser 
 integer semantic (I frankly don't know), and how much programs 
 benefit from it (I suspect very little, but I may be wrong) ?

FYI, in C, unsigned integers are guaranteed to be modulo 2^N (it is not called "overflow" in such a case). In other words, `UINT_MAX + 1U == 0`. Only *signed* integers can overflow or underflow; I do not know the D's semantics in this case.
Dec 16 2012
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 16 December 2012 at 22:56:08 UTC, Tim Čas wrote:
 FYI, in C, unsigned integers are guaranteed to be modulo 2^N 
 (it is not called "overflow" in such a case). In other words, 
 `UINT_MAX + 1U == 0`.

 Only *signed* integers can overflow or underflow; I do not know 
 the D's semantics in this case.

You seems to be right. The still still hold for signed integer.
Dec 16 2012
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 12/16/2012 11:52 PM, deadalnix wrote:
 Following the thread on integer semantic, i wanted to know if some data
 are available on the tradeoff we are doing. Let me first explain what
 they are.

 In D, integer are guaranteed to loop, ie uint.max + 1 == 0 . In C, the
 compiler is allowed to consider that operation do not overflow, which
 allow some optimization. See 2 examples bellow :

 if(a < a + 1) { ... }

 On that case, a compiler can decide that the condition is alway true in
 C. This may seems like a stupid piece of code, but in fact this is
 something that the compiler can do on a regular basis. You have to keep
 in mind that the example usually not appears in a way that is as obvious
 as this, but after several code transformation (function inlining,
 constant propagation, etc . . .).

 Another example is (x + 1) * 2; The compiler may decide to rewrite it as
 2 * x + 2 as this is done in one operation on many CPU, when (x + 1) * 2
 is not always possible. Both are equivalent, expect if the integer
 overflow.

Even if they do. I claim that 2 * x + 2 = (x + 1) * 2 holds for every int x. The only difference is that one of the two might set the CPU flags whereas the other wont. But flags are not exposed to D code anyway.
 As seen, ensuring that integer loop properly is costly (if someone have
 numbers, I'd be happy to know how much costly it is).

 On the other hand, we have predictable results on integer computation
 when they overflow. I wonder how much of a gain it is. In my experience,
 most piece of code are doomed anyway when integer overflow occurs.

It is not a very strong case, but the code may accidentally still work. Also, it is much easier to hunt down a bug when it does not disappear when unrelated code changes and is not dependent on optimization or debug flags.
 An other solution is to use checked integers. But this is even more
 costly. This have been discussed quite a lot in the previous thread, so
 i want to concentrate to the first issue.

 How much performances are sacrificed compared to a looser integer
 semantic (I frankly don't know), and how much programs benefit from it
 (I suspect very little, but I may be wrong) ?

I have no hard data, but I think that at the point where such things start to matter, I perform the optimizations myself anyway.
Dec 16 2012
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 12/17/2012 01:39 AM, deadalnix wrote:
 On Monday, 17 December 2012 at 00:27:42 UTC, Timon Gehr wrote:
 On 12/16/2012 11:52 PM, deadalnix wrote:
 Following the thread on integer semantic, i wanted to know if some data
 are available on the tradeoff we are doing. Let me first explain what
 they are.

 In D, integer are guaranteed to loop, ie uint.max + 1 == 0 . In C, the
 compiler is allowed to consider that operation do not overflow, which
 allow some optimization. See 2 examples bellow :

 if(a < a + 1) { ... }

 On that case, a compiler can decide that the condition is alway true in
 C. This may seems like a stupid piece of code, but in fact this is
 something that the compiler can do on a regular basis. You have to keep
 in mind that the example usually not appears in a way that is as obvious
 as this, but after several code transformation (function inlining,
 constant propagation, etc . . .).

 Another example is (x + 1) * 2; The compiler may decide to rewrite it as
 2 * x + 2 as this is done in one operation on many CPU, when (x + 1) * 2
 is not always possible. Both are equivalent, expect if the integer
 overflow.

Even if they do. I claim that 2 * x + 2 = (x + 1) * 2 holds for every int x. The only difference is that one of the two might set the CPU flags whereas the other wont. But flags are not exposed to D code anyway.

Use x * 4 / 2 => x * 2

Yup, that works.
 It is not a very strong case, but the code may accidentally still
 work. Also, it is much easier to hunt down a bug when it does not
 disappear when unrelated code changes and is not dependent on
 optimization or debug flags.

Maybe the compiler should generate a trap in debug mode and go full speed in release mode ? Code that accidentally work is not really a desirable result (even if i understand that this is arguably better).

If overflow semantics are changed to C's, then checking int overflow should be in a separate flag from release.
Dec 16 2012
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Monday, 17 December 2012 at 00:27:42 UTC, Timon Gehr wrote:
 On 12/16/2012 11:52 PM, deadalnix wrote:
 Following the thread on integer semantic, i wanted to know if 
 some data
 are available on the tradeoff we are doing. Let me first 
 explain what
 they are.

 In D, integer are guaranteed to loop, ie uint.max + 1 == 0 . 
 In C, the
 compiler is allowed to consider that operation do not 
 overflow, which
 allow some optimization. See 2 examples bellow :

 if(a < a + 1) { ... }

 On that case, a compiler can decide that the condition is 
 alway true in
 C. This may seems like a stupid piece of code, but in fact 
 this is
 something that the compiler can do on a regular basis. You 
 have to keep
 in mind that the example usually not appears in a way that is 
 as obvious
 as this, but after several code transformation (function 
 inlining,
 constant propagation, etc . . .).

 Another example is (x + 1) * 2; The compiler may decide to 
 rewrite it as
 2 * x + 2 as this is done in one operation on many CPU, when 
 (x + 1) * 2
 is not always possible. Both are equivalent, expect if the 
 integer
 overflow.

Even if they do. I claim that 2 * x + 2 = (x + 1) * 2 holds for every int x. The only difference is that one of the two might set the CPU flags whereas the other wont. But flags are not exposed to D code anyway.

Use x * 4 / 2 => x * 2
 It is not a very strong case, but the code may accidentally 
 still work. Also, it is much easier to hunt down a bug when it 
 does not disappear when unrelated code changes and is not 
 dependent on optimization or debug flags.

Maybe the compiler should generate a trap in debug mode and go full speed in release mode ? Code that accidentally work is not really a desirable result (even if i understand that this is arguably better).
Dec 16 2012
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 12/16/2012 2:52 PM, deadalnix wrote:
 How much performances are sacrificed compared to a looser integer semantic (I
 frankly don't know), and how much programs benefit from it (I suspect very
 little, but I may be wrong) ?

Recall that C is standardized to work on 1's complement machines, which affects the details of the "undefined behavior" rules. D is standardized to work only on 2's complement machines, so less is undefined.
Dec 17 2012