digitalmars.D - Integer semantic in D, what are the tradeoff ? Do we have numbers ?
- "deadalnix" <deadalnix gmail.com> Dec 16 2012
- "Tim =?UTF-8?B?xIxhcyI=?= <darkuranium gmail.com> Dec 16 2012
- "deadalnix" <deadalnix gmail.com> Dec 16 2012
- Timon Gehr <timon.gehr gmx.ch> Dec 16 2012
- Timon Gehr <timon.gehr gmx.ch> Dec 16 2012
- "deadalnix" <deadalnix gmail.com> Dec 16 2012
- Walter Bright <newshound2 digitalmars.com> Dec 17 2012
Following the thread on integer semantic, i wanted to know if
some data are available on the tradeoff we are doing. Let me
first explain what they are.
In D, integer are guaranteed to loop, ie uint.max + 1 == 0 . In
C, the compiler is allowed to consider that operation do not
overflow, which allow some optimization. See 2 examples bellow :
if(a < a + 1) { ... }
On that case, a compiler can decide that the condition is alway
true in C. This may seems like a stupid piece of code, but in
fact this is something that the compiler can do on a regular
basis. You have to keep in mind that the example usually not
appears in a way that is as obvious as this, but after several
code transformation (function inlining, constant propagation, etc
. . .).
Another example is (x + 1) * 2; The compiler may decide to
rewrite it as 2 * x + 2 as this is done in one operation on many
CPU, when (x + 1) * 2 is not always possible. Both are
equivalent, expect if the integer overflow.
As seen, ensuring that integer loop properly is costly (if
someone have numbers, I'd be happy to know how much costly it is).
On the other hand, we have predictable results on integer
computation when they overflow. I wonder how much of a gain it
is. In my experience, most piece of code are doomed anyway when
integer overflow occurs.
An other solution is to use checked integers. But this is even
more costly. This have been discussed quite a lot in the previous
thread, so i want to concentrate to the first issue.
How much performances are sacrificed compared to a looser integer
semantic (I frankly don't know), and how much programs benefit
from it (I suspect very little, but I may be wrong) ?
Dec 16 2012
On Sunday, 16 December 2012 at 22:52:44 UTC, deadalnix wrote:Following the thread on integer semantic, i wanted to know if some data are available on the tradeoff we are doing. Let me first explain what they are. In D, integer are guaranteed to loop, ie uint.max + 1 == 0 . In C, the compiler is allowed to consider that operation do not overflow, which allow some optimization. See 2 examples bellow : if(a < a + 1) { ... } On that case, a compiler can decide that the condition is alway true in C. This may seems like a stupid piece of code, but in fact this is something that the compiler can do on a regular basis. You have to keep in mind that the example usually not appears in a way that is as obvious as this, but after several code transformation (function inlining, constant propagation, etc . . .). Another example is (x + 1) * 2; The compiler may decide to rewrite it as 2 * x + 2 as this is done in one operation on many CPU, when (x + 1) * 2 is not always possible. Both are equivalent, expect if the integer overflow. As seen, ensuring that integer loop properly is costly (if someone have numbers, I'd be happy to know how much costly it is). On the other hand, we have predictable results on integer computation when they overflow. I wonder how much of a gain it is. In my experience, most piece of code are doomed anyway when integer overflow occurs. An other solution is to use checked integers. But this is even more costly. This have been discussed quite a lot in the previous thread, so i want to concentrate to the first issue. How much performances are sacrificed compared to a looser integer semantic (I frankly don't know), and how much programs benefit from it (I suspect very little, but I may be wrong) ?
FYI, in C, unsigned integers are guaranteed to be modulo 2^N (it is not called "overflow" in such a case). In other words, `UINT_MAX + 1U == 0`. Only *signed* integers can overflow or underflow; I do not know the D's semantics in this case.
Dec 16 2012
On Sunday, 16 December 2012 at 22:56:08 UTC, Tim Čas wrote:FYI, in C, unsigned integers are guaranteed to be modulo 2^N (it is not called "overflow" in such a case). In other words, `UINT_MAX + 1U == 0`. Only *signed* integers can overflow or underflow; I do not know the D's semantics in this case.
You seems to be right. The still still hold for signed integer.
Dec 16 2012
On 12/16/2012 11:52 PM, deadalnix wrote:Following the thread on integer semantic, i wanted to know if some data are available on the tradeoff we are doing. Let me first explain what they are. In D, integer are guaranteed to loop, ie uint.max + 1 == 0 . In C, the compiler is allowed to consider that operation do not overflow, which allow some optimization. See 2 examples bellow : if(a < a + 1) { ... } On that case, a compiler can decide that the condition is alway true in C. This may seems like a stupid piece of code, but in fact this is something that the compiler can do on a regular basis. You have to keep in mind that the example usually not appears in a way that is as obvious as this, but after several code transformation (function inlining, constant propagation, etc . . .). Another example is (x + 1) * 2; The compiler may decide to rewrite it as 2 * x + 2 as this is done in one operation on many CPU, when (x + 1) * 2 is not always possible. Both are equivalent, expect if the integer overflow.
Even if they do. I claim that 2 * x + 2 = (x + 1) * 2 holds for every int x. The only difference is that one of the two might set the CPU flags whereas the other wont. But flags are not exposed to D code anyway.As seen, ensuring that integer loop properly is costly (if someone have numbers, I'd be happy to know how much costly it is). On the other hand, we have predictable results on integer computation when they overflow. I wonder how much of a gain it is. In my experience, most piece of code are doomed anyway when integer overflow occurs.
It is not a very strong case, but the code may accidentally still work. Also, it is much easier to hunt down a bug when it does not disappear when unrelated code changes and is not dependent on optimization or debug flags.An other solution is to use checked integers. But this is even more costly. This have been discussed quite a lot in the previous thread, so i want to concentrate to the first issue. How much performances are sacrificed compared to a looser integer semantic (I frankly don't know), and how much programs benefit from it (I suspect very little, but I may be wrong) ?
I have no hard data, but I think that at the point where such things start to matter, I perform the optimizations myself anyway.
Dec 16 2012
On 12/17/2012 01:39 AM, deadalnix wrote:On Monday, 17 December 2012 at 00:27:42 UTC, Timon Gehr wrote:On 12/16/2012 11:52 PM, deadalnix wrote:Following the thread on integer semantic, i wanted to know if some data are available on the tradeoff we are doing. Let me first explain what they are. In D, integer are guaranteed to loop, ie uint.max + 1 == 0 . In C, the compiler is allowed to consider that operation do not overflow, which allow some optimization. See 2 examples bellow : if(a < a + 1) { ... } On that case, a compiler can decide that the condition is alway true in C. This may seems like a stupid piece of code, but in fact this is something that the compiler can do on a regular basis. You have to keep in mind that the example usually not appears in a way that is as obvious as this, but after several code transformation (function inlining, constant propagation, etc . . .). Another example is (x + 1) * 2; The compiler may decide to rewrite it as 2 * x + 2 as this is done in one operation on many CPU, when (x + 1) * 2 is not always possible. Both are equivalent, expect if the integer overflow.
Even if they do. I claim that 2 * x + 2 = (x + 1) * 2 holds for every int x. The only difference is that one of the two might set the CPU flags whereas the other wont. But flags are not exposed to D code anyway.
Use x * 4 / 2 => x * 2
Yup, that works.It is not a very strong case, but the code may accidentally still work. Also, it is much easier to hunt down a bug when it does not disappear when unrelated code changes and is not dependent on optimization or debug flags.
Maybe the compiler should generate a trap in debug mode and go full speed in release mode ? Code that accidentally work is not really a desirable result (even if i understand that this is arguably better).
If overflow semantics are changed to C's, then checking int overflow should be in a separate flag from release.
Dec 16 2012
On Monday, 17 December 2012 at 00:27:42 UTC, Timon Gehr wrote:On 12/16/2012 11:52 PM, deadalnix wrote:Following the thread on integer semantic, i wanted to know if some data are available on the tradeoff we are doing. Let me first explain what they are. In D, integer are guaranteed to loop, ie uint.max + 1 == 0 . In C, the compiler is allowed to consider that operation do not overflow, which allow some optimization. See 2 examples bellow : if(a < a + 1) { ... } On that case, a compiler can decide that the condition is alway true in C. This may seems like a stupid piece of code, but in fact this is something that the compiler can do on a regular basis. You have to keep in mind that the example usually not appears in a way that is as obvious as this, but after several code transformation (function inlining, constant propagation, etc . . .). Another example is (x + 1) * 2; The compiler may decide to rewrite it as 2 * x + 2 as this is done in one operation on many CPU, when (x + 1) * 2 is not always possible. Both are equivalent, expect if the integer overflow.
Even if they do. I claim that 2 * x + 2 = (x + 1) * 2 holds for every int x. The only difference is that one of the two might set the CPU flags whereas the other wont. But flags are not exposed to D code anyway.
Use x * 4 / 2 => x * 2It is not a very strong case, but the code may accidentally still work. Also, it is much easier to hunt down a bug when it does not disappear when unrelated code changes and is not dependent on optimization or debug flags.
Maybe the compiler should generate a trap in debug mode and go full speed in release mode ? Code that accidentally work is not really a desirable result (even if i understand that this is arguably better).
Dec 16 2012
On 12/16/2012 2:52 PM, deadalnix wrote:How much performances are sacrificed compared to a looser integer semantic (I frankly don't know), and how much programs benefit from it (I suspect very little, but I may be wrong) ?
Recall that C is standardized to work on 1's complement machines, which affects the details of the "undefined behavior" rules. D is standardized to work only on 2's complement machines, so less is undefined.
Dec 17 2012









"Tim =?UTF-8?B?xIxhcyI=?= <darkuranium gmail.com> 