www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - d-programming-language.org

reply eles <eles eles.com> writes:
1. please update http://www.d-programming-language.org/faq.html#q6
("Why fall through on switch statements?").

"The reason D doesn't change this is for the same reason that
integral promotion rules and operator precedence rules were kept the
same - to make code that looks the same as in C operate the same. If
it had subtly different semantics, it will cause frustratingly subtle
bugs." was a good reason at its time, it is no more (since code is
simply *not* accepted).

now, flames are on (again) - well, I have no alternative.

2. http://www.d-programming-language.org/faq.html#case_range (Why
doesn't the case range statement use the case X..Y: syntax?)

saying that "Case (1) has a VERY DIFFERENT meaning from (2) and (3).
(1) is inclusive of Y, and (2) and (3) are exclusive of Y. Having a
very different meaning means it should have a distinctly different
syntax." is a POST-rationalization (and, personally speaking, a real
shame for such a nice language as D) of a bad choice that was
exclusive-right limit in x..y syntax. Just imagine that you could
allow "case X..Y" syntax and avoid that explanation (and faq).

I made several arguments for changing this syntax:
-logic (not just code-writing or compile-writing easiness) consistency
-consistent representation on the same number of bits of the maximum
index of an array (while the length is not representable)
-the fact that multi-dimensional slicing (possibly, a future feature)
is far more convenient when x..y is inclusive (just imagine
remembering which those elements are left out on those many-
dimensions of the data)
-the fact that Ruby has implemented the inclusive syntax too (so
there was some need to do that although the right-exclusive syntax
was already available)
-disjoint array slices would have disjoint index ranges (imagine
cases where consecutive slices overlap a bit, like in moving average
applications)
-now, the fact that "case X..Y"  syntax will be allowable

I know that is implemented that way (and vala and python went with
it). What would be the cost to replace those X..Y with X..(Y-1)?
Aren't the above reasons worthy to consider such a change?

Well, (almost) enough for now. I also maintain that unsigned types
should throw out-of-range exceptions (in debug mode, so that release
will run as fast as it gets) when decremented below zero, unless
specifically marked as *circular* (i.e. intended behavior) or
something like this. This will prevent some bugs. I see those quite
often in my student's homeworks.
Jul 03 2011
next sibling parent reply KennyTM~ <kennytm gmail.com> writes:
On Jul 4, 11 02:40, eles wrote:
 1. please update http://www.d-programming-language.org/faq.html#q6
 ("Why fall through on switch statements?").

 "The reason D doesn't change this is for the same reason that
 integral promotion rules and operator precedence rules were kept the
 same - to make code that looks the same as in C operate the same. If
 it had subtly different semantics, it will cause frustratingly subtle
 bugs." was a good reason at its time, it is no more (since code is
 simply *not* accepted).

The whole FAQ needs to be reviewed: 1. Most of the "D 2.0 FAQ" is missing (e.g. 'Where is my simple language?') 2. 'How do I write my own D compiler for CPU X?' - the alternative backend has not been updated for 9 years. As a guide, it's better to refer to some Wiki4D article or the LDC interface. 3. 'Where can I get an IDE for D?' - Just link to a Wiki4D article. It is missing newer IDEs like Visual D. 4. 'What about templates?' - There's no need to keep this D 0.x item... 5. 'Why fall through on switch statements?' - As eles's comment. 6. 'Why doesn't D use reference counting for garbage collection?' - Add std.typecons.RefCounted 7. 'Can't a sufficiently smart compiler figure out that a function is pure automatically?' - But now we have purity inference :)
 now, flames are on (again) - well, I have no alternative.

Jul 03 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/3/11 1:58 PM, KennyTM~ wrote:
 On Jul 4, 11 02:40, eles wrote:
 1. please update http://www.d-programming-language.org/faq.html#q6
 ("Why fall through on switch statements?").

 "The reason D doesn't change this is for the same reason that
 integral promotion rules and operator precedence rules were kept the
 same - to make code that looks the same as in C operate the same. If
 it had subtly different semantics, it will cause frustratingly subtle
 bugs." was a good reason at its time, it is no more (since code is
 simply *not* accepted).

The whole FAQ needs to be reviewed: 1. Most of the "D 2.0 FAQ" is missing (e.g. 'Where is my simple language?') 2. 'How do I write my own D compiler for CPU X?' - the alternative backend has not been updated for 9 years. As a guide, it's better to refer to some Wiki4D article or the LDC interface. 3. 'Where can I get an IDE for D?' - Just link to a Wiki4D article. It is missing newer IDEs like Visual D. 4. 'What about templates?' - There's no need to keep this D 0.x item... 5. 'Why fall through on switch statements?' - As eles's comment. 6. 'Why doesn't D use reference counting for garbage collection?' - Add std.typecons.RefCounted 7. 'Can't a sufficiently smart compiler figure out that a function is pure automatically?' - But now we have purity inference :)
 now, flames are on (again) - well, I have no alternative.


Great points. I'll update the FAQ when I'll find the time, and please send your pull requests too! Andrei
Jul 03 2011
parent KennyTM~ <kennytm gmail.com> writes:
On Jul 4, 11 04:16, Andrei Alexandrescu wrote:
 On 7/3/11 1:58 PM, KennyTM~ wrote:
 On Jul 4, 11 02:40, eles wrote:
 1. please update http://www.d-programming-language.org/faq.html#q6
 ("Why fall through on switch statements?").

 "The reason D doesn't change this is for the same reason that
 integral promotion rules and operator precedence rules were kept the
 same - to make code that looks the same as in C operate the same. If
 it had subtly different semantics, it will cause frustratingly subtle
 bugs." was a good reason at its time, it is no more (since code is
 simply *not* accepted).

The whole FAQ needs to be reviewed: 1. Most of the "D 2.0 FAQ" is missing (e.g. 'Where is my simple language?') 2. 'How do I write my own D compiler for CPU X?' - the alternative backend has not been updated for 9 years. As a guide, it's better to refer to some Wiki4D article or the LDC interface. 3. 'Where can I get an IDE for D?' - Just link to a Wiki4D article. It is missing newer IDEs like Visual D. 4. 'What about templates?' - There's no need to keep this D 0.x item... 5. 'Why fall through on switch statements?' - As eles's comment. 6. 'Why doesn't D use reference counting for garbage collection?' - Add std.typecons.RefCounted 7. 'Can't a sufficiently smart compiler figure out that a function is pure automatically?' - But now we have purity inference :)
 now, flames are on (again) - well, I have no alternative.


Great points. I'll update the FAQ when I'll find the time, and please send your pull requests too! Andrei

Submitted http://d.puremagic.com/issues/show_bug.cgi?id=6243 to keep track on the list.
Jul 03 2011
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/3/2011 11:40 AM, eles wrote:
 I know that is implemented that way (and vala and python went with
 it). What would be the cost to replace those X..Y with X..(Y-1)?
 Aren't the above reasons worthy to consider such a change?

Regardless of the merits of your argument, such a change will silently break nearly every D program out there. It's way, way too late.
Jul 03 2011
parent reply eles <eles eles.com> writes:
 Regardless of the merits of your argument, such a change will

 nearly every D program out there. It's way, way too late.

forbid/or no the former (..) and introduce a new one (...) 2..4 => 2, 3 2...4 => 2, 3, 4
Jul 03 2011
parent eles <eles eles.com> writes:
 2..4 => 2, 3
 2...4 => 2, 3, 4

alternative syntaxes could be (:) one could also have stepped ranges (inclusive or exclusive), let's say: 1:2:5 => 1, 3, 5 although this is not so practical for the compiler (I think that slices are kept through the pointer/length couple, so stepped slices would require a third variable and further complications) anyway, I won't come back again to this issue (and I hope to be able to keep my word). I saw it as an improvement. if the disturbing cost is greater that the improvement benefit, well. anyway.
Jul 03 2011
prev sibling next sibling parent David Nadlinger <see klickverbot.at> writes:
On 7/3/11 8:40 PM, eles wrote:
 -logic (not just code-writing or compile-writing easiness) consistency
 -the fact that multi-dimensional slicing (possibly, a future feature)
 is far more convenient when x..y is inclusive (just imagine
 remembering which those elements are left out on those many-
 dimensions of the data)
 -the fact that Ruby has implemented the inclusive syntax too (so
 there was some need to do that although the right-exclusive syntax
 was already available)
 -disjoint array slices would have disjoint index ranges (imagine
 cases where consecutive slices overlap a bit, like in moving average
 applications)

As discussed before, many people here, myself included, think that these arguments are purely subjective, and you don't make them any better by stating them over and over again. Like I suggested in the previous thread, would you maybe consider just going ahead and writing some D code? While I imagine that you probably won't join the »open-right camp«, you might discover that the slicing syntax is not as big an issue as you are trying to make it appear. Also, this is not a question of »now, flames are on (again)« – whether to use open-right or closed slicing is a design decision where the arguments for both alternative are roughly equivalent. D has gone with the former, and I don't quite see why you just can't accept this as a fact. Besides, as Walter pointed out, there is no way the semantics of slicing could be changed at this point. The only thing that could be considered would be using something like a[0...1] for closed indexing. David
Jul 03 2011
prev sibling next sibling parent Daniel Gibson <metalcaedes gmail.com> writes:
Am 03.07.2011 20:40, schrieb eles:
 
 2. http://www.d-programming-language.org/faq.html#case_range (Why
 doesn't the case range statement use the case X..Y: syntax?)
 
 saying that "Case (1) has a VERY DIFFERENT meaning from (2) and (3).
 (1) is inclusive of Y, and (2) and (3) are exclusive of Y. Having a
 very different meaning means it should have a distinctly different
 syntax." is a POST-rationalization (and, personally speaking, a real
 shame for such a nice language as D) of a bad choice that was
 exclusive-right limit in x..y syntax. Just imagine that you could
 allow "case X..Y" syntax and avoid that explanation (and faq).
 
 I made several arguments for changing this syntax:
 -logic (not just code-writing or compile-writing easiness) consistency
 -consistent representation on the same number of bits of the maximum
 index of an array (while the length is not representable)
 -the fact that multi-dimensional slicing (possibly, a future feature)
 is far more convenient when x..y is inclusive (just imagine
 remembering which those elements are left out on those many-
 dimensions of the data)
 -the fact that Ruby has implemented the inclusive syntax too (so
 there was some need to do that although the right-exclusive syntax
 was already available)
 -disjoint array slices would have disjoint index ranges (imagine
 cases where consecutive slices overlap a bit, like in moving average
 applications)
 -now, the fact that "case X..Y"  syntax will be allowable
 
 I know that is implemented that way (and vala and python went with
 it). What would be the cost to replace those X..Y with X..(Y-1)?
 Aren't the above reasons worthy to consider such a change?
 

Not this again, this (right border is exclusive) was already discussed at length. With you, actually. So why do you bring up this up again? Just accept that D handles ranges, slices, ... this way, it's not going to change anyway. Cheers, - Daniel
Jul 03 2011
prev sibling next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
eles wrote:
 1. please update http://www.d-programming-language.org/faq.html#q6
 ("Why fall through on switch statements?").

 "The reason D doesn't change this is for the same reason that
 integral promotion rules and operator precedence rules were kept the
 same - to make code that looks the same as in C operate the same. If
 it had subtly different semantics, it will cause frustratingly subtle
 bugs." was a good reason at its time, it is no more (since code is
 simply *not* accepted).

 now, flames are on (again) - well, I have no alternative.

I have to disagree.
 2. http://www.d-programming-language.org/faq.html#case_range (Why
 doesn't the case range statement use the case X..Y: syntax?)

 saying that "Case (1) has a VERY DIFFERENT meaning from (2) and (3).
 (1) is inclusive of Y, and (2) and (3) are exclusive of Y. Having a
 very different meaning means it should have a distinctly different
 syntax." is a POST-rationalization (and, personally speaking, a real
 shame for such a nice language as D) of a bad choice that was
 exclusive-right limit in x..y syntax. Just imagine that you could
 allow "case X..Y" syntax and avoid that explanation (and faq).

Agreed, that answer does not make much sense. The reason why the current syntax case x: .. case y: is better is this: case x..y: // case statement that matches a range. case x: .. case y: // range of case statements. The first one would be the wrong way round. As to the included/excluded inconsistency: Different applications require different conventions. After all, x..y is just a pair that ought to be interpreted as a range. It is a bit unfortunate, but I am quite sure it is the right decision/a reasonable trade-off.
 I made several arguments for changing this syntax:
 -logic (not just code-writing or compile-writing easiness) consistency

a[0..$]; a[0..$-1]; Which one does look more consistent (assuming both should return a slice of the entire array a) and why?
 -consistent representation on the same number of bits of the maximum
 index of an array (while the length is not representable)

This is not a valid argument, please let it rest. An array whose length is not representable is itself not representable, because it does not fit into your machine's address space.
 -the fact that multi-dimensional slicing (possibly, a future feature)
 is far more convenient when x..y is inclusive (just imagine
 remembering which those elements are left out on those many-
 dimensions of the data)

How would that work? Array slices are references to the same data. It is quite safe to say D will *never* have multi-dimensional built-in array slicing.
 -the fact that Ruby has implemented the inclusive syntax too (so
 there was some need to do that although the right-exclusive syntax
 was already available)

a[i,j]; //start+length in ruby iirc a[i..j]; //inclusive in ruby a[i...j]; //exclusive in ruby I assume ruby offers exclusive slicing, because having just inclusive slicing was not considered quite sufficient. (or alternatively, given that they also implemented start index+length, they just wanted to prevent discussions like this one.) ruby slices have value semantics and are therefore quite different from D slices.
 -disjoint array slices would have disjoint index ranges (imagine
 cases where consecutive slices overlap a bit, like in moving average
 applications)

That is a special case, usually you need disjoint slices. And furthermore: // moving average of n consecutive elements // as you stated it requires slices, I am providing a naive O(n^2) solution. int n=...; double[] a=...; assert(a.length>=n); auto avg = new double[](a.length-n+1); // 1. right inclusive slicing //foreach(i, ref x;avg) x = average(a[i..i+n-1]); // 2. right exclusive slicing foreach(i, ref x;avg) x = average(a[i..i+n]); /* No comment */
 -now, the fact that "case X..Y"  syntax will be allowable

That is an anti-reason because case X..Y: is flawed as explained above.
 I know that is implemented that way (and vala and python went with
 it). What would be the cost to replace those X..Y with X..(Y-1)?
 Aren't the above reasons worthy to consider such a change?

The above reasons are biased in that they do not mention any of the *benefits* that the right-exclusive semantics bring. Also, I personally think they are not valid reasons. What would be the benefit of having to replace those X..Y with X..Y-1? It certainly bears a cost, including that if you fail to do it, your program silently breaks.
 Well, (almost) enough for now. I also maintain that unsigned types
 should throw out-of-range exceptions (in debug mode, so that release
 will run as fast as it gets) when decremented below zero, unless
 specifically marked as *circular* (i.e. intended behavior) or
 something like this. This will prevent some bugs. I see those quite
 often in my student's homeworks.

True, you seldom *want* an unsigned integer to underflow. I am not sure if it is worth the slowdown though. In practice, I think unsigned types are good for having access to all comparison (and in C, shift) operators the hardware provides and for requiring less space than long/cent/BigInt in large arrays if your values are positive and lie in certain ranges. Not much more. Is there a reason that those students use unsigned counter variables so often? Are they programming for a 16 bit architecture? Cheers, -Timon
Jul 03 2011
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/3/11 1:40 PM, eles wrote:
 1. please update http://www.d-programming-language.org/faq.html#q6
 ("Why fall through on switch statements?").

 "The reason D doesn't change this is for the same reason that
 integral promotion rules and operator precedence rules were kept the
 same - to make code that looks the same as in C operate the same. If
 it had subtly different semantics, it will cause frustratingly subtle
 bugs." was a good reason at its time, it is no more (since code is
 simply *not* accepted).
 now, flames are on (again) - well, I have no alternative.

Recently we have tightened the semantics of switch to reject "case leak". Perhaps that should be reflected in the FAQ. Did you have other change in mind? Why would the topic be flame-inducing?
 2. http://www.d-programming-language.org/faq.html#case_range (Why
 doesn't the case range statement use the case X..Y: syntax?)

 saying that "Case (1) has a VERY DIFFERENT meaning from (2) and (3).
 (1) is inclusive of Y, and (2) and (3) are exclusive of Y. Having a
 very different meaning means it should have a distinctly different
 syntax." is a POST-rationalization (and, personally speaking, a real
 shame for such a nice language as D) of a bad choice that was
 exclusive-right limit in x..y syntax. Just imagine that you could
 allow "case X..Y" syntax and avoid that explanation (and faq).

That's not a post-rationalization. It's exactly the reason of the choice.
 I made several arguments for changing this syntax:
 -logic (not just code-writing or compile-writing easiness) consistency
 -consistent representation on the same number of bits of the maximum
 index of an array (while the length is not representable)
 -the fact that multi-dimensional slicing (possibly, a future feature)
 is far more convenient when x..y is inclusive (just imagine
 remembering which those elements are left out on those many-
 dimensions of the data)
 -the fact that Ruby has implemented the inclusive syntax too (so
 there was some need to do that although the right-exclusive syntax
 was already available)
 -disjoint array slices would have disjoint index ranges (imagine
 cases where consecutive slices overlap a bit, like in moving average
 applications)
 -now, the fact that "case X..Y"  syntax will be allowable

 I know that is implemented that way (and vala and python went with
 it). What would be the cost to replace those X..Y with X..(Y-1)?
 Aren't the above reasons worthy to consider such a change?

No, and it's not worth rehashing again the merits and demerits of various arguments. Please don't bring this up again. Thank you.
 Well, (almost) enough for now. I also maintain that unsigned types
 should throw out-of-range exceptions (in debug mode, so that release
 will run as fast as it gets) when decremented below zero, unless
 specifically marked as *circular* (i.e. intended behavior) or
 something like this. This will prevent some bugs. I see those quite
 often in my student's homeworks.

Safe on top of flexible is the best design. If there is anything preventing you from defining a type with the behavior you mention, you may want to file a bug. Thanks, Andrei
Jul 03 2011
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei:

 Safe on top of flexible is the best design. If there is anything 
 preventing you from defining a type with the behavior you mention, you 
 may want to file a bug.

I am not a believer. Compile time/run time integral overflows to be "good enough" need to be built-in, more than associative arrays. How do you implement the compile-time ones? Run-time integral overflows need to be tested everywhere you don't explicitly ask for them to not happen, otherwise you will not catch most bugs. Bye, bearophile
Jul 03 2011
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
What I meant:

Run-time: I have written an enhancement request for LLVM about optimizing much
better the simple operations needed to spot and use the overflows. LLVM dev
team has implemented it in 2.8 or 2.9. Such optimizations are not optional, if
you want people to use overflow tests they need to be efficient. Even the
advanced optimizations done by LLVM weren't good enough until few months ago.

Compile-time: D is able to run code at compile-time too, but only where you ask
it explicitly, using or assigning the result where a compile-time constant is
required. I think this means compile-time overflow tests will usually not
happen.

There are routines for run-time overflow tests in C and C++, but I am not
seeing them used. While in Delphi I use overflow tests all the time and I see
code written by other people that have runtime overflow tests switched on. I
think that to catch integral overflow bugs in programs you can't just add a
SafeInt struct, you need a compiler-wide switch. Otherwise most people will not
use it. Array bound tests are able to catch bugs in normal D code written by
everybody because you don't need to use a SafeArray instead of the built in
arrays and because array bound tests are active on default, you need a switch
to disable them. A bit more syntax is needed to disable tests locally, where
needed.

Bye,
bearophile
Jul 03 2011
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/3/11 3:50 PM, bearophile wrote:
 Andrei:

 Safe on top of flexible is the best design. If there is anything
 preventing you from defining a type with the behavior you mention,
 you may want to file a bug.

I am not a believer. Compile time/run time integral overflows to be "good enough" need to be built-in, more than associative arrays. How do you implement the compile-time ones? Run-time integral overflows need to be tested everywhere you don't explicitly ask for them to not happen, otherwise you will not catch most bugs. Bye, bearophile

This and others (zero- vs. one-based indexing, closed vs. open intervals etc.) are issues with well-understood tradeoffs that could go either way. Making a choice in such matters becomes part of a language's ethos. After a while it becomes clear that rehashing such matters without qualitatively new arguments is futile. Andrei
Jul 04 2011
prev sibling next sibling parent reply eles <eles eles.com> writes:
 various arguments. Please don't bring this up again. Thank you.

I won't, but this is because I made that decision minutes ago.
 Well, (almost) enough for now. I also maintain that unsigned types
 should throw out-of-range exceptions (in debug mode, so that


 will run as fast as it gets) when decremented below zero, unless
 specifically marked as *circular* (i.e. intended behavior) or
 something like this. This will prevent some bugs. I see those


 often in my student's homeworks.

preventing you from defining a type with the behavior you mention,

 may want to file a bug.

type information will disappear after compilation. defining a class over unsigned int (and overloading operators) on top of that will be way too much overhead. in fact, this is what compiler would do for the debug type. to not break the existing code: define "uncircular" types that will throw that exception. this could be implemented as a keyword, since is more like a annotation.
Jul 03 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/3/11 4:23 PM, eles wrote:
 various arguments. Please don't bring this up again. Thank you.

I won't, but this is because I made that decision minutes ago.

Even better.
 Well, (almost) enough for now. I also maintain that unsigned types
 should throw out-of-range exceptions (in debug mode, so that


 will run as fast as it gets) when decremented below zero, unless
 specifically marked as *circular* (i.e. intended behavior) or
 something like this. This will prevent some bugs. I see those


 often in my student's homeworks.

preventing you from defining a type with the behavior you mention,

 may want to file a bug.

type information will disappear after compilation. defining a class over unsigned int (and overloading operators) on top of that will be way too much overhead. in fact, this is what compiler would do for the debug type.

D is not Java. You may want to define a struct instead of a class. Save for the overflow checks, speed of such a wrapper should stay largely unchanged.
 to not break the existing code: define "uncircular" types that will
 throw that exception. this could be implemented as a  keyword, since
 is more like a annotation.

I have an idea - how about the notation Uncircular!uint to designate such a type? Andrei
Jul 03 2011
parent reply eles <eles eles.com> writes:
 I have an idea - how about the notation Uncircular!uint to designate
 such a type?
 Andrei

Either put it into the standard language, either I have a better one. what about dropping printf and start using: mov ah, 9 int 0x21 instead? I am sure it can be done. So, why not dropping all D and start to code back into assembler? The point is that "it can be done even if the current context" is a shallow excuse for rejecting better ways to achieve something. BTW, I really look forward to show me an piece of D code that *cannot* be done in assembler. Then, I ensure you, I will stop looking for better alternatives to existing ones.
Jul 04 2011
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/4/11 1:49 PM, eles wrote:
 I have an idea - how about the notation Uncircular!uint to designate
 such a type?
 Andrei

Either put it into the standard language, either I have a better one. what about dropping printf and start using: mov ah, 9 int 0x21 instead? I am sure it can be done. So, why not dropping all D and start to code back into assembler? The point is that "it can be done even if the current context" is a shallow excuse for rejecting better ways to achieve something. BTW, I really look forward to show me an piece of D code that *cannot* be done in assembler. Then, I ensure you, I will stop looking for better alternatives to existing ones.

I don't see much merit in the comparison with assembler. Beyond that, it's a tad assuming that you presuppose you have an understanding of the superiority of automated bounds checking, whereas others are unable to get to the same understanding. Automatic built-in overflow checking is a choice in the language design space that has distinct consequences. It's easy to execute in the compiler so the entry barrier is low. To claim that language designers who didn't go that way simply missed the point makes for a weak story. Andrei
Jul 04 2011
parent eles <eles eles.com> writes:
 Automatic built-in overflow checking is a choice in the language

 space that has distinct consequences. It's easy to execute in the
 compiler so the entry barrier is low. To claim that language designers
 who didn't go that way simply missed the point makes for a weak story.
 Andrei

Well, I was a bit harsh. Apologies. Force of things is not that "they are easy" but "they are standard".
Jul 04 2011
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Steven Schveighoffer:

 To the point -- lots of existing D and C code uses the properties of  
 integer overflow.  If integer overflow is assumed to be an error, then  
 that code is broken, even though the code *expects* overflow to occur, and  
 in fact might *depend* on it occurring.

In this case you wrap the code in something that allows it to overflow without errors, like: unsafe(overflows) { // code here } ------------------------ Andrei:
This and others (zero- vs. one-based indexing, closed vs. open intervals etc.)
are issues with well-understood tradeoffs that could go either way.<

Integral overflows are not the same thing as indexing and intervals. Such last two are equal ways to write the same thing, while overflows are a way to spot a class of bugs in code.
Making a choice in such matters becomes part of a language's ethos.<

Right, and I think D Zen is pro-safety.
After a while it becomes clear that rehashing such matters without
qualitatively new arguments is futile.<

I have answered because you have said wrong things. You have implicitly said that good overflow tests are doable with library code, and I have explained why you are wrong. This isn't futile. Bye, bearophile
Jul 04 2011
next sibling parent reply eles <eles eles.com> writes:
 Yes non-relase mode is slower, but we are probably talking
 about a very significant slowdown here.  A lot of integer math

 D.

What about testing only for user selected variables/types?
 I think a much more effective fix for the language would be to make

 length a signed type.  Then you simply eliminate 99% of integer

 (it's very rare that a signed integer overflows, but not so

 an unsigned one does).

I do not like that (Java-like) path too much. Why loosing half of length range?
Jul 04 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
eles wrote:
 Yes non-relase mode is slower, but we are probably talking
 about a very significant slowdown here.  A lot of integer math
 happens in
 D.

What about testing only for user selected variables/types?

How is Uncircular!uint not up to that particular task? Arguably, it can be less efficient than a built-in solution, but why do you need that efficiency for debug builds? Cheers, -Timon
Jul 04 2011
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/4/11 2:48 PM, bearophile wrote:
 Steven Schveighoffer:

 To the point -- lots of existing D and C code uses the properties
 of integer overflow.  If integer overflow is assumed to be an
 error, then that code is broken, even though the code *expects*
 overflow to occur, and in fact might *depend* on it occurring.

In this case you wrap the code in something that allows it to overflow without errors, like: unsafe(overflows) { // code here }

This approach has a number of issues. First, addressing transitivity is difficult. If the code in such a scope calls a function, either every function has two versions, or chooses one way to go about it. Each choice has obvious drawbacks. Second, programmers are notoriously bad at choosing which code is affecting bottom line performance, yet this feature explicitly puts the burden on the coder. So code will be littered with amends, yet still be overall slower. This feature has very poor scalability.
 ------------------------

 Andrei:

 This and others (zero- vs. one-based indexing, closed vs. open
 intervals etc.) are issues with well-understood tradeoffs that
 could go either way.<

Integral overflows are not the same thing as indexing and intervals.

Of course they're not the same thing. Commonalities and differences.
 Such last two are equal ways to write the same thing, while overflows
 are a way to spot a class of bugs in code.

Well they also are a solid way to slow down all code.
 Making a choice in such matters becomes part of a language's
 ethos.<

Right, and I think D Zen is pro-safety.

You are using a different version of safety than D does. D defines very precisely safety as memory safety. Your definition is larger, less precise, and more ad-hoc.
 After a while it becomes clear that rehashing such matters without
 qualitatively new arguments is futile.<

I have answered because you have said wrong things. You have implicitly said that good overflow tests are doable with library code, and I have explained why you are wrong. This isn't futile.

Probably one good thing to get past is the attitude that in such a discussion the other is "wrong". These are simple matters of which understanding does not require any amount of special talent or competence. So the first step is to understand that some may actually value a different choice with different consequences than yours because they find the costs unacceptable. As someone who makes numerous posts and bug reports regarding speed of D code, you should definitely have more appreciation for that view. Andrei
Jul 04 2011
parent reply Mehrdad <wfunction hotmail.com> writes:
Since I didn't see this being mentioned anywhere, I thought I'd mention 
it...

On 7/4/2011 1:58 PM, Andrei Alexandrescu wrote:
 On 7/4/11 2:48 PM, bearophile wrote:
 In this case you wrap the code in something that allows it to
 overflow without errors, like:

 unsafe(overflows) { // code here }

This approach has a number of issues. First, addressing transitivity is difficult. If the code in such a scope calls a function, either every function has two versions, or chooses one way to go about it. Each choice has obvious drawbacks.

C# chooses to limit the scope to the current function, and it works pretty well. The use is to modify the behavior of the *operators*, and hence, there's no transitivity issue because that's just not what it's used for.
 Second, programmers are notoriously bad at choosing which code is
 affecting bottom line performance, yet this feature explicitly puts the
 burden on the coder. So code will be littered with amends, yet still be
 overall slower. This feature has very poor scalability.

Actually, there is **NO** performance issue -- at least not in C#. In fact, if you run this program (with or without optimizations), you will see that they're literally the same almost all the time: using System; static class Program { static long globalVar = 0; //Make it static so it doesn't get optimized static void Main() { const long COUNT = 100000000; for (;;) { var start = Environment.TickCount; for (long i = 0; i < COUNT; i++) checked { globalVar = i * i; } System.Console.WriteLine("Checked: {0}", Environment.TickCount - start); start = Environment.TickCount; for (long i = 0; i < COUNT; i++) unchecked { globalVar = i * i; } System.Console.WriteLine("Unchecked: {0}", Environment.TickCount - start); } } } There is literally no performance issue. Ever. Just my 2c
Jul 04 2011
next sibling parent Benjamin Lindley <benjameslindley gmail.com> writes:
On 7/4/2011 7:38 PM, Mehrdad wrote:
 Actually, there is **NO** performance issue -- at least not in C#. In
 fact, if you run this program (with or without optimizations), you will
 see that they're literally the same almost all the time:

 using System;

 static class Program
 {
 static long globalVar = 0; //Make it static so it doesn't get optimized
 static void Main()
 {
 const long COUNT = 100000000;
 for (;;)
 {
 var start = Environment.TickCount;
 for (long i = 0; i < COUNT; i++)
 checked { globalVar = i * i; }
 System.Console.WriteLine("Checked: {0}", Environment.TickCount - start);

 start = Environment.TickCount;
 for (long i = 0; i < COUNT; i++)
 unchecked { globalVar = i * i; }
 System.Console.WriteLine("Unchecked: {0}", Environment.TickCount - start);
 }
 }
 }

 There is literally no performance issue. Ever.


 Just my 2c

I ran your program, and my results are different than yours. Checked: 1497 Unchecked: 531 Checked: 1138 Unchecked: 453 Checked: 1092 Unchecked: 468 Checked: 1092 Unchecked: 452 Checked: 1077 Unchecked: 452 Checked: 1108 Unchecked: 452 Checked: 1186 Unchecked: 452 Checked: 1092 Unchecked: 437 Checked: 1092 Unchecked: 452 Checked: 1092 Unchecked: 453 Checked: 1138 Unchecked: 437
Jul 04 2011
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/4/11 7:38 PM, Mehrdad wrote:
 Since I didn't see this being mentioned anywhere, I thought I'd mention
 it...

 On 7/4/2011 1:58 PM, Andrei Alexandrescu wrote:
 On 7/4/11 2:48 PM, bearophile wrote:
 In this case you wrap the code in something that allows it to
 overflow without errors, like:

 unsafe(overflows) { // code here }

This approach has a number of issues. First, addressing transitivity is difficult. If the code in such a scope calls a function, either every function has two versions, or chooses one way to go about it. Each choice has obvious drawbacks.

C# chooses to limit the scope to the current function, and it works pretty well. The use is to modify the behavior of the *operators*, and hence, there's no transitivity issue because that's just not what it's used for.

Well that's a choice with its inherent tradeoffs.
 Second, programmers are notoriously bad at choosing which code is
 affecting bottom line performance, yet this feature explicitly puts the
 burden on the coder. So code will be littered with amends, yet still be
 overall slower. This feature has very poor scalability.

Actually, there is **NO** performance issue -- at least not in C#. In fact, if you run this program (with or without optimizations), you will see that they're literally the same almost all the time: using System; static class Program { static long globalVar = 0; //Make it static so it doesn't get optimized static void Main() { const long COUNT = 100000000; for (;;) { var start = Environment.TickCount; for (long i = 0; i < COUNT; i++) checked { globalVar = i * i; } System.Console.WriteLine("Checked: {0}", Environment.TickCount - start); start = Environment.TickCount; for (long i = 0; i < COUNT; i++) unchecked { globalVar = i * i; } System.Console.WriteLine("Unchecked: {0}", Environment.TickCount - start); } } } There is literally no performance issue. Ever.

Isn't it a bit of a stretch to derive a definitive conclusion from one small test? Andrei
Jul 04 2011
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/4/2011 5:38 PM, Mehrdad wrote:
 Actually, there is **NO** performance issue -- at least not in C#. In fact, if
 you run this program (with or without optimizations), you will see that they're
 literally the same almost all the time:

It's a bit extraordinary that there is no cost in executing extra instructions. I've seen many programmers (including experts) led down a primrose path on optimizations because they thought A was happening when actually B was. It pays to check the assembler output and see what's happening. In your benchmark case, it is possible that the optimizer discovers that i is in the range 0..COUNT, and that i*i can therefore never overflow, and then the overflow check is eliminated. Or it is possible that the compiler realizes that globalVar, despite being global, is never read from, and hence the assignments to it are dead, and hence the i*i is never computed at all. Or that the loop is unrolled 10 times, and the overflow check is only done once per 10 iterations, burying its cost. In other words perhaps it's a special case optimization happening, and is not indicative at all of the actual cost of overflow checks in the usual case, which cannot be optimized away.
Jul 04 2011
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Steven Schveighoffer:

 Or, use a separate type which throws the errors if you wish.

I have recently explained why this is not good enough, or even currently impossible: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=139950
 I don't want  
 runtime errors thrown in code that I didn't intend to throw.  most of the  
 time, overflows will not occur, so I don't want to go through all my code  
 and have to throw these decorations up where I know it's safe.

The idea is to add two switches to DMD that activate the integral overflows (one for signed and one for signed and unsigned). If you compile your code without those, the runtime tests will not happen.
 Besides, D is a systems language, and has no business doing checks
 on every integer instruction.

This argument doesn't hold, Delphi and Ada too are system languages.
 Yes non-relase mode is slower, but we are probably talking  
 about a very significant slowdown here.  A lot of integer math happens in D.

Have you done benchmarks or are you refusing something just on a gut feeling? I have done benchmarks of the overflow tests in C# and Delphi and the slowdown we are seeing is not significantly worse than the slowdown caused by array bound tests. And release mode is probably going to be independent from the overflows switches. So you are allowed to compile in non-release mode but without overflow tests.
 I think a much more effective fix for the language would be to make slice  
 length a signed type.  Then you simply eliminate 99% of integer overflows  
 (it's very rare that a signed integer overflows, but not so unlikely that  
 an unsigned one does).

My bug report on this was recently closed by Andrei. I don't mind unsigned fixnums if their bounds are enforced.
 At this point, any change to the semantics of the builtin types will be  
 extremely disruptive.  We should focus our efforts elsewhere.

This thing is more than named arguments or syntax sugar to unpack tuples :-) Bye, bearophile
Jul 04 2011
next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 05.07.2011 1:10, bearophile wrote:
 Steven Schveighoffer:

 Or, use a separate type which throws the errors if you wish.

http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=139950
 I don't want
 runtime errors thrown in code that I didn't intend to throw.  most of the
 time, overflows will not occur, so I don't want to go through all my code
 and have to throw these decorations up where I know it's safe.

 Besides, D is a systems language, and has no business doing checks
 on every integer instruction.


lacks normal pointer arithmetic: <quote> In Delphi 2009, pointer arithmetic, as usable for the PChar type (and PAnsiChar and PWideChar), is now also possible for other pointer types. When and where this is possible is governed by the new $POINTERMATH compiler directive. Pointer arithmetic is generally switched off, but it can be switched on for a piece of code using|{$POINTERMATH ON}|, and off again using|{$POINTERMATH OFF}|. For pointer types compiled with pointer arithmetic (pointer math) turned on, pointer arithmetic is generally possible. Apparently, in/Delphi 2009/, the new pointer arithmetic doesn't work as intended for pointers to/generic/types yet. Whatever type the parametric type is instantiated as, indices are not scaled by|SizeOf(T)|, as expected. </quote> http://rvelthuis.de/articles/articles-pointers.html also: http://stackoverflow.com/questions/4303880/delphi-pointer-arithmetic -- Dmitry Olshansky
Jul 04 2011
parent reply Max Klyga <max.klyga gmail.com> writes:
On 2011-07-05 08:31:46 +0300, Dmitry Olshansky said:

 On 05.07.2011 1:10, bearophile wrote:
 Steven Schveighoffer:
 
 Or, use a separate type which throws the errors if you wish.

currently impossible: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=139950


I
 
 don't want
 runtime errors thrown in code that I didn't intend to throw.  most of the
 time, overflows will not occur, so I don't want to go through all my code
 and have to throw these decorations up where I know it's safe.

overflows (one for signed and one for signed and unsigned). If you compile your code without those, the runtime tests will not happen.
 Besides, D is a systems language, and has no business doing checks
 on every integer instruction.


_still_ lacks normal pointer arithmetic: <quote> In Delphi 2009, pointer arithmetic, as usable for the PChar type (and PAnsiChar and PWideChar), is now also possible for other pointer types. When and where this is possible is governed by the new $POINTERMATH compiler directive. Pointer arithmetic is generally switched off, but it can be switched on for a piece of code using|{$POINTERMATH ON}|, and off again using|{$POINTERMATH OFF}|. For pointer types compiled with pointer arithmetic (pointer math) turned on, pointer arithmetic is generally possible. Apparently, in/Delphi 2009/, the new pointer arithmetic doesn't work as intended for pointers to/generic/types yet. Whatever type the parametric type is instantiated as, indices are not scaled by|SizeOf(T)|, as expected. </quote> http://rvelthuis.de/articles/articles-pointers.html also: http://stackoverflow.com/questions/4303880/delphi-pointer-arithmetic

Pointer arithmetic is not strictly necessary for a systems language. Several operating systems were written in Oberon. It has no pointer arithmetics. Should it be considered not a systems language?
Jul 04 2011
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 05.07.2011 10:35, Max Klyga wrote:
 On 2011-07-05 08:31:46 +0300, Dmitry Olshansky said:

 On 05.07.2011 1:10, bearophile wrote:
 Steven Schveighoffer:

 Or, use a separate type which throws the errors if you wish.

currently impossible: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars D&article_id=139950


I
 don't want
 runtime errors thrown in code that I didn't intend to throw.  most 
 of the
 time, overflows will not occur, so I don't want to go through all 
 my code
 and have to throw these decorations up where I know it's safe.

overflows (one for signed and one for signed and unsigned). If you compile your code without those, the runtime tests will not happen.
 Besides, D is a systems language, and has no business doing checks
 on every integer instruction.


_still_ lacks normal pointer arithmetic: <quote> In Delphi 2009, pointer arithmetic, as usable for the PChar type (and PAnsiChar and PWideChar), is now also possible for other pointer types. When and where this is possible is governed by the new $POINTERMATH compiler directive. Pointer arithmetic is generally switched off, but it can be switched on for a piece of code using|{$POINTERMATH ON}|, and off again using|{$POINTERMATH OFF}|. For pointer types compiled with pointer arithmetic (pointer math) turned on, pointer arithmetic is generally possible. Apparently, in/Delphi 2009/, the new pointer arithmetic doesn't work as intended for pointers to/generic/types yet. Whatever type the parametric type is instantiated as, indices are not scaled by|SizeOf(T)|, as expected. </quote> http://rvelthuis.de/articles/articles-pointers.html also: http://stackoverflow.com/questions/4303880/delphi-pointer-arithmetic

Pointer arithmetic is not strictly necessary for a systems language.

Well, given this some more thought, it's not exactly necessary, much more needed is ability to treat some memory location as either address or integer. Having that arithmetic can be done on integers and then casted back as pointers again, though that does gives you even _less_ safety the direct pointer arithmetic (at least it disallows /* on pointers ... and they track size of object).
 Several operating systems were written in Oberon. It has no pointer 
 arithmetics. Should it be considered not a systems language?

IMHO, yes, generally the features to work as close as possible to "metal" should be first-class in a systems language, else you could fit almost any language with meta library of "intrinsic functions" and call it a systems language (that would make sense as long as you have a native code compiler). Looking on Oberon it seems to do the latter i.e. stay Pascal but use intrinsic functions (spec of Oberon-07, page 15): SYSTEM additional definitions that are particular to the specific, underlying computer. In the following, v stands for a variable, x, a, and n for expressions. Function procedures: Name Argument types Result type Function ADR(v) any INTEGER address of variable v SIZE(T) any type INTEGER size in bytes BIT(a, n) a, n: INTEGER BOOLEAN bit n of mem[a] Proper procedures: Name Argument types Function GET(a, v) a: INTEGER; v: any basic type v := mem[a] PUT(a, x) a: INTEGER; x: any basic type mem[a] := x Ti seems Go took the same route, but also using magic types: http://golang.org/pkg/unsafe/ -- Dmitry Olshansky
Jul 05 2011
prev sibling parent reply eles <eles eles.com> writes:
 Note that I've proposed a solution elsewhere to use c* (i.e. cint,

 cuint, culong, etc.) to mean "checked" integral, then you can alias

 int or alias it to your struct type depending on debug flags you

 dmd.  This should be a reasonable solution, I might actually use it

 certain cases, if it's that easy.

I think it is a good proposition. However, name choice conflicts a bit with the choice for complex (creal, cfloat etc.), so better names are needed. Let's say: suint, sulong ("safe") or some other proposal.
 I don't want
 runtime errors thrown in code that I didn't intend to throw.



 the
 time, overflows will not occur, so I don't want to go through



 code
 and have to throw these decorations up where I know it's safe.

The idea is to add two switches to DMD that activate the integral overflows (one for signed and one for signed and unsigned). If you compile your code without those, the runtime tests will not


 See the solution I posted.  This can be done with a custom type and

 alias.  It just shouldn't take over builtin types.

Not take over (replace) existing types, but provide alternative types (see abobe "suint"). The idea is to put those types into the standard language. Only then they are universally available and will took off.
 Are you kidding?  An integer operation is a single instruction.

 for overflow requires another instruction.  Granted the test may be

 than the operation, but it's going to slow things down.  We are not
 talking about outputting strings or computing some complex value.

 math is the bread-and-butter of cpus.  It's what they were made

 compute things.  Herein lies the problem -- you are going to be

 down instructions that take up a large chunk of the code.  It

 benchmarking to see that without hardware support, testing all

 math instructions is going to cause significant slowdown (I'd say

 10%, probably more like 25%) in all code.

In fact, providing "safe" (ie. overflow checked) types will not force you to use those. If you don't like them or you consider that you don't need to slow down your debug build (the release build should run at the same speed), then you'll have the choice of using classic (let's say "unsafe" types).
 Comparing them to array bounds tests isn't exactly conclusive,

 array bounds tests prevent a much more insidious bug -- memory
 corruption.  With overflow, you may have a subtle bug, but your code
 should still be sane (i.e. you can debug it).  With memory

 code is not sane, you can't necessarily trust anything it says

 corruption occurs.

But you can use tools such as valgrind to detect memory corruption, while you cannot use analysis tools to detect *unintended* overflows.
 Typically with an overflow bug, I have bad behavior that immediately
 shows.  With memory corruption, the actual crash may occur long

 corruption occurred.

It does not mean that "dormant" bugs like overflows cannot exist and, in the long run, they could be as dangerous (for example, they may also lead to memory corruption if "positive index" becomes... negative - ie. overflows). Besides, yes, memory corruption is a dangerous bug but the fact that this has been addressed does not means that other source of bugs should now be neglected. Finally, "typically" is a bit subjective here, it depends on each one's experience.
Jul 05 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
eles wrote:
 ...
 It does not mean that "dormant" bugs like overflows cannot exist and,
 in the long run, they could be as dangerous (for example, they may
 also lead to memory corruption if "positive index" becomes...
 negative - ie. overflows). Besides, yes, memory corruption is a
 dangerous bug but the fact that this has been addressed does not
 means that other source of bugs should now be neglected. Finally,
 "typically" is a bit subjective here, it depends on each one's
 experience.

See http://www.youtube.com/watch?v=kYUrqdUyEpI for some experience in turning off some Ada overflow checks in a buggy system. The problem is that such dormant bugs may be deadly when they occur. They don't occur during debug runs, but mess up the released application. The best way to get a reliable and well-performing system is to write correct code. :o) (You can also install some higher-level error detection and correction capabilities.) Cheers, -Timon
Jul 05 2011
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/4/2011 12:48 PM, bearophile wrote:
 In this case you wrap the code in something that allows it to overflow without
errors, like:

 unsafe(overflows) {
      // code here
 }

Regardless of the merits of doing overflow checks, this is the completely wrong way to go about it. This must be a property of the type, not of its usage. It gets semantically pretty wacky when you have function calls for "code here".
Jul 04 2011
prev sibling parent eles <eles eles.com> writes:
 I think that's a really good idea.  I'm sure there's a great assembly
 newsgroup or forum where you can post your ideas and mock the D

 for the bunch of boobs that we are.

For the record, I have nothing against the D community. Nor against the D language and I wouldn't spend time here if it would be just to "mock". Yes, I might sometimes choose the wrong methods (inefficient ones) when defending some points-of-view. Speaking about irony and mocking, you made a similar choice in your words above.
Jul 04 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 04 Jul 2011 14:49:03 -0400, eles <eles eles.com> wrote:

 I have an idea - how about the notation Uncircular!uint to designate
 such a type?
 Andrei

Either put it into the standard language, either I have a better one. what about dropping printf and start using: mov ah, 9 int 0x21 instead? I am sure it can be done. So, why not dropping all D and start to code back into assembler?

I think that's a really good idea. I'm sure there's a great assembly newsgroup or forum where you can post your ideas and mock the D community for the bunch of boobs that we are.
 The point is that "it can be done even if the current context" is a
 shallow excuse for rejecting better ways to achieve something.

I think the rejection is for the assumption of superiority. I.e. your way isn't better, or even possible, given the existing code base. To the point -- lots of existing D and C code uses the properties of integer overflow. If integer overflow is assumed to be an error, then that code is broken, even though the code *expects* overflow to occur, and in fact might *depend* on it occurring. A more logical path is to build a new type, that's not in any existing code base, which handles overflow using exceptions. Then a developer can choose to use this new type if he wants to throw an exception on overflow. If this is not acceptable to you, I suggest you drop it anyways -- altering the builtin types is not going to happen. Ever. -Steve
Jul 04 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 04 Jul 2011 15:48:12 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Steven Schveighoffer:

 To the point -- lots of existing D and C code uses the properties of
 integer overflow.  If integer overflow is assumed to be an error, then
 that code is broken, even though the code *expects* overflow to occur,  
 and
 in fact might *depend* on it occurring.

In this case you wrap the code in something that allows it to overflow without errors, like: unsafe(overflows) { // code here }

Or, use a separate type which throws the errors if you wish. I don't want runtime errors thrown in code that I didn't intend to throw. most of the time, overflows will not occur, so I don't want to go through all my code and have to throw these decorations up where I know it's safe. Besides, D is a systems language, and has no business doing checks on every integer instruction. Yes non-relase mode is slower, but we are probably talking about a very significant slowdown here. A lot of integer math happens in D. I think a much more effective fix for the language would be to make slice length a signed type. Then you simply eliminate 99% of integer overflows (it's very rare that a signed integer overflows, but not so unlikely that an unsigned one does). At this point, any change to the semantics of the builtin types will be extremely disruptive. We should focus our efforts elsewhere. -Steve
Jul 04 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 04 Jul 2011 16:16:04 -0400, eles <eles eles.com> wrote:

 Yes non-relase mode is slower, but we are probably talking
 about a very significant slowdown here.  A lot of integer math

 D.

What about testing only for user selected variables/types?

That's fine, just define a new type that does this, so it's user selected. The ultimate goal is to define something like: struct checkedInt { ... } // throws on overflows debug(checkOverflows) alias _checkedInt cint; else alias int cint; // default to no overflow checking now you use cint anywhere you want to selectively enable overflow checking. I don't know if the language is good enough to allow the definition of checkedInt to behave just like a builtin int.
 I think a much more effective fix for the language would be to make

 length a signed type.  Then you simply eliminate 99% of integer

 (it's very rare that a signed integer overflows, but not so

 an unsigned one does).

I do not like that (Java-like) path too much. Why loosing half of length range?

I'm not saying it's the best solution, but it does solve a very common problem. The largest issue with overflow is for unsigned types, because most integers are closer to zero than they are to their maximum, regardless of whether they are signed or not. So having a signed integer protects against most cases of overflow -- when the integer goes less than zero. Unless I'm solving coding puzzles, I rarely have cases where an integer exceeds its maximum. Note that for 64-bit D, this effectively becomes a moot point -- a length of 2^63-1 is larger than any possible memory configuration you can currently have. There is also the possibility of defining a large_array type which does use size_t for the length. Anyway, this doesn't mean I think signed lengths are better than unsigned lengths. It just means I think solving overflows using signed lengths is a better option than solving them using overflow detection and exception throwing. -Steve
Jul 05 2011
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 04 Jul 2011 17:10:04 -0400, bearophile <bearophileHUGS lycos.com>  
wrote:

 Steven Schveighoffer:

 Or, use a separate type which throws the errors if you wish.

I have recently explained why this is not good enough, or even currently impossible: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=139950

You mean because people won't use them? That's exactly my point. I don't *want* to use them. You do, so use them. I might use it in cases where I'm nervous, especially when using decrementing for-loops. But I don't think there would be a significant improvement in code quality if overflows are continually checked. For one, it might be a *nuisance* error, i.e. overflow doesn't matter or was expected. Two, it's a runtime error, so you must execute code in a certain way to see a failure. Note that I've proposed a solution elsewhere to use c* (i.e. cint, clong, cuint, culong, etc.) to mean "checked" integral, then you can alias it to int or alias it to your struct type depending on debug flags you pass to dmd. This should be a reasonable solution, I might actually use it in certain cases, if it's that easy.
 I don't want
 runtime errors thrown in code that I didn't intend to throw.  most of  
 the
 time, overflows will not occur, so I don't want to go through all my  
 code
 and have to throw these decorations up where I know it's safe.

The idea is to add two switches to DMD that activate the integral overflows (one for signed and one for signed and unsigned). If you compile your code without those, the runtime tests will not happen.

See the solution I posted. This can be done with a custom type and an alias. It just shouldn't take over builtin types.
 Besides, D is a systems language, and has no business doing checks
 on every integer instruction.

This argument doesn't hold, Delphi and Ada too are system languages.

I don't use them, and I'm less likely to now, knowing they do this. Any systems language that does checks on every integer operation is going to give you a significant slowdown factor.
 Yes non-relase mode is slower, but we are probably talking
 about a very significant slowdown here.  A lot of integer math happens  
 in D.

Have you done benchmarks or are you refusing something just on a gut feeling? I have done benchmarks of the overflow tests in C# and Delphi and the slowdown we are seeing is not significantly worse than the slowdown caused by array bound tests. And release mode is probably going to be independent from the overflows switches. So you are allowed to compile in non-release mode but without overflow tests.

Are you kidding? An integer operation is a single instruction. Testing for overflow requires another instruction. Granted the test may be less than the operation, but it's going to slow things down. We are not talking about outputting strings or computing some complex value. Integer math is the bread-and-butter of cpus. It's what they were made for, to compute things. Herein lies the problem -- you are going to be slowing down instructions that take up a large chunk of the code. It doesn't take benchmarking to see that without hardware support, testing all integer math instructions is going to cause significant slowdown (I'd say at least 10%, probably more like 25%) in all code. Comparing them to array bounds tests isn't exactly conclusive, because array bounds tests prevent a much more insidious bug -- memory corruption. With overflow, you may have a subtle bug, but your code should still be sane (i.e. you can debug it). With memory corruption, the code is not sane, you can't necessarily trust anything it says after the corruption occurs. Typically with an overflow bug, I have bad behavior that immediately shows. With memory corruption, the actual crash may occur long after the corruption occurred. -Steve
Jul 05 2011