www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - [Submission] D Slices

reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
This is my final submission for the D article contest.

This takes into account all the fixes and suggestions from the first draft  
review.

http://www.dsource.org/projects/dcollections/wiki/ArrayArticle

-Steve
May 31 2011
next sibling parent reply eles <eles eles.com> writes:
int[] b = a[0..2];   // This is a 'slicing' operation.  b now refers to
the first two elements of a

is not the first *three* elements of a?
May 31 2011
parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Am 31.05.2011 15:13, schrieb eles:
 int[] b = a[0..2];   // This is a 'slicing' operation.  b now refers to
 the first two elements of a
 
 is not the first *three* elements of a?

No, it contains a[0] and a[1]. The right boundary of a slice is exclusive. This makes sense, so you can do stuff like a[1..$] (== a[1..a.length]) to get a slice that contains all elements of a except for the first one (a[0]). Cheers, - Daniel
May 31 2011
parent reply eles <eles eles.com> writes:
 The right boundary of a slice is exclusive.

I think it should be stated more obvious in the paper.
 This makes sense, so you can
 do stuff like a[1..$] (== a[1..a.length]) to get a slice that

 all elements of a except for the first one (a[0]).

I disagree, but I have not much influence here, although I will defend my point of view. I find it quite unpleasant to remember which of the left and right bounds are exclusive and, moreover, this precludes slicing with a[i1..i2] where i1 and i2 are only known at the runtime and may be i2<i1 (and not necessarily i1<i2). You will be forced to do smthng like: if(i1>i2) b=a[i1..i2] else b=a[i2..i1] end and is easy to forget (several lines below) if a[i1] or a[i2] still belongs to the slice or no. For example, it would be marvellous to implement definite integral convention as: int(a,i1,i2)=sign(i1-i2)*sum(a[i1..i2]) w.r.t. the mathematical convention. For me, the right solution would have been to consider the selection a [0..$-1] to select all the elements of the array. This way, "$" still means "the length" and one has a clear view of the selection going from a[0] to a["length"-1] as expected by someone used to 0-based indexes. The slicing would be always inclusive, which is a matter of consistence (and goes in line with Walter speaking about the car battery). Why to break this convention? If it is wrong, design it to appear being wrong.
May 31 2011
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/31/11 8:29 AM, eles wrote:
 The right boundary of a slice is exclusive.

I think it should be stated more obvious in the paper.
 This makes sense, so you can
 do stuff like a[1..$] (== a[1..a.length]) to get a slice that

 all elements of a except for the first one (a[0]).

I disagree, but I have not much influence here, although I will defend my point of view. I find it quite unpleasant to remember which of the left and right bounds are exclusive and, moreover, this precludes slicing with a[i1..i2] where i1 and i2 are only known at the runtime and may be i2<i1 (and not necessarily i1<i2). You will be forced to do smthng like: if(i1>i2) b=a[i1..i2] else b=a[i2..i1] end and is easy to forget (several lines below) if a[i1] or a[i2] still belongs to the slice or no.

if (i1 > i2) swap(i1, i2); ... I think it would be a bad application of defensive programming on the part of the compiler to liberally allow swapping limits in a slice. Most often such situations indicate a bug in the program.
 For example, it would be marvellous to implement definite integral
 convention as: int(a,i1,i2)=sign(i1-i2)*sum(a[i1..i2]) w.r.t. the
 mathematical convention.

 For me, the right solution would have been to consider the selection a
 [0..$-1] to select all the elements of the array. This way, "$" still
 means "the length" and one has a clear view of the selection going
 from a[0] to a["length"-1] as expected by someone used to 0-based
 indexes. The slicing would be always inclusive, which is a matter of
 consistence (and goes in line with Walter speaking about the car
 battery). Why to break this convention?

 If it is wrong, design it to appear being wrong.

The "war" between open-right and closed-right limit has been waged ever since Fortran was first invented. It may seem that closed-right limits are more natural, but they are marred by problems of varied degrees of subtlety. For example, representing an empty interval is tenuous. Particularly if you couple it with liberal limit swapping, what is a[1 .. 0]? An empty slice or the same as the two-elements a[0 .. 1]? Experience has shown that open-right has "won". Andrei
May 31 2011
parent reply eles <eles eles.com> writes:
 if (i1 > i2) swap(i1, i2);

That will affect further values from that point onward, which could not be necessarily intended. Also, is a bit of overhead for solving such a little issue. OK, it is simple to swap but... why to be force to do it?
 The "war" between open-right and closed-right limit has been waged

 since Fortran was first invented. It may seem that closed-right

 are more natural, but they are marred by problems of varied degrees

 subtlety. For example, representing an empty interval is tenuous.
 Particularly if you couple it with liberal limit swapping, what is a

 .. 0]? An empty slice or the same as the two-elements a[0 .. 1]?
 Experience has shown that open-right has "won".
 Andrei

I agree that the issue is not simple (else, there would have been no "war"). I know no other examples where open-right limits are used. I use (intensively) just another language capable of slicing, that is Matlab. It uses the closed-left and closed-right limits and I tend to see it as a winner in the engineering field, precisely because of its ability to manipulate arrays (its older name is MATrix LABoratory and this is exaxtly why I am using it for). Some examples of syntax (arrays are 1-based in Matlab): a=[1,2,3,4]; % array with elements: 1, 2, 3, 4 (of length 4) b=a(1:3); %b is an array with elements: 1, 2, 3 (of length 3) c=a(end-1:end); %c is an array with elements: 3, 4 (of length 2) d=a(2:1); %d is empty e=a(1:end); %e is an array with elements: 1,2,3,4 (of length 4) f=a(1:10); %ERROR (exceeds array dimension) g=a(1:1); %g is an array with elements: 1 (of length 1) Although there is no straight way to represent an empty array using the close-left&right syntax, I doubt this is important in real world. Matlab has a special symbol (which is "[]") for representing empty arrays. Where the need to represent an empty interval using slice syntax in D? Moreover, when writing in D: a[1..1] is this an empty or a non-empty array? One convention says that the element a[1] is part of the slice, since the left limit is included and the other convention says a[1] is not part of the slice precisely because the right limit is excluded. So, which one weights more? So, is a[1..1] an empty array or an array with one element? However, in syntax consistency, for me a[0..$-1] has the main advantage of not inducing the feeling that a[$] is an element of the array. Elements are counted from 0 to $-1 ("length"-1) and the syntax a[0..$-1] is just a confirmation for that (ie. "counting all elements"). I am sorry, but I tend to follow the Matlab convention. If there is a field where Matlab is really strong (besides visualization), is matrix and array operation. A standard document about what Matlab is able to do is: http://home.online.no/~pjacklam/matlab/doc/mtt/doc/ mtt.pdf
May 31 2011
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/31/11 10:10 AM, eles wrote:
 I agree that the issue is not simple (else, there would have been no
 "war"). I know no other examples where open-right limits are used. I
 use (intensively) just another language capable of slicing, that is
 Matlab. It uses the closed-left and closed-right limits and I tend to
 see it as a winner in the engineering field, precisely because of its
 ability to manipulate arrays (its older name is MATrix LABoratory and
 this is exaxtly why I am using it for). Some examples of syntax
 (arrays are 1-based in Matlab):

 a=[1,2,3,4]; % array with elements: 1, 2, 3, 4 (of length 4)
 b=a(1:3); %b is an array with elements: 1, 2, 3 (of length 3)
 c=a(end-1:end); %c is an array with elements: 3, 4 (of length 2)
 d=a(2:1); %d is empty
 e=a(1:end); %e is an array with elements: 1,2,3,4 (of length 4)
 f=a(1:10); %ERROR (exceeds array dimension)
 g=a(1:1); %g is an array with elements: 1 (of length 1)

I agree that 1-based indexing works pretty well with closed-right intervals; forgot to mention that in my first response. Even with such an approach, certain things are less obvious, e.g. the number of elements in an interval a..b is b-a+1, not b-a. All in all D doesn't attempt to break new ground with open-right intervals. It would be gratuitously and jarringly different from all of its major influencers. Though I understand the choice looks odd coming from Matlab, the same could be said about a lot of other languages. Andrei
May 31 2011
parent reply eles <eles eles.com> writes:
 I agree that 1-based indexing works pretty well with closed-right
 intervals; forgot to mention that in my first response. Even with

 an approach, certain things are less obvious, e.g. the number of
 elements in an interval a..b is b-a+1, not b-a.

Yes, but in C too, when going from a[0] to a[N-1], people know there are... (N-1)-0+1 elements (so, b-a+1). It is the same. Now, why: for(iterator from a[0] to a[N-1]){ //etc. } //let use the above notation for for(i=0; i<=N-1; i++) is acceptable, but sudden is no more acceptable to write a[for(iterator from 0 to N-1)] and one must use a[for(iterator from 0 to N]] in order to achieve exactly the same? The last two expressions are just mental placeholders for a[0..N-1] and for a[0..N] respectively.
 All in all D doesn't attempt to break new ground with open-right
 intervals. It would be gratuitously and jarringly different from

 its major influencers. Though I understand the choice looks odd

 from Matlab, the same could be said about a lot of other languages.

I don't see that ground. Maybe I simply lack information. Can you help?
May 31 2011
next sibling parent reply Mafi <mafi example.org> writes:
Am 31.05.2011 18:16, schrieb eles:
 Now, why:

 for(iterator from a[0] to a[N-1]){ //etc. }
 //let use the above notation for for(i=0; i<=N-1; i++)

 is acceptable, but sudden is no more acceptable to write

 a[for(iterator from 0 to N-1)]

 and one must use

 a[for(iterator from 0 to N]]

 in order to achieve exactly the same?

 The last two expressions are just mental placeholders for a[0..N-1]
 and for a[0..N] respectively.

let for(element from a[0] to a[n]) be a notation for for(i=0; i < n; i++) but then, assuming closed intervals, why do I have to write a[i..n-1] to archive the same? You see? This argument of yours is not really good. Mafi
May 31 2011
parent reply eles <eles eles.com> writes:
== Quote from Mafi (mafi example.org)'s article
 Am 31.05.2011 18:16, schrieb eles:
 Now, why:

 for(iterator from a[0] to a[N-1]){ //etc. }
 //let use the above notation for for(i=0; i<=N-1; i++)

 is acceptable, but sudden is no more acceptable to write

 a[for(iterator from 0 to N-1)]

 and one must use

 a[for(iterator from 0 to N]]

 in order to achieve exactly the same?

 The last two expressions are just mental placeholders for a


 and for a[0..N] respectively.

for(element from a[0] to a[n]) be a notation for for(i=0; i < n; i++) but then, assuming closed intervals, why do I have to write a[i..n-1] to archive the same? You see? This argument of yours is not really good. Mafi

why not write: for(i=0; i<=n-1; i++) ? COnceptually, you are iterating elements from a[0] to a[n-1], this is why index goes from 0 to n-1 (even if you write i<n). What about if length of the array already equals UTYPE_MAX, ie. the maximum value representable on the size_t? (I assume that indexes are on size_t and they go from 0 to UTYPE_MAX). How would you write then for the last element? UTYPE_MAX<UTYPE_MAX+1 The right side is an overflow. You see? This argument of yours is not really good.
May 31 2011
parent eles <eles eles.com> writes:
 if n is unsigned int, and 0, then this becomes i = 0; i <=

 Basically, using subtraction in loop conditions is a big no-no.

Yes, I have been trapped there. More than once. And yes, n=0 is a special case. An yes, is a big no-no. It also appears when *i* is unsigned int and you are decrementing it and comparing (with egality) against 0. The loop is infinite. For me, conceptually, the problem is simply that unsigned types, once at zero, should throw exception if they are decremented. It is illogical to allow such operation. However, type information is only available at the compile time, so when the program is running is difficult to take measures against it. I thought, however, that those exception should be thrown in the "debug" version. However, there is no reason to make, instead, UTYPE_MAX a big no-no. I think a better solution is needed for the following problem: - avoid decrementing the index outside its logical domain - maintaining the conceptually-consistent representation of both index and n as unsigned ints. Unfortunately, I have no good solution for that problem. Java dropped completely unsigned types and, thus, they got rid of this problem. But for me is still a bit illogical to enforce an index or an array length of being signed, as they are by definition *always* positive quantities. Is like you are forced to use just half of the available representable range in order to avoid some corner case. You are
 much better off to write using addition:
 i + 1 <= n
 But then i < n looks so much better.

I doubt. It is a matter of taste. I think "looks" does not necessarily means "good".
 COnceptually, you are iterating elements from a[0] to a[n-1],


 why index goes from 0 to n-1 (even if you write i<n).

 What about if length of the array already equals UTYPE_MAX, ie.


 maximum value representable on the size_t? (I assume that indexes


 on size_t and they go from 0 to UTYPE_MAX).


 items in any slice.
 Second, there are very very few use cases where size_t.max is used

 iteration.

That is not the point. "There are very very few cases where array lengths could not be passed as a second parameter to the function, beside the pointer to the array". "There are very very few cases when switch branches should not fall through". "There are very very few cases when more than 640 kB of RAM would be necessary (so, let's introduce the A-20 gate (http://en.wikipedia.org/wiki/ A20_line#A20_gate)!!!)" etc. The point is: why allow an inconsistence?
May 31 2011
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/31/11 11:16 AM, eles wrote:
 I agree that 1-based indexing works pretty well with closed-right
 intervals; forgot to mention that in my first response. Even with

 an approach, certain things are less obvious, e.g. the number of
 elements in an interval a..b is b-a+1, not b-a.

Yes, but in C too, when going from a[0] to a[N-1], people know there are... (N-1)-0+1 elements (so, b-a+1). It is the same. Now, why: for(iterator from a[0] to a[N-1]){ //etc. } //let use the above notation for for(i=0; i<=N-1; i++) is acceptable, but sudden is no more acceptable to write a[for(iterator from 0 to N-1)] and one must use a[for(iterator from 0 to N]] in order to achieve exactly the same? The last two expressions are just mental placeholders for a[0..N-1] and for a[0..N] respectively.
 All in all D doesn't attempt to break new ground with open-right
 intervals. It would be gratuitously and jarringly different from

 its major influencers. Though I understand the choice looks odd

 from Matlab, the same could be said about a lot of other languages.

I don't see that ground. Maybe I simply lack information. Can you help?

As I mentioned, the issues involved are of increasing subtlety. As you wrote, C programmers iterate upward like this: for (int i = 0; i < N; i++) However if N is the length of an array, that has unsigned type size_t and therefore i should follow suit: for (size_t i = 0; i < N; i++) This is _not_ equivalent with: for (size_t i = 0; i <= N - 1; i++) because at N == 0, iteration will take a very long time (and go to all kinds of seedy places). But often C programmers iterate with pointers between given limits: for (int* p = a; p < a + N; i++) It would appear that that does take care of the N == 0 case, so the loop should be equivalent with: for (int* p = a; p <= a + N - 1; i++) Alas, it's not. There is a SPECIAL RULE in the C programming language that says pointers EXACTLY past ONE the end of an array are ALWAYS comparable for (in)equality and ordering with pointers inside the array. There is no such rule for pointers one BEFORE the array. Therefore, C programmers who use pointers and lengths or pairs of pointers ALWAYS need to use open-right intervals because otherwise they would be unable to iterate with well-defined code. Essentially C legalizes open-right and only open-right intervals inside the language. C++ carried that rule through. C++ programmers who use iterators actually write loops a bit differently. To go from a to b they write: for (SomeIterator it = a; it != b; ++it) They use "!=" instead of "<" because there are iterators that compare for inequality but not for ordering. Furthermore, they use ++it instead of it++ because the latter may be more expensive. A conscientious C++ programmer wants to write code that makes the weakest assumptions about the types involved. This setup has as a direct consequence the fact that ranges expressed as pairs of iterators MUST use the right-open convention. Otherwise it would be impossible to express an empty range, and people would have a very hard time iterating even a non-empty one as they'd have to write: // Assume non-empty closed range for (SomeIterator it = a; ; ++it) { ... code ... if (it == b) break; } or something similar. The STL also carefully defines the behavior of iterators past one the valid range, a move with many nice consequences including consistency with pointers and the ease of expressing an empty range as one with a == b. Really, if there was any doubt that right-open is a good choice, the STL blew it off its hinges. That's not to say right-open ranges are always best. One example where a closed range is necessary is with expressing total closed sets. Consider e.g. a random function: uint random(uint min, uint max); In this case it's perfectly normal for random to return a number in [min, max]. If random were defined as: uint random(uint min, uint one_after_max); it would be impossible (or highly unintuitive) to generate a random number that has uint.max as its largest value. It could be said that closed intervals make it difficult to express empty ranges, whereas right-open intervals make it difficult to express total ranges. It turns out the former are a common case whereas the latter is a rarely-encountered one. That's why we use closed intervals in std.random and case statements but open intervals everywhere else. At this point, right-open is so embedded in the ethos of D that it would be pretty much impossible to introduce an exception for slices without confusing a lot of people -- all for hardly one good reason. Andrei
May 31 2011
parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/31/2011 10:57 AM, Andrei Alexandrescu wrote:
 As I mentioned, the issues involved are of increasing subtlety.

Thanks, Andrei. I think this is a pretty awesome summary of the issues.
May 31 2011
prev sibling next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
eles wrote:
 if (i1 > i2) swap(i1, i2);

That will affect further values from that point onward, which could not be necessarily intended. Also, is a bit of overhead for solving such a little issue. OK, it is simple to swap but... why to be force to do it?
 The "war" between open-right and closed-right limit has been waged

 since Fortran was first invented. It may seem that closed-right

 are more natural, but they are marred by problems of varied degrees

 subtlety. For example, representing an empty interval is tenuous.
 Particularly if you couple it with liberal limit swapping, what is a

 .. 0]? An empty slice or the same as the two-elements a[0 .. 1]?
 Experience has shown that open-right has "won".
 Andrei

I agree that the issue is not simple (else, there would have been no "war"). I know no other examples where open-right limits are used. I use (intensively) just another language capable of slicing, that is Matlab. It uses the closed-left and closed-right limits and I tend to see it as a winner in the engineering field, precisely because of its ability to manipulate arrays (its older name is MATrix LABoratory and this is exaxtly why I am using it for). Some examples of syntax (arrays are 1-based in Matlab): a=[1,2,3,4]; % array with elements: 1, 2, 3, 4 (of length 4) b=a(1:3); %b is an array with elements: 1, 2, 3 (of length 3) c=a(end-1:end); %c is an array with elements: 3, 4 (of length 2) d=a(2:1); %d is empty e=a(1:end); %e is an array with elements: 1,2,3,4 (of length 4) f=a(1:10); %ERROR (exceeds array dimension) g=a(1:1); %g is an array with elements: 1 (of length 1) Although there is no straight way to represent an empty array using the close-left&right syntax, I doubt this is important in real world. Matlab has a special symbol (which is "[]") for representing empty arrays. Where the need to represent an empty interval using slice syntax in D? Moreover, when writing in D: a[1..1] is this an empty or a non-empty array? One convention says that the element a[1] is part of the slice, since the left limit is included and the other convention says a[1] is not part of the slice precisely because the right limit is excluded. So, which one weights more? So, is a[1..1] an empty array or an array with one element?

See Andrej Mitrovics image.
 However, in syntax consistency, for me a[0..$-1] has the main
 advantage of not inducing the feeling that a[$] is an element of the
 array. Elements are counted from 0 to $-1 ("length"-1) and the syntax
 a[0..$-1] is just a confirmation for that (ie. "counting all
 elements").

 I am sorry, but I tend to follow the Matlab convention. If there is a
 field where Matlab is really strong (besides visualization), is
 matrix and array operation. A standard document about what Matlab is
 able to do is: http://home.online.no/~pjacklam/matlab/doc/mtt/doc/
 mtt.pdf

Ah okay. Now I get the context. D slices are quite different from matlab arrays/vectors. matlab slices are value types, all the data is always copied on slicing. But in matlab: v=[1,2,3,4]; a=v(3:1); b=v(1:3); Will give you a=[], b=[1,2,3], right? So you cannot swap them freely in matlab either? But D slices are different from matlab slices. They are merely a window to the same data. Slicing and changing values in the slice will change the original array. For such semantics, the way D does it is just right. In D there isn't even a way to create 2D-slices from 2D-arrays, so D slices really are not what you want for linear algebra. If you want to do matrix processing in D, you would probably first have to implement a vector and a matrix struct. They would have semantics defined by you, so you could have slicing just as you want it (I do not know how a very nice syntax can be achieved though). Timon
May 31 2011
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2011-05-31 11:10:59 -0400, eles <eles eles.com> said:

 I agree that the issue is not simple (else, there would have been no
 "war"). I know no other examples where open-right limits are used. I
 use (intensively) just another language capable of slicing, that is
 Matlab. It uses the closed-left and closed-right limits and I tend to
 see it as a winner in the engineering field, precisely because of its
 ability to manipulate arrays (its older name is MATrix LABoratory and
 this is exaxtly why I am using it for).

I don't think you can enter this debate without bringing the other war about zero-based indices vs. one-based indices. Matlab first's index number is number one, and I think this fits very naturally with the closed-right limit. In many fields, one-based indices are common because they are easier to reason with. But in computers, where the index is generally an offset from a base address, zero-based is much more prevalent as it better represents what the machine is actually doing. And it follows that open-ended limits on the right are the norm. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 31 2011
parent reply eles <eles eles.com> writes:
 I don't think you can enter this debate without bringing the other

 about zero-based indices vs. one-based indices. Matlab first's index
 number is number one, and I think this fits very naturally with the
 closed-right limit. In many fields, one-based indices are common
 because they are easier to reason with. But in computers, where the
 index is generally an offset from a base address, zero-based is much
 more prevalent as it better represents what the machine is actually
 doing. And it follows that open-ended limits on the right are the

Actually, I think 0-based vs. 1-based is of little importance in the fact if limits should be closed or open. Why do you think the contrary? Why it is "natural" to use open-limit for 0-based? I think it is quite subjective. Why to the right and not to the left? Can you redirect me towards an example of such norm? I have no knowledge of that use. I hope some day someone would not have to write a paper like http:// drdobbs.com/blogs/cpp/228701625 but targetting... D's biggest mistake was to use open-limit on the right. I think the idea of dropping array information to just a pointer was seen as very natural to Kernigan and Ritchie. Otherwise they would have not used it.
May 31 2011
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/31/11 11:10 AM, eles wrote:
 I hope some day someone would not have to write a paper like http://
 drdobbs.com/blogs/cpp/228701625 but targetting... D's biggest mistake
 was to use open-limit on the right.

I sure wish that were the biggest mistake! :o) Andrei
May 31 2011
parent reply eles <eles yaho.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s
article
 On 5/31/11 11:10 AM, eles wrote:
 I hope some day someone would not have to write a paper like


 drdobbs.com/blogs/cpp/228701625 but targetting... D's biggest


 was to use open-limit on the right.

Andrei

Maybe you are right and there are others, too. Is off-topic, but I won't understand why D did not choose to explicitly declare intentions of "break" or "fall" after branches in switch statements (while dropping implicit "fall"). It won't break existing or inherited (from C) code. It will just signal that it is illegal and force the programmer to revise it and to make sure it behaves as intended.
May 31 2011
next sibling parent reply KennyTM~ <kennytm gmail.com> writes:
On Jun 1, 11 02:12, eles wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s
 article
 On 5/31/11 11:10 AM, eles wrote:
 I hope some day someone would not have to write a paper like


 drdobbs.com/blogs/cpp/228701625 but targetting... D's biggest


 was to use open-limit on the right.

Andrei

Maybe you are right and there are others, too. Is off-topic, but I won't understand why D did not choose to explicitly declare intentions of "break" or "fall" after branches in switch statements (while dropping implicit "fall"). It won't break existing or inherited (from C) code. It will just signal that it is illegal and force the programmer to revise it and to make sure it behaves as intended.

This has been discussed a lot of times before. See http://digitalmars.com/d/archives/digitalmars/D/About_switch_case_statements..._101110.html#N101112.
May 31 2011
parent reply eles <eles eles.com> writes:
 This has been discussed a lot of times before. See
 http://digitalmars.com/d/archives/digitalmars/D/

Thank you. I know. I follow this newsgroup since the days when EvilOne Minayev was still posting here and also witnessed D 1.0 and old D group dying and the new digitalmars.D (because of Google indexing - while we are at indexes) and so on. I know this was discussing before. What I do not agree is the conclusion and the reason behind, which mainly is: "One of D's design policies is that a D code that looks like C code should behave like C." Wel, is BAAAAD policy! Yes, if C code *is accepted as is*, then it should behave *like C*. I agree! But there is also a lot of C code that is simply *not accepted*. An I think it was Don Clugston that tracked down some time ago some strange and difficult bug caused by the fact that D was still accepting the C cumbersome syntax for declaring function pointers and arrays in the old style. Walter got rid of that syntax. So, the point is not about *accepting C-like code and imposing a different behavior* (which would be a distraction for C programmers and prone to bugs because old habits die hard) but *simply NOT accept some kind of C-code*. With errors and so on. Even for the existing C code, all that this is asking is to write some "fall" for those branches that do not have "break". Are there so many? Is this an overwhelming task?
May 31 2011
parent KennyTM~ <kennytm gmail.com> writes:
On Jun 1, 11 03:48, eles wrote:
 This has been discussed a lot of times before. See
 http://digitalmars.com/d/archives/digitalmars/D/

Thank you. I know. I follow this newsgroup since the days when EvilOne Minayev was still posting here and also witnessed D 1.0 and old D group dying and the new digitalmars.D (because of Google indexing - while we are at indexes) and so on. I know this was discussing before. What I do not agree is the conclusion and the reason behind, which mainly is: "One of D's design policies is that a D code that looks like C code should behave like C." Wel, is BAAAAD policy! Yes, if C code *is accepted as is*, then it should behave *like C*. I agree! But there is also a lot of C code that is simply *not accepted*. An I think it was Don Clugston that tracked down some time ago some strange and difficult bug caused by the fact that D was still accepting the C cumbersome syntax for declaring function pointers and arrays in the old style. Walter got rid of that syntax. So, the point is not about *accepting C-like code and imposing a different behavior* (which would be a distraction for C programmers and prone to bugs because old habits die hard) but *simply NOT accept some kind of C-code*. With errors and so on. Even for the existing C code, all that this is asking is to write some "fall" for those branches that do not have "break". Are there so many? Is this an overwhelming task?

I agree that implicit fall-through should be disallowed for a non-empty case. Also, the conclusion is "Walter thinks he uses fall-through frequently, so it should stay", not really because of C.
May 31 2011
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
eles wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s
 article
 On 5/31/11 11:10 AM, eles wrote:
 I hope some day someone would not have to write a paper like


 drdobbs.com/blogs/cpp/228701625 but targetting... D's biggest


 was to use open-limit on the right.

Andrei

Maybe you are right and there are others, too. Is off-topic, but I won't understand why D did not choose to explicitly declare intentions of "break" or "fall" after branches in switch statements (while dropping implicit "fall"). It won't break existing or inherited (from C) code. It will just signal that it is illegal and force the programmer to revise it and to make sure it behaves as intended.

In my understanding, you use switch if you want fall through somewhere and if() else if() else ... otherwise. (or you remember to break properly) Yes, it can cause bugs. The other possibility would be to add excessive verbosity. (Have a look into the Eiffel programming language.) I am happy with how Ds switch works, but I understand your concerns too. Timon
May 31 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/31/11 1:36 PM, Timon Gehr wrote:
 eles wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s
 article
 On 5/31/11 11:10 AM, eles wrote:
 I hope some day someone would not have to write a paper like


 drdobbs.com/blogs/cpp/228701625 but targetting... D's biggest


 was to use open-limit on the right.

Andrei

Maybe you are right and there are others, too. Is off-topic, but I won't understand why D did not choose to explicitly declare intentions of "break" or "fall" after branches in switch statements (while dropping implicit "fall"). It won't break existing or inherited (from C) code. It will just signal that it is illegal and force the programmer to revise it and to make sure it behaves as intended.

In my understanding, you use switch if you want fall through somewhere and if() else if() else ... otherwise. (or you remember to break properly) Yes, it can cause bugs. The other possibility would be to add excessive verbosity. (Have a look into the Eiffel programming language.) I am happy with how Ds switch works, but I understand your concerns too. Timon

I think a better guideline is to use switch if you have choices of similar likelihood and if/else chains otherwise. This discussion has been carried before. People have collected metrics about their code and others'. It turned out that even Walter, who perceives using fall through "all the time" was factually wrong (see e.g. http://www.digitalmars.com/pnews/read.php?server=news.digitalmars.com&group=digitalmars.D&artnum=101125). The evidence collected makes it clear to any reasonable observer that enforcing flow of control statements after code in case labels would mark a net improvement for D. How big? Probably not bigger than operating other improvements. But definitely not negligible. Andrei
May 31 2011
parent eles <eles eles.com> writes:
 The evidence collected makes it clear to any reasonable observer

 enforcing flow of control statements after code in case labels would
 mark a net improvement for D. How big? Probably not bigger than
 operating other improvements. But definitely not negligible.
 Andrei

Then, why not to fight in order to win that "not negligible" advantage for D? Is the D language in such a good shape that it allows the luxury of making mistakes? There is no Google, no Sun Microsystems, no Microsoft behind D! Why not to aim for the clear cut? Allowing implicit "fall" is just opposite of that Walter was presenting at the latest conference: "do not rely on educating users, enforce it by design!". *Enforce* it! I fail to see a pertinent reason why to... not.
May 31 2011
prev sibling next sibling parent reply David Nadlinger <see klickverbot.at> writes:
On 5/31/11 6:10 PM, eles wrote:
 Actually, I think 0-based vs. 1-based is of little importance in the
 fact if limits should be closed or open. Why do you think the
 contrary?
 Why it is "natural" to use open-limit for 0-based? I think it is
 quite subjective. Why to the right and not to the left?

for (int i = 0; i < array.length; ++i) {} array[0..array.length] David
May 31 2011
parent reply eles <eles eles.com> writes:
== Quote from David Nadlinger (see klickverbot.at)'s article
 On 5/31/11 6:10 PM, eles wrote:
 Actually, I think 0-based vs. 1-based is of little importance in


 fact if limits should be closed or open. Why do you think the
 contrary?
 Why it is "natural" to use open-limit for 0-based? I think it is
 quite subjective. Why to the right and not to the left?

array[0..array.length] David

First: what happens in your example if array.length=2^n-1, where n is the number of bits used to represent i? How would you increment ++i then? Second: for (int i = 0; i <= array.length-1; i++) {} array[0..array.length-1] I should note that from mathematical point of view, "<" is not a relation of order, as it lacks reflexivity (meaning a<b and b>a does not imply a==b). Only <= and >= are relations of order, as they are also reflexive (as a<=b and b<=a implies a==b).
May 31 2011
next sibling parent reply KennyTM~ <kennytm gmail.com> writes:
On Jun 1, 11 02:16, eles wrote:
 == Quote from David Nadlinger (see klickverbot.at)'s article
 On 5/31/11 6:10 PM, eles wrote:
 Actually, I think 0-based vs. 1-based is of little importance in


 fact if limits should be closed or open. Why do you think the
 contrary?
 Why it is "natural" to use open-limit for 0-based? I think it is
 quite subjective. Why to the right and not to the left?

array[0..array.length] David

First: what happens in your example if array.length=2^n-1, where n is the number of bits used to represent i? How would you increment ++i then? Second: for (int i = 0; i<= array.length-1; i++) {} array[0..array.length-1] I should note that from mathematical point of view, "<" is not a relation of order, as it lacks reflexivity (meaning a<b and b>a does not imply a==b). Only<= and>= are relations of order, as they are also reflexive (as a<=b and b<=a implies a==b).

If your program needs an array of 4 billion elements (2e+19 elements on 64-bit system), you're programming it wrong. This is an problem that won't arise practically. The advantages of using closed range is small compared with those of open range (as shown by other people).
May 31 2011
parent reply eles <eles eles.com> writes:
 If your program needs an array of 4 billion elements (2e+19

 64-bit system), you're programming it wrong.
 This is an problem that won't arise practically.

You are limiting yourself to Desktop PC. It is not always the case. There are also other platforms (embedded D, let's say), where constraints could be different. Second, you are addressing the rarity of the case. Yes, is a scholastic case w.r.t. the today needs. What is wrong and what is good is quite subjective to say it. The advantages of using
 closed range is small compared with those of open range (as shown by
 other people).

But what are the advantages of using open range over closed range? I fail to see any advantages. Yes, I see those other people (and quite many), but I fail to see any advantage except the louder talk.
May 31 2011
parent reply Jesse Phillips <jessekphillips+D gmail.com> writes:
eles Wrote:

 But what are the advantages of using open range over closed range? I
 fail to see any advantages. Yes, I see those other people (and quite
 many), but I fail to see any advantage except the louder talk.

Quite simply, there aren't any. But frankly your claims for having a closed range also aren't advantages. It is all style and familiarity (no different than the python whitespace).
May 31 2011
parent reply eles <eles eles.com> writes:
 Quite simply, there aren't any. But frankly your claims for having

familiarity (no different than the python whitespace). Good point and thanks. However, D is not trying to become Python and, more, it does not try to conquer the world of interpreted languages. And there is a *big* world outside Python world. There are no advantages of one vs. other side, let's accept that (although I have some doubts). But, let's keep it straight. At least is a better score than some hours ago.
May 31 2011
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
eles wrote:
 You can live a life without having to think about many things.
 However, experience proves that thinking is generally better than not
 "having to think about it". What if a bug?

eles wrote:
 Allowing implicit "fall" is just opposite of that Walter was
 presenting at the latest conference: "do not rely on educating users,
 enforce it by design!".

Also hopefully my last post on the subject. Timon
May 31 2011
prev sibling parent Jesse Phillips <jessekphillips+D gmail.com> writes:
eles Wrote:

 Quite simply, there aren't any. But frankly your claims for having

familiarity (no different than the python whitespace). Good point and thanks. However, D is not trying to become Python and, more, it does not try to conquer the world of interpreted languages. And there is a *big* world outside Python world. There are no advantages of one vs. other side, let's accept that (although I have some doubts). But, let's keep it straight. At least is a better score than some hours ago.

This has nothing to do with Python. My claim is that you, and others will say it is inconsistent and unintuitive, while myself and others will claim that their is nothing in needs to be consistent with, and it is in fact intuitive. Consistency can be argued, but then what is it supposed to be consistent with? Walters goals? With other languages? With iota or uniform (random number generator)? I can't tell you what the best choice is, but I can tell you I'm not writing: for(int i = 0; i <= arr.length-1; i++) just because it adds 3 extra characters. Just as I wouldn't write for(int i = 1; i < arr.length + 1; i++) for an 1 based array.
May 31 2011
prev sibling next sibling parent eles <eles eles.com> writes:
== Quote from Andrej Mitrovic (andrej.mitrovich gmail.com)'s article
 Quite frankly I don't give a damn about for loops because I hardly

 them. Whenever I can, I use foreach, and if I get stuck and think

 foreach can't be used I rather think twice and consider to do a

 redesign of my code rather than start using ancient for loops.

Foreach is an abstraction for... for. When it comes to abstractions, please read this paper: http://www.joelonsoftware.com/articles/ LeakyAbstractions.html I hate giving links but sometimes it must. You can live in a world of abstractions. But what when things no longer work? What if foreach does not work and the nice world of abstractions gets stucked? The simple truth is that the for loops are the base on which abstractions like foreach are constructed. And if abstractions are leaky... well, then. I don't
 want to deal with these silly signed/unsigned/overflow issues at

 Foreach for the win. Foreach and lockstep for the double win.
 In fact, from what my *.d searches tell me, most of my for loops

 from C relics of code which I've translated into D.
 But I don't do scientific computing, so that's just me. :)

You are not solving the problem. You are hiding it. I also do *a lot* of scientific computing. Processing arrays are the daily basis of my work.
May 31 2011
prev sibling parent David Nadlinger <see klickverbot.at> writes:
On 5/31/11 8:16 PM, eles wrote:
 == Quote from David Nadlinger (see klickverbot.at)'s article
 On 5/31/11 6:10 PM, eles wrote:
 Actually, I think 0-based vs. 1-based is of little importance in


 fact if limits should be closed or open. Why do you think the
 contrary?
 Why it is "natural" to use open-limit for 0-based? I think it is
 quite subjective. Why to the right and not to the left?

array[0..array.length] David

First: what happens in your example if array.length=2^n-1, where n is the number of bits used to represent i? How would you increment ++i then?

As far as I can see, there wouldn't be any problem with array.length being size_t.max in this example.
 Second:

 for (int i = 0; i<= array.length-1; i++) {}

   array[0..array.length-1]


 I should note that from mathematical point of view, "<" is not a
 relation of order, as it lacks reflexivity (meaning a<b and b>a does
 not imply a==b).

 Only<= and>= are relations of order, as they are also reflexive (as
 a<=b and b<=a implies a==b).

The term you are looking for is probably »(weak) partial order«, whereas < is sometimes called »strict partial order«. I don't quite see how that's relevant here, though. David
May 31 2011
prev sibling next sibling parent reply Mafi <mafi example.org> writes:
Am 31.05.2011 18:10, schrieb eles:
 I don't think you can enter this debate without bringing the other

 about zero-based indices vs. one-based indices. Matlab first's index
 number is number one, and I think this fits very naturally with the
 closed-right limit. In many fields, one-based indices are common
 because they are easier to reason with. But in computers, where the
 index is generally an offset from a base address, zero-based is much
 more prevalent as it better represents what the machine is actually
 doing. And it follows that open-ended limits on the right are the

Actually, I think 0-based vs. 1-based is of little importance in the fact if limits should be closed or open. Why do you think the contrary? Why it is "natural" to use open-limit for 0-based? I think it is quite subjective. Why to the right and not to the left? Can you redirect me towards an example of such norm? I have no knowledge of that use. I hope some day someone would not have to write a paper like http:// drdobbs.com/blogs/cpp/228701625 but targetting... D's biggest mistake was to use open-limit on the right. I think the idea of dropping array information to just a pointer was seen as very natural to Kernigan and Ritchie. Otherwise they would have not used it.

In my opinion it is natuaral to use half open intervals for zero-based indices. My reasoning: //zero-based int[8] zero; //indices from 0 upto and excluding 8 -> [0,8) //one-based int[8] one; //in many languages the indices from 1 upto and including 8 -> [1,8] Then, using the same type of inetrval seems natural for slicing. Mafi
May 31 2011
parent reply eles <eles eles.com> writes:
 In my opinion it is natuaral to use half open intervals for zero-

 indices. My reasoning:
 //zero-based
 int[8] zero; //indices from 0 upto and excluding 8 -> [0,8)
 //one-based
 int[8] one;
 //in many languages the indices from 1 upto and including 8 -> [1,8]
 Then, using the same type of inetrval seems natural for slicing.
 Mafi

Well, in my opinion it is not, as from mathematical point of view, 1..N-1 is a *closed* field http://en.wikipedia.org/wiki/Field_ (mathematics) w.r.t. addition and multiplication. It simply says that there are *no outside* elements. They are 8 elements, and they go from 0 to 7. "8" never appears among those elements.
May 31 2011
parent reply KennyTM~ <kennytm gmail.com> writes:
On Jun 1, 11 02:24, eles wrote:
 In my opinion it is natuaral to use half open intervals for zero-

 indices. My reasoning:
 //zero-based
 int[8] zero; //indices from 0 upto and excluding 8 ->  [0,8)
 //one-based
 int[8] one;
 //in many languages the indices from 1 upto and including 8 ->  [1,8]
 Then, using the same type of inetrval seems natural for slicing.
 Mafi

Well, in my opinion it is not, as from mathematical point of view, 1..N-1 is a *closed* field http://en.wikipedia.org/wiki/Field_ (mathematics) w.r.t. addition and multiplication. It simply says that there are *no outside* elements. They are 8 elements, and they go from 0 to 7. "8" never appears among those elements.

Array indices do not form a field in D. What's the point bringing it in?
May 31 2011
parent reply eles <eles eles.com> writes:
 Array indices do not form a field in D. What's the point bringing it

Yes, they are (it is true no matter that we speak of D or not). Or, more exactly, they should be a field. They are numbers, after all. Secondly, why they should not be a field? Yes, the current approach introduces a "n+1"th parameter needed to characterize an array with just... n elements. Why that?
May 31 2011
next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"eles" <eles eles.com> wrote in message news:is3h9b$on3$1 digitalmars.com...
 Array indices do not form a field in D. What's the point bringing it

Yes, they are (it is true no matter that we speak of D or not). Or, more exactly, they should be a field. They are numbers, after all. Secondly, why they should not be a field? Yes, the current approach introduces a "n+1"th parameter needed to characterize an array with just... n elements. Why that?

Because that's not supposed to be n, it's supposed to be the length. In programming, length tends to a much more useful way to descrbe the size of something than the index of the last element.
May 31 2011
parent eles <eles eles.com> writes:
 Because that's not supposed to be n, it's supposed to be the

 programming, length tends to a much more useful way to descrbe the

 something than the index of the last element.

Then why not slicing with a[beginning..length] for a slice? It should be even better. However, this is my last message on the question. I cannot sustain a war against all and I have a life outside D that I have to take care of and I lack the infinity of time for just myself or D. An I am not paid to post here, neither. Anyway, I consider the choice D is making is a mistake. A bitter one in the long run. Well, Cassandra would have predicted the same.
May 31 2011
prev sibling parent KennyTM~ <kennytm gmail.com> writes:
On Jun 1, 11 03:58, eles wrote:
 Array indices do not form a field in D. What's the point bringing it

Yes, they are (it is true no matter that we speak of D or not). Or, more exactly, they should be a field. They are numbers, after all.

No they aren't. If it is a 'field' as in Z_8, then we should have a[7] == a[-1] == a[15], which is not true (the other two throw RangeError, and practically they should). Indices are simply the subset "[0, 8) & Z" of all accessible integers. (Actually Z_8 is just a commutative ring, not a field. A field has much more requirement than having addition and multiplication. Please understand the term before you use it.)
 Secondly, why they should not be a field? Yes, the current approach
 introduces a "n+1"th parameter needed to characterize an array with
 just... n elements. Why that?

'a[i .. j]' means a slice containing elements a[k] where k in [i, j). No j+1 involved :)
May 31 2011
prev sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2011-05-31 12:10:24 -0400, eles <eles eles.com> said:

 I don't think you can enter this debate without bringing the other war
 about zero-based indices vs. one-based indices. Matlab first's index
 number is number one, and I think this fits very naturally with the
 closed-right limit. In many fields, one-based indices are common
 because they are easier to reason with. But in computers, where the
 index is generally an offset from a base address, zero-based is much
 more prevalent as it better represents what the machine is actually
 doing. And it follows that open-ended limits on the right are the norm.

Actually, I think 0-based vs. 1-based is of little importance in the fact if limits should be closed or open. Why do you think the contrary?

Good question. Actually, I'm not too sure if 1-based is better suited with closed. But I can say that it makes a lot of sense for 0-based arrays to be open. Take this array of bytes for instance, where the 'end' pointer is located just after the last element: . begin end address: 0xA0 0xA1 0xA2 0xA3 0xA4 0xA5 0xA6 0xA8 0xA9 ... index: [ 0, 1, 2, 3, 4, 5, 6 ] Now, observe those properties (where '&' means "address of"): length = &end - &begin &end = &begin + length &begin = &end - length Now, let's say end is *on* the last element instead of after it: . begin end address: 0xA0 0xA1 0xA2 0xA3 0xA4 0xA5 0xA6 0xA8 0xA9 ... index: [ 0, 1, 2, 3, 4, 5, 6 ] length = &end - &begin + 1 &end = &begin + length - 1 &begin = &end - length + 1 Not only you always need to add or substract one from the result, but since the "end" index ('$' in D) for an empty array is now -1 you can't use unsigned integers anymore (at least not without huge complications).
 Why it is "natural" to use open-limit for 0-based? I think it is
 quite subjective.

Similar observations can be made with zero-based indices: &elem = &begin + index index = &begin - &elem and one-based indices: &elem = &begin + index - 1 index = &begin - &elem + 1 So they have pretty much the same problem that you must do +1/-1 everywhere. Although now the "end" index for an empty array becomes zero again (one less than the first index), so we've solved our unsigned integer problem. Now, we could make the compiler automatically add those +1/-1 everywhere, but that won't erase the time those extra calculations take on your processor. In a tight loop this can be noticeable.
 Why to the right and not to the left?

Because you start counting indices from the right. If you make it open on the left instead, you'll have to insert those +1/-1 everywhere again. That said, in the rare situations where indices are counted from the right it certainly make sense to make it open on the left and closed on the right. C++'s reverse iterators and D's retro are open ended on the left... although depending on your point of view you could say they just swap what's right and left and that it's still open on the right. :-) -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 31 2011
parent reply eles <eles eles.com> writes:
 Actually, I think 0-based vs. 1-based is of little importance in


 fact if limits should be closed or open. Why do you think the
 contrary?


 with closed. But I can say that it makes a lot of sense for 0-based
 arrays to be open.
 Take this array of bytes for instance, where the 'end' pointer is
 located just after the last element:

However, if the length of the array is exactly the maximum value that can be represented on size_t (which is, I assume, the type used internally to represent the indexes), addressing the pointer *just after the last element* is impossible (through normal index operations). Then, why characterizing an array should depend of something... "just after the last element"? Whatif there is nothing there? What if, on a microcontroller, the last element of the array is simply the highest addressable memory location? (like the "end of the world") Why introducing something completely inconsistent with the array (namely, someting "just after the last element") to characterize something that should be regarded as... self-contained? What if such a concept (data after the last element of the array) is simply nonsense on some architecture?
 	.        begin                                     end
 	address: 0xA0  0xA1  0xA2  0xA3  0xA4  0xA5  0xA6  0xA8

 	index:  [   0,    1,    2,    3,    4,    5,    6 ]
 Now, observe those properties (where '&' means "address of"):
 	length = &end - &begin
 	&end = &begin + length
 	&begin = &end - length
 Now, let's say end is *on* the last element instead of after it:
 	.        begin                               end
 	address: 0xA0  0xA1  0xA2  0xA3  0xA4  0xA5  0xA6  0xA8

 	index:  [   0,    1,    2,    3,    4,    5,    6 ]
 	length = &end - &begin + 1
 	&end = &begin + length - 1
 	&begin = &end - length + 1
 Not only you always need to add or substract one from the result,

 since the "end" index ('$' in D) for an empty array is now -1 you

 use unsigned integers anymore (at least not without huge

But, it is just the very same operations that you would make for 0- based indexes, as well as for 1-based indexes. Is just like you would replace +j with -j (squares of -1). The whole mathematics would remain just as they are, because the two are completely inter- exchangeable. Because, in the first place, the choice of +j and of -j is *arbitrary*. The same stands for 0-based vs. 1-based (ie. the field Zn could be from 0 to n-1 as well as from 1 to n). The choice is arbitrary. But this choice *has nothing to do* with the rest of the mathematical theory, and nothing to do with the closed-limit vs. open-limit problem.
 Why it is "natural" to use open-limit for 0-based? I think it is
 quite subjective.

&elem = &begin + index index = &begin - &elem and one-based indices: &elem = &begin + index - 1 index = &begin - &elem + 1 So they have pretty much the same problem that you must do +1/-1 everywhere. Although now the "end" index for an empty array becomes zero again (one less than the first index), so we've solved our unsigned integer problem. Now, we could make the compiler automatically add those +1/-1 everywhere, but that won't erase the time those extra calculations

 on your processor. In a tight loop this can be noticeable.
 Why to the right and not to the left?


 on the left instead, you'll have to insert those +1/-1 everywhere

 That said, in the rare situations where indices are counted from the
 right it certainly make sense to make it open on the left and

 the right. C++'s reverse iterators and D's retro are open ended on

 left... although depending on your point of view you could say they
 just swap what's right and left and that it's still open on the

 :-)

They are rare situations... sometimes. Did you translated an array from right to left with one position? You almost always use a decrementing index, to avoid overwriting values. But this is not the main point. The main point is that the things are unnecessarily complex with the open-limit to either of the right or the left limit. I simply fail why not to choose the simplest solution? I stress out that even when you write for(i=0; i<n; i++){ some_processing(a[i]); } *conceptually*, the only set of *useful values* of index i are from 0 to n-1. *Why* to go further up if *not needed*? Seriously, it reminds me of the epicycles (http://en.wikipedia.org/ wiki/Deferent_and_epicycle) to explain how planet are moving while preciously placing Earth at the center of the universe. It is simply not there, as much as we would like it to be.
May 31 2011
next sibling parent =?ISO-8859-1?Q?Ali_=C7ehreli?= <acehreli yahoo.com> writes:
On 05/31/2011 12:38 PM, eles wrote:

 What if, on a
 microcontroller, the last element of the array is simply the highest
 addressable memory location? (like the "end of the world") Why
 introducing something completely inconsistent with the array (namely,
 someting "just after the last element") to characterize something
 that should be regarded as... self-contained? What if such a concept
 (data after the last element of the array) is simply nonsense on some
 architecture?

As Andrei also said elsewhere on this thread, it's not possible in the C++ language. I would expect the D spec to require that the address of the one-past-the-end element to be valid. Ali
May 31 2011
prev sibling next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
eles wrote:
 Actually, I think 0-based vs. 1-based is of little importance in


 fact if limits should be closed or open. Why do you think the
 contrary?


 with closed. But I can say that it makes a lot of sense for 0-based
 arrays to be open.
 Take this array of bytes for instance, where the 'end' pointer is
 located just after the last element:

However, if the length of the array is exactly the maximum value that can be represented on size_t (which is, I assume, the type used internally to represent the indexes), addressing the pointer *just after the last element* is impossible (through normal index operations).

I already disproved you on this, why do you keep insisting it is a valid point? Did you miss my post? In more detail: Assume your size_t has 3 bits. Ergo, you can index into 8 memory locations, namely 0,1,2,3,4,5,6,7. The maximum value that can be represented in 3 bits is 7. You have a byte array of that maximum length that takes up memory locations 0,1,2,3,4,5,6. You can point to the end of your array in that case, which is location 7. Yes, of course, your array could end just at location 7, but that is a different issue (and a trivial one, just don't allocate an array that ends at location 7). What is the point? (You can slice quite peacefully without having to care about the element that is off by one anyways. You don't need that pointer for slicing.)
 Then, why characterizing an array should depend of something... "just
 after the last element"? Whatif there is nothing there? What if, on a
 microcontroller, the last element of the array is simply the highest
 addressable memory location? (like the "end of the world") Why
 introducing something completely inconsistent with the array (namely,
 someting "just after the last element") to characterize something
 that should be regarded as... self-contained? What if such a concept
 (data after the last element of the array) is simply nonsense on some
 architecture?

Give up the last word of memory for arrays if you need that pointer. Store something different there. It has already been done for the first word on most platforms I am aware of. If you just personally like the other way of slicing more, why don't you name it by that? The latest arguments you have given are esoteric enough that I am starting to suspect it is the case. Tell me if I'm wrong. Are there any _practical_ drawbacks of half-open intervals? There are many _practical_ benefits. Timon
May 31 2011
prev sibling next sibling parent "Nick Sabalausky" <a a.a> writes:
"eles" <eles eles.com> wrote in message news:is3g3j$n2a$1 digitalmars.com...
 Actually, I think 0-based vs. 1-based is of little importance in


 fact if limits should be closed or open. Why do you think the
 contrary?


 with closed. But I can say that it makes a lot of sense for 0-based
 arrays to be open.
 Take this array of bytes for instance, where the 'end' pointer is
 located just after the last element:

However, if the length of the array is exactly the maximum value that can be represented on size_t (which is, I assume, the type used internally to represent the indexes), addressing the pointer *just after the last element* is impossible (through normal index operations). Then, why characterizing an array should depend of something... "just after the last element"? Whatif there is nothing there? What if, on a microcontroller, the last element of the array is simply the highest addressable memory location? (like the "end of the world") Why introducing something completely inconsistent with the array (namely, someting "just after the last element") to characterize something that should be regarded as... self-contained? What if such a concept (data after the last element of the array) is simply nonsense on some architecture?

The need to have an array of *exactly* 2^32 or 2^64 elements, instead of (2^32)-1 or (2^64)-1, is extremely rare. But when right-inclusive is in effect, there are a *lot* of times you need to use both "index" and "index-1" (instead of just "index", which is usually all you need with exclusive-right). Not only is that an extra instruction, which can be bad in an inner loop, but it's also an extra value that may need to take up an extra register. So why should the computer have to deal with all that extra baggage, just for the sake of an extremely rare case when someone might want that *one* extra element in an already huge array?
 I simply fail why not to choose the simplest solution?

In my experience, exclusive-right *is* much simpler. It only sounds more complex on the surface, but it leads to far greater simplicity in its consequences, and *that's* where simplicity really counts. What I've said in the paragraph above is only one of those simplified consequences. I've mentioned other ways in which it simplifies things in a nother post further below. (One such simplification is this nice clean property: arr[a..b].length == b-a )
May 31 2011
prev sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2011-05-31 15:38:27 -0400, eles <eles eles.com> said:

 But, it is just the very same operations that you would make for 0-
 based indexes, as well as for 1-based indexes. Is just like you would
 replace +j with -j (squares of -1). The whole mathematics would
 remain just as they are, because the two are completely inter-
 exchangeable. Because, in the first place, the choice of +j and of -j
 is *arbitrary*.
 
 The same stands for 0-based vs. 1-based (ie. the field Zn could be
 from 0 to n-1 as well as from 1 to n). The choice is arbitrary.
 
 But this choice *has nothing to do* with the rest of the mathematical
 theory, and nothing to do with the closed-limit vs. open-limit
 problem.

You're arguing about mathematics. Mathematically, there is little difference. I'll concede that. It's about efficiency. It's about the compiler emitting less instructions, the processor making less calculations, and about your program running faster. It also makes some things conceptually simpler, but that's hard to see until you're used to it. I think Nick Sabalausky made a good case for that. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 31 2011
prev sibling next sibling parent reply eles <eles eles.com> writes:
== Quote from Andrej Mitrovic (andrej.mitrovich gmail.com)'s article
 This is how I got to terms with it long ago: http://i.imgur.com/

 When it's a slice, it's basically two anchors or gaps at some
 location, and whatever items are between the anchors is your result.
 Otherwise with indexing it's the number that starts at that offset

 this case it would be at the location at right from the gap).

Nice picture, but then why foo[1] is rather 8 than 4? And what is foo [9]? For me is a bit cumbersome to put anchors "in-between", while still writing foo[1]. Yes, it defends a statu-quo, but what if the whole choice is wrong? Maybe we should write foo[1.5]. I am not (or, at least, I am trying to not be) a Matlab defender, but in slicing is like a... state of the art. Octave, Scilab are copying it. Why to reinvent the wheel? Occam's razor: complex explanations (like the anchor in the pictures) are required to explain unnatural choices. What if someone needs a model for ]-Inf,3] (ie. left limit is open)? He will still have to use the right-limit is open convention? The same for ]-2,0]. Rather, interpreting the slicing as a[start_position..slice_length] is more appealing (including for empty arrays) than the whole anchor concept. We could have then a[1..0] to represent empty array, and a[0..$] to represent the entire array. However, in that case a[1..1] would be a slice of just one element, containing a[1]. The bottom line is that I maintain my pov that a [closed_limit...open_limit] is a wrong choice. I fail to see any good example in its good favor and I know no other successful language (well, besides D) that uses this convention. Nor the reasons why it would do so.
May 31 2011
next sibling parent reply KennyTM~ <kennytm gmail.com> writes:
On Jun 1, 11 00:02, eles wrote:
 == Quote from Andrej Mitrovic (andrej.mitrovich gmail.com)'s article
 This is how I got to terms with it long ago: http://i.imgur.com/

 When it's a slice, it's basically two anchors or gaps at some
 location, and whatever items are between the anchors is your result.
 Otherwise with indexing it's the number that starts at that offset

 this case it would be at the location at right from the gap).

Nice picture, but then why foo[1] is rather 8 than 4? And what is foo [9]? For me is a bit cumbersome to put anchors "in-between", while still writing foo[1]. Yes, it defends a statu-quo, but what if the whole choice is wrong? Maybe we should write foo[1.5]. I am not (or, at least, I am trying to not be) a Matlab defender, but in slicing is like a... state of the art. Octave, Scilab are copying it. Why to reinvent the wheel? Occam's razor: complex explanations (like the anchor in the pictures) are required to explain unnatural choices. What if someone needs a model for ]-Inf,3] (ie. left limit is open)? He will still have to use the right-limit is open convention? The same for ]-2,0]. Rather, interpreting the slicing as a[start_position..slice_length] is more appealing (including for empty arrays) than the whole anchor concept. We could have then a[1..0] to represent empty array, and a[0..$] to represent the entire array. However, in that case a[1..1] would be a slice of just one element, containing a[1]. The bottom line is that I maintain my pov that a [closed_limit...open_limit] is a wrong choice. I fail to see any good example in its good favor and I know no other successful language (well, besides D) that uses this convention. Nor the reasons why it would do so.

Is Python successful?
 a = [0,1,2,3,4,5,6]
 a[3:5]



In C++'s iterator concept, x.end() points to one position after the last element, so the a "range" (x.begin(), x.end()) is has an open limit at the end. Every C++ algorithm that operates on a pair of iterator take use [...) range concept. BTW, Ruby has both of them
 a = [0,1,2,3,4,5,6]


 a[3..5]


 a[3...5]


May 31 2011
next sibling parent reply eles <eles eles.com> writes:
 Is Python successful?
  >>> a = [0,1,2,3,4,5,6]
  >>> a[3:5]
 [3, 4]

Well, I am not a Python user, I give you credit for that. Actually, I don't really appreciate Python for its indentation choice, but that's a matter of taste.
 In C++'s iterator concept, x.end() points to one position after the

 element, so the a "range" (x.begin(), x.end()) is has an open limit

 the end. Every C++ algorithm that operates on a pair of iterator

 use [...) range concept.

That's a good point. I work in C, not in C++. I think it comes from: for(i=0; i<N; i++){} while other languages use more for(i=0; i<=N-1; i++){}
 BTW, Ruby has both of them
  >> a = [0,1,2,3,4,5,6]
 => [0, 1, 2, 3, 4, 5, 6]
  >> a[3..5]
 => [3, 4, 5]
  >> a[3...5]
 => [3, 4]

So, they have studied the well established ground of using an open- right limit and... chose to implement the other way too. Good for them. But, if using right-open limit is the natural and established way... why then, the choice they made?
May 31 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
eles wrote:
 Is Python successful?
  >>> a = [0,1,2,3,4,5,6]
  >>> a[3:5]
 [3, 4]

Well, I am not a Python user, I give you credit for that. Actually, I don't really appreciate Python for its indentation choice, but that's a matter of taste.
 In C++'s iterator concept, x.end() points to one position after the

 element, so the a "range" (x.begin(), x.end()) is has an open limit

 the end. Every C++ algorithm that operates on a pair of iterator

 use [...) range concept.

That's a good point. I work in C, not in C++. I think it comes from: for(i=0; i<N; i++){} while other languages use more for(i=0; i<=N-1; i++){}

Indeed, but you agree that in that special case, the first one is better? In C++ you cannot have the same for iterators, because iterators are not necessarily ordered. Eg: map m; /*fill map with values*/ for(map<int>::iterator it=m.begin();it!=m.end();++it){} But that is not really important for D slices.
 BTW, Ruby has both of them
  >> a = [0,1,2,3,4,5,6]
 => [0, 1, 2, 3, 4, 5, 6]
  >> a[3..5]
 => [3, 4, 5]
  >> a[3...5]
 => [3, 4]

So, they have studied the well established ground of using an open- right limit and... chose to implement the other way too. Good for them. But, if using right-open limit is the natural and established way... why then, the choice they made?

Probably because in different contexts both make sense. Half-open intervals save you from many +-1 errors. a very simple and archetypical example: binary search in array a for value v. Assuming _closed_ slicing intervals: while(a.length>1){ a = v<=a[$/2] ? a[0..$/2] : a[$/2+1..$-1]; } Do you see the bug? People do that all the time when implementing binary search. Half-open slicing intervals (correct): while(a.length>1){ a = v<a[$/2] ? a[0..$/2] : a[$/2..$]; } Half-open intervals encourage a correct implementation of binary search! They cannot be that bad, right? Well, I might have biased this display a little bit. Fact is, that many programmers fail to implement binary search correctly. It can be assumed that most of those use closed intervals for their l and h. It gets really simple once the intervals are half-open. It is easier to reason about half-open intervals than about closed intervals when indexes are 0-based. Another valid reason is, that the slicing of half-open intervals can be done in less machine instructions ;). slicing with closed intervals: b=a[i..j]; <=> b.ptr:=a.ptr+i; b.length:=j-i+1; // boom slicing with half-open intervals: b=a[i..j]; <=> b.prt:=a.ptr+i; b.length:=j-i; // nice! When you want to have closed interval slicing, you do it by making the overhead visible: b=a[i..j+1]; Timon
May 31 2011
prev sibling parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 05/31/2011 09:09 AM, KennyTM~ wrote:

 BTW, Ruby has both of them

  >> a = [0,1,2,3,4,5,6]
 => [0, 1, 2, 3, 4, 5, 6]
  >> a[3..5]
 => [3, 4, 5]
  >> a[3...5]
 => [3, 4]

Isn't that too subtle? I wonder whether that's a common problem in Ruby programming. That reminds me of uniform(), Phobos's experiment in open/closed ending: :) import std.random; void main() { // Also available: "()", "[)" (the default) and "(]" auto result = uniform!"[]"(0, 5); assert((result >= 0) && (result <= 5)); } Ali
May 31 2011
prev sibling parent reply eles <eles eles.com> writes:
== Quote from Andrej Mitrovic (andrej.mitrovich gmail.com)'s article
 On 5/31/11, eles <eles eles.com> wrote:
 Nice picture, but then why foo[1] is rather 8 than 4? And what is


 [9]?

foo[0] would be out of range too. I've only used this picture

 back years ago when I was trying to understand how this whole system
 works there was an explanation similar to this one in some book, the
 name of which I forgot. But it did 'click' with me when I saw it
 presented like that.

Yes. Except that if you assume that indexes of an array are represented, lets say on an unsigned type UTYPE of n bytes and let's note UTYPE_MAX the 2^n-1 value (i.e. the maximum representable value on n bytes) and if you have an array of exactly UTYPE_MAX elements: a[0],...,a[UTYPE_MAX] then, to slice something including the last element, you would need to represent the UINT_MAX+1 value on n bytes, which is of course impossible. In your example, assume that n=3. So, to slice a substring of two elements containing the last element, you cannot write a[7..9]. It simply overflows. The only choice you have is to write the a[7..$] thing where, at least conceptually, $ would still be an oxymoron represented on... n+1 bits (because n are not enough). Consider such a string and that you need to process slices of two elements (I remind that length of the array is UTYPE_MAX+1 and that its indices are going from 0 to UTYPE_MAX): //closed-limits approach for(i=0; i<=UTYPE_MAX-1; i++){ b=a[i..i+1]; //still possible even for i=UTYPE_MAX-1 some_process(b); } or for(i=0; i<UTYPE_MAX; i++){ b=a[i..i+1]; //still possible even for the last value of i, which is UTYPE_MAX-1 some_process(b); } while //for the open-limit approach for(i=0; i<=UTYPE_MAX-1){ b=a[i..i+2]; //needed to include the last element some_process(b); } or for(i=0; i<UTYPE_MAX){ b=a[i..i+2]; //needed to include the last element some_process(b); } However, the latest two will overflow when trying to compute i+2 for i=UTYPE_MAX-1, as it would give... UTYPE_MAX+1. You would be forced to use some inconsistent behavior such as: for(i=0; i<UTYPE_MAX-1){ //process all slices except the last b=a[i..i+2]; //now, it is still possible to compute i+2 for i=UTYPE_MAX-2, the maximum value of i some_process(b); } some_process(a[UTYPE_MAX-1..$]); //the *only* way to write a slice containing the last element of the array. Using closed-limits does not require inconsistency into representation and processing of data, at least in this case.
May 31 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
eles wrote:
 == Quote from Andrej Mitrovic (andrej.mitrovich gmail.com)'s article
 On 5/31/11, eles <eles eles.com> wrote:
 Nice picture, but then why foo[1] is rather 8 than 4? And what is


 [9]?

foo[0] would be out of range too. I've only used this picture because back years ago when I was trying to understand how this whole system works there was an explanation similar to this one in some book, the name of which I forgot. But it did 'click' with me when I saw it presented like that.

Yes. Except that if you assume that indexes of an array are represented, lets say on an unsigned type UTYPE of n bytes and let's note UTYPE_MAX the 2^n-1 value (i.e. the maximum representable value on n bytes) and if you have an array of exactly UTYPE_MAX elements: a[0],...,a[UTYPE_MAX] [snip.]

Those are UTYPE_MAX+1 elements. It's natural that you will run into trouble if you cannot even represent the length of your array with your size_t type. If you have just a[0],...,a[UTYPE_MAX-1], (UTYPE_MAX elements), it will work just fine. Moreover, it is unlikely that this will happen in 64bit, because your array (of ints) would take up about 64 Million TB. And you could resolve the issue by reducing its size by 4 bytes. Timon
May 31 2011
prev sibling parent reply =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 05/31/2011 08:10 AM, eles wrote:

 I know no other examples where open-right limits are used.

The C++ standard library uses open-right with its pairs of iterators. The second iterator "points" at one beyond the last element of the range. Ali
May 31 2011
parent reply eles <eles eles.com> writes:
== Quote from Ali Çehreli (acehreli yahoo.com)'s article
 On 05/31/2011 08:10 AM, eles wrote:
  > I know no other examples where open-right limits are used.
 The C++ standard library uses open-right with its pairs of

 The second iterator "points" at one beyond the last element of the

 Ali

C'mon, if C++ is such a good standard, then D would have never appeared. Why not dropping D completely and go to C++ then? The fact that C++ uses it that way does not making it good for us. D choose slices, not iterators. Maybe we should remind why it did it in the first place. Now, we are taking the "enemy" (well, is a joke) as a reference in the matter?
May 31 2011
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
eles wrote:
 == Quote from Ali Çehreli (acehreli yahoo.com)'s article
 On 05/31/2011 08:10 AM, eles wrote:
  > I know no other examples where open-right limits are used.
 The C++ standard library uses open-right with its pairs of

 The second iterator "points" at one beyond the last element of the

 Ali

C'mon, if C++ is such a good standard, then D would have never appeared. Why not dropping D completely and go to C++ then? The fact that C++ uses it that way does not making it good for us. D choose slices, not iterators. Maybe we should remind why it did it in the first place.

D slices do not require being able to point to invalid data past the end of the array.
 Now, we are taking the "enemy" (well, is a joke) as a reference in
 the matter?

Timon
May 31 2011
prev sibling next sibling parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 05/31/2011 01:19 PM, eles wrote:
 == Quote from Ali Çehreli (acehreli yahoo.com)'s article
 On 05/31/2011 08:10 AM, eles wrote:
   >  I know no other examples where open-right limits are used.
 The C++ standard library uses open-right with its pairs of

 The second iterator "points" at one beyond the last element of the

 Ali

C'mon, if C++ is such a good standard, then D would have never appeared. Why not dropping D completely and go to C++ then?

D improves over C++.
 The fact that C++ uses it that way does not making it good for us.

Familiarity is very important. I came to D from C++ and it made sense to me right away. (Just like it did not make sense to you because it is different in MatLab.) Ali
May 31 2011
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/31/11 3:19 PM, eles wrote:
 == Quote from Ali Çehreli (acehreli yahoo.com)'s article
 On 05/31/2011 08:10 AM, eles wrote:
   >  I know no other examples where open-right limits are used.
 The C++ standard library uses open-right with its pairs of

 The second iterator "points" at one beyond the last element of the

 Ali

C'mon, if C++ is such a good standard, then D would have never appeared. Why not dropping D completely and go to C++ then? The fact that C++ uses it that way does not making it good for us. D choose slices, not iterators. Maybe we should remind why it did it in the first place. Now, we are taking the "enemy" (well, is a joke) as a reference in the matter?

I think this is not even remotely fair-play. The man simply responded to your question, and even quoted it. You shouldn't reframe his answer as if he argued it that way. Honest, I suggest we all just end this. This is well-trodden ground, it's unlikely new things will be learned, and definitely the semantics won't change. Andrei
May 31 2011
prev sibling parent reply "Nick Sabalausky" <a a.a> writes:
"eles" <eles eles.com> wrote in message news:is3ihf$qnu$1 digitalmars.com...
 == Quote from Ali ehreli (acehreli yahoo.com)'s article
 On 05/31/2011 08:10 AM, eles wrote:
  > I know no other examples where open-right limits are used.
 The C++ standard library uses open-right with its pairs of

 The second iterator "points" at one beyond the last element of the

 Ali

C'mon, if C++ is such a good standard, then D would have never appeared. Why not dropping D completely and go to C++ then? The fact that C++ uses it that way does not making it good for us. D choose slices, not iterators. Maybe we should remind why it did it in the first place. Now, we are taking the "enemy" (well, is a joke) as a reference in the matter?

First you complain about D doing it differently from other popular languages. Then when you're given a counterexample, you start complaining that D *isn't* doing it differently??? You're really just trolling, aren't you?
May 31 2011
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/31/11 4:37 PM, Andrej Mitrovic wrote:
 As for switch statements (and this is getting offtopic), there are
 cases where fallback is nice to have..

 final switch (param)
 {
      case FILE_NEW:
      case FILE_OPEN:
      case FILE_SAVE:
      case FILE_SAVE_AS:
      case EDIT_UNDO:
          throw new Exception("unimplemented!");  // or there would be
 one method that handles these cases
      case EDIT_CUT:
          handleCut();
          break;
      case EDIT_COPY:
          handleCopy();
          break;
      case EDIT_PASTE:
          handlePaste();
          break;
      case EDIT_CLEAR:
          handleClear();
          break;
 }
 return 0;

 If you didn't have fallback, you would probably have to add some kind
 of new statement like "goto next" or "fallback" on each of those
 cases.

Currently the best proposed behavior is to only require control flow if there is actual code before it. So adjacent case labels don't require any. Your example code would work unmodified. For the rare cases where fall-through is actually needed, goto case xxx; has been proposed. It has the advantage that it's robust to reordering of labels, and the disadvantage that it's not robust to swapping label names. Arguably, such swapping is an infrequent refactoring. Andrei
May 31 2011
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Andrei:

 Currently the best proposed behavior is to only require control flow if 
 there is actual code before it. So adjacent case labels don't require 
 any. Your example code would work unmodified.

We have already discussed a little about this. The basic rule is to require a control flow statement at the end of each case. The special case you refer to is when there are empty case labels. I agree this special case is not going to cause bugs. In general I hate special cases, but now I have changed my mind and I now agree this one is acceptable. I hope to see such small change to D switches in a short enough time, reducing code breakage too :-) Thank you for improving D, bearophile
May 31 2011
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Andrei:

 Currently the best proposed behavior is to only require control flow if 
 there is actual code before it. So adjacent case labels don't require 
 any. Your example code would work unmodified.

On the other hand this offers two different ways to write the same thing, because the empty ones can be already be written separated by comma: case FILE_NEW, FILE_OPEN, FILE_SAVE, FILE_SAVE_AS, case EDIT_UNDO: In Python Zen rule says: "There should be one-- and preferably only one --obvious way to do it.". Bye, bearophile
May 31 2011
prev sibling parent reply KennyTM~ <kennytm gmail.com> writes:
On Jun 1, 11 05:37, Andrej Mitrovic wrote:
 On 5/31/11, Nick Sabalausky<a a.a>  wrote:
 "eles"<eles eles.com>  wrote in message news:is3ihf$qnu$1 digitalmars.com...
 == Quote from Ali Çehreli (acehreli yahoo.com)'s article
 On 05/31/2011 08:10 AM, eles wrote:
   >  I know no other examples where open-right limits are used.
 The C++ standard library uses open-right with its pairs of

 The second iterator "points" at one beyond the last element of the

 Ali

C'mon, if C++ is such a good standard, then D would have never appeared. Why not dropping D completely and go to C++ then? The fact that C++ uses it that way does not making it good for us. D choose slices, not iterators. Maybe we should remind why it did it in the first place. Now, we are taking the "enemy" (well, is a joke) as a reference in the matter?

First you complain about D doing it differently from other popular languages. Then when you're given a counterexample, you start complaining that D *isn't* doing it differently??? You're really just trolling, aren't you?

Matlab programmers, Java programmers, they all want D to act their way. As for switch statements (and this is getting offtopic), there are cases where fallback is nice to have..

It's already off-topic at the time the close vs open range debate has started :)
 final switch (param)
 {
      case FILE_NEW:
      case FILE_OPEN:
      case FILE_SAVE:
      case FILE_SAVE_AS:
      case EDIT_UNDO:

In D you'd write it as case FILE_NEW, FILE_OPEN, FILE_SAVE, FILE_SAVE_AS, EDIT_UNDO: this doesn't need fall-through.
          throw new Exception("unimplemented!");  // or there would be
 one method that handles these cases
      case EDIT_CUT:
          handleCut();
          break;
      case EDIT_COPY:
          handleCopy();
          break;
      case EDIT_PASTE:
          handlePaste();
          break;
      case EDIT_CLEAR:
          handleClear();
          break;
 }
 return 0;

 If you didn't have fallback, you would probably have to add some kind
 of new statement like "goto next" or "fallback" on each of those
 cases.

It already exists. It is called# goto case;
 On the other hand, having to add break everywhere is a pain too. So
 maybe the change would benefit overall.

To avoid surprising C users, I don't think you can't avoid having to add break everywhere. But it's still good to disallow implicit fall-through. I think the C# behavior* is good enough.
 The switch statement is a weird construct. It's as if suddenly you're
 coding in Python with no curly braces. :)

*: http://msdn.microsoft.com/en-us/library/06tc147t(v=VS.100).aspx #: http://d-programming-language.org/statement.html#GotoStatement
May 31 2011
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/31/11 4:46 PM, KennyTM~ wrote:
 On Jun 1, 11 05:37, Andrej Mitrovic wrote:
 If you didn't have fallback, you would probably have to add some kind
 of new statement like "goto next" or "fallback" on each of those
 cases.

It already exists. It is called# goto case;

Sigh. Unless it's a recent addition, I didn't know about it, Walter missed the case during proofreading, and consequently that's not documented in TDPL. Andrei
May 31 2011
next sibling parent KennyTM~ <kennytm gmail.com> writes:
On Jun 1, 11 06:03, Andrei Alexandrescu wrote:
 On 5/31/11 4:46 PM, KennyTM~ wrote:
 On Jun 1, 11 05:37, Andrej Mitrovic wrote:
 If you didn't have fallback, you would probably have to add some kind
 of new statement like "goto next" or "fallback" on each of those
 cases.

It already exists. It is called# goto case;

Sigh. Unless it's a recent addition, I didn't know about it, Walter missed the case during proofreading, and consequently that's not documented in TDPL. Andrei

This exists even in D1!* I hope this construct won't be removed, I use it a lot :p. *: http://www.digitalmars.com/d/1.0/statement.html#GotoStatement
May 31 2011
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/31/2011 07:58 PM, Jonathan M Davis wrote:
 On 2011-05-31 15:03, Andrei Alexandrescu wrote:
 On 5/31/11 4:46 PM, KennyTM~ wrote:
 On Jun 1, 11 05:37, Andrej Mitrovic wrote:
 If you didn't have fallback, you would probably have to add some kind
 of new statement like "goto next" or "fallback" on each of those
 cases.

It already exists. It is called# goto case;

Sigh. Unless it's a recent addition, I didn't know about it, Walter missed the case during proofreading, and consequently that's not documented in TDPL.

He probably always uses implicit fallthrough in his own code, and we know that he doesn't use fallthrough in switch statements as much as he thought that he did, so I suppose that it's not a great surprise that he miissed it. It would have been nice to have it in TDPL though. - Jonathan M Davis

Probably it would be best to take this opportunity to make the language change and then document it in a future edition of the book. With "goto case;" reducing the incremental cost of falling through to near non-existence, I think there is no excuse to keep the current behavior. Also, I think the runtime error on not handling all cases should be eliminated as well. It's very non-D-ish and error prone in the extreme (as unhandled cases tend to be often rare too). Right there with the HiddenFunc error. I think improving switch and hidden functions to err during compilation would be right in line with our push to making correct code easy to write and incorrect code difficult to write. Andrei
May 31 2011
prev sibling parent bearophile <bearophileHUGS lycos.com> writes:
Andrej Mitrovic:

 #: http://d-programming-language.org/statement.html#GotoStatement

Doh! I know where it comes from, it's just funky compared to the rest of C/D. :)

Some of those rules about case are not so intuitive/explicit, like goto; They are handy, but not so easy to read and understand. Bye, bearophile
May 31 2011
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
eles wrote:
 The right boundary of a slice is exclusive.

I think it should be stated more obvious in the paper.
 This makes sense, so you can
 do stuff like a[1..$] (== a[1..a.length]) to get a slice that

 all elements of a except for the first one (a[0]).

I disagree, but I have not much influence here, although I will defend my point of view. I find it quite unpleasant to remember which of the left and right bounds are exclusive and, moreover, this

No. I have never met half-open intervals that are open on the left side. All you have to remember is that the interval is half-open.
 precludes slicing with a[i1..i2] where i1 and i2 are only known at
 the runtime and may be i2<i1 (and not necessarily i1<i2).

It is always an error to slice a[i1..i2] with i2<i1. It is a flaw in your code if you do not know that i1<i2. You can make this sure even if you do not know the exact bounds.
 You will be forced to do smthng like:

 if(i1>i2)
  b=a[i1..i2]
 else
  b=a[i2..i1]
 end

More semicolons and curly brackets and less "end" please. =) Your code as is (and if it compiled), will throw an exception in debug or safe mode if i1!=i2. You understand that? I also do not get what point you are trying to sell with this snippet, can you explain what you want to do with i2<i1?
 and is easy to forget (several lines below) if a[i1] or a[i2] still
 belongs to the slice or no.

You can always do a[i1..i2+1] to get inclusive slicing.
 For example, it would be marvellous to implement definite integral
 convention as: int(a,i1,i2)=sign(i1-i2)*sum(a[i1..i2]) w.r.t. the
 mathematical convention.

 For me, the right solution would have been to consider the selection a
 [0..$-1] to select all the elements of the array. This way, "$" still
 means "the length" and one has a clear view of the selection going
 from a[0] to a["length"-1] as expected by someone used to 0-based
 indexes. The slicing would be always inclusive, which is a matter of
 consistence (and goes in line with Walter speaking about the car
 battery). Why to break this convention?

 If it is wrong, design it to appear being wrong.

It is not wrong, but the right thing to do. I suspect you have not done much coding (in D) given your code examples. half-open is the best choice for representing intervals wherever it is possible. It reduces the amount of +-1 bugs in your code considerably. (try it) Timon
May 31 2011
parent reply eles <eles eles.com> writes:
Without being rude, but I find that you are too much a zealot and too
little of a listener. You are firing bullets in many, but wrong
directions, trying to scary and to impress.

Yes, I have no much experience with D, although I have some with C.
The syntax is not for D, but for general pseudo-code whose purpose
was to convey the idea, and not to inflame spirits about narrow
bracketing issues. The matter was not if a semicolon, or a bracket
for that matter, is appropriate or no.

In the same manner, you would try to shout down the relativity theory
for... improper English spelling (not to say that I am a second
Einstein...) because the author was... of German origin.

Although I disagree with Andrei's p.o.v. concerning exclusive right
limit, I much more appreciate his style to reply. Probably it is
because being so great, he feels no compulsive need to emphasize his
stature w.r.t. those that are littler than him. And he folloows and
fights the idea, maybe because he was able to overlook the
insignificant bracketing issues for something more important. But,
probably, his attitude is the attribute of great minds, not of the
littlest ones.

There is a saying... saying "you do not see the forest because of the
trees".

My apologies (to the others) for this little... bullet.


== Quote from Timon Gehr (timon.gehr gmx.ch)'s article
 eles wrote:
 The right boundary of a slice is exclusive.

I think it should be stated more obvious in the paper.
 This makes sense, so you can
 do stuff like a[1..$] (== a[1..a.length]) to get a slice that

 all elements of a except for the first one (a[0]).

I disagree, but I have not much influence here, although I will defend my point of view. I find it quite unpleasant to remember


 of the left and right bounds are exclusive and, moreover, this


 have to remember is that the interval is half-open.
 precludes slicing with a[i1..i2] where i1 and i2 are only known at
 the runtime and may be i2<i1 (and not necessarily i1<i2).


 you do not know that i1<i2. You can make this sure even if you do

 exact bounds.
 You will be forced to do smthng like:

 if(i1>i2)
  b=a[i1..i2]
 else
  b=a[i2..i1]
 end

Your code as is (and if it compiled), will throw an exception in

 mode if i1!=i2. You understand that?
 I also do not get what point you are trying to sell with this

 explain what you want to do with i2<i1?
 and is easy to forget (several lines below) if a[i1] or a[i2]


 belongs to the slice or no.

 For example, it would be marvellous to implement definite integral
 convention as: int(a,i1,i2)=sign(i1-i2)*sum(a[i1..i2]) w.r.t. the
 mathematical convention.

 For me, the right solution would have been to consider the


 [0..$-1] to select all the elements of the array. This way, "$"


 means "the length" and one has a clear view of the selection going
 from a[0] to a["length"-1] as expected by someone used to 0-based
 indexes. The slicing would be always inclusive, which is a matter


 consistence (and goes in line with Walter speaking about the car
 battery). Why to break this convention?

 If it is wrong, design it to appear being wrong.


 coding (in D) given your code examples.
 half-open is the best choice for representing intervals wherever it

 It reduces the amount of +-1 bugs in your code considerably. (try

 Timon

May 31 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
eles wrote:
 Without being rude, but I find that you are too much a zealot and too
 little of a listener. You are firing bullets in many, but wrong
 directions, trying to scary and to impress.

Oh, sorry. I did not want to be rude. Communicating over newsgroups tends to be working poorly in that respect.
 Yes, I have no much experience with D, although I have some with C.
 The syntax is not for D, but for general pseudo-code whose purpose
 was to convey the idea, and not to inflame spirits about narrow
 bracketing issues. The matter was not if a semicolon, or a bracket
 for that matter, is appropriate or no.

I agree syntax is not very important for that matter. It was not to be taken too seriously; it was a joke, annotated by ":)". But reviewing my post I can understand how it might have offended you. Sorry for that. If somebody codes very much in a language, his pseudo-code tends to look much like that language. That's were I inferred that you do not have much experience with the way things work in D. Yet you seemed quite certain that D was doing it all wrong. Try D some time. It is great.
 In the same manner, you would try to shout down the relativity theory
 for... improper English spelling (not to say that I am a second
 Einstein...) because the author was... of German origin.

I think you are concentrating too much on my comments about syntax. Fun fact: Most physicists did shout down the relativity theory in the beginning. There was even a book titled "100 authors against Einstein." or similar. Einstein did receive the Nobel prize, not for papers on relativity theory but for his discovery of the photoelectric effect. Today, most people will say relativity theory when they hear Einstein. So if you have a really good point, please share. I agree that I have to get better at that. And I was not disagreeing with your point of view because of syntactic issues, but most likely because I have other use cases in mind than you. Again, what exactly do you want to do with array slices? There are perfectly valid uses of the slicing semantics you suggest, but I think they are not a good fit for built-in D arrays.
 Although I disagree with Andrei's p.o.v. concerning exclusive right
 limit, I much more appreciate his style to reply. Probably it is
 because being so great, he feels no compulsive need to emphasize his
 stature w.r.t. those that are littler than him. And he folloows and
 fights the idea, maybe because he was able to overlook the
 insignificant bracketing issues for something more important. But,
 probably, his attitude is the attribute of great minds, not of the
 littlest ones.

Yes, his answer was indeed quite political. Also, his English is better than mine. I have no need to "emphasize my stature", I am trying to give arguments for what I consider correct.
 There is a saying... saying "you do not see the forest because of the
 trees".

Please explain how it applies to me. True, I did not understand what you were trying to do in your code snippet. That is why I asked.
 My apologies (to the others) for this little... bullet.

I hope you wont conceive this message in a similar way. Timon
 == Quote from Timon Gehr (timon.gehr gmx.ch)'s article
 eles wrote:
 The right boundary of a slice is exclusive.

I think it should be stated more obvious in the paper.
 This makes sense, so you can
 do stuff like a[1..$] (== a[1..a.length]) to get a slice that

 all elements of a except for the first one (a[0]).

I disagree, but I have not much influence here, although I will defend my point of view. I find it quite unpleasant to remember


 of the left and right bounds are exclusive and, moreover, this


 have to remember is that the interval is half-open.
 precludes slicing with a[i1..i2] where i1 and i2 are only known at
 the runtime and may be i2<i1 (and not necessarily i1<i2).


 you do not know that i1<i2. You can make this sure even if you do

 exact bounds.
 You will be forced to do smthng like:

 if(i1>i2)
  b=a[i1..i2]
 else
  b=a[i2..i1]
 end

Your code as is (and if it compiled), will throw an exception in

 mode if i1!=i2. You understand that?
 I also do not get what point you are trying to sell with this

 explain what you want to do with i2<i1?
 and is easy to forget (several lines below) if a[i1] or a[i2]


 belongs to the slice or no.

 For example, it would be marvellous to implement definite integral
 convention as: int(a,i1,i2)=sign(i1-i2)*sum(a[i1..i2]) w.r.t. the
 mathematical convention.

 For me, the right solution would have been to consider the


 [0..$-1] to select all the elements of the array. This way, "$"


 means "the length" and one has a clear view of the selection going
 from a[0] to a["length"-1] as expected by someone used to 0-based
 indexes. The slicing would be always inclusive, which is a matter


 consistence (and goes in line with Walter speaking about the car
 battery). Why to break this convention?

 If it is wrong, design it to appear being wrong.


 coding (in D) given your code examples.
 half-open is the best choice for representing intervals wherever it

 It reduces the amount of +-1 bugs in your code considerably. (try

 Timon


May 31 2011
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/31/11 10:28 AM, Timon Gehr wrote:
 eles wrote:
 Without being rude, but I find that you are too much a zealot and too
 little of a listener. You are firing bullets in many, but wrong
 directions, trying to scary and to impress.

Oh, sorry. I did not want to be rude. Communicating over newsgroups tends to be working poorly in that respect.

Many thanks to you both for graciously keeping this exchange civil. Andrei
May 31 2011
prev sibling parent reply "Nick Sabalausky" <a a.a> writes:
"eles" <eles eles.com> wrote in message 
news:is2qgc$2o7h$1 digitalmars.com...
 The right boundary of a slice is exclusive.

I think it should be stated more obvious in the paper.
 This makes sense, so you can
 do stuff like a[1..$] (== a[1..a.length]) to get a slice that

 all elements of a except for the first one (a[0]).

I disagree, but I have not much influence here, although I will defend my point of view. I find it quite unpleasant to remember which of the left and right bounds are exclusive and, moreover, this precludes slicing with a[i1..i2] where i1 and i2 are only known at the runtime and may be i2<i1 (and not necessarily i1<i2).

When I first started with D, ages ago, I was skeptical about the half-open slices. It sounded unbalanced. But since then, I've been won over by it for a few reasons (in no particular order): 1. After years of using D, I've never had a problem with it working that way. And even initially, I found it very easy to learn and get used to it. 2. I've found it easier to avoid off-by-one errors. I don't have to think about them as much. 3. arr[a..b].length == b-a <-- That's a *very* nice, clean, useful property to have. And I think it's one of the main reasons for #2 above. In fact, this actually makes it feel more balanced to me than having inclusive on both ends. 4. The following: string str = "ABCDEF"; int splitIndex = 3; string part1 = str[0 .. splitIndex]; string part2 = str[splitIndex .. $]; assert(part1 ~ part2 == str); Ie, when you split an array, you can use the same index for both halves. No "+1"-ing. It just works. Don't have to think about it.
May 31 2011
next sibling parent "Nick Sabalausky" <a a.a> writes:
"Nick Sabalausky" <a a.a> wrote in message 
news:is3hoc$peu$1 digitalmars.com...
 "eles" <eles eles.com> wrote in message 
 news:is2qgc$2o7h$1 digitalmars.com...
 The right boundary of a slice is exclusive.

I think it should be stated more obvious in the paper.
 This makes sense, so you can
 do stuff like a[1..$] (== a[1..a.length]) to get a slice that

 all elements of a except for the first one (a[0]).

I disagree, but I have not much influence here, although I will defend my point of view. I find it quite unpleasant to remember which of the left and right bounds are exclusive and, moreover, this precludes slicing with a[i1..i2] where i1 and i2 are only known at the runtime and may be i2<i1 (and not necessarily i1<i2).

When I first started with D, ages ago, I was skeptical about the half-open slices. It sounded unbalanced. But since then, I've been won over by it for a few reasons (in no particular order): 1. After years of using D, I've never had a problem with it working that way. And even initially, I found it very easy to learn and get used to it. 2. I've found it easier to avoid off-by-one errors. I don't have to think about them as much. 3. arr[a..b].length == b-a <-- That's a *very* nice, clean, useful property to have. And I think it's one of the main reasons for #2 above. In fact, this actually makes it feel more balanced to me than having inclusive on both ends. 4. The following: string str = "ABCDEF"; int splitIndex = 3; string part1 = str[0 .. splitIndex]; string part2 = str[splitIndex .. $]; assert(part1 ~ part2 == str); Ie, when you split an array, you can use the same index for both halves. No "+1"-ing. It just works. Don't have to think about it.

Another way I often like to think of it is, if you'll pardon the ASCII-art, a slice looks like this: ___________ <___________< Like stackable cups. They "fit" together. Nice, neat and tidy. But if it were inclusive on both ends, it would look like this: ___________ <___________> Those don't fit together. Messy.
May 31 2011
prev sibling parent reply eles <eles eles.com> writes:
 a few reasons (in no particular order):
 1. After years of using D, I've never had a problem with it working

 way. And even initially, I found it very easy to learn and get used

You become an "educated" user. How many? I bet that *you* will never write while(some_condition); do_some_intended_loop_processing(); but this happens! What to do when it happens?
 2. I've found it easier to avoid off-by-one errors. I don't have to

 about them as much.

Why? I fail to see the reason behind. It would be simply as substracting 1 from the right limit. Is this more prone to errors?
 3. arr[a..b].length == b-a     <-- That's a *very* nice, clean,

 property to have. And I think it's one of the main reasons for #2

 fact, this actually makes it feel more balanced to me than having

 on both ends.

Speaking of this, using an alternative syntax such as a [slice_beginning..slice_length] would even allow you to not even make that substraction in the first place, let alone adding 1. But the fact is that when you are counting numbers going from 6 to 9, to find *the number of numbers* you do: 9-6 PLUS 1.
 4. The following:
 string str = "ABCDEF";
 int splitIndex = 3;
 string part1 = str[0 .. splitIndex];
 string part2 = str[splitIndex .. $];
 assert(part1 ~ part2 == str);
 Ie, when you split an array, you can use the same index for both

 "+1"-ing. It just works. Don't have to think about it.

You can live a life without having to think about many things. However, experience proves that thinking is generally better than not "having to think about it". What if a bug? Is not more logically to index those halves with: a[0..length_half-1] a[length_half..length-1] We speak about *disjoint* halves here. Why not disjoint indexes? What if you have a "center" (common) character that you want to keep in both halves? You would have to write: string part1 = str[0 .. splitIndex+1]; string part2 = str[splitIndex .. $]; Is this more logical than writing: string part1 = str[0 .. splitIndex]; string part2 = str[splitIndex .. $]; ?
May 31 2011
parent "Nick Sabalausky" <a a.a> writes:
"eles" <eles eles.com> wrote in message news:is3j1h$thh$1 digitalmars.com...
 a few reasons (in no particular order):
 1. After years of using D, I've never had a problem with it working

 way. And even initially, I found it very easy to learn and get used

You become an "educated" user. How many? I bet that *you* will never write while(some_condition); do_some_intended_loop_processing(); but this happens! What to do when it happens?

That's a syntax error in D.
 2. I've found it easier to avoid off-by-one errors. I don't have to

 about them as much.

Why? I fail to see the reason behind.

Seriously, I've just stated that that's been my experience. Whether or not I can give a reason for why that's been my experience is irrelevent. If I couldn't give a reason, would that somehow change the fact of what I experienced? Of course not.
 It would be simply as
 substracting 1 from the right limit. Is this more prone to errors?

In my experience, yes.
 3. arr[a..b].length == b-a     <-- That's a *very* nice, clean,

 property to have. And I think it's one of the main reasons for #2

 fact, this actually makes it feel more balanced to me than having

 on both ends.

Speaking of this, using an alternative syntax such as a [slice_beginning..slice_length] would even allow you to not even make that substraction in the first place, let alone adding 1.

Using [slice_beginning..slice_length] is essentially like mixing types. You have an index and then you have a length. If you want to use the end of that slice as the beginning of another slice, you have to convert it from length to index. If you want to use the start of the slice as the end of another slice, you have to convert it from index to length. But the way we do it now, you just have two indicies. The same "type", so to speak. So why not use what's simpler? ;)
 But the fact is that when you are counting numbers going from 6 to 9,
 to find *the number of numbers* you do: 9-6 PLUS 1.

That's not even relevent.
 4. The following:
 string str = "ABCDEF";
 int splitIndex = 3;
 string part1 = str[0 .. splitIndex];
 string part2 = str[splitIndex .. $];
 assert(part1 ~ part2 == str);
 Ie, when you split an array, you can use the same index for both

 "+1"-ing. It just works. Don't have to think about it.

You can live a life without having to think about many things. However, experience proves that thinking is generally better than not "having to think about it". What if a bug?

That's one of the most rediculous arguments I've heard. If I started typing everything in "leet" or pig latin, you'd have to think a lot more to read it, and I'd have to think a lot more to write it, but it sure as hell wouldn't improve anything because it's thinking about things that we don't actually need to think about.
 Is not more logically to index those halves with:

 a[0..length_half-1]
 a[length_half..length-1]

Yes, that is not more logical. It's less logical and more messy.
 We speak about *disjoint* halves here. Why not disjoint indexes?

 What if you have a "center" (common) character that you want to keep
 in both halves?

 You would have to write:

 string part1 = str[0 .. splitIndex+1];
 string part2 = str[splitIndex .. $];

 Is this more logical than writing:

 string part1 = str[0 .. splitIndex];
 string part2 = str[splitIndex .. $];

 ?

First of all, yes, the first *is* more logical simply because you're deliberately duplicating an element, therefore it makes sense there would be an adjustment to one of the endpoints. Secondly, when would anyone ever need to do that? It's obviously only a rare need at best. I'm getting the impression that you're just making up arguments for the sake of arguing.
May 31 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 31 May 2011 08:34:12 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 This is my final submission for the D article contest.

 This takes into account all the fixes and suggestions from the first  
 draft review.

 http://www.dsource.org/projects/dcollections/wiki/ArrayArticle

 -Steve

Shortly after submitting this, I've discovered that dsource is not working. Something really fishy is going on... Opera is giving me warnings about the web site requesting access to my computer's data. I have a backup copy (full html + images), but I need to find a place for it. If dsource doesn't come back in the next few hours, I'll find another home (probably the D wiki). -Steve
May 31 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 31 May 2011 09:23:24 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Tue, 31 May 2011 08:34:12 -0400, Steven Schveighoffer  
 <schveiguy yahoo.com> wrote:

 This is my final submission for the D article contest.

 This takes into account all the fixes and suggestions from the first  
 draft review.

 http://www.dsource.org/projects/dcollections/wiki/ArrayArticle

 -Steve

Shortly after submitting this, I've discovered that dsource is not working. Something really fishy is going on... Opera is giving me warnings about the web site requesting access to my computer's data. I have a backup copy (full html + images), but I need to find a place for it. If dsource doesn't come back in the next few hours, I'll find another home (probably the D wiki).

Seems to be back, and actually it may be a problem in my local subnet. -Steve
May 31 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 31 May 2011 09:29:48 -0400, eles <eles eles.com> wrote:

 The right boundary of a slice is exclusive.

I think it should be stated more obvious in the paper.

It's now clarified in the comment. Thanks for the feedback. -Steve
May 31 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
This is how I got to terms with it long ago: http://i.imgur.com/lSkvs.png

When it's a slice, it's basically two anchors or gaps at some
location, and whatever items are between the anchors is your result.
Otherwise with indexing it's the number that starts at that offset (in
this case it would be at the location at right from the gap).
May 31 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 31 May 2011 11:10:59 -0400, eles <eles eles.com> wrote:

 if (i1 > i2) swap(i1, i2);

That will affect further values from that point onward, which could not be necessarily intended. Also, is a bit of overhead for solving such a little issue. OK, it is simple to swap but... why to be force to do it?

Or: T[] arbitrarySlice(T)(T[] data, size_t i1, size_t i2) // rename as desired { return (i2 < i1) ? data[i2..i1] : data[i1..i2]; } b = a.arbitrarySlice(i1, i2); Not perfect, but doable. In any case, it's not a common need to reverse indexes.
 The "war" between open-right and closed-right limit has been waged

 since Fortran was first invented. It may seem that closed-right

 are more natural, but they are marred by problems of varied degrees

 subtlety. For example, representing an empty interval is tenuous.
 Particularly if you couple it with liberal limit swapping, what is a

 .. 0]? An empty slice or the same as the two-elements a[0 .. 1]?
 Experience has shown that open-right has "won".
 Andrei

I agree that the issue is not simple (else, there would have been no "war"). I know no other examples where open-right limits are used. I use (intensively) just another language capable of slicing, that is Matlab. It uses the closed-left and closed-right limits and I tend to see it as a winner in the engineering field, precisely because of its ability to manipulate arrays (its older name is MATrix LABoratory and this is exaxtly why I am using it for). Some examples of syntax (arrays are 1-based in Matlab): a=[1,2,3,4]; % array with elements: 1, 2, 3, 4 (of length 4) b=a(1:3); %b is an array with elements: 1, 2, 3 (of length 3) c=a(end-1:end); %c is an array with elements: 3, 4 (of length 2) d=a(2:1); %d is empty e=a(1:end); %e is an array with elements: 1,2,3,4 (of length 4) f=a(1:10); %ERROR (exceeds array dimension) g=a(1:1); %g is an array with elements: 1 (of length 1) Although there is no straight way to represent an empty array using the close-left&right syntax, I doubt this is important in real world. Matlab has a special symbol (which is "[]") for representing empty arrays. Where the need to represent an empty interval using slice syntax in D?

An "empty" slice does not have length but it does have position. For example, a binary search feature may return an empty slice indicating where an element *would* be.
 Moreover, when writing in D: a[1..1] is this an empty or a non-empty
 array? One convention says that the element a[1] is part of the
 slice, since the left limit is included and the other convention says
 a[1] is not part of the slice precisely because the right limit is
 excluded. So, which one weights more? So, is a[1..1] an empty array
 or an array with one element?

This really is simply a matter of taste and experience. It's like learning a new language, some words may sound or be spelled similarly, but have completely different meanings. You just have to learn the nuances. Here there is truly no "right" answer. To me, the Matlab syntax looks as strange as the D syntax looks to you :) I also have trouble with 1-based array languages. But that doesn't mean it's "wrong", just something I'm not used to. One thing I will say though, being able to refer to the element after the interval makes things easier to code in containers. For example, in C++ STL, there is always an end iterator, even if it's not represented by the length. It gives you an extra anchor at which you can insert elements, meaning you don't need the "insertAfter" primitive as well as the insert primitive. Always having an open interval on the right allows for nice flow in code. -Steve
May 31 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 5/31/11, eles <eles eles.com> wrote:
 Nice picture, but then why foo[1] is rather 8 than 4? And what is foo
 [9]?

foo[9] is out of range. If foo[1] was actually referring to 4, then foo[0] would be out of range too. I've only used this picture because back years ago when I was trying to understand how this whole system works there was an explanation similar to this one in some book, the name of which I forgot. But it did 'click' with me when I saw it presented like that.
May 31 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Quite frankly I don't give a damn about for loops because I hardly use
them. Whenever I can, I use foreach, and if I get stuck and think that
foreach can't be used I rather think twice and consider to do a small
redesign of my code rather than start using ancient for loops. I don't
want to deal with these silly signed/unsigned/overflow issues at all.
Foreach for the win. Foreach and lockstep for the double win.

In fact, from what my *.d searches tell me, most of my for loops come
from C relics of code which I've translated into D.

But I don't do scientific computing, so that's just me. :)
May 31 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 31 May 2011 14:20:20 -0400, eles <eles eles.com> wrote:

 == Quote from Mafi (mafi example.org)'s article
 Am 31.05.2011 18:16, schrieb eles:
 Now, why:

 for(iterator from a[0] to a[N-1]){ //etc. }
 //let use the above notation for for(i=0; i<=N-1; i++)

 is acceptable, but sudden is no more acceptable to write

 a[for(iterator from 0 to N-1)]

 and one must use

 a[for(iterator from 0 to N]]

 in order to achieve exactly the same?

 The last two expressions are just mental placeholders for a


 and for a[0..N] respectively.

for(element from a[0] to a[n]) be a notation for for(i=0; i < n; i++) but then, assuming closed intervals, why do I have to write a[i..n-1] to archive the same? You see? This argument of yours is not really good. Mafi

why not write: for(i=0; i<=n-1; i++)

if n is unsigned int, and 0, then this becomes i = 0; i <= uint.max; i++ Basically, using subtraction in loop conditions is a big no-no. You are much better off to write using addition: i + 1 <= n But then i < n looks so much better.
 COnceptually, you are iterating elements from a[0] to a[n-1], this is
 why index goes from 0 to n-1 (even if you write i<n).

 What about if length of the array already equals UTYPE_MAX, ie. the
 maximum value representable on the size_t? (I assume that indexes are
 on size_t and they go from 0 to UTYPE_MAX).

First, size_t is the limit of your address space. You cannot have size_t items in any slice. Second, there are very very few use cases where size_t.max is used as an iteration. -Steve
May 31 2011
prev sibling next sibling parent "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Tue, 31 May 2011 20:29:51 +0200, KennyTM~ <kennytm gmail.com> wrote:

 If your program needs an array of 4 billion elements (2e+19 elements on  
 64-bit system), you're programming it wrong.

Not true. (for 32 bits) True for simple arrays, yes. Not for other ranges that may e.g. lazily read files. That said, if you do that, you really, really should consider using longs instead of ints. -- Simen
May 31 2011
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On 2011-05-31 13:19, eles wrote:
 == Quote from Ali Çehreli (acehreli yahoo.com)'s article
 
 On 05/31/2011 08:10 AM, eles wrote:
 I know no other examples where open-right limits are used.

The C++ standard library uses open-right with its pairs of

iterators.
 The second iterator "points" at one beyond the last element of the

range.
 Ali

C'mon, if C++ is such a good standard, then D would have never appeared. Why not dropping D completely and go to C++ then? The fact that C++ uses it that way does not making it good for us. D choose slices, not iterators. Maybe we should remind why it did it in the first place. Now, we are taking the "enemy" (well, is a joke) as a reference in the matter?

The fact that C++ does something does not inherently make that choice good or bad. C++ has both good and bad stuff in it. D tries to take the good stuff and improve or remove the bad stuff, but ultimately, D has a lot in common with C++. So, the fact that C++ has a particular feature or behavior says _nothing_ in and of itself as to whether D should have that same feature or behavior. They must all be taken or left individually on their own merits. Iterators have a lot going for them. Ranges just take that to the next level. And whether the right end of a range is open or closed has _nothing_ to do with why ranges were implemented in Phobos instead of iterators. It was determined that having the right end be open was desirable, so that's what was implemented. And honestly, given how fantastic a lot of the STL is, in many cases, the fact that the STL did it is definitely _not_ a mark against it. The STL didn't do everything right, and D and Phobos strive to improve upon it by creating a superior solution, but that doesn't mean that the fact that the STL has a particular design decision makes it a bad decision - almost the opposite in fact. But ultimately, all design decisions need to be examined in their own right and whether a particular language or library made a particular design decision doesn't in and of itself saying anything about whether that decision was a good one or not. - Jonathan M Davis
May 31 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 5/31/11, Nick Sabalausky <a a.a> wrote:
 "eles" <eles eles.com> wrote in message news:is3ihf$qnu$1 digitalmars.com=

 =3D=3D Quote from Ali =C7ehreli (acehreli yahoo.com)'s article
 On 05/31/2011 08:10 AM, eles wrote:
  > I know no other examples where open-right limits are used.
 The C++ standard library uses open-right with its pairs of

 The second iterator "points" at one beyond the last element of the

 Ali

C'mon, if C++ is such a good standard, then D would have never appeared. Why not dropping D completely and go to C++ then? The fact that C++ uses it that way does not making it good for us. D choose slices, not iterators. Maybe we should remind why it did it in the first place. Now, we are taking the "enemy" (well, is a joke) as a reference in the matter?

First you complain about D doing it differently from other popular languages. Then when you're given a counterexample, you start complaining that D *isn't* doing it differently??? You're really just trolling, aren'=

 you?

Matlab programmers, Java programmers, they all want D to act their way. As for switch statements (and this is getting offtopic), there are cases where fallback is nice to have.. final switch (param) { case FILE_NEW: case FILE_OPEN: case FILE_SAVE: case FILE_SAVE_AS: case EDIT_UNDO: throw new Exception("unimplemented!"); // or there would be one method that handles these cases case EDIT_CUT: handleCut(); break; case EDIT_COPY: handleCopy(); break; case EDIT_PASTE: handlePaste(); break; case EDIT_CLEAR: handleClear(); break; } return 0; If you didn't have fallback, you would probably have to add some kind of new statement like "goto next" or "fallback" on each of those cases. On the other hand, having to add break everywhere is a pain too. So maybe the change would benefit overall. The switch statement is a weird construct. It's as if suddenly you're coding in Python with no curly braces. :)
May 31 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
I see. Well, 'goto case xxx' has the added advantage that it already works. :)

I would prefer it to be just 'goto next' or something similar, but
that hijacks yet another keyword. I've ever needed fallbacks after
code blocks very few times, I'd be ok with the change as you've
described it.
May 31 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 6/1/11, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 On 5/31/11 4:46 PM, KennyTM~ wrote:
 On Jun 1, 11 05:37, Andrej Mitrovic wrote:
 If you didn't have fallback, you would probably have to add some kind
 of new statement like "goto next" or "fallback" on each of those
 cases.

It already exists. It is called# goto case;

Sigh. Unless it's a recent addition, I didn't know about it, Walter missed the case during proofreading, and consequently that's not documented in TDPL. Andrei

Oh this worked for quite a while now. :p
May 31 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 5/31/11, KennyTM~ <kennytm gmail.com> wrote:
 *: http://msdn.microsoft.com/en-us/library/06tc147t(v=VS.100).aspx
 #: http://d-programming-language.org/statement.html#GotoStatement

Doh! I know where it comes from, it's just funky compared to the rest of C/D. :)
May 31 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 5/31/11, KennyTM~ <kennytm gmail.com> wrote:
 In D you'd write it as

         case FILE_NEW, FILE_OPEN, FILE_SAVE, FILE_SAVE_AS, EDIT_UNDO:

Maybe. But if each case is going to have its own handler, it makes sense to separate them each on its own line. Then as you implement each case from top to bottom your commits might look like: commit 1: case FILE_NEW: handleFileNew(); // just finished writing handler break; case FILE_OPEN: case FILE_SAVE: case FILE_SAVE_AS: case EDIT_UNDO: throw new Exception(); commit 2: case FILE_NEW: handleFileNew(); break; case FILE_OPEN: handleFileOpen(); // done with this now break; case FILE_SAVE: case FILE_SAVE_AS: case EDIT_UNDO: throw new UnimplementedException(); Etcetera.. *But*, I wouldn't want my FILE_OPEN case to accidentally fall through and throw an "UnimplementedException". So I'd be all for a change in this regard.
May 31 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On 2011-05-31 15:03, Andrei Alexandrescu wrote:
 On 5/31/11 4:46 PM, KennyTM~ wrote:
 On Jun 1, 11 05:37, Andrej Mitrovic wrote:
 If you didn't have fallback, you would probably have to add some kind
 of new statement like "goto next" or "fallback" on each of those
 cases.

It already exists. It is called# goto case;

Sigh. Unless it's a recent addition, I didn't know about it, Walter missed the case during proofreading, and consequently that's not documented in TDPL.

He probably always uses implicit fallthrough in his own code, and we know that he doesn't use fallthrough in switch statements as much as he thought that he did, so I suppose that it's not a great surprise that he miissed it. It would have been nice to have it in TDPL though. - Jonathan M Davis
May 31 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On 2011-05-31 18:57, Andrei Alexandrescu wrote:
 On 05/31/2011 07:58 PM, Jonathan M Davis wrote:
 On 2011-05-31 15:03, Andrei Alexandrescu wrote:
 On 5/31/11 4:46 PM, KennyTM~ wrote:
 On Jun 1, 11 05:37, Andrej Mitrovic wrote:
 If you didn't have fallback, you would probably have to add some kind
 of new statement like "goto next" or "fallback" on each of those
 cases.

It already exists. It is called# goto case;

Sigh. Unless it's a recent addition, I didn't know about it, Walter missed the case during proofreading, and consequently that's not documented in TDPL.

He probably always uses implicit fallthrough in his own code, and we know that he doesn't use fallthrough in switch statements as much as he thought that he did, so I suppose that it's not a great surprise that he miissed it. It would have been nice to have it in TDPL though. - Jonathan M Davis

Probably it would be best to take this opportunity to make the language change and then document it in a future edition of the book. With "goto case;" reducing the incremental cost of falling through to near non-existence, I think there is no excuse to keep the current behavior. Also, I think the runtime error on not handling all cases should be eliminated as well. It's very non-D-ish and error prone in the extreme (as unhandled cases tend to be often rare too). Right there with the HiddenFunc error. I think improving switch and hidden functions to err during compilation would be right in line with our push to making correct code easy to write and incorrect code difficult to write.

Definitely. Also, it would be nice if final switch actually worked. In my experience, it never does, though maybe that's changed since I last messed with it. Regardless, there are definitely some basic changes which can be made to switch which don't reduce its useability but which definitely reduce the number of errors that you're likely to get when using it. - Jonathan M Davis
May 31 2011
prev sibling next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 6/1/11, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 Also, I think the runtime error on not handling all cases should be
 eliminated as well. It's very non-D-ish and error prone in the extreme
 (as unhandled cases tend to be often rare too). Right there with the
 HiddenFunc error.

ass, and I really want them to surface during compilation instead of getting triggered at random during runtime.
Jun 01 2011
parent reply "Daniel Murphy" <yebblies nospamgmail.com> writes:
"Andrej Mitrovic" <andrej.mitrovich gmail.com> wrote in message 
news:mailman.517.1306925428.14074.digitalmars-d puremagic.com...
 On 6/1/11, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 Also, I think the runtime error on not handling all cases should be
 eliminated as well. It's very non-D-ish and error prone in the extreme
 (as unhandled cases tend to be often rare too). Right there with the
 HiddenFunc error.

ass, and I really want them to surface during compilation instead of getting triggered at random during runtime.

Shouldn't final switch be the way to handle this though? It seems like making it a compile time error will just force the programmer to insert 'default: assert(0);', which is basically what the compiler does anyway. Or have I missed the point?
Jun 01 2011
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 06/01/2011 10:26 AM, Daniel Murphy wrote:
 "Andrej Mitrovic"<andrej.mitrovich gmail.com>  wrote in message
 news:mailman.517.1306925428.14074.digitalmars-d puremagic.com...
 On 6/1/11, Andrei Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:
 Also, I think the runtime error on not handling all cases should be
 eliminated as well. It's very non-D-ish and error prone in the extreme
 (as unhandled cases tend to be often rare too). Right there with the
 HiddenFunc error.

ass, and I really want them to surface during compilation instead of getting triggered at random during runtime.

Shouldn't final switch be the way to handle this though?

I think that should be the default behavior of switch.
 It seems like
 making it a compile time error will just force the programmer to insert
 'default: assert(0);', which is basically what the compiler does anyway.  Or
 have I missed the point?

1. Most switch statements already have a default label. 2. Those that don't should be inspected and annotated by the programmer, not rely on the compiler to automagically insert code assuming the programmer's intent. Every single time I had that assertion firing, the cause was a bug in my code. We can't leave that to a dynamic error, particularly since unhandled cases may be arbitrarily infrequent. The current state of affairs is choosing the wrong tradeoff, pure and simple. Andrei
Jun 01 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 31 May 2011 16:16:37 -0400, eles <eles eles.com> wrote:

 if n is unsigned int, and 0, then this becomes i = 0; i <=

 Basically, using subtraction in loop conditions is a big no-no.

Yes, I have been trapped there. More than once. And yes, n=0 is a special case. An yes, is a big no-no. It also appears when *i* is unsigned int and you are decrementing it and comparing (with egality) against 0. The loop is infinite. For me, conceptually, the problem is simply that unsigned types, once at zero, should throw exception if they are decremented. It is illogical to allow such operation. However, type information is only available at the compile time, so when the program is running is difficult to take measures against it.

This is not practical. It would be too expensive to check because the hardware does not support it.
 I thought, however, that those exception should be thrown in the
 "debug" version.

 However, there is no reason to make, instead, UTYPE_MAX a big no-no.

As has been said, multiple times, UTYPE_MAX is not a valid index, and that is not because of the open-interval on the right. It's because of addressing in a zero-based index world, you simply can't have an array bigger than your address space. An array with UTYPE_MAX as a valid index must have at least UTYPE_MAX + 1 elements. How this possibly can be a "huge factor" for you, but having to put +1 and -1 all over the place for mundane array indexing is beyond my comprehension. If you are looking for an iron-clad reason for closed-right indexing, this isn't it. But, it's not really important anyways. The open-right indexing ship has not only sailed, it's made it across the ocean, formed several colonies, and is currently declaring independence. You're about a decade too late with this argument. How much of a possibility would you think Matlab has of changing its indexing scheme to be like D's? About the same chance as D adopting Matlab's, I'd say. If this is a reason you will not use D, then I'm sorry that you won't be with us, but that's just life. -Steve
Jun 01 2011
prev sibling next sibling parent =?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:
On 05/31/2011 05:34 AM, Steven Schveighoffer wrote:
 This is my final submission for the D article contest.

 This takes into account all the fixes and suggestions from the first
 draft review.

 http://www.dsource.org/projects/dcollections/wiki/ArrayArticle

 -Steve

Thank you, Steve! This great article now has a Turkish translation: http://ddili.org/makale/d_dilimleri.html Ali
Jun 01 2011
prev sibling parent "Alexander Malakhov" <anm programmer.net> writes:
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> =D0=C9=D3=C1=CC(=C1)=
 =D7 =D3=D7=CF=A3=CD  =

=D0=C9=D3=D8=CD=C5 Wed, 01 Jun 2011 00:57:14 +0700:
 As I mentioned, the issues involved are of increasing subtlety. As you=

 wrote, C programmers iterate upward like this:
 ...

Maybe add this to http://www.digitalmars.com/d/2.0/rationale.html (and = btw, I don't see "rationale" on d-p-l.org) Say, if I would see on some forum "open-right proved to be inferior in = almost all areas, hence D's choice sucks", I don't think there a lot of programmers out there, who will instantly = come up with what Andrei wrote -- = Alexander
Jun 06 2011