digitalmars.D - Yet another include/exclusive slicing thread

• Tomer Altman (66/66) Oct 22 2004 First of all, I know and understand that things are hard to change, that...
• Russ Lewis (19/19) Oct 22 2004 I used to be a vocal proponent of this very same thing. Walter's
• Sjoerd van Leent (9/31) Oct 23 2004 I think that a solution to this should be possible. Why not let people
• Walter (4/10) Oct 31 2004 While that would technically work, I suspect that there would be constan...
• =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (6/17) Oct 23 2004 Java chose exclusive ranges too, if that helps...
• David Medlock (13/13) Oct 23 2004 I think the whole issue stems from the ridiculous notion that we start
• =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (6/11) Oct 24 2004 I think you meant: "for i := 1 to length do" as readable :-)
• Derek (12/32) Oct 24 2004 I'm with you here too. I know that D's heritage does not permit it to us...
• Ben Hinkle (14/14) Oct 23 2004 I see where you are coming from - Fortran and MATLAB both include the
• David Medlock (6/22) Oct 25 2004 I like the (often downtrodden) pascal language a lot, because it allows
```First of all, I know and understand that things are hard to change, that Walter
has absolute authority over the language specification and the fact this topic
has been brought up many times already. However I believe (hope) that my
following post will pour more light on the subject and help turn the matters to
the best.

WARNING: LONG POST!

The main advantage of having an inclusive range for start and ending indices for
slicing is the intuitivity and clearness of what the programmer meant as ".." is
a universal denotation of a "range", symmetric:
A[0..2] affects cells 0,1,2.

Now points against it, which for each I will give a counter opinion:
1. To denote the slice is until the end of the array, one must use (length - 1).

While this is correct numerically, one logically thinks about a range in terms
of start and end and not in terms of "length". Therefore if a new property with
a name such as "last", "end", "maxIndex" could be introduced and will be equal
to (length - 1). In this case, A[0..last] can be written, instead of
a[0..length-1], keeping the clarity.
Adding a "last" property can also help in case one wants to simply change the
last cell in the array:
A[last] = 5;

2. Programmers are used to the for(i=0;i<length;++i) idiom.

While it is true that right now this is the case, I believe it became this way
from historic reasons of shorthand instead of writing code such as
"for(i=0;i<=length-1;++i)" which is certainly confusing and long.

From my experience, even tho "for(i=0;i<length;++i)" is better for simple loops,
almost always whenever I have a more complex loop that goes from a variable
index "n" till a variable index "m", I prefer using the <= notation as it gives
a clear idea until where the loop iterates.

I then realized that the reason why the <= notation isn't the common one is that
the "length" of the array doesn't actually fit in a loop over indices, just a
number from which one can denote the maximum index.

Since the system is structured, adding a "last" property as in point 1 would
diminish the shorthand motive for using "for(i=0;i<length;++i)". Instead it
could be "for(i=0;i<=last;++i)"

Moreover, in case you DO refer to length, the usage of ".." is
counter-intuitive. As other people noted, an inclusion of a index and length
notation can be useful regardless of a start..end notation.

The reasoning behind D is to make a language which is like C and C++, but is
better designed, reducing the amount of thought one should use to read/write
code which does what s/he intuitively thinks of. In this case, it means removing
the need for the exclusive < notation and instead using a symmetric <= notation
using a new language feature ("lastIndex").

Note: In both of my points, the code can be compiled by translating the new
version to the old version and compiling it just the same, efficiency isn't
hindered.

3. Creating a zero lengthed array should be easy, so A[0..0] is a good way.

If a certain index and length notation would be added (for example with the
syntax A[4#10], since "#" implies a number of elements), then creating an 0
sized array is trivial, A[0#0] would work.

4. Thousands of lines of D code have already been written, we shouldn't change
it now for legacy reasons.

While this is true for the reason C and C++ stayed backwards compatible, the
proportions are completely different, thousands versus hundreds of millions.
This caused many of the faults these languages now suffer from and are the
reason new languages (Java, D, etc.) were designed.
The language is still fresh and evolving. Changing it now might prove to much
more fruitful when it gets popular and actually gets to millions of lines of
complex code written.

On this note, I suggest adding a mandatory "language version" declaration at the
header of source files so that the compiler can act accordingly as the language
and libraries evolve. For example, "version 0.56" has a certain feature, while
"version 0.6" still has that feature but maybe has a different meaning (such as
this post's topic). That way even if the language changes expressions, old code
can be compiled appropriatly. Ofcourse if the versioning doesn't appear, it will
fall off to the version just before versioning is added.

I hope very much that this will help making D a better language for all of us.
```
Oct 22 2004
Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:
```I used to be a vocal proponent of this very same thing.  Walter's
response to me, basically, was that it was designed that way because it
made the code easier to write, more often.  Basically, he said that it's
more common for you to know the index of the element after the last than
it was for you to know the index of the last one.  For instance, say you
want to slice an array into two pieces, and you know the index where it
should start.  With Walter's current design, the code looks like this:
char[] array = <whatever>;
int sliceIndx = <whatever>;
char[] slice1 = array[0..sliceIndx];
char[] slice2 = array[sliceIndx..length];
With inclusive ranges, you have to add two extra "-1"s to the code:
char[] slice1 = array[0..sliceIndx-1];
char[] slice2 = array[sliceIndx..length-1];

At the time, I didn't believe him.  However, in my D experience since,
I've had to say that I  think he is right.  It is far more common to use
non-inclusive ranges than inclusive once.

So, I'm a convert.  Yes, it looks confusing, and takes a little to
learn.  But it's probably the best way to do things after all.
```
Oct 22 2004
Sjoerd van Leent <svanleent wanadoo.nl> writes:
```Russ Lewis wrote:
I used to be a vocal proponent of this very same thing.  Walter's
response to me, basically, was that it was designed that way because it
made the code easier to write, more often.  Basically, he said that it's
more common for you to know the index of the element after the last than
it was for you to know the index of the last one.  For instance, say you
want to slice an array into two pieces, and you know the index where it
should start.  With Walter's current design, the code looks like this:
char[] array = <whatever>;
int sliceIndx = <whatever>;
char[] slice1 = array[0..sliceIndx];
char[] slice2 = array[sliceIndx..length];
With inclusive ranges, you have to add two extra "-1"s to the code:
char[] slice1 = array[0..sliceIndx-1];
char[] slice2 = array[sliceIndx..length-1];

At the time, I didn't believe him.  However, in my D experience since,
I've had to say that I  think he is right.  It is far more common to use
non-inclusive ranges than inclusive once.

So, I'm a convert.  Yes, it looks confusing, and takes a little to
learn.  But it's probably the best way to do things after all.

I think that a solution to this should be possible. Why not let people
decide themselves to use inclusive or exclusive notation for slicing.
The following should be possible to implement:

char[] slice1 = array[0 .. length];	// The same thing
char[] slice2 = array[start : end];	// Different operator

Could such a solution be the one you're looking for?

Regards,
Sjoerd
```
Oct 23 2004
"Walter" <newshound digitalmars.com> writes:
```"Sjoerd van Leent" <svanleent wanadoo.nl> wrote in message
news:cldcc1\$2hue\$1 digitaldaemon.com...
I think that a solution to this should be possible. Why not let people
decide themselves to use inclusive or exclusive notation for slicing.
The following should be possible to implement:

char[] slice1 = array[0 .. length]; // The same thing
char[] slice2 = array[start : end]; // Different operator

Could such a solution be the one you're looking for?

While that would technically work, I suspect that there would be constant
confusion over which was which.
```
Oct 31 2004
=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
```Tomer Altman wrote:

While this is true for the reason C and C++ stayed backwards compatible, the
proportions are completely different, thousands versus hundreds of millions.
This caused many of the faults these languages now suffer from and are the
reason new languages (Java, D, etc.) were designed.
The language is still fresh and evolving. Changing it now might prove to much
more fruitful when it gets popular and actually gets to millions of lines of
complex code written.

Java chose exclusive ranges too, if that helps...

http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html#substring(int,%20int)
public String substring(int beginIndex, int endIndex)
Returns a new string that is a substring of this string.
The substring begins at the specified beginIndex and extends to  the
character
at index endIndex - 1. Thus the length of the substring is
endIndex-beginIndex.

Some of us think that it's a *good thing*, just as
we like arrays to start from zero and not from one ?

--anders
```
Oct 23 2004
David Medlock <amedlock nospam.org> writes:
```I think the whole issue stems from the ridiculous notion that we start
counting things at zero in programming languages.

Its completely counterintuitive unless you have been writing compilers
and you know that:
char *p;
p[2] == *(p + 2)

With one based indexes, then the inclusive idea has more merit.
No need for a[length-1], just a[length] for the last item.

for( i=1; i<=length; i++ ) ... looks more readable to me than
for( i=0; i<length; i++)

This is definitely *not* a critique of Walter by any means, since he has
made C familiarity a priortity.  Its more in legacy C this has come to pass.

My 0.02\$ spend it wisely.
```
Oct 23 2004
=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
```David Medlock wrote:

With one based indexes, then the inclusive idea has more merit.
No need for a[length-1], just a[length] for the last item.

for( i=1; i<=length; i++ ) ... looks more readable to me than
for( i=0; i<length; i++)

I think you meant: "for i := 1 to length do" as readable :-)

Since D uses C-style arrays, its exclusive indexing makes sense ?
(just as inclusive indexing would make sense with Pascal arrays)

And of course, for array loops, the "foreach" is excellent...

--anders
```
Oct 24 2004
Derek <derek psyc.ward> writes:
```On Sat, 23 Oct 2004 21:38:46 -0400, David Medlock wrote:

I think the whole issue stems from the ridiculous notion that we start
counting things at zero in programming languages.

Its completely counterintuitive unless you have been writing compilers
and you know that:
char *p;
p[2] == *(p + 2)

With one based indexes, then the inclusive idea has more merit.
No need for a[length-1], just a[length] for the last item.

for( i=1; i<=length; i++ ) ... looks more readable to me than
for( i=0; i<length; i++)

This is definitely *not* a critique of Walter by any means, since he has
made C familiarity a priortity.  Its more in legacy C this has come to pass.

My 0.02\$ spend it wisely.

I'm with you here too. I know that D's heritage does not permit it to use
1-based indexing so I'm not debating its pros and cons here.

I think of 0-indexing as not really indexes at all but offsets to the
beginning of the element. I've been programming for more than 25 years and
a large part of that is with C, and yet 1-based indexing always seems more
natural to me. I now do a lot of programming with Euphoria and with
Progress, both which use 1-based indexing and it is just easier to
read/comprehend and explain to normal people (not programmers!).

--
Derek
Melbourne, Australia
```
Oct 24 2004
Ben Hinkle <bhinkle4 juno.com> writes:
```I see where you are coming from - Fortran and MATLAB both include the
endpoint in slices (and they both use 1-based indexing instead of
0-based). Non-programmers tend to like that more than C-style. One
area where including the endpoint makes sense is with custom
containers like a sorted associative array - why should a slice of
such an array need to know the key for the element after the desired
slice? Similarly in a linked list the slice from one node to another
should probably include the endpoint. The difference becomes important
when items are added to the list - do they go into the slice or after
the slice? In MinTL slicing by integers will exclude the endpoint and
slicing by key or node will include the endpoint.

I think people in this newsgroup are a bit worn out right now, though,
so I don't expect this topic to get much debate.

-Ben
```
Oct 23 2004
David Medlock <amedlock nospam.org> writes:
```Ben Hinkle wrote:
I see where you are coming from - Fortran and MATLAB both include the
endpoint in slices (and they both use 1-based indexing instead of
0-based). Non-programmers tend to like that more than C-style. One
area where including the endpoint makes sense is with custom
containers like a sorted associative array - why should a slice of
such an array need to know the key for the element after the desired
slice? Similarly in a linked list the slice from one node to another
should probably include the endpoint. The difference becomes important
when items are added to the list - do they go into the slice or after
the slice? In MinTL slicing by integers will exclude the endpoint and
slicing by key or node will include the endpoint.

I think people in this newsgroup are a bit worn out right now, though,
so I don't expect this topic to get much debate.

-Ben

I like the (often downtrodden) pascal language a lot, because it allows
you to set the range of your array.

I found some old Delphi Code I wrote like 2 years ago and It was