D - comments on m..n array index syntax. make it m through n inclusive

Chris Friesen (29/29) Aug 16 2001 On the whole, it looks pretty good. I have already given my thoughts on...

Walter (14/37) Aug 16 2001 generic

Chris Friesen (14/28) Aug 16 2001 Sure, but then you just write
Sheldon Simms (7/46) Aug 17 2001 I'm very used to writing loops like that too, but this notation in the

Christophe de Dinechin (12/28) Aug 17 2001 Yes, this is counter intuitive. But it should not be in the language in ...
reiter nomadics.com (Mac Reiter) (27/35) Jan 11 2002 (First, pardon me if the array slicing syntax debate is over. I just

Pavel Minayev (9/17) Jan 11 2002 ...

Walter (3/20) Jan 11 2002 Look at the string.d code for examples!
reiter nomadics.com (Mac Reiter) (78/95) Jan 14 2002 On Sat, 12 Jan 2002 03:30:04 +0300, "Pavel Minayev"

Pavel Minayev (7/14) Jan 14 2002 These were my words... earlier.

Roland (8/9) Jan 16 2002 sorry but in french keyboard, ']' and ')' are the same keyboard key, jus...

Pavel Minayev (3/4) Jan 16 2002 I always thought (a..b] and [a..b) are mathematical forms, aren't they?

Roland (5/9) Jan 17 2002 Not as i was teached math.

DrWhat? (9/21) Feb 14 2002 that is rather unpleasant (and likely to produce typos), however that

Pavel Minayev (6/13) Feb 14 2002 Practice shows that the form [0 .. a.length] (end-exclusive) is more

Chris Friesen <cfriesen nortelnetworks.com> writes:

On the whole, it looks pretty good.  I have already given my thoughts on generic
programming, but I wanted to make a comment on your array index range notation.


Quoted from your document:
   In general, (a[n..m] op e) is defined as: 

        for (i = n; i < m; i++)
            a[i] op e;


        s[] = t[];              the 3 elements of t[3] are copied into s[3]
        s[1..2] = t[0..1];      same as s[1] = t[0]
        s[0..2] = t[1..3];      same as s[0] = t[1], s[1] = t[2]


While I can see how this came from C/C++, I think it's very confusing.  I think
it would make a whole lot more sense to read the [m..n] notation as being the
range of indices which are covered.  This would then be identical behaviour to
math programs such as maple.  Plus, it has the added advantage of being
syntactically similar to accessing a single array element.

Thus,

a[1] = b[1];       obvious
a[1..3] = b[1..3];      same as a[1]=b[1], a[2]=b[2], a[3]=b[3]
a[1..3] = b[6..8];      same as a[1]=b[6], a[2]=b[7], a[3]=b[8]

Translating to english, the m..n notation converts to "take elements n through
m" which I think makes a lot more sense then "take elements n through m-1".

As a final piece of syntactical sugar, what about something like

a[1,4,7] = b[3,2,8];   same as a[1]=b[3], a[4]=b[2], a[7]=b[8]

where you specify a list of indices to copy?


Chris





-- 
Chris Friesen                    | MailStop: 043/33/F10  
Nortel Networks                  | work: (613) 765-0557
3500 Carling Avenue              | fax:  (613) 765-2986
Nepean, ON K2H 8E9 Canada        | email: cfriesen nortelnetworks.com

Aug 16 2001

"Walter" <walter digitalmars.com> writes:

"Chris Friesen" <cfriesen nortelnetworks.com> wrote in message
news:3B7C20B2.E2B8595F nortelnetworks.com...
 On the whole, it looks pretty good.  I have already given my thoughts on

generic
 programming, but I wanted to make a comment on your array index range

notation.
 Quoted from your document:
    In general, (a[n..m] op e) is defined as:
         for (i = n; i < m; i++)
             a[i] op e;
         s[] = t[];              the 3 elements of t[3] are copied into

s[3]
         s[1..2] = t[0..1];      same as s[1] = t[0]
         s[0..2] = t[1..3];      same as s[0] = t[1], s[1] = t[2]
 While I can see how this came from C/C++, I think it's very confusing.  I

think
 it would make a whole lot more sense to read the [m..n] notation as being

the
 range of indices which are covered.  This would then be identical

behaviour to
 math programs such as maple.  Plus, it has the added advantage of being
 syntactically similar to accessing a single array element.

 Thus,

 a[1] = b[1];       obvious
 a[1..3] = b[1..3];      same as a[1]=b[1], a[2]=b[2], a[3]=b[3]
 a[1..3] = b[6..8];      same as a[1]=b[6], a[2]=b[7], a[3]=b[8]

 Translating to english, the m..n notation converts to "take elements n

through
 m" which I think makes a lot more sense then "take elements n through

m-1".

That's a good point. But I am so used to writing loops that go from n to
m-1, that diverging from that will cause a lot of inadvertant bugs.


 As a final piece of syntactical sugar, what about something like
 a[1,4,7] = b[3,2,8];   same as a[1]=b[3], a[4]=b[2], a[7]=b[8]
 where you specify a list of indices to copy?

That does work pretty neat, but are there enough uses of this to justify the
feature?

Aug 16 2001

Chris Friesen <chris_friesen sympatico.ca> writes:

Walter wrote:
 "Chris Friesen" <cfriesen nortelnetworks.com> wrote in message

 
 Translating to english, the m..n notation converts to "take elements n

 through
 m" which I think makes a lot more sense then "take elements n through

 m-1".
 
 That's a good point. But I am so used to writing loops that go from n to
 m-1, that diverging from that will cause a lot of inadvertant bugs.

Sure, but then you just write
a[m..n-1] = b[m..n-1]

Doesn't that make more sense than writing

a[0..n] = b[0..n]

when you only have n elements to begin with?  Since its a whole new syntax
anyways, I would like to make it something logical and obvious to a new user. 
Thinking back to my old programming days when loops were "for i = 1 to 10 do"...

I think that having it obvious in the statemnet what the range of values is will
end up being clearer in the end.  I think the concept of ranges would be useful
in switch statements as well, but I'll address that in another thread.

 As a final piece of syntactical sugar, what about something like
 a[1,4,7] = b[3,2,8];   same as a[1]=b[3], a[4]=b[2], a[7]=b[8]
 where you specify a list of indices to copy?

 
 That does work pretty neat, but are there enough uses of this to justify the
 feature?

I kind of doubt it.  Like I said, syntactical sugar.

Chris

Aug 16 2001

"Sheldon Simms" <sheldon semanticedge.com> writes:

Im Artikel <9lhqng$d9n$1 digitaldaemon.com> schrieb "Walter"
<walter digitalmars.com>:

 "Chris Friesen" <cfriesen nortelnetworks.com> wrote in message
 news:3B7C20B2.E2B8595F nortelnetworks.com...
 On the whole, it looks pretty good.  I have already given my thoughts
 on

 generic
 programming, but I wanted to make a comment on your array index range

 notation.
 Quoted from your document:
    In general, (a[n..m] op e) is defined as:
         for (i = n; i < m; i++)
             a[i] op e;
         s[] = t[];              the 3 elements of t[3] are copied into

 s[3]
         s[1..2] = t[0..1];      same as s[1] = t[0] s[0..2] = t[1..3]; 
             same as s[0] = t[1], s[1] = t[2]
 While I can see how this came from C/C++, I think it's very confusing. 
 I

 think
 it would make a whole lot more sense to read the [m..n] notation as
 being

 the
 range of indices which are covered.  This would then be identical

 behaviour to
 math programs such as maple.  Plus, it has the added advantage of being
 syntactically similar to accessing a single array element.

 Thus,

 a[1] = b[1];       obvious
 a[1..3] = b[1..3];      same as a[1]=b[1], a[2]=b[2], a[3]=b[3] a[1..3]
 = b[6..8];      same as a[1]=b[6], a[2]=b[7], a[3]=b[8]

 Translating to english, the m..n notation converts to "take elements n

 through
 m" which I think makes a lot more sense then "take elements n through

 m-1".
 
 That's a good point. But I am so used to writing loops that go from n to
 m-1, that diverging from that will cause a lot of inadvertant bugs.

I'm very used to writing loops like that too, but this notation in the
D document really confused me at first. I think it's completely
counterintuitive and agree with Chris 100%.

-- 
Sheldon Simms / sheldon semanticedge.com

Aug 17 2001

Christophe de Dinechin <descubes earthlink.net> writes:

Yes, this is counter intuitive. But it should not be in the language in the
first
place. The definition of something like indexing should be in the library.

What if I want range-checked indexes. What if I don't want them. What if I want
a
stride (that is, elements are A[0], A[4], A[8], but A[1] and A[0] are the same.)
What if I want a different base (indexes in range 1..100 rather than 0..99)?


Chris Friesen wrote:

 On the whole, it looks pretty good.  I have already given my thoughts on
generic
 programming, but I wanted to make a comment on your array index range notation.

 Quoted from your document:
    In general, (a[n..m] op e) is defined as:

         for (i = n; i < m; i++)
             a[i] op e;

         s[] = t[];              the 3 elements of t[3] are copied into s[3]
         s[1..2] = t[0..1];      same as s[1] = t[0]
         s[0..2] = t[1..3];      same as s[0] = t[1], s[1] = t[2]

 While I can see how this came from C/C++, I think it's very confusing.  I think
 it would make a whole lot more sense to read the [m..n] notation as being the
 range of indices which are covered.  This would then be identical behaviour to
 math programs such as maple.  Plus, it has the added advantage of being
 syntactically similar to accessing a single array element.

I agree there. This notation is confusing to the extreme. If you want to have
this
behavior, use the mathematical notation:

    s[1..2[ = t[0..1[

For mathematicians, this means 1 to 2, 2 excluded. I can see that this might be
difficult to parse, thou...


 As a final piece of syntactical sugar, what about something like

 a[1,4,7] = b[3,2,8];   same as a[1]=b[3], a[4]=b[2], a[7]=b[8]

This is problematic in the presence of multi-dimensional arrays.

Aug 17 2001

reiter nomadics.com (Mac Reiter) writes:

(First, pardon me if the array slicing syntax debate is over.  I just
found out about D a few days ago, and just started looking at the spec
seriously today.)

On Thu, 16 Aug 2001 15:36:18 -0400, Chris Friesen
<cfriesen nortelnetworks.com> wrote:

Quoted from your document:
   In general, (a[n..m] op e) is defined as: 

        for (i = n; i < m; i++)
            a[i] op e;


        s[] = t[];              the 3 elements of t[3] are copied into s[3]
        s[1..2] = t[0..1];      same as s[1] = t[0]
        s[0..2] = t[1..3];      same as s[0] = t[1], s[1] = t[2]


While I can see how this came from C/C++, I think it's very confusing.  I think

Just to throw another vote in here - when I first read the description
of slicing in the D spec, I assumed it was a typo.  I then read a
later line where a slice was described as:

	int a[10];
	int b[]

	b = a;
	b = a[];
	b = a[0 .. a.length];

This explains WHY the syntax is the way it is, but I must strenuously
agree that it does not justify it.  Since "off by one" errors are
second only to pointer handling errors in programming, any new syntax
should be very clear and intuitive in its use.

I would also agree that some form of exclusive bound would be
acceptable, though hard to parse:

		b = a[0 .. a.length-1];
replaced by:
		b = a[0 .. a.length);

If the currently described exclusive ending bound remains in D, I
would simply have to remove the slicing syntax from my set of tools,
because I would always get it wrong -- I've switched from Basic to
C/C++ enough times to know that much.

Mac Reiter

Jan 11 2002

"Pavel Minayev" <evilone omen.ru> writes:

"Mac Reiter" <reiter nomadics.com> wrote in message
news:3c3f702f.27813613 news.digitalmars.com...

 This explains WHY the syntax is the way it is, but I must strenuously
 agree that it does not justify it.  Since "off by one" errors are
 second only to pointer handling errors in programming, any new syntax
 should be very clear and intuitive in its use.

...
 If the currently described exclusive ending bound remains in D, I
 would simply have to remove the slicing syntax from my set of tools,
 because I would always get it wrong -- I've switched from Basic to
 C/C++ enough times to know that much.

I thought the same when I argued on the topic.
Now, after I used it for a while, I have to agree with Walter that
end-exclusive form is what you need in 90% cases. It's not so
counter-intuitive as one might think, in fact, I didn't yet make
any mistakes with this syntax so far! Just try to write something
using slices heavily and you'll see it for yourself....

Jan 11 2002

"Walter" <walter digitalmars.com> writes:

"Pavel Minayev" <evilone omen.ru> wrote in message
news:a1o00d$31ed$1 digitaldaemon.com...
 "Mac Reiter" <reiter nomadics.com> wrote in message
 news:3c3f702f.27813613 news.digitalmars.com...

 This explains WHY the syntax is the way it is, but I must strenuously
 agree that it does not justify it.  Since "off by one" errors are
 second only to pointer handling errors in programming, any new syntax
 should be very clear and intuitive in its use.

 ...
 If the currently described exclusive ending bound remains in D, I
 would simply have to remove the slicing syntax from my set of tools,
 because I would always get it wrong -- I've switched from Basic to
 C/C++ enough times to know that much.

 I thought the same when I argued on the topic.
 Now, after I used it for a while, I have to agree with Walter that
 end-exclusive form is what you need in 90% cases. It's not so
 counter-intuitive as one might think, in fact, I didn't yet make
 any mistakes with this syntax so far! Just try to write something
 using slices heavily and you'll see it for yourself....

Look at the string.d code for examples!

Jan 11 2002

reiter nomadics.com (Mac Reiter) writes:

On Sat, 12 Jan 2002 03:30:04 +0300, "Pavel Minayev" <evilone omen.ru>
wrote:

I apologize up front for the length of this posting.  Unfortunately, I
do not have the time necessary to edit it down while maintaining the
points I am trying to make.

"Mac Reiter" <reiter nomadics.com> wrote in message
news:3c3f702f.27813613 news.digitalmars.com...

 This explains WHY the syntax is the way it is, but I must strenuously
 agree that it does not justify it.  Since "off by one" errors are
 second only to pointer handling errors in programming, any new syntax
 should be very clear and intuitive in its use.

...
 If the currently described exclusive ending bound remains in D, I
 would simply have to remove the slicing syntax from my set of tools,
 because I would always get it wrong -- I've switched from Basic to
 C/C++ enough times to know that much.

I thought the same when I argued on the topic.
Now, after I used it for a while, I have to agree with Walter that
end-exclusive form is what you need in 90% cases. It's not so
counter-intuitive as one might think, in fact, I didn't yet make
any mistakes with this syntax so far! Just try to write something
using slices heavily and you'll see it for yourself....

The mere thought causes me to wake up at night in a cold sweat from
maintenance nightmares.  

How many times do C programmers blow up stacks and heaps because they
forget to allocate enough space for the NULL at the end of a C string?

How many programmers are going to assume they need the -1 on the final
bound and end up one element short all the time?

How many programmers are going to think they copied the entire array
and wonder why their code explodes or throws an exception when they
try to access that last element?

Any decision will work for people who program exclusively in the given
language.  Java people got used to January being 0 and December being
11, eventually.  But a lot of programs got bad dates and lots of
exceptions thrown regarding December, too.  Experienced C programmers
don't have problems remembering that scanf needs the address for all
variable types EXCEPT strings (char arrays), but *EVERY* new and some
intermediate C programmers have blown up programs because of it.

This form saves typing "-1" 90% of the time.  But it generates a giant
blind spot when you have an off-by-one error the other 10% of the
time, because when you're trying to do the code review you look at it
and it *looks* like it does the right thing, but in reality it is
leaving the last element off.  Code should do what it says.

I don't mind an end-exclusive form.  I just don't think it should use
the end-INCLUSIVE syntax.  Most of us had some math thrown at us along
the way, and we know that [] includes both endpoints and [) does not
include the last endpoint.  If nothing else, seeing the ) at the end
of the range will make you stop and think about what you are looking
at.

	a[5..7] should be a[5], a[6], and a[7]
	a[5..7) should be a[5] and a[6]

If your newsreader font is really small, the second line used a
closing parenthesis instead of a closing square bracket.

This might be difficult to parse, but it really shouldn't be.  I would
expect some kind of "grouping stack" that keeps track of the most
recent outstanding opening symbol.  If that is the case, all you have
to do is accept a closing parenthesis as a valid match to an opening
square bracket.

Even conversion of existing code to the new format *shouldn't* be too
hard (says the non-compiler-writer, possibly with no ground to stand
on).  Make an intermediate version of the compiler that accepts either
form, and treats both of them as end-exclusive.  Have that compiler
dump out a new file with the closing ] converted to a ).  Users can
use this compiler to convert code files and prepare for the new
version of the compiler that supports end-exclusive AND end-inclusive.
A conversion like this needs to be done as early as possible, because
if you think it will be hard now, imagine how hard it will be as more
and more code accumulates.

Alternatively, some kind of #pragma-like device could be used, but
then you have to choose which behavior is default, and code reviewers
have to check for #pragmas to understand the code they are reading,
and it all just gets nasty.

Ultimately, since D is Mr. Bright's language, it will do whatever he
wants it to do.  I do know that I have had to do a LOT of maintenance
programming and code reviews of other people's code, and that I rarely
get the opportunity to work exclusively in one language for extended
periods of time.  My comments about blind spots and the principle of
least astonishment -- A system and its commands should behave the way
most people would predict, that is, the system should operate with
"least astonishment" -- come from experience and practice.  I realize
that Mr. Bright also has tremendous experience, having viewed his
substantial list of commercial programming successes.  But if []
remains end-exclusive, and if our company eventually started using it
for production work, the style policy would have to require that array
slicing not be used unless no alternative was available, and require
specific boilerplate commentary when it was used.  This would be
necessary to avoid astonishing new programmers and/or reviewers who
came across the code.  I already do similar things when I need to use
<= in a for loop instead of <, especially if it is a nested loop and
one loop uses < but the other uses <=.  That looks like a bug, so I
comment it to explain why it is that way.  Every use of array slicing
looks like a bug to me, so every use would require a comment
explaining its purpose.

Again, I apologize for the length of this posting.
Mac

Jan 14 2002

"Pavel Minayev" <evilone omen.ru> writes:

"Mac Reiter" <reiter nomadics.com> wrote in message
news:3c42f52d.258466765 news.digitalmars.com...

 How many programmers are going to assume they need the -1 on the final
 bound and end up one element short all the time?

These were my words... earlier.

 This form saves typing "-1" 90% of the time.  But it generates a giant
 blind spot when you have an off-by-one error the other 10% of the
 time, because when you're trying to do the code review you look at it
 and it *looks* like it does the right thing, but in reality it is
 leaving the last element off.  Code should do what it says.

My reply is simple: RTFM first. Always!
On other hand, Walter should have probably written it in red and
bold, all capital: "array slices are end-exclusive!", at the beginning
of the reference =)

Jan 14 2002

Roland <rv ronetech.com> writes:

Mac Reiter a �crit :

                 b = a[0 .. a.length);

sorry but in french keyboard, ']' and ')' are the same keyboard key, just the
first
is AltGr'ed..dangerous isn't it ?

Me, i still think that even if inclusive-exclusive form is more usable than it
seems, it is hard to sell.
I like the mathematical [a..b[ form, but i understand the parser don't.
For new comers this topic had been discussed in "arrays slicing range" thread.

Roland

Jan 16 2002

"Pavel Minayev" <evilone omen.ru> writes:

"Roland" <rv ronetech.com> wrote in message
news:3C45A4FC.128C2C8F ronetech.com...

 I like the mathematical [a..b[ form, but i understand the parser don't.

I always thought (a..b] and [a..b) are mathematical forms, aren't they?

Jan 16 2002

Roland <rv ronetech.com> writes:

Pavel Minayev a �crit :

 "Roland" <rv ronetech.com> wrote in message
 news:3C45A4FC.128C2C8F ronetech.com...

 I like the mathematical [a..b[ form, but i understand the parser don't.

 I always thought (a..b] and [a..b) are mathematical forms, aren't they?

Not as i was teached math.
In fact i would'nt care notation style if ')' and ']' were not so close in my
keyboard (same key).

Roland

Jan 17 2002

DrWhat? <DrWhat nospam.madscientist.co.uk> writes:

Mac Reiter wrote:

[snip]

 I would also agree that some form of exclusive bound would be
 acceptable, though hard to parse:
 
 b = a[0 .. a.length-1];
 replaced by:
 b = a[0 .. a.length);

that is rather unpleasant (and likely to produce typos), however that
is commonly what programmers what, whould may solution be
better ...

b = a[0 .. a.last]

where a.last == a.length-1

 If the currently described exclusive ending bound remains in D, I
 would simply have to remove the slicing syntax from my set of tools,
 because I would always get it wrong -- I've switched from Basic to
 C/C++ enough times to know that much.

I can imagine that could give errors,  and D (IMHO) would be better
without such gotchas.

 Mac Reiter

Feb 14 2002

"Pavel Minayev" <evilone omen.ru> writes:

"DrWhat?" <DrWhat nospam.madscientist.co.uk> wrote in message
news:a4hnfh$109f$1 digitaldaemon.com...

 that is rather unpleasant (and likely to produce typos), however that
 is commonly what programmers what, whould may solution be
 better ...

 b = a[0 .. a.last]

 where a.last == a.length-1

Practice shows that the form [0 .. a.length] (end-exclusive) is more
practically convenient than end-inclusive one. Otherwise, this is
a matter of taste.

 I can imagine that could give errors,  and D (IMHO) would be better
 without such gotchas.

Too late too late =)

Feb 14 2002

D Programming

C/C++ Programming

Other

D - comments on m..n array index syntax. make it m through n inclusive