www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Question about Slicing - possible bug

reply "TechnoZeus" <TechnoZeus PeoplePC.com> writes:
Okay,
if this is a bug then probably just about everyone here knows about it.
That's why I haven't said anything before...
but it looks like a bug to me,
so I have to ask.

Is a[0..4] supposed to mean elements number 0 through 4 of the array named "a",
as I would intuitively expect it to,
or is it supposed to mean elements number 0 through 3 of the array named "a",
which is what it acts like in the following two tests...

unittest
{
 char[] a;
 a="abcdefg";
 a=a[0..2]~a[4..6];
 writef("[%s]",a);
 assert(a=="abcefg");
}

//_____________________

unittest
{
 char[] a;
 a="abcdefg";
 a=a[0..3]~a[4..7];
 writef("[%s]",a);
 assert(a=="abcefg");
}

Anyone who knows for sure what was "intended", please tell me...
because I want to use slicing in my project, but if it's broken,
I would rather wait until it gets fixed.

By the way,
this is the case with other types of arrays that I have tested also,
not just with arrays of type char.

TZ
Apr 20 2005
next sibling parent reply xs0 <xs0 xs0.com> writes:
TechnoZeus wrote:
 Okay,
 if this is a bug then probably just about everyone here knows about it.
 That's why I haven't said anything before...
 but it looks like a bug to me,
 so I have to ask.
 
 Is a[0..4] supposed to mean elements number 0 through 4 of the array named "a",
 as I would intuitively expect it to,
 or is it supposed to mean elements number 0 through 3 of the array named "a",
 which is what it acts like in the following two tests...

0 through 3, I guess the point is to allow slices of length 0.. xs0
Apr 20 2005
next sibling parent "TechnoZeus" <TechnoZeus PeoplePC.com> writes:
Ah, okay... so I guess it could be considered 0 "to" 4, in a sense.
Not intuitive at all, in my opinion..
but learnable, and useable.
I hope this one is carefully documented,
because it could lead to a lot of confusion otherwise,
and possibly to inconsistant implementations in the long run
(like the BASIC "for" statement, which allows 0 itterations on some
implementations,
but not on others.)

I think a more sensable solution would have been to incluse a length based
slicing syntax,
such as for example a[0:10] meaning 10 elements of a[] starting at element
number 0.
To me, this would make much more sense for a way to allow 0 length strings in a
format like
a[x:0]
where x is any integer.

Of course, it would then be up to the implementation whether to allow
a[4:-5]
as the reverse of a[0:5]
or consider it an error.
Personally, I would think it nice to me able to slice an array as easily
forward as backward.
Is that possible under the current implementation?

TZ


"xs0" <xs0 xs0.com> wrote in message news:d46jc8$1dgs$2 digitaldaemon.com...
 TechnoZeus wrote:
 Okay,
 if this is a bug then probably just about everyone here knows about it.
 That's why I haven't said anything before...
 but it looks like a bug to me,
 so I have to ask.

 Is a[0..4] supposed to mean elements number 0 through 4 of the array named "a",
 as I would intuitively expect it to,
 or is it supposed to mean elements number 0 through 3 of the array named "a",
 which is what it acts like in the following two tests...

0 through 3, I guess the point is to allow slices of length 0.. xs0

Apr 20 2005
prev sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
xs0 wrote:
 TechnoZeus wrote:

 Is a[0..4] supposed to mean elements number 0 through 4 of the 
 array named "a", as I would intuitively expect it to, or is it 
 supposed to mean elements number 0 through 3 of the array named 
 "a", which is what it acts like in the following two tests...

0 through 3, I guess the point is to allow slices of length 0..

The point is to facilitate efficient operations without fencepost errors. A useful way to think of it is that the numbers refer to the intervals between elements, rather than the elements themselves. Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Apr 21 2005
parent reply "TechnoZeus" <TechnoZeus PeoplePC.com> writes:
Then what interval would b[4..2] refer to?
Would that be b[3]~b[2] or would it be b[4]~b[3] ?

TZ

"Stewart Gordon" <smjg_1998 yahoo.com> wrote in message
news:d47v2g$2oru$1 digitaldaemon.com...
 xs0 wrote:
 TechnoZeus wrote:

 Is a[0..4] supposed to mean elements number 0 through 4 of the
 array named "a", as I would intuitively expect it to, or is it
 supposed to mean elements number 0 through 3 of the array named
 "a", which is what it acts like in the following two tests...

0 through 3, I guess the point is to allow slices of length 0..

The point is to facilitate efficient operations without fencepost errors. A useful way to think of it is that the numbers refer to the intervals between elements, rather than the elements themselves. Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.

Apr 21 2005
parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
TechnoZeus wrote:
 Then what interval would b[4..2] refer to?
 Would that be b[3]~b[2] or would it be b[4]~b[3] ?

At the moment, b[4..2] is not a valid slice. But there are various things that it could in theory be defined to mean: (a) the same as b[2..4], in accordance with .substring in JS (at least as IE, Mozilla and Safari all implement it - not sure if this is standard though) (b) [b[3], b[2]], with either the view that the slice ends are intervals between elements or that the higher end is excluded (c) [b[4], b[3]], with the view that the right-hand end is excluded (d) b[4..$] ~ b[0..2], taking slices to wrap around (but then would b[3..3] be an empty array or b[3..$] ~ b[0..3]?) Any of these implementations would slow D programs down, as the slice would need to be classified at runtime even in release mode. And except for (a), they would no longer be slices into the array. Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Apr 21 2005
parent reply "TechnoZeus" <TechnoZeus PeoplePC.com> writes:
"Stewart Gordon" <smjg_1998 yahoo.com> wrote in message
news:d48muk$dte$1 digitaldaemon.com...
 TechnoZeus wrote:
 Then what interval would b[4..2] refer to?
 Would that be b[3]~b[2] or would it be b[4]~b[3] ?

At the moment, b[4..2] is not a valid slice. But there are various things that it could in theory be defined to mean: (a) the same as b[2..4], in accordance with .substring in JS (at least as IE, Mozilla and Safari all implement it - not sure if this is standard though) (b) [b[3], b[2]], with either the view that the slice ends are intervals between elements or that the higher end is excluded (c) [b[4], b[3]], with the view that the right-hand end is excluded (d) b[4..$] ~ b[0..2], taking slices to wrap around (but then would b[3..3] be an empty array or b[3..$] ~ b[0..3]?) Any of these implementations would slow D programs down, as the slice would need to be classified at runtime even in release mode. And except for (a), they would no longer be slices into the array. Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.

Hmmm... Now you have me curious. Why couldn't a reverse direction slice be a slice into the array? It would seem to me that there should be nothing preventing it. TZ
Apr 22 2005
parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
TechnoZeus wrote:
<snip>
 Hmmm... Now you have me curious.  Why couldn't a reverse direction 
 slice be a slice into the array?  It would seem to me that there 
 should be nothing preventing it.

1. It would require an extra piece of information to be stored about each array: the direction in which it goes in memory. Which would mean a change in the ABI - either add a direction field (increasing memory requirements), or make length an int (instead of uint) with the sign denoting direction (decreasing the maximum length of an array). 2. Checking the direction might slow programs down a little. 3. Code that extracts pointers from arrays (either to mark a point in the array or to interface with an external API) will break if the array is backwards in memory. It could be done with Norbert's long-standing MDA proposal, but wouldn't make a practical addition to plain linear arrays. Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Apr 22 2005
parent "TechnoZeus" <TechnoZeus PeoplePC.com> writes:
Ah, okay.  Didn't realize that an unsigned int was used.

TZ

"Stewart Gordon" <smjg_1998 yahoo.com> wrote in message
news:d4ai31$2525$1 digitaldaemon.com...
 TechnoZeus wrote:
 <snip>
 Hmmm... Now you have me curious.  Why couldn't a reverse direction
 slice be a slice into the array?  It would seem to me that there
 should be nothing preventing it.

1. It would require an extra piece of information to be stored about each array: the direction in which it goes in memory. Which would mean a change in the ABI - either add a direction field (increasing memory requirements), or make length an int (instead of uint) with the sign denoting direction (decreasing the maximum length of an array). 2. Checking the direction might slow programs down a little. 3. Code that extracts pointers from arrays (either to mark a point in the array or to interface with an external API) will break if the array is backwards in memory. It could be done with Norbert's long-standing MDA proposal, but wouldn't make a practical addition to plain linear arrays. Stewart. -- My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.

Apr 23 2005
prev sibling parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Wed, 20 Apr 2005 16:38:14 -0500, TechnoZeus <TechnoZeus PeoplePC.com>  
wrote:
 Okay,
 if this is a bug then probably just about everyone here knows about it.
 That's why I haven't said anything before...
 but it looks like a bug to me,
 so I have to ask.

 Is a[0..4] supposed to mean elements number 0 through 4 of the array  
 named "a", as I would intuitively expect it to,
 or is it supposed to mean elements number 0 through 3 of the array named  
 "a", which is what it acts like in the following two tests...

0 thru 3, i.e. indices 0,1 and 2. (not 3). It's inclusive of the first number, and exclusive of the last. Allowing "a[0 .. a.length]" or "a[0 .. $]" ($ means a.length in this case). Strangely I've never found it anything but intuitive.. which is odd when I actually think about it. Of course "a[a.length]" will give array bounds error. So...
 unittest
 {
  char[] a;
  a="abcdefg";
  a=a[0..2]~a[4..6];
  writef("[%s]",a);
  assert(a=="abcefg");
 }

This will assert.
 //_____________________

 unittest
 {
  char[] a;
  a="abcdefg";
  a=a[0..3]~a[4..7];
  writef("[%s]",a);
  assert(a=="abcefg");
 }

This will not. Regan
Apr 20 2005
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Thu, 21 Apr 2005 10:03:03 +1200, Regan Heath <regan netwin.co.nz> wrote:
 On Wed, 20 Apr 2005 16:38:14 -0500, TechnoZeus <TechnoZeus PeoplePC.com>  
 wrote:
 Okay,
 if this is a bug then probably just about everyone here knows about it.
 That's why I haven't said anything before...
 but it looks like a bug to me,
 so I have to ask.

 Is a[0..4] supposed to mean elements number 0 through 4 of the array  
 named "a", as I would intuitively expect it to,
 or is it supposed to mean elements number 0 through 3 of the array  
 named "a", which is what it acts like in the following two tests...

0 thru 3, i.e. indices 0,1 and 2. (not 3).

Oops. What a mess I am. 0 thru 3, i.e. indices 0,1,2 and 3. (not 4). I wrote that first time, then "corrected" myself without re-reading the original slice above.
 It's inclusive of the first number, and exclusive of the last.
 Allowing "a[0 .. a.length]" or "a[0 .. $]" ($ means a.length in this  
 case).

Instead of having to type a[0 .. a.length-1] etc.
 Strangely I've never found it anything but intuitive.. which is odd when  
 I actually think about it.

You're probably thinking this comment was all lies given my mistake above :) In all honestly I didn't even look at the slice when I corrected it, but rather the first part of my reply "0 thru 3" and interpreted it as a slice, thus the mistake. Regan
Apr 20 2005
parent "TechnoZeus" <TechnoZeus PeoplePC.com> writes:
Mistakes happen.  It's part of beinbg human.  (Although some programmers claim
not to be.  Hehe)

TZ

"Regan Heath" <regan netwin.co.nz> wrote in message
news:opspjydiky23k2f5 nrage.netwin.co.nz...
 On Thu, 21 Apr 2005 10:03:03 +1200, Regan Heath <regan netwin.co.nz> wrote:
 On Wed, 20 Apr 2005 16:38:14 -0500, TechnoZeus <TechnoZeus PeoplePC.com>
 wrote:
 Okay,
 if this is a bug then probably just about everyone here knows about it.
 That's why I haven't said anything before...
 but it looks like a bug to me,
 so I have to ask.

 Is a[0..4] supposed to mean elements number 0 through 4 of the array
 named "a", as I would intuitively expect it to,
 or is it supposed to mean elements number 0 through 3 of the array
 named "a", which is what it acts like in the following two tests...

0 thru 3, i.e. indices 0,1 and 2. (not 3).

Oops. What a mess I am. 0 thru 3, i.e. indices 0,1,2 and 3. (not 4). I wrote that first time, then "corrected" myself without re-reading the original slice above.
 It's inclusive of the first number, and exclusive of the last.
 Allowing "a[0 .. a.length]" or "a[0 .. $]" ($ means a.length in this
 case).

Instead of having to type a[0 .. a.length-1] etc.
 Strangely I've never found it anything but intuitive.. which is odd when
 I actually think about it.

You're probably thinking this comment was all lies given my mistake above :) In all honestly I didn't even look at the slice when I corrected it, but rather the first part of my reply "0 thru 3" and interpreted it as a slice, thus the mistake. Regan

Apr 20 2005