digitalmars.D.bugs - Array bounds check on array slices.

• Dave (3/3) Nov 27 2005 Shouldn't:
• Unknown W. Brackets (8/15) Nov 27 2005 Why should it? I can see plenty of cases where I would want that to
• Dave (15/29) Nov 28 2005 Hmm, doesn't seem perfect to me Consider:
• Stewart Gordon (20/25) Nov 28 2005
• Walter Bright (5/10) Dec 01 2005 You're right that this can be a problem. The solution currently used is ...
• Bruno Medeiros (8/37) Nov 29 2005 Yes, it is legal for it to point there, but it's ok since it won't be
• Stewart Gordon (23/32) Nov 30 2005
• Bruno Medeiros (11/31) Nov 30 2005 Ah, that. Yes, it happens, and is a known behaviour. See "Setting
• Dave (9/41) Nov 30 2005 Well put, and to cover cases like that the GC ABI becomes more complicat...
• Georg Wrede (12/54) Nov 29 2005 Yes, this is counter-intuitive. OTOH, the C specification (since decades...
• Dave (6/60) Nov 30 2005 Yes, but this is D, not C, and easy array slicing is a D feature .
```Shouldn't:

int[] i = a[length .. length];

produce an ArrayBoundsError?
```
Nov 27 2005
"Unknown W. Brackets" <unknown simplemachines.org> writes:
```Why should it?  I can see plenty of cases where I would want that to
work, being that it's a slice of zero elements at the end of the array.
That's not out of bounds, is it?

However, these:

int[] i = a[length + 1 .. length + 1];
int[] i = a[length .. length + 1];

Definitely should not work.  And they don't.  Sounds perfect to me 8).

-[Unknown]

Shouldn't:

int[] i = a[length .. length];

produce an ArrayBoundsError?

```
Nov 27 2005
```"Unknown W. Brackets" <unknown simplemachines.org> wrote in message
news:dmdtm2\$18d3\$1 digitaldaemon.com...
Why should it?  I can see plenty of cases where I would want that to work,
being that it's a slice of zero elements at the end of the array. That's
not out of bounds, is it?

However, these:

int[] i = a[length + 1 .. length + 1];
int[] i = a[length .. length + 1];

Definitely should not work.  And they don't.  Sounds perfect to me 8).

-[Unknown]

Hmm, doesn't seem perfect to me <g> Consider:

Since array[length .. length] actually points *past* the end of the array,
for non-GC allocated memory, it is currently legal through slicing to point
into memory not 'owned' by the array, which to me really should not be legal
(and the rule for slice leading indexes, whatever it turns out to be, really
should to be consistent between D array and pointer slicing).

Also, with GC allocated memory, allocations using power-of-2 bytes have to
be doubled because of this so the next larger 'bucket' is assigned and
initialized (see phobos/internal/gc/gc.d and gcx.d). Seems pretty wasteful..

Finally, it just seems semantically inconsistent to me, and will probably
lead to a lot of D newbie problems and hard to find bugs.

[Since this seems legal and not a bug, I'm cross-posting to digitalmars.D
for more discussion..]

Shouldn't:

int[] i = a[length .. length];

produce an ArrayBoundsError?

```
Nov 28 2005
Stewart Gordon <smjg_1998 yahoo.com> writes:
```Dave wrote:
<snip>
Since array[length .. length] actually points *past* the end of the array,
for non-GC allocated memory, it is currently legal through slicing to point
into memory not 'owned' by the array, which to me really should not be legal
(and the rule for slice leading indexes, whatever it turns out to be, really
should to be consistent between D array and pointer slicing).

<snip>

The problem is that it is convenient to write something like

qwert = qwert[1..\$];

to cut off the first element of an array, regardless of whether anything
remains.

But I'm guessing the problem to which you refer is that the beginning of
such a slice might be the beginning of another array altogether, and so
later changing its .length can overwrite the contents of the other
array.  Can this actually happen with the current DMD/GDC?  Or does the
structure of the GC heap somehow prevent it?

--
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/M d- s:- C++  a->--- UB  P+ L E  W++  N+++ o K-  w++  O? M V? PS-
PE- Y? PGP- t- 5? X? R b DI? D G e++>++++ h-- r-- !y
------END GEEK CODE BLOCK------

My e-mail is valid but not my primary mailbox.  Please keep replies on
the 'group where everyone may benefit.
```
Nov 28 2005
"Walter Bright" <newshound digitalmars.com> writes:
```"Stewart Gordon" <smjg_1998 yahoo.com> wrote in message
news:dmfc73\$2hec\$1 digitaldaemon.com...
But I'm guessing the problem to which you refer is that the beginning of
such a slice might be the beginning of another array altogether, and so
later changing its .length can overwrite the contents of the other
array.  Can this actually happen with the current DMD/GDC?  Or does the
structure of the GC heap somehow prevent it?

You're right that this can be a problem. The solution currently used is to
allocate a bit more than necessary for the array, so an end slice will not
be at the beginning of some other array later in memory.
```
Dec 01 2005
Bruno Medeiros <daiphoenixNO SPAMlycos.com> writes:
```Dave wrote:
"Unknown W. Brackets" <unknown simplemachines.org> wrote in message
news:dmdtm2\$18d3\$1 digitaldaemon.com...

Why should it?  I can see plenty of cases where I would want that to work,
being that it's a slice of zero elements at the end of the array. That's
not out of bounds, is it?

However, these:

int[] i = a[length + 1 .. length + 1];
int[] i = a[length .. length + 1];

Definitely should not work.  And they don't.  Sounds perfect to me 8).

-[Unknown]

Hmm, doesn't seem perfect to me <g> Consider:

Since array[length .. length] actually points *past* the end of the array,
for non-GC allocated memory, it is currently legal through slicing to point
into memory not 'owned' by the array, which to me really should not be legal

Yes, it is legal for it to point there, but it's ok since it won't be
legal to use the array no that it has length 0.

Shouldn't:

int[] i = a[length .. length];

produce an ArrayBoundsError?

It will happen if you try to use the array (i.e., index any element).

--
Bruno Medeiros - CS/E student
"Certain aspects of D are a pathway to many abilities some consider to
be... unnatural."
```
Nov 29 2005
Stewart Gordon <smjg_1998 yahoo.com> writes:
```Bruno Medeiros wrote:
Dave wrote:

<snip>
Hmm, doesn't seem perfect to me <g> Consider:

Since array[length .. length] actually points *past* the end of the
array, for non-GC allocated memory, it is currently legal through
slicing to point into memory not 'owned' by the array, which to me
really should not be legal

Yes, it is legal for it to point there, but it's ok since it won't be
legal to use the array no that it has length 0.

<snip>

You miss the point.

Suppose you have

int[] qwert, yuiop;
qwert.length = 16;
yuiop.length = 16;

and the two arrays happen to end up adjacent in memory.  Now you set

int[] asdfg = qwert[16..16];

then asdfg will point to the beginning of yuiop.  Therefore if you then
try to set asdfg.length, then because asdfg.ptr is at the beginning of a
heap-allocated block, it will try to lengthen asdfg in place, thereby
overwriting the contents of yuiop.

Stewart.

--
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/M d- s:- C++  a->--- UB  P+ L E  W++  N+++ o K-  w++  O? M V? PS-
PE- Y? PGP- t- 5? X? R b DI? D G e++>++++ h-- r-- !y
------END GEEK CODE BLOCK------

My e-mail is valid but not my primary mailbox.  Please keep replies on
the 'group where everyone may benefit.
```
Nov 30 2005
Bruno Medeiros <daiphoenixNO SPAMlycos.com> writes:
```Stewart Gordon wrote:

You miss the point.

Suppose you have

int[] qwert, yuiop;
qwert.length = 16;
yuiop.length = 16;

and the two arrays happen to end up adjacent in memory.  Now you set

int[] asdfg = qwert[16..16];

then asdfg will point to the beginning of yuiop.  Therefore if you then
try to set asdfg.length, then because asdfg.ptr is at the beginning of a
heap-allocated block, it will try to lengthen asdfg in place, thereby
overwriting the contents of yuiop.

Stewart.

Ah, that. Yes, it happens, and is a known behaviour. See "Setting
Dynamic Array Length" (where a similar example is presented) on
http://www.digitalmars.com/d/arrays.html . In particular : "To guarantee
copying behavior, use the .dup property to ensure a unique array that
can be resized."
I do agree that overall this doesn't seem a very elegant, clean behaviour.

--
Bruno Medeiros - CS/E student
"Certain aspects of D are a pathway to many abilities some consider to
be... unnatural."
```
Nov 30 2005
```In article <dmk21u\$dqt\$1 digitaldaemon.com>, Stewart Gordon says...
Bruno Medeiros wrote:
Dave wrote:

<snip>
Hmm, doesn't seem perfect to me <g> Consider:

Since array[length .. length] actually points *past* the end of the
array, for non-GC allocated memory, it is currently legal through
slicing to point into memory not 'owned' by the array, which to me
really should not be legal

Yes, it is legal for it to point there, but it's ok since it won't be
legal to use the array no that it has length 0.

<snip>

You miss the point.

Suppose you have

int[] qwert, yuiop;
qwert.length = 16;
yuiop.length = 16;

and the two arrays happen to end up adjacent in memory.  Now you set

int[] asdfg = qwert[16..16];

then asdfg will point to the beginning of yuiop.  Therefore if you then
try to set asdfg.length, then because asdfg.ptr is at the beginning of a
heap-allocated block, it will try to lengthen asdfg in place, thereby
overwriting the contents of yuiop.

Stewart.

Well put, and to cover cases like that the GC ABI becomes more complicated and
less efficient, not to mention the pain this could cause when people start doing
custom memory management in earnest.

Not to mention my biggest nit, which is that it is inconsistent with the rest of
the array bounds rules and doesn't make sense semantically. This is how most
newbies will envision array slicing for [length .. length]:

for(int i = length; i < length; i++) { ... } // WTF?

- Dave

--
-----BEGIN GEEK CODE BLOCK-----
Version: 3.1
GCS/M d- s:- C++  a->--- UB  P+ L E  W++  N+++ o K-  w++  O? M V? PS-
PE- Y? PGP- t- 5? X? R b DI? D G e++>++++ h-- r-- !y
------END GEEK CODE BLOCK------

My e-mail is valid but not my primary mailbox.  Please keep replies on
the 'group where everyone may benefit.

```
Nov 30 2005
Georg Wrede <georg.wrede nospam.org> writes:
```Dave wrote:
"Unknown W. Brackets" <unknown simplemachines.org> wrote in message
news:dmdtm2\$18d3\$1 digitaldaemon.com...

Why should it? I can see plenty of cases where I would want that to
work, being that it's a slice of zero elements at the end of the
array. That's not out of bounds, is it?

However, these: >>
int[] i = a[length + 1 .. length + 1];
int[] i = a[length .. length + 1];

Definitely should not work.  And they don't.  Sounds perfect to me
8).

Yes, this is counter-intuitive. OTOH, the C specification (since decades
ago) specified that (IIRC) you should always be able to point "one past"
_anything_. The case (specifically) being even one past end-of-memory.

Adding to this (in a massive way, too) is the STL. There the whole idea
of handling _any_ collection is _based_ on the ability to point one-past
the end.

Following from that: it should be okay to point to one-past. At the same
time, obviously, it should be illegal to actually dereference that
address, since "everyone knows" that it either contains garbage,
unrelated stuff, or causes a hardware trap, killing the program.

Now, the example code does pointing, but not dereferencing. Hence, no error.

Hmm, doesn't seem perfect to me <g> Consider:

Since array[length .. length] actually points *past* the end of the
array, for non-GC allocated memory, it is currently legal through
slicing to point into memory not 'owned' by the array, which to me
really should not be legal (and the rule for slice leading indexes,
whatever it turns out to be, really should to be consistent between D
array and pointer slicing).

Also, with GC allocated memory, allocations using power-of-2 bytes
have to be doubled because of this so the next larger 'bucket' is
assigned and initialized (see phobos/internal/gc/gc.d and gcx.d).
Seems pretty wasteful..

Finally, it just seems semantically inconsistent to me, and will
probably lead to a lot of D newbie problems and hard to find bugs.

[Since this seems legal and not a bug, I'm cross-posting to
digitalmars.D for more discussion..]

Shouldn't:

int[] i = a[length .. length];

produce an ArrayBoundsError?

```
Nov 29 2005
```In article <438CDCD4.3020709 nospam.org>, Georg Wrede says...
Dave wrote:
"Unknown W. Brackets" <unknown simplemachines.org> wrote in message
news:dmdtm2\$18d3\$1 digitaldaemon.com...

Why should it? I can see plenty of cases where I would want that to
work, being that it's a slice of zero elements at the end of the
array. That's not out of bounds, is it?

However, these: >>
int[] i = a[length + 1 .. length + 1];
int[] i = a[length .. length + 1];

Definitely should not work.  And they don't.  Sounds perfect to me
8).

Yes, this is counter-intuitive. OTOH, the C specification (since decades
ago) specified that (IIRC) you should always be able to point "one past"
_anything_. The case (specifically) being even one past end-of-memory.

Yes, but this is D, not C, and easy array slicing is a D feature <g>.

There's nothing getting in the way of backwards compatibility with C since you
can still point one past with naked pointers using memory allocated by a C
compliant allocator (ie: std.c.stdlib.malloc). That doesn't mean D has to carry
this legacy forward with any of the D specific features.

Adding to this (in a massive way, too) is the STL. There the whole idea
of handling _any_ collection is _based_ on the ability to point one-past
the end.

Following from that: it should be okay to point to one-past. At the same
time, obviously, it should be illegal to actually dereference that
address, since "everyone knows" that it either contains garbage,
unrelated stuff, or causes a hardware trap, killing the program.

Now, the example code does pointing, but not dereferencing. Hence, no error.

Hmm, doesn't seem perfect to me <g> Consider:

Since array[length .. length] actually points *past* the end of the
array, for non-GC allocated memory, it is currently legal through
slicing to point into memory not 'owned' by the array, which to me
really should not be legal (and the rule for slice leading indexes,
whatever it turns out to be, really should to be consistent between D
array and pointer slicing).

Also, with GC allocated memory, allocations using power-of-2 bytes
have to be doubled because of this so the next larger 'bucket' is
assigned and initialized (see phobos/internal/gc/gc.d and gcx.d).
Seems pretty wasteful..

Finally, it just seems semantically inconsistent to me, and will
probably lead to a lot of D newbie problems and hard to find bugs.

[Since this seems legal and not a bug, I'm cross-posting to
digitalmars.D for more discussion..]

Shouldn't:

int[] i = a[length .. length];

produce an ArrayBoundsError?

```
Nov 30 2005