www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Arrays are sufficient for ArrayLists? Really??

reply Mehrdad <wfunction hotmail.com> writes:
Hi!

I posted this question on StackOverflow about D:

http://stackoverflow.com/questions/6015008/how-to-delete-an-element-
from-an-array-in-d

and the answers are quite surprising to me.

Are arrays really supposed to be substitutes for array lists if we
can't remove anything from them? It seems to me like that's a really
basic feature we're missing here... it's really annoying whenever I
find out that each of my arrays needs its own "counter" variable,
even though it already has a length and a capacity.

Doesn't that mean we still need an ArrayList(T) type, as powerful as
arrays are supposed to be in D?
May 16 2011
next sibling parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Am 16.05.2011 11:54, schrieb Mehrdad:
 Hi!
 
 I posted this question on StackOverflow about D:
 
 http://stackoverflow.com/questions/6015008/how-to-delete-an-element-
 from-an-array-in-d
 
 and the answers are quite surprising to me.
 
 Are arrays really supposed to be substitutes for array lists if we
 can't remove anything from them? It seems to me like that's a really
 basic feature we're missing here... it's really annoying whenever I
 find out that each of my arrays needs its own "counter" variable,
 even though it already has a length and a capacity.
 
 Doesn't that mean we still need an ArrayList(T) type, as powerful as
 arrays are supposed to be in D?
They're called Arrays, not Lists (or ArrayLists), so way would you expect a delete functions? Deleting from an Array is O(n) (if it's not the last element - and in that case you can just do "arr.length = arr.length-1;" in D), while deleting from a typical (Linked) List is O(1), i.e. it is pretty expensive and should be avoided, so it makes sense to not support it directly. IMHO arrays are not supposed to substitute ArrayLists. They're a basic type built in the language, while ArrayLists are usually classes that implement a list interface on top of arrays. Because of the lack of built in dynamic arrays (in Java etc) ArrayLists are frequently used like dynamic arrays. So just because ArrayLists are used like dynamic arrays it doesn't mean that dynamic arrays should behave like ArrayLists and support all their features natively. If you want something like an ArrayList in D, have a look at std.container.Array :-) Cheers, - Daniel
May 16 2011
parent reply Mehrdad <wfunction hotmail.com> writes:
 They're called Arrays, not Lists (or ArrayLists), so way would you expect a
delete functions?
I thought they're supposed to substitute for lists (otherwise we'd have an ArrayList type in Phobos).
 If you want something like an ArrayList in D, have a look at
std.container.Array :-)
That doesn't work well the the garbage collector. :\
 As has been mentioned, std.algorithm.remove can be of help. You may want to
look at three of its
capabilities in particular: (a) remove multiple offsets in one pass, e.g. remove(a, 0, 4) removes the first and fifth element, (b) you can remove subranges, e.g. remove(a, tuple(1, 3)) removes the second through fourth element, and (c) if you don't care about the order in which elements are left after removal you may want to look into unstable remove, which does considerably less work. Thanks for the idea. This seems great, except for a couple of things: - I _do_ need the order to stay the same, so I can't just put in the last element. - I only need to remove one element at a time. - I still don't understand how this helps. Either this modifies the array directly, in which case adding a new element to the array after a removal would still cause a reallocation every time, or this returns a filtered range, in which case it doesn't do what I need (since it doesn't modify the array directly)... am I missing something?
May 16 2011
next sibling parent Sean Kelly <sean invisibleduck.org> writes:
On May 16, 2011, at 11:29 AM, Mehrdad wrote:
=20
 As has been mentioned, std.algorithm.remove can be of help. You may =
want to look at three of its
 capabilities in particular: (a) remove multiple offsets in one pass, =
e.g. remove(a, 0, 4) removes
 the first and fifth element, (b) you can remove subranges, e.g. =
remove(a, tuple(1, 3)) removes the
 second through fourth element, and (c) if you don't care about the =
order in which elements are left
 after removal you may want to look into unstable remove, which does =
considerably less work.
=20
 Thanks for the idea. This seems great, except for a couple of things:
=20
 - I _do_ need the order to stay the same, so I can't just put in the =
last element. Thus the stable remove, which doesn't affect element order.
 - I still don't understand how this helps. Either this modifies the =
array directly, in which case
 adding a new element to the array after a removal would still cause a =
reallocation every time, or
 this returns a filtered range, in which case it doesn't do what I need =
(since it doesn't modify the
 array directly)... am I missing something?
Removing and then appending one element to an array won't cause any = allocations to occur. The GC lock may be acquired to determine whether = the memory block backing the array has room for the new element, but = that's it.=
May 16 2011
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
 Thanks for the idea. This seems great, except for a couple of things:

 - I _do_ need the order to stay the same, so I can't just put in the last
element.

 - I only need to remove one element at a time.

 - I still don't understand how this helps. Either this modifies the array
directly, in which case
 adding a new element to the array after a removal would still cause a
reallocation every time, or
 this returns a filtered range, in which case it doesn't do what I need (since
it
doesn't modify the
 array directly)... am I missing something?
1. Are you sure you need an array rather than a linked list? Removal and insertion in the middle of the array is very inefficient if you have many elements. (but you cannot index into a list efficiently, duh) 2. The garbage collector guarantees amortized constant runtime for insertion at the back of the array. What exactly do you need the data structure for? Timon
May 16 2011
next sibling parent Russel Winder <russel russel.org.uk> writes:
On Mon, 2011-05-16 at 18:45 +0000, Timon Gehr wrote:
[ . . . ]
=20
 What exactly do you need the data structure for?
I have to admit, my reaction was, "and about b~~~~~ time someone asked that question". How on earth can one answer a question about data structures without knowing what the use case is?
 Timon
--=20 Russel. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D Dr Russel Winder t: +44 20 7585 2200 voip: sip:russel.winder ekiga.n= et 41 Buckmaster Road m: +44 7770 465 077 xmpp: russel russel.org.uk London SW11 1EN, UK w: www.russel.org.uk skype: russel_winder
May 16 2011
prev sibling parent Mehrdad <wfunction hotmail.com> writes:
1. Are you sure you need an array rather than a linked list?

Yes. I'm only asking about a functionality already present in every other
language I know (C++'s vector.erase,


2. The garbage collector guarantees amortized constant runtime for insertion at
the back of the array.

The problem is that that is _only_ true if the array is NOT a slice of another
array!

As an example, let's say I define:

void removeAt(T)(ref T[] arr, size_t index)
{
    foreach (i, ref item; arr[index .. $ - 1])
        item = arr[i + 1];
    arr = arr[0 .. $ - 1];
}

and then I have:

auto arr = [1, 2, 3];
arr.removeAt(0);
arr ~= 4;

The last statement ***WILL*** cause a reallocation, so that doesn't help.
May 16 2011
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/16/11 1:29 PM, Mehrdad wrote:
 As has been mentioned, std.algorithm.remove can be of help. You may want to
look at three of its
capabilities in particular: (a) remove multiple offsets in one pass, e.g. remove(a, 0, 4) removes the first and fifth element, (b) you can remove subranges, e.g. remove(a, tuple(1, 3)) removes the second through fourth element, and (c) if you don't care about the order in which elements are left after removal you may want to look into unstable remove, which does considerably less work. Thanks for the idea. This seems great, except for a couple of things: - I _do_ need the order to stay the same, so I can't just put in the last element.
Then you'd need to use the default swapping strategy.
 - I only need to remove one element at a time.
Then you may want to pass only one index.
 - I still don't understand how this helps. Either this modifies the array
directly, in which case
 adding a new element to the array after a removal would still cause a
reallocation every time, or
 this returns a filtered range, in which case it doesn't do what I need (since
it doesn't modify the
 array directly)... am I missing something?
After you remove some elements from an array by using std.algorithm.remove, the array capacity stays the same. Appending to it again will not reallocate the array if the capacity is sufficient. It is possible a reallocation does occur even if the capacity would be sufficient, but that's a rare phenomenon caused by implementation paraphernalia. The formal guarantee is that ~= takes amortized constant time. If you need an absolute guarantee that the array stays as initially allocated, you can't use ~=, but you can still use std.algorithm.remove. Andrei
May 16 2011
next sibling parent reply Mehrdad <wfunction hotmail.com> writes:
Andrei:
 After you remove some elements from an array by using std.algorithm.remove,
the array capacity stays the
same. Sean:
 Removing and then appending one element to an array won't cause any
allocations to occur.
How is that possible, though? (See the example in my response to Timon.)
May 16 2011
parent Sean Kelly <sean invisibleduck.org> writes:
On May 16, 2011, at 12:01 PM, Mehrdad wrote:
=20
 Sean:
 Removing and then appending one element to an array won't cause any =
allocations to occur.
=20
 How is that possible, though? (See the example in my response to =
Timon.) I wasn't thinking about how remove() worked. You'd have to adjust the = length property of the original array before appending or it would think = you were appending to a slice.=
May 16 2011
prev sibling next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
You could try checking capacity() and using reserve() in non time-critical code.
May 16 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
 1. Are you sure you need an array rather than a linked list?

 Yes. I'm only asking about a functionality already present in every other
language I know (C++'s vector.erase,


 2. The garbage collector guarantees amortized constant runtime for insertion at
the back of the array.
 The problem is that that is _only_ true if the array is NOT a slice of another
array!
 As an example, let's say I define:

 void removeAt(T)(ref T[] arr, size_t index)
 {
    foreach (i, ref item; arr[index .. $ - 1])
         item = arr[i + 1];
     arr = arr[0 .. $ - 1];
 }

 and then I have:

 auto arr = [1, 2, 3];
 arr.removeAt(0);
 arr ~= 4;

 The last statement ***WILL*** cause a reallocation, so that doesn't help.
What about: void removeAt(T)(ref T[] arr, size_t index) { foreach (i, ref item; arr[1 .. index+1]) item = arr[i - 1]; arr = arr[1 .. $]; //note how no valid data is beyond the end of the array } I did not test it but in theory this should give you the amortized constant guarantee for ~=. Timon
May 16 2011
next sibling parent reply %u <wfunction hotmail.com> writes:
Timon:
 What about:
void removeAt(T)(ref T[] arr, size_t index) { foreach (i, ref item; arr[1 .. index+1]) item = arr[i - 1]; arr = arr[1 .. $]; //note how no valid data is beyond the end of the array } Clever, but if you do this with a big enough number of items, you'll exhaust all memory. Be careful. :P Steven:
 For the OP, you may want to consider using ArrayList from
dcollections, which supports removing an individual or range of elements, or using foreach to purge the list. Note that this class does call assumeSafeAppend on its data storage since it can make more assumptions. Ah, seems to be what I need. I guess I was also trying to argue this should really be part of the language too, though (not just Phobos), given that concatenation and even hashtables are also part of the language. It seems to be missing badly. :\
May 16 2011
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
Mehrdad wrote:
 Timon:
 What about:
void removeAt(T)(ref T[] arr, size_t index) { foreach (i, ref item; arr[1 .. index+1]) item = arr[i - 1]; arr = arr[1 .. $]; //note how no valid data is beyond the end of the array } Clever, but if you do this with a big enough number of items, you'll exhaust all memory. Be careful. :P
Total memory usage is at most 2 times the maximum size the array reaches during its lifetime. Test it. Arguably, this is not a very good guarantee, because you might have multiple arrays that oscillate from big to small with some offset. But I was only trying to improve on runtime. Btw: removing an element always takes linear time. relocating happens only straight after removal and takes, well, linear time. In an asymptotic view, your implementation performs optimally, while never using too much memory. Timon
May 16 2011
next sibling parent reply Mehrdad <wfunction hotmail.com> writes:
On 5/16/2011 12:25 PM, Timon Gehr wrote:
 Mehrdad wrote:
 Timon:
 What about:
void removeAt(T)(ref T[] arr, size_t index) { foreach (i, ref item; arr[1 .. index+1]) item = arr[i - 1]; arr = arr[1 .. $]; //note how no valid data is beyond the end of the array } Clever, but if you do this with a big enough number of items, you'll exhaust all memory. Be careful. :P
Total memory usage is at most 2 times the maximum size the array reaches during its lifetime. Test it.
Whoops, my bad, I was under the impression that the array could potentially grow forever if I added elements, which obviously isn't true... yeah, seems like this would work too, interesting! I'll definitely test it, thanks!
May 16 2011
parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On 2011-05-16 12:26, Mehrdad wrote:
 On 5/16/2011 12:25 PM, Timon Gehr wrote:
 Mehrdad wrote:
 Timon:
 What about:
void removeAt(T)(ref T[] arr, size_t index) { foreach (i, ref item; arr[1 .. index+1]) item = arr[i - 1]; arr = arr[1 .. $]; //note how no valid data is beyond the end of the array } Clever, but if you do this with a big enough number of items, you'll exhaust all memory. Be careful. :P
Total memory usage is at most 2 times the maximum size the array reaches during its lifetime. Test it.
Whoops, my bad, I was under the impression that the array could potentially grow forever if I added elements, which obviously isn't true... yeah, seems like this would work too, interesting! I'll definitely test it, thanks!
It'll grow, but if it doesn't have enough space to grow, it has to reallocate, which means that any other slices/arrays which pointed to any portion of that array will not point to the new array. And you obviously can't have an array bigger than the amount of memory that you have, so you definitely can't make it grow forever. But it would be incredibly rare to have an application which needed an array larger than you could make. - Jonathan M Davis
May 16 2011
prev sibling parent reply Mehrdad <wfunction hotmail.com> writes:
On 5/16/2011 12:25 PM, Andrei Alexandrescu wrote:
 On 5/16/11 2:25 PM, Timon Gehr wrote:
 Mehrdad wrote:
 Timon:
 What about:
void removeAt(T)(ref T[] arr, size_t index) { foreach (i, ref item; arr[1 .. index+1]) item = arr[i - 1]; arr = arr[1 .. $]; //note how no valid data is beyond the end of the array } Clever, but if you do this with a big enough number of items, you'll exhaust all memory. Be careful. :P
Total memory usage is at most 2 times the maximum size the array reaches during its lifetime. Test it.
Testing would also reveal that this is a buggy replica of std.algorithm.remove. At best we should suggest reusing library code over attempting to reinvent it. Thanks, Andrei
Oh, I see. Wait, what bug are you referring to, though? On 5/16/2011 12:25 PM, Steven Schveighoffer wrote:
 Steven:
 For the OP, you may want to consider using ArrayList from
dcollections, which supports removing an individual or range of elements, or using foreach to purge the list. Note that this class does call assumeSafeAppend on its data storage since it can make more assumptions. Ah, seems to be what I need. I guess I was also trying to argue this should really be part of the language too, though (not just Phobos), given that concatenation and even hashtables are also part of the language. It seems to be missing badly. :\
builtin arrays are full of features that are "good enough" for most usages. Removing elements the way that is least confusing is inefficient, and removing elements the most efficient way can be confusing or incorrect depending on the application. Rather than guess what the user expects, the runtime just leaves it up to the developer/standard lib. -Steve
Hm... is it really that atypical, though? We already have addition, is removal really that unlikely of an operation?
May 16 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/16/11 2:31 PM, Mehrdad wrote:
 On 5/16/2011 12:25 PM, Andrei Alexandrescu wrote:
 On 5/16/11 2:25 PM, Timon Gehr wrote:
 Mehrdad wrote:
 Timon:
 What about:
void removeAt(T)(ref T[] arr, size_t index) { foreach (i, ref item; arr[1 .. index+1]) item = arr[i - 1]; arr = arr[1 .. $]; //note how no valid data is beyond the end of the array } Clever, but if you do this with a big enough number of items, you'll exhaust all memory. Be careful. :P
Total memory usage is at most 2 times the maximum size the array reaches during its lifetime. Test it.
Testing would also reveal that this is a buggy replica of std.algorithm.remove. At best we should suggest reusing library code over attempting to reinvent it. Thanks, Andrei
Oh, I see. Wait, what bug are you referring to, though?
I was mistaken and removed my post. The code ingeniously redefines the problem - instead of removing from the array by shifting its tail downwards, it shifts elements upwards from the head of the array. A nice hack, but I don't think it does a lot in helping your situation. There's a bit of history wrt ~=. Initially the behavior was to never reallocate if there was enough capacity. But that caused a stomping issue - people would create an array, pass a slice of it to a function, and then the function would clobber elements in the arrays outside the slice. This was a major breakage of modular behavior so we decided to change the behavior: as long as you keep references to the original array, appending to a slice of it will cause reallocation of the array. One thing you could do is to think of what you're trying to achieve on a larger scale. Assuming a slice to grow over the original array is arguably a bad behavior that I'm sure could be fixed by small changes in the design. Andrei
May 16 2011
next sibling parent Mehrdad <wfunction hotmail.com> writes:
On 5/16/2011 12:38 PM, Andrei Alexandrescu wrote:
 Oh, I see.
 Wait, what bug are you referring to, though?
I was mistaken and removed my post. The code ingeniously redefines the problem - instead of removing from the array by shifting its tail downwards, it shifts elements upwards from the head of the array. A nice hack, but I don't think it does a lot in helping your situation. There's a bit of history wrt ~=. Initially the behavior was to never reallocate if there was enough capacity. But that caused a stomping issue - people would create an array, pass a slice of it to a function, and then the function would clobber elements in the arrays outside the slice. This was a major breakage of modular behavior so we decided to change the behavior: as long as you keep references to the original array, appending to a slice of it will cause reallocation of the array. One thing you could do is to think of what you're trying to achieve on a larger scale. Assuming a slice to grow over the original array is arguably a bad behavior that I'm sure could be fixed by small changes in the design. Andrei
It seems to be fine for what I need, although I'm going to put a comment in my code because I'm sure I'm going to run into a problem with it down the road... for now it seems safe though. :) Thanks!
May 16 2011
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/16/11 2:38 PM, Andrei Alexandrescu wrote:
 On 5/16/11 2:31 PM, Mehrdad wrote:
 On 5/16/2011 12:25 PM, Andrei Alexandrescu wrote:
 On 5/16/11 2:25 PM, Timon Gehr wrote:
 Mehrdad wrote:
 Timon:
 What about:
void removeAt(T)(ref T[] arr, size_t index) { foreach (i, ref item; arr[1 .. index+1]) item = arr[i - 1]; arr = arr[1 .. $]; //note how no valid data is beyond the end of the array } Clever, but if you do this with a big enough number of items, you'll exhaust all memory. Be careful. :P
Total memory usage is at most 2 times the maximum size the array reaches during its lifetime. Test it.
Testing would also reveal that this is a buggy replica of std.algorithm.remove. At best we should suggest reusing library code over attempting to reinvent it. Thanks, Andrei
Oh, I see. Wait, what bug are you referring to, though?
I was mistaken and removed my post. The code ingeniously redefines the problem - instead of removing from the array by shifting its tail downwards, it shifts elements upwards from the head of the array. A nice hack, but I don't think it does a lot in helping your situation.
In fact I even need to take that back. In order to work correctly, the function would have to iterate downwards. It _is_ indeed buggy, and I should stop emitting opinions when I'm short on time... Andrei
May 16 2011
next sibling parent reply Mehrdad <wfunction hotmail.com> writes:
On 5/16/2011 12:41 PM, Andrei Alexandrescu wrote:
 I was mistaken and removed my post. The code ingeniously redefines the
 problem - instead of removing from the array by shifting its tail
 downwards, it shifts elements upwards from the head of the array. A nice
 hack, but I don't think it does a lot in helping your situation.
In fact I even need to take that back. In order to work correctly, the function would have to iterate downwards. It _is_ indeed buggy, and I should stop emitting opinions when I'm short on time... Andrei
I'm not sure what you mean by iterating downwards, but I just noticed another bug: That would _still_ stomp over a superarray if this array is a slice of it, right? Is there any way to find out if an array is a slice of another array?
May 16 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/16/11 2:44 PM, Mehrdad wrote:
 On 5/16/2011 12:41 PM, Andrei Alexandrescu wrote:
 I was mistaken and removed my post. The code ingeniously redefines the
 problem - instead of removing from the array by shifting its tail
 downwards, it shifts elements upwards from the head of the array. A nice
 hack, but I don't think it does a lot in helping your situation.
In fact I even need to take that back. In order to work correctly, the function would have to iterate downwards. It _is_ indeed buggy, and I should stop emitting opinions when I'm short on time... Andrei
I'm not sure what you mean by iterating downwards, but I just noticed another bug: That would _still_ stomp over a superarray if this array is a slice of it, right?
Iterating downwards would mean: if you want to remove the third element from a = [0, 1, 2, 3, 4]; by using head shifting, you'd need to assign a[2]=a[1], a[1]=a[0] (backwards that is) to obtain [0, 0, 1, 3, 4]. Then you take a[1 .. $] which is the result. The function as implemented first assigns a[1]=a[0] and then a[2]=a[1], leading to [0, 0, 0, 3, 4] which is not what's needed.
 Is there any way to find out if an array is a slice of another array?
No. If there would, it would be a rather advanced function. Allow me to reiterate my suggestion: it's likely you have a simple problem with a simple solution based on slices, which are a simple abstraction. Therefore I suggest you rethink a bit of your design, and it's possible you won't need assumeSafeAppend et al. Andrei
May 16 2011
next sibling parent Mehrdad <wfunction hotmail.com> writes:
On 5/16/2011 12:53 PM, Andrei Alexandrescu wrote:
 On 5/16/11 2:44 PM, Mehrdad wrote:
 The function as implemented first assigns a[1]=a[0] and then a[2]=a[1],
 leading to [0, 0, 0, 3, 4] which is not what's needed.
Oh yeah good point.
 Is there any way to find out if an array is a slice of another array?
No. If there would, it would be a rather advanced function. Allow me to reiterate my suggestion: it's likely you have a simple problem with a simple solution based on slices, which are a simple abstraction. Therefore I suggest you rethink a bit of your design, and it's possible you won't need assumeSafeAppend et al. Andrei
Yeah, seems like even if it was, it wouldn't solve the problem in all cases. I'll try going around the problem, thanks.
May 16 2011
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
foreach is a very bad choice for solving this. I blindly took it over from the
original code. Need to get some sleep :).

This now definitely works, and is also the shortest:

void removeAt(T)(ref T[] arr, size_t index){
    for(auto i = index; i; i--) arr[i] = arr[i - 1];
    arr = arr[1 .. $];
}

Timon
May 16 2011
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
 In fact I even need to take that back. In order to work correctly, the
 function would have to iterate downwards. It _is_ indeed buggy, and I
 should stop emitting opinions when I'm short on time...

 Andrei
Whoops, you are right: void removeAt(T)(ref T[] arr, size_t index) { foreach (i, ref item; retro(arr[1 .. index+1])) item = arr[i - 1]; arr = arr[1 .. $]; } Timon
May 16 2011
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
Timon Gehr wrote:
 void removeAt(T)(ref T[] arr, size_t index)
 {
    foreach (i, ref item; retro(arr[1 .. index+1]))
         item = arr[i - 1];
     arr = arr[1 .. $];
 }
Sorry, still wrong: void removeAt(T)(ref T[] arr, size_t index) { foreach (i, ref item; retro(arr[1 .. index+1])) item = arr[index - i - 1]; arr = arr[1 .. $]; }
May 16 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/16/11 2:56 PM, Timon Gehr wrote:
 Timon Gehr wrote:
 void removeAt(T)(ref T[] arr, size_t index)
 {
     foreach (i, ref item; retro(arr[1 .. index+1]))
          item = arr[i - 1];
      arr = arr[1 .. $];
 }
Sorry, still wrong: void removeAt(T)(ref T[] arr, size_t index) { foreach (i, ref item; retro(arr[1 .. index+1])) item = arr[index - i - 1]; arr = arr[1 .. $]; }
I think it would take less time to actually paste the function in a file and try it. A cute possibility: void removeAt(T)(ref T[] arr, size_t index) { copy(retro(arr[0 .. index]), retro(arr[1 .. index + 1])); arr = arr[1 .. $]; } Andrei
May 16 2011
parent reply Mehrdad <wfunction hotmail.com> writes:
On 5/16/2011 12:59 PM, Andrei Alexandrescu wrote:
 On 5/16/11 2:56 PM, Timon Gehr wrote:
 I think it would take less time to actually paste the function in a file
 and try it. A cute possibility:

 void removeAt(T)(ref T[] arr, size_t index)
 {
 copy(retro(arr[0 .. index]), retro(arr[1 .. index + 1]));
 arr = arr[1 .. $];
 }


 Andrei
Sorry guys, _none_ of these unfortunately works. See CyberShadow's reply on my question on SO: http://stackoverflow.com/q/6015008/541686
May 16 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/16/11 3:04 PM, Mehrdad wrote:
 On 5/16/2011 12:59 PM, Andrei Alexandrescu wrote:
 On 5/16/11 2:56 PM, Timon Gehr wrote:
 I think it would take less time to actually paste the function in a file
 and try it. A cute possibility:

 void removeAt(T)(ref T[] arr, size_t index)
 {
 copy(retro(arr[0 .. index]), retro(arr[1 .. index + 1]));
 arr = arr[1 .. $];
 }


 Andrei
Sorry guys, _none_ of these unfortunately works. See CyberShadow's reply on my question on SO: http://stackoverflow.com/q/6015008/541686
I tested the function above. Andrei
May 16 2011
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
Andrei Alexandrescu wrote:
 A On 5/16/11 3:04 PM, Mehrdad wrote:
 On 5/16/2011 12:59 PM, Andrei Alexandrescu wrote:
 On 5/16/11 2:56 PM, Timon Gehr wrote:
 I think it would take less time to actually paste the function in a file
 and try it. A cute possibility:

 void removeAt(T)(ref T[] arr, size_t index)
 {
 copy(retro(arr[0 .. index]), retro(arr[1 .. index + 1]));
 arr = arr[1 .. $];
 }


 Andrei
Sorry guys, _none_ of these unfortunately works. See CyberShadow's reply on my question on SO: http://stackoverflow.com/q/6015008/541686 I tested the function above. Andrei
Timon Gehr wrote:
 void removeAt(T)(ref T[] arr, size_t index){
     for(auto i = index; i; i--) arr[i] = arr[i - 1];
     arr = arr[1 .. $];
 }
This one works too. Timon
May 16 2011
prev sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Mon, 16 May 2011 23:11:50 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 I tested the function above.
OP's intention was to create a version of removeAt which modifies the array in-place only when there are no other aliases towards the data. The initial approach was to attempt to check if the passed array is a slice of a larger array, but as it was discussed, it's flawed. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
May 16 2011
next sibling parent reply Mehrdad <wfunction hotmail.com> writes:
On 5/16/2011 1:15 PM, Vladimir Panteleev wrote:
 On Mon, 16 May 2011 23:11:50 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 I tested the function above.
OP's intention was to create a version of removeAt which modifies the array in-place only when there are no other aliases towards the data. The initial approach was to attempt to check if the passed array is a slice of a larger array, but as it was discussed, it's flawed.
+1
May 16 2011
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 16 May 2011 16:17:45 -0400, Mehrdad <wfunction hotmail.com> wrote:

 On 5/16/2011 1:15 PM, Vladimir Panteleev wrote:
 On Mon, 16 May 2011 23:11:50 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 I tested the function above.
OP's intention was to create a version of removeAt which modifies the array in-place only when there are no other aliases towards the data. The initial approach was to attempt to check if the passed array is a slice of a larger array, but as it was discussed, it's flawed.
+1
Wait, I don't think the approach is flawed, just not correctly implemented (see my suggestion with capacity). If you have an array: [1, 2, 3, 4] and you want to remove the 2nd element, another alias to the original array would be: [1, 3, 4, 4] Are you trying to argue that the last 4 in that array is still valid data? If so, then yes, it's impossible to tell if there are other aliases to the same data. All you can tell is if the data ends at the valid data for that block (via the capacity property). I can't see any use case for continuing to use the original alias, because the data is obviously corrupt. Any alias that still references that last element would be corrupted. If you passed in the slice [0..3], you get an original aliased array of: [1, 3, 3, 4] Now, anything that references that third element is corrupt. But that fourth element is not corrupted, it wasn't part of the arguments to removeAt. So you can continue to refer to that element, and a properly implemented removeAt should avoid stomping on that element with a later append. -Steve
May 16 2011
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/16/11 3:15 PM, Vladimir Panteleev wrote:
 On Mon, 16 May 2011 23:11:50 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 I tested the function above.
OP's intention was to create a version of removeAt which modifies the array in-place only when there are no other aliases towards the data. The initial approach was to attempt to check if the passed array is a slice of a larger array, but as it was discussed, it's flawed.
The parameters of the problem have been re(de)fined several times during this discussion. Various responses focused on as many perceptions of what the request was. Some code snippets were wrong in the sense that they didn't work within their own specification. Those are different from responses that do work within their charter, hence my clarification. It does seem that all answers were ill-fitted to the way the problem has been ultimately formulated. Andrei
May 16 2011
next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Mon, 16 May 2011 23:22:21 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 5/16/11 3:15 PM, Vladimir Panteleev wrote:
 On Mon, 16 May 2011 23:11:50 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 I tested the function above.
OP's intention was to create a version of removeAt which modifies the array in-place only when there are no other aliases towards the data. The initial approach was to attempt to check if the passed array is a slice of a larger array, but as it was discussed, it's flawed.
The parameters of the problem have been re(de)fined several times during this discussion. Various responses focused on as many perceptions of what the request was. Some code snippets were wrong in the sense that they didn't work within their own specification. Those are different from responses that do work within their charter, hence my clarification. It does seem that all answers were ill-fitted to the way the problem has been ultimately formulated.
I wasn't challenging your statement, sorry if I made it sound like that. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
May 16 2011
prev sibling parent Mehrdad <wfunction hotmail.com> writes:
On 5/16/2011 1:22 PM, Andrei Alexandrescu wrote:
 It does seem that all answers were ill-fitted to the way the problem has
 been ultimately formulated.

 Andrei
Yeah, there's no clean answer to this that's of this form, so I'm just working around the problem. Thanks everybody for your input, I appreciate it! :)
May 16 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 16 May 2011 15:53:38 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 In fact I even need to take that back. In order to work correctly, the
 function would have to iterate downwards. It _is_ indeed buggy, and I
 should stop emitting opinions when I'm short on time...

 Andrei
Whoops, you are right: void removeAt(T)(ref T[] arr, size_t index) { foreach (i, ref item; retro(arr[1 .. index+1])) item = arr[i - 1]; arr = arr[1 .. $]; }
Hehe, won't work. Ranges can only support a single foreach parameter. I think you may have to resort to for loops here (/me purposely ignores the hideous beast foreach_reverse). Be very careful with indexes on that one. -Steve
May 16 2011
prev sibling parent reply Mehrdad <wfunction hotmail.com> writes:
On 5/16/2011 12:53 PM, Timon Gehr wrote:
 In fact I even need to take that back. In order to work correctly, the
 function would have to iterate downwards. It _is_ indeed buggy, and I
 should stop emitting opinions when I'm short on time...

 Andrei
Whoops, you are right: void removeAt(T)(ref T[] arr, size_t index) { foreach (i, ref item; retro(arr[1 .. index+1])) item = arr[i - 1]; arr = arr[1 .. $]; } Timon
Wouldn't that stomp on the super-slice of arr, though? As was pointed out on SO, the problem is actually always there: if someone passes arr[0 .. $] to a function, it will look as if the original array was passed, although it wasn't. Seems like it's a lot uglier than I'd thought...
May 16 2011
parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 16 May 2011 15:52:23 -0400, Mehrdad <wfunction hotmail.com> wrote:

 On 5/16/2011 12:53 PM, Timon Gehr wrote:
 In fact I even need to take that back. In order to work correctly, the
 function would have to iterate downwards. It _is_ indeed buggy, and I
 should stop emitting opinions when I'm short on time...

 Andrei
Whoops, you are right: void removeAt(T)(ref T[] arr, size_t index) { foreach (i, ref item; retro(arr[1 .. index+1])) item = arr[i - 1]; arr = arr[1 .. $]; } Timon
Wouldn't that stomp on the super-slice of arr, though? As was pointed out on SO, the problem is actually always there: if someone passes arr[0 .. $] to a function, it will look as if the original array was passed, although it wasn't. Seems like it's a lot uglier than I'd thought...
arr[0..$] shouldn't be an lvalue (it is, but I think that's a bug). Therefore, you shouldn't be able to pass it as a ref argument (is this in bugzilla?). -Steve
May 16 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
Steven Schveighoffer wrote:
 On Mon, 16 May 2011 15:52:23 -0400, Mehrdad <wfunction hotmail.com> wrote:

 On 5/16/2011 12:53 PM, Timon Gehr wrote:
 In fact I even need to take that back. In order to work correctly, the
 function would have to iterate downwards. It _is_ indeed buggy, and I
 should stop emitting opinions when I'm short on time...

 Andrei
Whoops, you are right: void removeAt(T)(ref T[] arr, size_t index) { foreach (i, ref item; retro(arr[1 .. index+1])) item = arr[i - 1]; arr = arr[1 .. $]; } Timon
Wouldn't that stomp on the super-slice of arr, though? As was pointed out on SO, the problem is actually always there: if someone passes arr[0 .. $] to a function, it will look as if the original array was passed, although it wasn't. Seems like it's a lot uglier than I'd thought...
arr[0..$] shouldn't be an lvalue (it is, but I think that's a bug). Therefore, you shouldn't be able to pass it as a ref argument (is this in bugzilla?). -Steve
I think it is a little bit more subtle than "not lvalue", because you actually have to be able to do something like: arr[0..2] = arr[3..5]; Where a slice appears on the left hand side of the assignment. (also a[]=b[]). Timon
May 16 2011
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 16 May 2011 16:14:33 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 Steven Schveighoffer wrote:
 On Mon, 16 May 2011 15:52:23 -0400, Mehrdad <wfunction hotmail.com>  
 wrote:

 On 5/16/2011 12:53 PM, Timon Gehr wrote:
 In fact I even need to take that back. In order to work correctly,  
 the
 function would have to iterate downwards. It _is_ indeed buggy, and I
 should stop emitting opinions when I'm short on time...

 Andrei
Whoops, you are right: void removeAt(T)(ref T[] arr, size_t index) { foreach (i, ref item; retro(arr[1 .. index+1])) item = arr[i - 1]; arr = arr[1 .. $]; } Timon
Wouldn't that stomp on the super-slice of arr, though? As was pointed out on SO, the problem is actually always there: if someone passes arr[0 .. $] to a function, it will look as if the original array was passed, although it wasn't. Seems like it's a lot uglier than I'd thought...
arr[0..$] shouldn't be an lvalue (it is, but I think that's a bug). Therefore, you shouldn't be able to pass it as a ref argument (is this in bugzilla?). -Steve
I think it is a little bit more subtle than "not lvalue", because you actually have to be able to do something like: arr[0..2] = arr[3..5]; Where a slice appears on the left hand side of the assignment. (also a[]=b[]).
ref is not required in those cases, the array can be passed by value (and by value, I mean the pointer and length, not the data). You only need the array to be an lvalue when you want the passed-in array's pointer and length to be modified. -Steve
May 16 2011
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 16 May 2011 15:16:16 -0400, %u <wfunction hotmail.com> wrote:

 Timon:
 What about:
void removeAt(T)(ref T[] arr, size_t index) { foreach (i, ref item; arr[1 .. index+1]) item = arr[i - 1]; arr = arr[1 .. $]; //note how no valid data is beyond the end of the array } Clever, but if you do this with a big enough number of items, you'll exhaust all memory. Be careful. :P
It should be fine, as long as you are using appending to grow the array. When an append requires a reallocation, it will only reserve enough space for the valid elements. However, there are certain cases that will make the array continue to grow in place by adding more pages. But this will only happen if you get more than half a page of data, and pages are available to concatenate. It's actually a pretty clever way to avoid hitting the runtime when you need to remove an element. As long as it doesn't invalidate other references to the array elements, you should be good.
 Steven:
 For the OP, you may want to consider using ArrayList from
dcollections, which supports removing an individual or range of elements, or using foreach to purge the list. Note that this class does call assumeSafeAppend on its data storage since it can make more assumptions. Ah, seems to be what I need. I guess I was also trying to argue this should really be part of the language too, though (not just Phobos), given that concatenation and even hashtables are also part of the language. It seems to be missing badly. :\
builtin arrays are full of features that are "good enough" for most usages. Removing elements the way that is least confusing is inefficient, and removing elements the most efficient way can be confusing or incorrect depending on the application. Rather than guess what the user expects, the runtime just leaves it up to the developer/standard lib. -Steve
May 16 2011
prev sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 16 May 2011 15:12:17 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:


 As an example, let's say I define:

 void removeAt(T)(ref T[] arr, size_t index)
 {
    foreach (i, ref item; arr[index .. $ - 1])
         item = arr[i + 1];
     arr = arr[0 .. $ - 1];
 }

 and then I have:

 auto arr = [1, 2, 3];
 arr.removeAt(0);
 arr ~= 4;

 The last statement ***WILL*** cause a reallocation, so that doesn't  
 help.
You need to do arr.assumeSafeAppend(); Otherwise, the runtime is not aware of your invalidation of the last element.
 What about:
 void removeAt(T)(ref T[] arr, size_t index)
 {
    foreach (i, ref item; arr[1 .. index+1])
         item = arr[i - 1];
     arr = arr[1 .. $]; //note how no valid data is beyond the end of the  
 array
 }

 I did not test it but in theory this should give you the amortized  
 constant
 guarantee for ~=.
Yes, this should solve the problem, depending on the situation. Essentially, as long as you are not keeping other references to the array you expect to be updated properly (this solution "moves" the original pointer to the right each time), it should work. -Steve
May 16 2011
parent reply Mehrdad <wfunction hotmail.com> writes:
Steven:
 You need to do arr.assumeSafeAppend();  Otherwise, the runtime is no
aware of your invalidation of the last element. See my edit on SO: http://stackoverflow.com/q/6015008/541686
May 16 2011
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 16 May 2011 15:23:31 -0400, Mehrdad <wfunction hotmail.com> wrote:

 Steven:
 You need to do arr.assumeSafeAppend();  Otherwise, the runtime is no
aware of your invalidation of the last element. See my edit on SO: http://stackoverflow.com/q/6015008/541686
I'm not an SO user, so I'll post my answer here: assumeSafeAppend is named that way because you are making an assumption, possibly an unsafe one :) You are saying, "hey, runtime, assume that despite what you know about this array's data, it's safe to append in this case," and the runtime says "ok, boss, no problem." It cannot tell that you intended to use that data you have just invalidated. What you want to do is determine if the array is safe to append *before* you eliminate the elements at the end (i.e. the elements you are "throwing away" are at the end of valid data anyways). You can do this with the capacity property: auto oldcap = arr.capacity; // do your dirty work ... if(oldcap) // capacity != 0 means you could originally append to this array, or that it filled up the block. arr.assumeSafeAppend(); This should do the trick. -Steve
May 16 2011
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 16 May 2011 14:53:21 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 5/16/11 1:29 PM, Mehrdad wrote:
 As has been mentioned, std.algorithm.remove can be of help. You may  
 want to look at three of its
capabilities in particular: (a) remove multiple offsets in one pass, e.g. remove(a, 0, 4) removes the first and fifth element, (b) you can remove subranges, e.g. remove(a, tuple(1, 3)) removes the second through fourth element, and (c) if you don't care about the order in which elements are left after removal you may want to look into unstable remove, which does considerably less work. Thanks for the idea. This seems great, except for a couple of things: - I _do_ need the order to stay the same, so I can't just put in the last element.
Then you'd need to use the default swapping strategy.
 - I only need to remove one element at a time.
Then you may want to pass only one index.
 - I still don't understand how this helps. Either this modifies the  
 array directly, in which case
 adding a new element to the array after a removal would still cause a  
 reallocation every time, or
 this returns a filtered range, in which case it doesn't do what I need  
 (since it doesn't modify the
 array directly)... am I missing something?
After you remove some elements from an array by using std.algorithm.remove, the array capacity stays the same. Appending to it again will not reallocate the array if the capacity is sufficient.
This is not true. In order to adjust the used array length, you must call assumeSafeAppend on the resulting array (which I would recommend you do after using remove in this case). Remove can't do it for you, because it is a generic algorithm which takes a range, it doesn't "know" that it's working with an array. Plus it doesn't know that it's working with a dynamic array. Calling assumeSafeAppend could be a completely useless call. For the OP, you may want to consider using ArrayList from dcollections, which supports removing an individual or range of elements, or using foreach to purge the list. Note that this class does call assumeSafeAppend on its data storage since it can make more assumptions. http://www.dsource.org/projects/dcollections -Steve
May 16 2011
prev sibling next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Mon, 16 May 2011 12:54:33 +0300, Mehrdad <wfunction hotmail.com> wrote:

 Hi!

 I posted this question on StackOverflow about D:

 http://stackoverflow.com/questions/6015008/how-to-delete-an-element-
 from-an-array-in-d

 and the answers are quite surprising to me.
I've updated my answer. Sorry my original answer wasn't very helpful. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
May 16 2011
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 05/16/2011 04:54 AM, Mehrdad wrote:
 Hi!

 I posted this question on StackOverflow about D:

 http://stackoverflow.com/questions/6015008/how-to-delete-an-element-
 from-an-array-in-d

 and the answers are quite surprising to me.

 Are arrays really supposed to be substitutes for array lists if we
 can't remove anything from them? It seems to me like that's a really
 basic feature we're missing here... it's really annoying whenever I
 find out that each of my arrays needs its own "counter" variable,
 even though it already has a length and a capacity.

 Doesn't that mean we still need an ArrayList(T) type, as powerful as
 arrays are supposed to be in D?
As has been mentioned, std.algorithm.remove can be of help. You may want to look at three of its capabilities in particular: (a) remove multiple offsets in one pass, e.g. remove(a, 0, 4) removes the first and fifth element, (b) you can remove subranges, e.g. remove(a, tuple(1, 3)) removes the second through fourth element, and (c) if you don't care about the order in which elements are left after removal you may want to look into unstable remove, which does considerably less work. Andrei
May 16 2011