digitalmars.D.learn - std.container & ranges

Max Wolter (19/19) Oct 30 2011 Hello there.

Jonathan M Davis (12/34) Oct 30 2011 linearRemove (and stableLinearRemove) takes a _range_ not a value. cell ...

Max Wolter (11/45) Oct 30 2011 Hello there.

Jonathan M Davis (37/49) Oct 30 2011 You would be doing exactly the same thing in C++ except that it would be...

Mike Parker (15/18) Oct 30 2011 IMO, it's much more intuitive to say list.remove(item). It's the first

Jonathan M Davis (17/38) Oct 30 2011 I definitely agree that the documentation should be improved to at least...

Tobias Pankrath (20/27) Oct 31 2011 But this thread serves as a proof that it is not obvious and something l...

Dmitry Olshansky (49/74) Oct 31 2011 It is, let's use remove from std.algorithm! ... though there is no

Jonathan M Davis (25/55) Oct 31 2011 Probably because it has to swap elements around, which requires a
Steven Schveighoffer (8/72) Oct 31 2011 ahem, using dcollections:

Dmitry Olshansky (11/89) Oct 31 2011 While computer happily does O(n) work here, my brain screams in

Steven Schveighoffer (13/105) Oct 31 2011 That could also be added (and I agree it's a good idea). Alas, interfac...

Kagamin (8/14) Nov 01 2011 may be a generic iteration handler would be more useful?

travert phare.normalesup.org (Christophe) (12/31) Nov 02 2011 That's not easier than using Tobias' improved foreach:

Dmitry Olshansky (9/14) Oct 31 2011 bugged, but
Tobias Pankrath (5/9) Nov 03 2011 To be honest, I don't understand this.

Steven Schveighoffer (5/13) Nov 03 2011 SList is a poor singly linked list implementation. It does not support ...

Dmitry Olshansky (25/40) Nov 03 2011 Yeah, poor wording going to kill me one day :) And looking at how SList

Steven Schveighoffer (19/58) Nov 03 2011 No, it's necessarily doing O(n) search for current node because it has n...

Timon Gehr (16/60) Nov 03 2011 The container interface does not expose references to the 'Node' struct,...

Steven Schveighoffer (12/78) Nov 03 2011 It does, actually. A range is a reference to a Node struct. But I stil...

Timon Gehr (10/93) Nov 03 2011 I can do that, but it will have to wait a few days, as I am quite busy

Steven Schveighoffer (6/104) Nov 03 2011 No rush. I just wanted to know if you would do it, and if not, I would ...

Timon Gehr (16/123) Nov 03 2011 My point is, if you have a struct like this:

Steven Schveighoffer (9/144) Nov 03 2011 Yes, this looks like a viable solution. Not sure how it would be done i...

Steven Schveighoffer (7/14) Nov 03 2011 Looking at dcollections' List interface, the one thing I can't implement...

Dmitry Olshansky (7/67) Nov 03 2011 removeAfter ? ;) Hm though it does hit as strange. Looking at my code, I...

Steven Schveighoffer (23/35) Nov 03 2011 For the moment, no. I am not sure whether this is the right decision or...

Dmitry Olshansky (9/44) Nov 03 2011 Well, I was mostly talking about this kind of scenario i.e. skip

Steven Schveighoffer (11/69) Nov 03 2011 c1 is a cursor, so there is no need to search for it (the point of a

Ali =?iso-8859-1?q?=C7ehreli?= (7/16) Nov 03 2011 Matt Austern's excellent "STL Singly Linked Lists" presentation is

Max Wolter (12/61) Nov 01 2011 Hello again.

Steven Schveighoffer (26/116) Nov 02 2011 The basic response to this is, when dealing with containers generically ...

Paolo Invernizzi (30/53) Nov 02 2011 do is simple, the syntax should also be simple, this is what I love =
Ary Manzana (16/136) Nov 02 2011 I don't really understand what's wrong with inefficient methods. You can...

Steven Schveighoffer (30/80) Nov 02 2011 Or use the right container for the job?

Ary Manzana (13/82) Nov 02 2011 But I'm sure in this algorithm I have for this app I'm making, my

Steven Schveighoffer (39/137) Nov 02 2011 Then your specific application can use std.algorithm.find to implement t...

Max Wolter (19/158) Nov 02 2011 Hello.

Steven Schveighoffer (25/73) Nov 03 2011 A linked list (any list really) is important if you want to maintain

Max Wolter (3/7) Nov 04 2011 Well, thank god that's cleared up.

Mike Parker (7/10) Nov 03 2011 This is the root of the problem. How are we supposed to know that we

Jonathan M Davis (5/16) Nov 03 2011 It's quite clear that we need to do a better job getting the concept of ...
Jacob Carlborg (5/17) Nov 03 2011 What about having a remove method on a container that calls remove in
Steven Schveighoffer (16/28) Nov 03 2011 The primitive for a container is remove(range). Ranges are essential to...

travert phare.normalesup.org (Christophe) (48/50) Nov 03 2011 Programmers have to learn ranges to use containers. Hiding ranges is not...

Steven Schveighoffer (23/73) Nov 03 2011 Preaching to the choir sir :)

Marco Leise (34/44) Nov 05 2011 While I stumble over this. I found that I sometimes need a certain

Steven Schveighoffer (16/66) Nov 05 2011 You should be able to do this with dcollections:

Steven Schveighoffer (25/43) Oct 31 2011 I offer an alternative collection library to std.container which has a

Max Wolter <awishformore gmail.com> writes:

Hello there.

I seem to be having problems wrapping my head around how to use the 
ranges in the context of containers in phobos. Specifically, I can't 
seem to figure out how to remove an element from a linked list.

          foreach(cell; organism)
          {
             if(cell.x == x && cell.y == y)
             {
                organism.stableLinearRemove(cell);
                break;
             }
          }

mind.d(123): Error: function 
std.container.SList!(Cell).SList.linearRemove (Range r) is not callable 
using argument types (Cell)
mind.d(123): Error: cannot implicitly convert expression (cell) of type 
cell.Cell to Take!(Range)

I somehow get the feeling such a basic operation should just...work?

/Max

Oct 30 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday, October 30, 2011 11:38:30 Max Wolter wrote:
 Hello there.
 
 I seem to be having problems wrapping my head around how to use the
 ranges in the context of containers in phobos. Specifically, I can't
 seem to figure out how to remove an element from a linked list.
 
           foreach(cell; organism)
           {
              if(cell.x == x && cell.y == y)
              {
                 organism.stableLinearRemove(cell);
                 break;
              }
           }
 
 mind.d(123): Error: function
 std.container.SList!(Cell).SList.linearRemove (Range r) is not callable
 using argument types (Cell)
 mind.d(123): Error: cannot implicitly convert expression (cell) of type
 cell.Cell to Take!(Range)
 
 I somehow get the feeling such a basic operation should just...work?

linearRemove (and stableLinearRemove) takes a _range_ not a value. cell is an 
element in the list, not a range over the list. The range that it takes must 
be either of type SList.Range or Take!(SList.Range). You get that range by 
slicing an SList. Take!(SList.Range) is for the case where you want only a 
portion of the beginning of a range rather than the whole range. Your example 
actually has a really simple solution:

auto found = find!((a){return a.x == x && a.y == y;})(organism[]);
organism.stableLinearRemove(take(found, 1));

It finds the element in the list that you're looking for, and then it passes a 
range with that one element to stableLinearRemove so that it'll remove it.

- Jonathan M Davis

Oct 30 2011

Max Wolter <awishformore gmail.com> writes:

On 10/30/2011 6:45 PM, Jonathan M Davis wrote:
 On Sunday, October 30, 2011 11:38:30 Max Wolter wrote:
 Hello there.

 I seem to be having problems wrapping my head around how to use the
 ranges in the context of containers in phobos. Specifically, I can't
 seem to figure out how to remove an element from a linked list.

            foreach(cell; organism)
            {
               if(cell.x == x&&  cell.y == y)
               {
                  organism.stableLinearRemove(cell);
                  break;
               }
            }

 mind.d(123): Error: function
 std.container.SList!(Cell).SList.linearRemove (Range r) is not callable
 using argument types (Cell)
 mind.d(123): Error: cannot implicitly convert expression (cell) of type
 cell.Cell to Take!(Range)

 I somehow get the feeling such a basic operation should just...work?

 linearRemove (and stableLinearRemove) takes a _range_ not a value. cell is an
 element in the list, not a range over the list. The range that it takes must
 be either of type SList.Range or Take!(SList.Range). You get that range by
 slicing an SList. Take!(SList.Range) is for the case where you want only a
 portion of the beginning of a range rather than the whole range. Your example
 actually has a really simple solution:

 auto found = find!((a){return a.x == x&&  a.y == y;})(organism[]);
 organism.stableLinearRemove(take(found, 1));

 It finds the element in the list that you're looking for, and then it passes a
 range with that one element to stableLinearRemove so that it'll remove it.

 - Jonathan M Davis

Hello there.

Thank you very much for the explanation.

However, while I really liked the concept of ranges in Andrei's book and 
a lot of it seems intuitive and faster than using iterators, I can't 
shake the feeling that in this case, it's just unnecessarily convoluted.

Maybe it's just the fact that the container library is still very basic, 
but I don't think I should go through such a complicated procedure to 
remove an known element from a list. It's just not a "very simple" or 
intuitive solution, which is something I came to love D for thus far.

/Max

Oct 30 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday, October 30, 2011 20:53:02 Max Wolter wrote:
 Hello there.
 
 Thank you very much for the explanation.
 
 However, while I really liked the concept of ranges in Andrei's book and
 a lot of it seems intuitive and faster than using iterators, I can't
 shake the feeling that in this case, it's just unnecessarily convoluted.
 
 Maybe it's just the fact that the container library is still very basic,
 but I don't think I should go through such a complicated procedure to
 remove an known element from a list. It's just not a "very simple" or
 intuitive solution, which is something I came to love D for thus far.

You would be doing exactly the same thing in C++ except that it would be with 
an iterator rather than a range. You would use find to find the iterator to the 
element that you wanted and then you'd pass that to the list's erase function. 
It is only more complicated in D in that you get a range which _starts_ with 
the element that you want (essentially, you get the iterator to that element 
plus the end iterator for the list), so if you want to remove only the first 
element, you take it from the front of the range. Other than that, it's the 
same as in C++. You can't remove an element from a list in C++ by giving that 
element to the list anymore than you can in D. In both cases, you need an 
iterator or a range.

So, in comparison to C++, there's no significant difference. Now, Java does
have 
a remove function which will take an element and remove the first occurence of 
that element from a list, and we could theoretically add one, but why? It 
would just be duplicating the functionality that find already gives. Java 
doesn't use iterators the way the C++ does, and it doesn't have ranges, so it 
_can't_ have a find function the way that C++ and D do, but D can. And in some 
cases, find can do it more efficiently than your loop could have.

I grant you that if you're not used to using find like this in C++ (and far, 
far too many C++ programmers use loops instead of find - in part because pre-
C++11, there were no lambdas and any time you need a custom comparison 
function, it's a pain to get one), then it's not immediately intuitive, but 
it's far more flexible and powerful than removing elements by giving the 
element to a remove function on the list. And if you really want to use a 
loop, then you still can. You just can't use foreach.

for(auto r = organism[]; !r.empty; r.popFront())
{
    if(r.front.x == x && r.front.y == y)
    {
        organism.stableLinearRemove(take(r, 1));
        break;
    }
}

but that's a lot more verbose than simply using find, and as I said, in at 
least some cases, find can be more efficient than simply looping, since it's 
optimized for finding elements.

- Jonathan M Davis

Oct 30 2011

Mike Parker <aldacron gmail.com> writes:

On 10/31/2011 5:28 AM, Jonathan M Davis wrote:


 So, in comparison to C++, there's no significant difference. Now, Java does
have
 a remove function which will take an element and remove the first occurence of
 that element from a list, and we could theoretically add one, but why?

IMO, it's much more intuitive to say list.remove(item). It's the first 
thing a lot of people expect coming from a Java background, that's for 
sure. The first time I tried to use SList, being unfamiliar with ranges 
as I was, it took a while to figure out what I needed to do. IIRC, I had 
to post here to ask.

The problem is that you really have to understand ranges and the Phobos 
functions that manipulate them before you can begin to use containers. I 
think that's wrong. Ideally, containers should be usable without having 
to know about find and Take and whatever else. This isn't the first time 
this question has come up in the NG and I've got a feeling it won't be 
the last.

At the very least it would be nice to see something in the documentation 
tying std.algorithm and std.range together with std.container. Something 
to point the way for those to whom it isn't obvious.

Oct 30 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday, October 31, 2011 12:11:45 Mike Parker wrote:
 On 10/31/2011 5:28 AM, Jonathan M Davis wrote:
 So, in comparison to C++, there's no significant difference. Now, Java
 does have a remove function which will take an element and remove the
 first occurence of that element from a list, and we could theoretically
 add one, but why?

 IMO, it's much more intuitive to say list.remove(item). It's the first
 thing a lot of people expect coming from a Java background, that's for
 sure. The first time I tried to use SList, being unfamiliar with ranges
 as I was, it took a while to figure out what I needed to do. IIRC, I had
 to post here to ask.
 
 The problem is that you really have to understand ranges and the Phobos
 functions that manipulate them before you can begin to use containers. I
 think that's wrong. Ideally, containers should be usable without having
 to know about find and Take and whatever else. This isn't the first time
 this question has come up in the NG and I've got a feeling it won't be
 the last.
 
 At the very least it would be nice to see something in the documentation
 tying std.algorithm and std.range together with std.container. Something
 to point the way for those to whom it isn't obvious.

I definitely agree that the documentation should be improved to at least give 
the basic example of using find with linearRemove. Adding something similar to 
C++'s remove_if which removes all elements which match a predicate could also 
be useful, but I don't at all agree that there should be a function which 
takes an element and removes the first element which is equal to it. find
allows 
you to do that just fine, and such a remove function would simply be 
duplicating its functionality.

As for understanding ranges, it's definitely true that there needs to be more 
documentation and/or articles on them so that the concept can better 
communicated. That's a definite flaw in the current documentation. But there's
a 
lot in Phobos which you're just not going to be able to use if you don't 
understand ranges, and the library would definitely be worse if it used them 
less, since they're such a powerful concept, so I think that the problem is 
the lack of communication on ranges, not the fact that ranges are used so 
heavily in Phobos.

- Jonathan M Davis

Oct 30 2011

Tobias Pankrath <tobias pankrath.net> writes:

Jonathan M Davis wrote:

  find allows
 you to do that just fine, and such a remove function would simply be
 duplicating its functionality.

But this thread serves as a proof that it is not obvious and something like 
containers should be obvious to use. Deleting an element with a certain 
value is a common task and should should not take
three function calls, with functions from three different modules.

 You would be doing exactly the same thing in C++ except that it would be
 with an iterator rather than a range. You would use find to find the
 iterator to the element that you wanted and then you'd pass that to the
 list's erase function. 

I don't think we should refer to C++ as an benchmark for ease of use.

How do you delete every occurence of v? Not just the first one? Is this
equally "easy" with find and take? I didn't find an easy way not 
envolving a loop within 10 Minutes by reading the documentation.

What do you think of an construct like this:

foreach(item, itemRange; container)
{
// itemRange is a range only containing item
// itemRange.length == 1 && itemRange.front == item
}

That would be analogous to foreach over array with an index. You might think
that iterating explicity may not be elegant. But it is what people want to
do and it should not be any harder than necessary.

--
Tobias

Oct 31 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 31.10.2011 11:16, Tobias Pankrath wrote:
 Jonathan M Davis wrote:

   find allows
 you to do that just fine, and such a remove function would simply be
 duplicating its functionality.

 But this thread serves as a proof that it is not obvious and something like
 containers should be obvious to use. Deleting an element with a certain
 value is a common task and should should not take
 three function calls, with functions from three different modules.

 You would be doing exactly the same thing in C++ except that it would be
 with an iterator rather than a range. You would use find to find the
 iterator to the element that you wanted and then you'd pass that to the
 list's erase function.

 I don't think we should refer to C++ as an benchmark for ease of use.

 How do you delete every occurence of v? Not just the first one? Is this
 equally "easy" with find and take? I didn't find an easy way not
 envolving a loop within 10 Minutes by reading the documentation.

It is, let's use remove from std.algorithm! ... though there is no 
overload of remove for forward ranges ... crap, worthy of bugzilla 
report. Or SList could be special enough to offer a built-in remove with 
predicate done in O(n).

However this should work( yet because of apparent bug in SList it 
doesn't). The bug is:
Line 1294 of std.container should be:
	 for (; !r.empty; r.popFront())
...
or it will remove the whole container.

Code:
import std.container, std.range;
import std.functional, std.algorithm, std.stdio;

void removeFromList(alias pred, T)(ref SList!T s)
{
	static if(is(typeof(pred) == string))
		alias unaryFun!pred Pred;
	else
		alias pred Pred;
	for(auto r=s[]; !r.empty; )
	{
		if(Pred(r.front))
		{
			r = s.linearRemove(take(r,1));
			continue;
		}
		else
			r.popFront();
	}
}

void main()
{
	SList!int list(20, 30, 40, 50, 60); // will work with stock Phobos if 
the first element % 20 != 0
	removeFromList!"a % 20 == 0"(list);
	writeln(list[]);
	
}

And more importantly, it still would be horribly slow O(N^2). 
Personally, because of that I'd prefer hand-rolled intrusive 
singly-linked list any time of day.

 What do you think of an construct like this:

 foreach(item, itemRange; container)
 {
 // itemRange is a range only containing item
 // itemRange.length == 1&&  itemRange.front == item
 }

 That would be analogous to foreach over array with an index. You might think
 that iterating explicity may not be elegant. But it is what people want to
 do and it should not be any harder than necessary.

You still can, just use plain range interface with empty/front/popFront.
for(auto r = list[]; !r.empty; r.popFront())
{
	...
}


-- 
Dmitry Olshansky

Oct 31 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Monday, October 31, 2011 13:16:11 Dmitry Olshansky wrote:
 On 31.10.2011 11:16, Tobias Pankrath wrote:
 Jonathan M Davis wrote:
   find allows
 
 you to do that just fine, and such a remove function would simply be
 duplicating its functionality.

 
 But this thread serves as a proof that it is not obvious and something
 like containers should be obvious to use. Deleting an element with a
 certain value is a common task and should should not take
 three function calls, with functions from three different modules.
 
 You would be doing exactly the same thing in C++ except that it would
 be
 with an iterator rather than a range. You would use find to find the
 iterator to the element that you wanted and then you'd pass that to
 the
 list's erase function.

 
 I don't think we should refer to C++ as an benchmark for ease of use.
 
 How do you delete every occurence of v? Not just the first one? Is this
 equally "easy" with find and take? I didn't find an easy way not
 envolving a loop within 10 Minutes by reading the documentation.

 
 It is, let's use remove from std.algorithm! ... though there is no
 overload of remove for forward ranges ... crap, worthy of bugzilla
 report.

Probably because it has to swap elements around, which requires a 
bidirectional range.

 Or SList could be special enough to offer a built-in remove with
 predicate done in O(n).

It would be valuable in general IMHO. We should add something similar to the 
STL's remove_if function on linked lists. The fact that std.algorith.remove 
must shuffle elements around instead of actually removing them (similar to what 
the STL does with its free function erase) just shows how it doesn't really 
work all that well. It can be done _much_ more efficiently by putting remove_if 
(or whatever we would call it) on the container. It's also less confusing, 
since std.algorithm.remove (and the STL's erase function) doesn't actually 
_anything_. You have to pass the returned range (or iterator in C++) to the 
container's linearRemove function (or erase in C++) to remove the elements, 
which is _highly_ confusing (but since iterators and ranges don't really know 
about their containers beyond the elements that they reference, it's the best 
that can be done with a free function).

For the case where you're searching for a specific element to remove though, I 
see no reason to add another function. That's what find is for (and given that 
we have lambdas, it's easy to use even in the case where you need a predicate 
other than ==), and if you really want to, you can iterate the range 
explicitly and pass the portion of the range to the container's remove 
function that you want to remove. The main problem with it is that we don't 
have proper examples, and we don't have enough documentation explaining ranges 
in general, so it's not immediately obvious to people how to properly use a 
container's remove function.

- Jonathan M Davis

Oct 31 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 31 Oct 2011 05:16:11 -0400, Dmitry Olshansky  
<dmitry.olsh gmail.com> wrote:

 On 31.10.2011 11:16, Tobias Pankrath wrote:
 Jonathan M Davis wrote:

   find allows
 you to do that just fine, and such a remove function would simply be
 duplicating its functionality.

 But this thread serves as a proof that it is not obvious and something  
 like
 containers should be obvious to use. Deleting an element with a certain
 value is a common task and should should not take
 three function calls, with functions from three different modules.

 You would be doing exactly the same thing in C++ except that it would  
 be
 with an iterator rather than a range. You would use find to find the
 iterator to the element that you wanted and then you'd pass that to the
 list's erase function.

 I don't think we should refer to C++ as an benchmark for ease of use.

 How do you delete every occurence of v? Not just the first one? Is this
 equally "easy" with find and take? I didn't find an easy way not
 envolving a loop within 10 Minutes by reading the documentation.

 It is, let's use remove from std.algorithm! ... though there is no  
 overload of remove for forward ranges ... crap, worthy of bugzilla  
 report. Or SList could be special enough to offer a built-in remove with  
 predicate done in O(n).

 However this should work( yet because of apparent bug in SList it  
 doesn't). The bug is:
 Line 1294 of std.container should be:
 	 for (; !r.empty; r.popFront())
 ...
 or it will remove the whole container.

 Code:
 import std.container, std.range;
 import std.functional, std.algorithm, std.stdio;

 void removeFromList(alias pred, T)(ref SList!T s)
 {
 	static if(is(typeof(pred) == string))
 		alias unaryFun!pred Pred;
 	else
 		alias pred Pred;
 	for(auto r=s[]; !r.empty; )
 	{
 		if(Pred(r.front))
 		{
 			r = s.linearRemove(take(r,1));
 			continue;
 		}
 		else
 			r.popFront();
 	}
 }

 void main()
 {
 	SList!int list(20, 30, 40, 50, 60); // will work with stock Phobos if  
 the first element % 20 != 0
 	removeFromList!"a % 20 == 0"(list);
 	writeln(list[]);
 	
 }

 And more importantly, it still would be horribly slow O(N^2).  
 Personally, because of that I'd prefer hand-rolled intrusive  
 singly-linked list any time of day.

ahem, using dcollections:

foreach(ref doRemove, cell; &organism.purge)
     doRemove = cell.x == x && cell.y == y;

complexity: O(n)

:)

-Steve

Oct 31 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 31.10.2011 18:38, Steven Schveighoffer wrote:
 On Mon, 31 Oct 2011 05:16:11 -0400, Dmitry Olshansky
 <dmitry.olsh gmail.com> wrote:

 On 31.10.2011 11:16, Tobias Pankrath wrote:
 Jonathan M Davis wrote:

 find allows
 you to do that just fine, and such a remove function would simply be
 duplicating its functionality.

 But this thread serves as a proof that it is not obvious and
 something like
 containers should be obvious to use. Deleting an element with a certain
 value is a common task and should should not take
 three function calls, with functions from three different modules.

 You would be doing exactly the same thing in C++ except that it
 would be
 with an iterator rather than a range. You would use find to find the
 iterator to the element that you wanted and then you'd pass that to the
 list's erase function.

 I don't think we should refer to C++ as an benchmark for ease of use.

 How do you delete every occurence of v? Not just the first one? Is this
 equally "easy" with find and take? I didn't find an easy way not
 envolving a loop within 10 Minutes by reading the documentation.

 It is, let's use remove from std.algorithm! ... though there is no
 overload of remove for forward ranges ... crap, worthy of bugzilla
 report. Or SList could be special enough to offer a built-in remove
 with predicate done in O(n).

 However this should work( yet because of apparent bug in SList it
 doesn't). The bug is:
 Line 1294 of std.container should be:
 for (; !r.empty; r.popFront())
 ...
 or it will remove the whole container.

 Code:
 import std.container, std.range;
 import std.functional, std.algorithm, std.stdio;

 void removeFromList(alias pred, T)(ref SList!T s)
 {
 static if(is(typeof(pred) == string))
 alias unaryFun!pred Pred;
 else
 alias pred Pred;
 for(auto r=s[]; !r.empty; )
 {
 if(Pred(r.front))
 {
 r = s.linearRemove(take(r,1));
 continue;
 }
 else
 r.popFront();
 }
 }

 void main()
 {
 SList!int list(20, 30, 40, 50, 60); // will work with stock Phobos if
 the first element % 20 != 0
 removeFromList!"a % 20 == 0"(list);
 writeln(list[]);

 }

 And more importantly, it still would be horribly slow O(N^2).
 Personally, because of that I'd prefer hand-rolled intrusive
 singly-linked list any time of day.

 ahem, using dcollections:

 foreach(ref doRemove, cell; &organism.purge)
 doRemove = cell.x == x && cell.y == y;

 complexity: O(n)

While computer happily does O(n) work here, my brain screams in 
cognitive dissonance figuring out that this assignment actually does 
delete... meh voodoo through and through :)

I'd rather
organism.remove!((cell){ return cell.x == x && cell.y == y; })();
or something like that.
But it's not a question of form, but more of a missing stuff that should 
be there.

 :)

 -Steve


-- 
Dmitry Olshansky

Oct 31 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Mon, 31 Oct 2011 12:40:37 -0400, Dmitry Olshansky  
<dmitry.olsh gmail.com> wrote:

 On 31.10.2011 18:38, Steven Schveighoffer wrote:
 On Mon, 31 Oct 2011 05:16:11 -0400, Dmitry Olshansky
 <dmitry.olsh gmail.com> wrote:

 On 31.10.2011 11:16, Tobias Pankrath wrote:
 Jonathan M Davis wrote:

 find allows
 you to do that just fine, and such a remove function would simply be
 duplicating its functionality.

 But this thread serves as a proof that it is not obvious and
 something like
 containers should be obvious to use. Deleting an element with a  
 certain
 value is a common task and should should not take
 three function calls, with functions from three different modules.

 You would be doing exactly the same thing in C++ except that it
 would be
 with an iterator rather than a range. You would use find to find the
 iterator to the element that you wanted and then you'd pass that to  
 the
 list's erase function.

 I don't think we should refer to C++ as an benchmark for ease of use.

 How do you delete every occurence of v? Not just the first one? Is  
 this
 equally "easy" with find and take? I didn't find an easy way not
 envolving a loop within 10 Minutes by reading the documentation.

 It is, let's use remove from std.algorithm! ... though there is no
 overload of remove for forward ranges ... crap, worthy of bugzilla
 report. Or SList could be special enough to offer a built-in remove
 with predicate done in O(n).

 However this should work( yet because of apparent bug in SList it
 doesn't). The bug is:
 Line 1294 of std.container should be:
 for (; !r.empty; r.popFront())
 ...
 or it will remove the whole container.

 Code:
 import std.container, std.range;
 import std.functional, std.algorithm, std.stdio;

 void removeFromList(alias pred, T)(ref SList!T s)
 {
 static if(is(typeof(pred) == string))
 alias unaryFun!pred Pred;
 else
 alias pred Pred;
 for(auto r=s[]; !r.empty; )
 {
 if(Pred(r.front))
 {
 r = s.linearRemove(take(r,1));
 continue;
 }
 else
 r.popFront();
 }
 }

 void main()
 {
 SList!int list(20, 30, 40, 50, 60); // will work with stock Phobos if
 the first element % 20 != 0
 removeFromList!"a % 20 == 0"(list);
 writeln(list[]);

 }

 And more importantly, it still would be horribly slow O(N^2).
 Personally, because of that I'd prefer hand-rolled intrusive
 singly-linked list any time of day.

 ahem, using dcollections:

 foreach(ref doRemove, cell; &organism.purge)
 doRemove = cell.x == x && cell.y == y;

 complexity: O(n)

 While computer happily does O(n) work here, my brain screams in  
 cognitive dissonance figuring out that this assignment actually does  
 delete... meh voodoo through and through :)

 I'd rather
 organism.remove!((cell){ return cell.x == x && cell.y == y; })();
 or something like that.
 But it's not a question of form, but more of a missing stuff that should  
 be there.

That could also be added (and I agree it's a good idea).  Alas, interface  
templates still don't work, so it'd have to be defined as a convention.

Note that this isn't really the intended usage for purge (though it works  
quite nicely as an afterthought).  Typically, when I use purge, I'm  
iterating through all the elements, processing them somehow, and then  
deciding to remove them or not based on the processing.

For instance, a main thread processing results from sub-threads, and  
removing ones that are finished.

Think of it as an implementation of Java's Iterator.remove:

http://download.oracle.com/javase/7/docs/api/java/util/Iterator.html#remove()

-Steve

Oct 31 2011

Kagamin <spam here.lot> writes:

Steven Schveighoffer Wrote:

 ahem, using dcollections:
 
 foreach(ref doRemove, cell; &organism.purge)
      doRemove = cell.x == x && cell.y == y;
 
 complexity: O(n)

may be a generic iteration handler would be more useful?

foreach(ref handler, cell; &organism.each)
   if(cell.x == x && cell.y == y) handler.removeCurrent();

it could provide a whole api, say, you may want to have lookahead

foreach(ref handler, cell; &organism.each)
   if(cell.x == x && cell.y == y && handler.next.x==0)
      handler.removeCurrent();

Nov 01 2011

travert phare.normalesup.org (Christophe) writes:

Kagamin , dans le message (digitalmars.D.learn:30362), a écrit :
 Steven Schveighoffer Wrote:
 
 ahem, using dcollections:
 
 foreach(ref doRemove, cell; &organism.purge)
      doRemove = cell.x == x && cell.y == y;
 
 complexity: O(n)

 
 may be a generic iteration handler would be more useful?
 
 foreach(ref handler, cell; &organism.each)
    if(cell.x == x && cell.y == y) handler.removeCurrent();
 
 it could provide a whole api, say, you may want to have lookahead
 
 foreach(ref handler, cell; &organism.each)
    if(cell.x == x && cell.y == y && handler.next.x==0)
       handler.removeCurrent();


That's not easier than using Tobias' improved foreach:

foreach(cell, cellRange; organism[])
  if (cell.x == x && cell.y == y)
    organism.remove(cellRange);


If you want to use algorithm specialized and optimized for the 
container, I would prefer to use purge than each + handler. purge seems 
better to me, because it can be specialized for the container and for 
the action, and do not require to learn a new handler structure. "each" 
do not seem to add a lot  to foreach(cell, cellRange; range), since it 
must be able to cope with any operation provided by handler, and any 
operation combinaisons...

Nov 02 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

Looks like it was accidental cross-posting:

On 31.10.2011 20:11, Alessandro Stamatto wrote:
     it still would be horribly slow O(N^2).
     Personally, because of that I'd prefer hand-rolled intrusive
     singly-linked list any time of day.


 Now you're scaring me... You're saying that SList in D not only is 

bugged, but
 a templated remove_if would run in O(N^2) instead of O(N) ????!!!!

Basically I'm saying that with current SList *implementation* it will 
run in O(N^2), because for some reason it always do a check on each 
range passed to remove that it does belong to this list. (i.e. it walks 
from head till hits it or it's own tail)

-- 
Dmitry Olshansky

Oct 31 2011

Tobias Pankrath <tobias pankrath.net> writes:

Dmitry Olshansky wrote:

 And more importantly, it still would be horribly slow O(N^2).
 Personally, because of that I'd prefer hand-rolled intrusive
 singly-linked list any time of day.
 

To be honest, I don't understand this. 
A "remove_if" for non-intrusive single-linked lists should be doable in 
O(N).

I still think the interface to container lacks a important feature here.

Nov 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 03 Nov 2011 10:02:22 -0400, Tobias Pankrath <tobias pankrath.net>  
wrote:

 Dmitry Olshansky wrote:

 And more importantly, it still would be horribly slow O(N^2).
 Personally, because of that I'd prefer hand-rolled intrusive
 singly-linked list any time of day.

 To be honest, I don't understand this.
 A "remove_if" for non-intrusive single-linked lists should be doable in
 O(N).

SList is a poor singly linked list implementation.  It does not support  
O(1) removal.

-Steve

Nov 03 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 03.11.2011 19:34, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 10:02:22 -0400, Tobias Pankrath
 <tobias pankrath.net> wrote:

 Dmitry Olshansky wrote:

 And more importantly, it still would be horribly slow O(N^2).
 Personally, because of that I'd prefer hand-rolled intrusive
 singly-linked list any time of day.

 To be honest, I don't understand this.
 A "remove_if" for non-intrusive single-linked lists should be doable in
 O(N).


Yeah, poor wording going to kill me one day :) And looking at how SList 
works with "checked" remove does O(N^2) turned out to be a context for 
reference for intrusive singly-linked list, just my bad.

As for why I'd rather go with intrusive lists, it's because of it 
usually uses less memory due to node structures being more padding-free.
Anyway the point should have been focused on _hand-rolled_ part, mostly 
because SList is plain not ready for prime time*. And btw singly-linked 
list it's not hard to implement and you can fine tune it how you like it 
e.g. they do play nice with free list allocation strategy. Not to say of 
circular lists and/or using sentinels.
 SList is a poor singly linked list implementation. It does not support
 O(1) removal.

 -Steve

Aye, looking at SList implementation I can say that it sort of tries to 
verify that this is a correct list. Otherwise it would be O(1). Indeed 
passing wrong range/iterator is quite bad in linked lists and could lead 
to all manner of funky bugs, but the cost is horrific. Especially since 
insert doesn't do this extra check.

Actually, it opens up a question of "Checked ranges" akin to "Checked 
iterators" used in many STL implementations. Any ideas what they can 
carry around as an "ID" of list? Root pointer won't cut it, as it can be 
easily removed during iteration. If SList was a class it's reference 
probably would do.

* I think I'll scrap together a pull request to address both these 
issues though.

-- 
Dmitry Olshansky

Nov 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 03 Nov 2011 12:35:36 -0400, Dmitry Olshansky  
<dmitry.olsh gmail.com> wrote:

 On 03.11.2011 19:34, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 10:02:22 -0400, Tobias Pankrath
 <tobias pankrath.net> wrote:

 Dmitry Olshansky wrote:

 And more importantly, it still would be horribly slow O(N^2).
 Personally, because of that I'd prefer hand-rolled intrusive
 singly-linked list any time of day.

 To be honest, I don't understand this.
 A "remove_if" for non-intrusive single-linked lists should be doable in
 O(N).


 Yeah, poor wording going to kill me one day :) And looking at how SList  
 works with "checked" remove does O(N^2) turned out to be a context for  
 reference for intrusive singly-linked list, just my bad.

 As for why I'd rather go with intrusive lists, it's because of it  
 usually uses less memory due to node structures being more padding-free.
 Anyway the point should have been focused on _hand-rolled_ part, mostly  
 because SList is plain not ready for prime time*. And btw singly-linked  
 list it's not hard to implement and you can fine tune it how you like it  
 e.g. they do play nice with free list allocation strategy. Not to say of  
 circular lists and/or using sentinels.
 SList is a poor singly linked list implementation. It does not support
 O(1) removal.

 -Steve

 Aye, looking at SList implementation I can say that it sort of tries to  
 verify that this is a correct list. Otherwise it would be O(1). Indeed  
 passing wrong range/iterator is quite bad in linked lists and could lead  
 to all manner of funky bugs, but the cost is horrific. Especially since  
 insert doesn't do this extra check.

No, it's necessarily doing O(n) search for current node because it has no  
reference to the previous node.

The range type for a SList has a single pointer to the currently iterated  
node.  How do you remove that node without having access to the  
head/previous pointer?

I suggested to Andrei having the range keep track of the *previous*  
node/head, but he did not like that idea (too many dereferences for  
front()).  Another option is to have a removeAllButFirst method, but  
that's kind of awkward...

 Actually, it opens up a question of "Checked ranges" akin to "Checked  
 iterators" used in many STL implementations. Any ideas what they can  
 carry around as an "ID" of list? Root pointer won't cut it, as it can be  
 easily removed during iteration. If SList was a class it's reference  
 probably would do.

dcollections stipulates that all ranges/cursors can be verified in O(lgn)  
time or better to belong to a specific container.  In some cases, this  
adds an extra word to the range/cursor, and in others, it's easy to  
determine or the owner-reference was already needed.  Since everything is  
a class, the fallback is to just stick an owner class instance in the  
range.

This stipulation is necessary to allow safe slicing.

-Steve

Nov 03 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 11/03/2011 06:13 PM, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 12:35:36 -0400, Dmitry Olshansky
 <dmitry.olsh gmail.com> wrote:

 On 03.11.2011 19:34, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 10:02:22 -0400, Tobias Pankrath
 <tobias pankrath.net> wrote:

 Dmitry Olshansky wrote:

 And more importantly, it still would be horribly slow O(N^2).
 Personally, because of that I'd prefer hand-rolled intrusive
 singly-linked list any time of day.

 To be honest, I don't understand this.
 A "remove_if" for non-intrusive single-linked lists should be doable in
 O(N).


 Yeah, poor wording going to kill me one day :) And looking at how
 SList works with "checked" remove does O(N^2) turned out to be a
 context for reference for intrusive singly-linked list, just my bad.

 As for why I'd rather go with intrusive lists, it's because of it
 usually uses less memory due to node structures being more padding-free.
 Anyway the point should have been focused on _hand-rolled_ part,
 mostly because SList is plain not ready for prime time*. And btw
 singly-linked list it's not hard to implement and you can fine tune it
 how you like it e.g. they do play nice with free list allocation
 strategy. Not to say of circular lists and/or using sentinels.
 SList is a poor singly linked list implementation. It does not support
 O(1) removal.

 -Steve

 Aye, looking at SList implementation I can say that it sort of tries
 to verify that this is a correct list. Otherwise it would be O(1).
 Indeed passing wrong range/iterator is quite bad in linked lists and
 could lead to all manner of funky bugs, but the cost is horrific.
 Especially since insert doesn't do this extra check.

 No, it's necessarily doing O(n) search for current node because it has
 no reference to the previous node.

 The range type for a SList has a single pointer to the currently
 iterated node. How do you remove that node without having access to the
 head/previous pointer?

The container interface does not expose references to the 'Node' struct, 
therefore the following approach would be practical:

0. Have a sentinel for end of list. (O(1) additional memory for the 
entire list). It is the only node in the list that has a null 'next' 
pointer.

example, for removing just the current node:
1. if the following node is not the sentinel
1.1. Move value from following node into the current node.
1.2. Remove following node.
2. if the following node is the sentinel
2.1. erase the contents of the current node (iff it has indirections)
2.2. set it's 'next' field to null


Removing an entire range is just erasing the contents of the current 
node and setting the 'next' field of the associated Node to zero. 
Removing a Take!Range is a simple generalization of the algorithm above.

Nov 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 03 Nov 2011 13:47:28 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 11/03/2011 06:13 PM, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 12:35:36 -0400, Dmitry Olshansky
 <dmitry.olsh gmail.com> wrote:

 On 03.11.2011 19:34, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 10:02:22 -0400, Tobias Pankrath
 <tobias pankrath.net> wrote:

 Dmitry Olshansky wrote:

 And more importantly, it still would be horribly slow O(N^2).
 Personally, because of that I'd prefer hand-rolled intrusive
 singly-linked list any time of day.

 To be honest, I don't understand this.
 A "remove_if" for non-intrusive single-linked lists should be doable  
 in
 O(N).


 Yeah, poor wording going to kill me one day :) And looking at how
 SList works with "checked" remove does O(N^2) turned out to be a
 context for reference for intrusive singly-linked list, just my bad.

 As for why I'd rather go with intrusive lists, it's because of it
 usually uses less memory due to node structures being more  
 padding-free.
 Anyway the point should have been focused on _hand-rolled_ part,
 mostly because SList is plain not ready for prime time*. And btw
 singly-linked list it's not hard to implement and you can fine tune it
 how you like it e.g. they do play nice with free list allocation
 strategy. Not to say of circular lists and/or using sentinels.
 SList is a poor singly linked list implementation. It does not support
 O(1) removal.

 -Steve

 Aye, looking at SList implementation I can say that it sort of tries
 to verify that this is a correct list. Otherwise it would be O(1).
 Indeed passing wrong range/iterator is quite bad in linked lists and
 could lead to all manner of funky bugs, but the cost is horrific.
 Especially since insert doesn't do this extra check.

 No, it's necessarily doing O(n) search for current node because it has
 no reference to the previous node.

 The range type for a SList has a single pointer to the currently
 iterated node. How do you remove that node without having access to the
 head/previous pointer?

 The container interface does not expose references to the 'Node' struct,  
 therefore the following approach would be practical:

It does, actually.  A range is a reference to a Node struct.  But I still  
think the following approach will work.

 0. Have a sentinel for end of list. (O(1) additional memory for the  
 entire list). It is the only node in the list that has a null 'next'  
 pointer.

 example, for removing just the current node:
 1. if the following node is not the sentinel
 1.1. Move value from following node into the current node.
 1.2. Remove following node.
 2. if the following node is the sentinel
 2.1. erase the contents of the current node (iff it has indirections)
 2.2. set it's 'next' field to null


 Removing an entire range is just erasing the contents of the current  
 node and setting the 'next' field of the associated Node to zero.  
 Removing a Take!Range is a simple generalization of the algorithm above.

This would be a good addition to the type.  It could not be a  
stableRemove, however, because it would invalidate any ranges pointing at  
the node you removed.  Can you put together a pull request?  If not, I can  
see about doing it.  One issue, you could not do this if the value is  
immutable/const.

I could use this idea, I think, to implement a singly linked list in  
dcollections as well (the prospect of not having O(1) removal is what has  
stopped me).  Thanks for the idea!

-Steve

Nov 03 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 11/03/2011 07:15 PM, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 13:47:28 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 11/03/2011 06:13 PM, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 12:35:36 -0400, Dmitry Olshansky
 <dmitry.olsh gmail.com> wrote:

 On 03.11.2011 19:34, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 10:02:22 -0400, Tobias Pankrath
 <tobias pankrath.net> wrote:

 Dmitry Olshansky wrote:

 And more importantly, it still would be horribly slow O(N^2).
 Personally, because of that I'd prefer hand-rolled intrusive
 singly-linked list any time of day.

 To be honest, I don't understand this.
 A "remove_if" for non-intrusive single-linked lists should be
 doable in
 O(N).


 Yeah, poor wording going to kill me one day :) And looking at how
 SList works with "checked" remove does O(N^2) turned out to be a
 context for reference for intrusive singly-linked list, just my bad.

 As for why I'd rather go with intrusive lists, it's because of it
 usually uses less memory due to node structures being more
 padding-free.
 Anyway the point should have been focused on _hand-rolled_ part,
 mostly because SList is plain not ready for prime time*. And btw
 singly-linked list it's not hard to implement and you can fine tune it
 how you like it e.g. they do play nice with free list allocation
 strategy. Not to say of circular lists and/or using sentinels.
 SList is a poor singly linked list implementation. It does not support
 O(1) removal.

 -Steve

 Aye, looking at SList implementation I can say that it sort of tries
 to verify that this is a correct list. Otherwise it would be O(1).
 Indeed passing wrong range/iterator is quite bad in linked lists and
 could lead to all manner of funky bugs, but the cost is horrific.
 Especially since insert doesn't do this extra check.

 No, it's necessarily doing O(n) search for current node because it has
 no reference to the previous node.

 The range type for a SList has a single pointer to the currently
 iterated node. How do you remove that node without having access to the
 head/previous pointer?

 The container interface does not expose references to the 'Node'
 struct, therefore the following approach would be practical:

 It does, actually. A range is a reference to a Node struct. But I still
 think the following approach will work.

 0. Have a sentinel for end of list. (O(1) additional memory for the
 entire list). It is the only node in the list that has a null 'next'
 pointer.

 example, for removing just the current node:
 1. if the following node is not the sentinel
 1.1. Move value from following node into the current node.
 1.2. Remove following node.
 2. if the following node is the sentinel
 2.1. erase the contents of the current node (iff it has indirections)
 2.2. set it's 'next' field to null


 Removing an entire range is just erasing the contents of the current
 node and setting the 'next' field of the associated Node to zero.
 Removing a Take!Range is a simple generalization of the algorithm above.

 This would be a good addition to the type. It could not be a
 stableRemove, however, because it would invalidate any ranges pointing
 at the node you removed. Can you put together a pull request?

I can do that, but it will have to wait a few days, as I am quite busy 
at the moment.

 If not, I
 can see about doing it. One issue, you could not do this if the value is
 immutable/const.

That is true. But how to fix this? Resolving it by simply not exposing 
the remove method if it is impossible to implement the straightforward 
way would be an option, but I don't think that is satisfying. Probably 
it could be made to work by creating some kind of head mutable 
structure. I will think about it. It should even be possible to 
generalize std.typecons.Rebindable for structs to help this idea.

 I could use this idea, I think, to implement a singly linked list in
 dcollections as well (the prospect of not having O(1) removal is what
 has stopped me). Thanks for the idea!

Nice!

Nov 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 03 Nov 2011 14:50:31 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 11/03/2011 07:15 PM, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 13:47:28 -0400, Timon Gehr <timon.gehr gmx.ch>  
 wrote:

 On 11/03/2011 06:13 PM, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 12:35:36 -0400, Dmitry Olshansky
 <dmitry.olsh gmail.com> wrote:

 On 03.11.2011 19:34, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 10:02:22 -0400, Tobias Pankrath
 <tobias pankrath.net> wrote:

 Dmitry Olshansky wrote:

 And more importantly, it still would be horribly slow O(N^2).
 Personally, because of that I'd prefer hand-rolled intrusive
 singly-linked list any time of day.

 To be honest, I don't understand this.
 A "remove_if" for non-intrusive single-linked lists should be
 doable in
 O(N).


 Yeah, poor wording going to kill me one day :) And looking at how
 SList works with "checked" remove does O(N^2) turned out to be a
 context for reference for intrusive singly-linked list, just my bad.

 As for why I'd rather go with intrusive lists, it's because of it
 usually uses less memory due to node structures being more
 padding-free.
 Anyway the point should have been focused on _hand-rolled_ part,
 mostly because SList is plain not ready for prime time*. And btw
 singly-linked list it's not hard to implement and you can fine tune  
 it
 how you like it e.g. they do play nice with free list allocation
 strategy. Not to say of circular lists and/or using sentinels.
 SList is a poor singly linked list implementation. It does not  
 support
 O(1) removal.

 -Steve

 Aye, looking at SList implementation I can say that it sort of tries
 to verify that this is a correct list. Otherwise it would be O(1).
 Indeed passing wrong range/iterator is quite bad in linked lists and
 could lead to all manner of funky bugs, but the cost is horrific.
 Especially since insert doesn't do this extra check.

 No, it's necessarily doing O(n) search for current node because it has
 no reference to the previous node.

 The range type for a SList has a single pointer to the currently
 iterated node. How do you remove that node without having access to  
 the
 head/previous pointer?

 The container interface does not expose references to the 'Node'
 struct, therefore the following approach would be practical:

 It does, actually. A range is a reference to a Node struct. But I still
 think the following approach will work.

 0. Have a sentinel for end of list. (O(1) additional memory for the
 entire list). It is the only node in the list that has a null 'next'
 pointer.

 example, for removing just the current node:
 1. if the following node is not the sentinel
 1.1. Move value from following node into the current node.
 1.2. Remove following node.
 2. if the following node is the sentinel
 2.1. erase the contents of the current node (iff it has indirections)
 2.2. set it's 'next' field to null


 Removing an entire range is just erasing the contents of the current
 node and setting the 'next' field of the associated Node to zero.
 Removing a Take!Range is a simple generalization of the algorithm  
 above.

 This would be a good addition to the type. It could not be a
 stableRemove, however, because it would invalidate any ranges pointing
 at the node you removed. Can you put together a pull request?

 I can do that, but it will have to wait a few days, as I am quite busy  
 at the moment.

No rush.  I just wanted to know if you would do it, and if not, I would do  
it.

 If not, I
 can see about doing it. One issue, you could not do this if the value is
 immutable/const.

 That is true. But how to fix this? Resolving it by simply not exposing  
 the remove method if it is impossible to implement the straightforward  
 way would be an option, but I don't think that is satisfying. Probably  
 it could be made to work by creating some kind of head mutable  
 structure. I will think about it. It should even be possible to  
 generalize std.typecons.Rebindable for structs to help this idea.

Hm... rebindable doesn't help if the item is a struct.  You'd have to  
store an allocated struct on the heap.

-Steve

Nov 03 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 11/03/2011 09:46 PM, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 14:50:31 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 11/03/2011 07:15 PM, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 13:47:28 -0400, Timon Gehr <timon.gehr gmx.ch>
 wrote:

 On 11/03/2011 06:13 PM, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 12:35:36 -0400, Dmitry Olshansky
 <dmitry.olsh gmail.com> wrote:

 On 03.11.2011 19:34, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 10:02:22 -0400, Tobias Pankrath
 <tobias pankrath.net> wrote:

 Dmitry Olshansky wrote:

 And more importantly, it still would be horribly slow O(N^2).
 Personally, because of that I'd prefer hand-rolled intrusive
 singly-linked list any time of day.

 To be honest, I don't understand this.
 A "remove_if" for non-intrusive single-linked lists should be
 doable in
 O(N).


 Yeah, poor wording going to kill me one day :) And looking at how
 SList works with "checked" remove does O(N^2) turned out to be a
 context for reference for intrusive singly-linked list, just my bad.

 As for why I'd rather go with intrusive lists, it's because of it
 usually uses less memory due to node structures being more
 padding-free.
 Anyway the point should have been focused on _hand-rolled_ part,
 mostly because SList is plain not ready for prime time*. And btw
 singly-linked list it's not hard to implement and you can fine
 tune it
 how you like it e.g. they do play nice with free list allocation
 strategy. Not to say of circular lists and/or using sentinels.
 SList is a poor singly linked list implementation. It does not
 support
 O(1) removal.

 -Steve

 Aye, looking at SList implementation I can say that it sort of tries
 to verify that this is a correct list. Otherwise it would be O(1).
 Indeed passing wrong range/iterator is quite bad in linked lists and
 could lead to all manner of funky bugs, but the cost is horrific.
 Especially since insert doesn't do this extra check.

 No, it's necessarily doing O(n) search for current node because it has
 no reference to the previous node.

 The range type for a SList has a single pointer to the currently
 iterated node. How do you remove that node without having access to
 the
 head/previous pointer?

 The container interface does not expose references to the 'Node'
 struct, therefore the following approach would be practical:

 It does, actually. A range is a reference to a Node struct. But I still
 think the following approach will work.

 0. Have a sentinel for end of list. (O(1) additional memory for the
 entire list). It is the only node in the list that has a null 'next'
 pointer.

 example, for removing just the current node:
 1. if the following node is not the sentinel
 1.1. Move value from following node into the current node.
 1.2. Remove following node.
 2. if the following node is the sentinel
 2.1. erase the contents of the current node (iff it has indirections)
 2.2. set it's 'next' field to null


 Removing an entire range is just erasing the contents of the current
 node and setting the 'next' field of the associated Node to zero.
 Removing a Take!Range is a simple generalization of the algorithm
 above.

 This would be a good addition to the type. It could not be a
 stableRemove, however, because it would invalidate any ranges pointing
 at the node you removed. Can you put together a pull request?

 I can do that, but it will have to wait a few days, as I am quite busy
 at the moment.

 No rush. I just wanted to know if you would do it, and if not, I would
 do it.

 If not, I
 can see about doing it. One issue, you could not do this if the value is
 immutable/const.

 That is true. But how to fix this? Resolving it by simply not exposing
 the remove method if it is impossible to implement the straightforward
 way would be an option, but I don't think that is satisfying. Probably
 it could be made to work by creating some kind of head mutable
 structure. I will think about it. It should even be possible to
 generalize std.typecons.Rebindable for structs to help this idea.

 Hm... rebindable doesn't help if the item is a struct. You'd have to
 store an allocated struct on the heap.

 -Steve

My point is, if you have a struct like this:

struct S{
     const int* x;
     immutable int y;
}

Then that could be stored in the list as

struct _S{
     const(int)* x;
     int y;

     S getS(){return S(x,y);}
}

Without putting the type system under pressure.
But on second thought, std.typecons.Rebindable can probably not 
meaningfully be generalized to structs because it could not deal with 
member functions.

Nov 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 03 Nov 2011 17:13:55 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 11/03/2011 09:46 PM, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 14:50:31 -0400, Timon Gehr <timon.gehr gmx.ch>  
 wrote:

 On 11/03/2011 07:15 PM, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 13:47:28 -0400, Timon Gehr <timon.gehr gmx.ch>
 wrote:

 On 11/03/2011 06:13 PM, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 12:35:36 -0400, Dmitry Olshansky
 <dmitry.olsh gmail.com> wrote:

 On 03.11.2011 19:34, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 10:02:22 -0400, Tobias Pankrath
 <tobias pankrath.net> wrote:

 Dmitry Olshansky wrote:

 And more importantly, it still would be horribly slow O(N^2).
 Personally, because of that I'd prefer hand-rolled intrusive
 singly-linked list any time of day.

 To be honest, I don't understand this.
 A "remove_if" for non-intrusive single-linked lists should be
 doable in
 O(N).


 Yeah, poor wording going to kill me one day :) And looking at how
 SList works with "checked" remove does O(N^2) turned out to be a
 context for reference for intrusive singly-linked list, just my  
 bad.

 As for why I'd rather go with intrusive lists, it's because of it
 usually uses less memory due to node structures being more
 padding-free.
 Anyway the point should have been focused on _hand-rolled_ part,
 mostly because SList is plain not ready for prime time*. And btw
 singly-linked list it's not hard to implement and you can fine
 tune it
 how you like it e.g. they do play nice with free list allocation
 strategy. Not to say of circular lists and/or using sentinels.
 SList is a poor singly linked list implementation. It does not
 support
 O(1) removal.

 -Steve

 Aye, looking at SList implementation I can say that it sort of  
 tries
 to verify that this is a correct list. Otherwise it would be O(1).
 Indeed passing wrong range/iterator is quite bad in linked lists  
 and
 could lead to all manner of funky bugs, but the cost is horrific.
 Especially since insert doesn't do this extra check.

 No, it's necessarily doing O(n) search for current node because it  
 has
 no reference to the previous node.

 The range type for a SList has a single pointer to the currently
 iterated node. How do you remove that node without having access to
 the
 head/previous pointer?

 The container interface does not expose references to the 'Node'
 struct, therefore the following approach would be practical:

 It does, actually. A range is a reference to a Node struct. But I  
 still
 think the following approach will work.

 0. Have a sentinel for end of list. (O(1) additional memory for the
 entire list). It is the only node in the list that has a null 'next'
 pointer.

 example, for removing just the current node:
 1. if the following node is not the sentinel
 1.1. Move value from following node into the current node.
 1.2. Remove following node.
 2. if the following node is the sentinel
 2.1. erase the contents of the current node (iff it has indirections)
 2.2. set it's 'next' field to null


 Removing an entire range is just erasing the contents of the current
 node and setting the 'next' field of the associated Node to zero.
 Removing a Take!Range is a simple generalization of the algorithm
 above.

 This would be a good addition to the type. It could not be a
 stableRemove, however, because it would invalidate any ranges pointing
 at the node you removed. Can you put together a pull request?

 I can do that, but it will have to wait a few days, as I am quite busy
 at the moment.

 No rush. I just wanted to know if you would do it, and if not, I would
 do it.

 If not, I
 can see about doing it. One issue, you could not do this if the value  
 is
 immutable/const.

 That is true. But how to fix this? Resolving it by simply not exposing
 the remove method if it is impossible to implement the straightforward
 way would be an option, but I don't think that is satisfying. Probably
 it could be made to work by creating some kind of head mutable
 structure. I will think about it. It should even be possible to
 generalize std.typecons.Rebindable for structs to help this idea.

 Hm... rebindable doesn't help if the item is a struct. You'd have to
 store an allocated struct on the heap.

 -Steve

 My point is, if you have a struct like this:

 struct S{
      const int* x;
      immutable int y;
 }

 Then that could be stored in the list as

 struct _S{
      const(int)* x;
      int y;

      S getS(){return S(x,y);}
 }

 Without putting the type system under pressure.
 But on second thought, std.typecons.Rebindable can probably not  
 meaningfully be generalized to structs because it could not deal with  
 member functions.

Yes, this looks like a viable solution.  Not sure how it would be done in  
a generic way.  I think what is best is to have getS be like this:

S *getS(){ return cast(S*)&this;}

And never provide access to this

As long as S.opCmp, S.opEquals, and S.toHash are all const, this should be  
no issue (this needs to be a requirement).

I wonder if we really have to go through such trouble...

-Steve

Nov 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 03 Nov 2011 14:50:31 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 11/03/2011 07:15 PM, Steven Schveighoffer wrote:
 I could use this idea, I think, to implement a singly linked list in
 dcollections as well (the prospect of not having O(1) removal is what
 has stopped me). Thanks for the idea!

 Nice!

Looking at dcollections' List interface, the one thing I can't implement  
is back() (i.e. get the last element in the list).  I can implement  
everything else.  It might be worth it to make an exception for this (i.e.  
just throw if someone calls back()) in order to have a singly-linked list  
implementation.

-Steve

Nov 03 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 03.11.2011 21:13, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 12:35:36 -0400, Dmitry Olshansky
 <dmitry.olsh gmail.com> wrote:

 On 03.11.2011 19:34, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 10:02:22 -0400, Tobias Pankrath
 <tobias pankrath.net> wrote:

 Dmitry Olshansky wrote:

 And more importantly, it still would be horribly slow O(N^2).
 Personally, because of that I'd prefer hand-rolled intrusive
 singly-linked list any time of day.

 To be honest, I don't understand this.
 A "remove_if" for non-intrusive single-linked lists should be doable in
 O(N).


 Yeah, poor wording going to kill me one day :) And looking at how
 SList works with "checked" remove does O(N^2) turned out to be a
 context for reference for intrusive singly-linked list, just my bad.

 As for why I'd rather go with intrusive lists, it's because of it
 usually uses less memory due to node structures being more padding-free.
 Anyway the point should have been focused on _hand-rolled_ part,
 mostly because SList is plain not ready for prime time*. And btw
 singly-linked list it's not hard to implement and you can fine tune it
 how you like it e.g. they do play nice with free list allocation
 strategy. Not to say of circular lists and/or using sentinels.
 SList is a poor singly linked list implementation. It does not support
 O(1) removal.

 -Steve

 Aye, looking at SList implementation I can say that it sort of tries
 to verify that this is a correct list. Otherwise it would be O(1).
 Indeed passing wrong range/iterator is quite bad in linked lists and
 could lead to all manner of funky bugs, but the cost is horrific.
 Especially since insert doesn't do this extra check.

 No, it's necessarily doing O(n) search for current node because it has
 no reference to the previous node.

 The range type for a SList has a single pointer to the currently
 iterated node. How do you remove that node without having access to the
 head/previous pointer?

removeAfter ? ;) Hm though it does hit as strange. Looking at my code, I 
usually keep previous node but only when I remove elements during iteration.

 I suggested to Andrei having the range keep track of the *previous*
 node/head, but he did not like that idea (too many dereferences for
 front()). Another option is to have a removeAllButFirst method, but
 that's kind of awkward...

And then using a sentinel as in Timon's idea doesn't look half bad.

 Actually, it opens up a question of "Checked ranges" akin to "Checked
 iterators" used in many STL implementations. Any ideas what they can
 carry around as an "ID" of list? Root pointer won't cut it, as it can
 be easily removed during iteration. If SList was a class it's
 reference probably would do.

 dcollections stipulates that all ranges/cursors can be verified in
 O(lgn) time or better to belong to a specific container. In some cases,
 this adds an extra word to the range/cursor, and in others, it's easy to
 determine or the owner-reference was already needed. Since everything is
 a class, the fallback is to just stick an owner class instance in the
 range.

 This stipulation is necessary to allow safe slicing.

Seems reasonable, I'd expect checks to go away in release, right(?).


-- 
Dmitry Olshansky

Nov 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 03 Nov 2011 14:08:31 -0400, Dmitry Olshansky  
<dmitry.olsh gmail.com> wrote:

 On 03.11.2011 21:13, Steven Schveighoffer wrote:
 dcollections stipulates that all ranges/cursors can be verified in
 O(lgn) time or better to belong to a specific container. In some cases,
 this adds an extra word to the range/cursor, and in others, it's easy to
 determine or the owner-reference was already needed. Since everything is
 a class, the fallback is to just stick an owner class instance in the
 range.

 This stipulation is necessary to allow safe slicing.

 Seems reasonable, I'd expect checks to go away in release, right(?).

For the moment, no.  I am not sure whether this is the right decision or  
not, because once you get beyond arrays, when to do bounds checks becomes  
fuzzy.

For example, imagine you have this:

auto ts = new TreeSet!int(1, 3, 5, 7, 9);

What does this mean?

auto r = ts[2..4];

Note that a range type for a treeset has a pointer to a begin and end node  
for the container.  For arrays, not doing a bounds check is simply less  
code.  For a RBTree, you still have to look for the specific node, even if  
you are in release mode.

Another example:

auto ts2 = ts.dup; // duplicate the treeset

auto c1 = ts2.elemAt(3); // get the node for 3 in ts2

auto r2 = ts[c1..ts.end];

Here I can say, we can skip the belongs check (which is an O(lgn) walk up  
the tree).

But I'm still doing it in release mode.  I'm not sure what's best.  Should  
I just do the least work possible, or should I make it consistent with  
ts[2..4]?

-Steve

Nov 03 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 03.11.2011 22:37, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 14:08:31 -0400, Dmitry Olshansky
 <dmitry.olsh gmail.com> wrote:

 On 03.11.2011 21:13, Steven Schveighoffer wrote:
 dcollections stipulates that all ranges/cursors can be verified in
 O(lgn) time or better to belong to a specific container. In some cases,
 this adds an extra word to the range/cursor, and in others, it's easy to
 determine or the owner-reference was already needed. Since everything is
 a class, the fallback is to just stick an owner class instance in the
 range.

 This stipulation is necessary to allow safe slicing.

 Seems reasonable, I'd expect checks to go away in release, right(?).

 For the moment, no. I am not sure whether this is the right decision or
 not, because once you get beyond arrays, when to do bounds checks
 becomes fuzzy.

 For example, imagine you have this:

 auto ts = new TreeSet!int(1, 3, 5, 7, 9);

 What does this mean?

 auto r = ts[2..4];

 Note that a range type for a treeset has a pointer to a begin and end
 node for the container. For arrays, not doing a bounds check is simply
 less code. For a RBTree, you still have to look for the specific node,
 even if you are in release mode.

 Another example:

 auto ts2 = ts.dup; // duplicate the treeset

 auto c1 = ts2.elemAt(3); // get the node for 3 in ts2

 auto r2 = ts[c1..ts.end];

hm-h will that actually work? I mean searching in ts for node from ts2?

 Here I can say, we can skip the belongs check (which is an O(lgn) walk
 up the tree).

Well, I was mostly talking about this kind of scenario i.e. skip 
checking that cursor do belong to this collection. While in ts[2..4] 
example there is no (explicit) cursors so it's a different situation.

 But I'm still doing it in release mode. I'm not sure what's best. Should
 I just do the least work possible, or should I make it consistent with
 ts[2..4]?

I'd say release mode should avoid as much extra work as possible while 
keeping primary functionality intact, but that's just me.


-- 
Dmitry Olshansky

Nov 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 03 Nov 2011 16:27:46 -0400, Dmitry Olshansky  
<dmitry.olsh gmail.com> wrote:

 On 03.11.2011 22:37, Steven Schveighoffer wrote:
 On Thu, 03 Nov 2011 14:08:31 -0400, Dmitry Olshansky
 <dmitry.olsh gmail.com> wrote:

 On 03.11.2011 21:13, Steven Schveighoffer wrote:
 dcollections stipulates that all ranges/cursors can be verified in
 O(lgn) time or better to belong to a specific container. In some  
 cases,
 this adds an extra word to the range/cursor, and in others, it's easy  
 to
 determine or the owner-reference was already needed. Since everything  
 is
 a class, the fallback is to just stick an owner class instance in the
 range.

 This stipulation is necessary to allow safe slicing.

 Seems reasonable, I'd expect checks to go away in release, right(?).

 For the moment, no. I am not sure whether this is the right decision or
 not, because once you get beyond arrays, when to do bounds checks
 becomes fuzzy.

 For example, imagine you have this:

 auto ts = new TreeSet!int(1, 3, 5, 7, 9);

 What does this mean?

 auto r = ts[2..4];

 Note that a range type for a treeset has a pointer to a begin and end
 node for the container. For arrays, not doing a bounds check is simply
 less code. For a RBTree, you still have to look for the specific node,
 even if you are in release mode.

 Another example:

 auto ts2 = ts.dup; // duplicate the treeset

 auto c1 = ts2.elemAt(3); // get the node for 3 in ts2

 auto r2 = ts[c1..ts.end];

 hm-h will that actually work? I mean searching in ts for node from ts2?

c1 is a cursor, so there is no need to search for it (the point of a  
cursor is to keep a reference to an element for later use).  All you have  
to do is verify it's in the container (and the ordering is valid).

If unchecked, it will result in likely a segfault, because the two  
endpoints are not connected.

 Here I can say, we can skip the belongs check (which is an O(lgn) walk
 up the tree).

 Well, I was mostly talking about this kind of scenario i.e. skip  
 checking that cursor do belong to this collection. While in ts[2..4]  
 example there is no (explicit) cursors so it's a different situation.

So what should happen?  Should it throw?

 But I'm still doing it in release mode. I'm not sure what's best. Should
 I just do the least work possible, or should I make it consistent with
 ts[2..4]?

 I'd say release mode should avoid as much extra work as possible while  
 keeping primary functionality intact, but that's just me.

Yes, it's a dilemma that I'm not sure what the right answer is.  It does  
make sense that release mode should just avoid the checks altogether.

-Steve

Nov 03 2011

Ali =?iso-8859-1?q?=C7ehreli?= <acehreli yahoo.com> writes:

On Thu, 03 Nov 2011 22:08:31 +0400, Dmitry Olshansky wrote:

 On 03.11.2011 21:13, Steven Schveighoffer wrote:
 The range type for a SList has a single pointer to the currently
 iterated node. How do you remove that node without having access to the
 head/previous pointer?

 removeAfter ? ;) Hm though it does hit as strange. Looking at my code, I
 usually keep previous node but only when I remove elements during
 iteration.

Matt Austern's excellent "STL Singly Linked Lists" presentation is 
relevant:

  http://accu.org/index.php/accu_branches/accu_usa/past

Slide 11 is titled "Solving the insert problem: off-by-one iterators". 
Also slide 16.

Ali

Nov 03 2011

Max Wolter <awishformore gmail.com> writes:

On 10/30/2011 9:28 PM, Jonathan M Davis wrote:
 On Sunday, October 30, 2011 20:53:02 Max Wolter wrote:
 Hello there.

 Thank you very much for the explanation.

 However, while I really liked the concept of ranges in Andrei's book and
 a lot of it seems intuitive and faster than using iterators, I can't
 shake the feeling that in this case, it's just unnecessarily convoluted.

 Maybe it's just the fact that the container library is still very basic,
 but I don't think I should go through such a complicated procedure to
 remove an known element from a list. It's just not a "very simple" or
 intuitive solution, which is something I came to love D for thus far.

 You would be doing exactly the same thing in C++ except that it would be with
 an iterator rather than a range. You would use find to find the iterator to the
 element that you wanted and then you'd pass that to the list's erase function.
 It is only more complicated in D in that you get a range which _starts_ with
 the element that you want (essentially, you get the iterator to that element
 plus the end iterator for the list), so if you want to remove only the first
 element, you take it from the front of the range. Other than that, it's the
 same as in C++. You can't remove an element from a list in C++ by giving that
 element to the list anymore than you can in D. In both cases, you need an
 iterator or a range.

 So, in comparison to C++, there's no significant difference. Now, Java does
have
 a remove function which will take an element and remove the first occurence of
 that element from a list, and we could theoretically add one, but why? It
 would just be duplicating the functionality that find already gives. Java
 doesn't use iterators the way the C++ does, and it doesn't have ranges, so it
 _can't_ have a find function the way that C++ and D do, but D can. And in some
 cases, find can do it more efficiently than your loop could have.

 I grant you that if you're not used to using find like this in C++ (and far,
 far too many C++ programmers use loops instead of find - in part because pre-
 C++11, there were no lambdas and any time you need a custom comparison
 function, it's a pain to get one), then it's not immediately intuitive, but
 it's far more flexible and powerful than removing elements by giving the
 element to a remove function on the list. And if you really want to use a
 loop, then you still can. You just can't use foreach.

 for(auto r = organism[]; !r.empty; r.popFront())
 {
      if(r.front.x == x&&  r.front.y == y)
      {
          organism.stableLinearRemove(take(r, 1));
          break;
      }
 }

 but that's a lot more verbose than simply using find, and as I said, in at
 least some cases, find can be more efficient than simply looping, since it's
 optimized for finding elements.

 - Jonathan M Davis

Hello again.

I've read all further replies in this thread, but it seems to me this is 
the most appropriate place to respond.

Simply put, all of those options are too verbose. If what you want to do 
is simple, the syntax should also be simple, this is what I love about 
D. As a side-note, I don't come from Java, but I still expect a 
container to be able to remove an element by passing it to a method. The 
verbosity and the details of this operation should be hidden in the 
implementation of the method and I shouldn't need to know about the 
details. Otherwise, I could just as well implement my own container.

/Max

Nov 01 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 01 Nov 2011 16:53:26 -0400, Max Wolter <awishformore gmail.com>  
wrote:

 On 10/30/2011 9:28 PM, Jonathan M Davis wrote:
 On Sunday, October 30, 2011 20:53:02 Max Wolter wrote:
 Hello there.

 Thank you very much for the explanation.

 However, while I really liked the concept of ranges in Andrei's book  
 and
 a lot of it seems intuitive and faster than using iterators, I can't
 shake the feeling that in this case, it's just unnecessarily  
 convoluted.

 Maybe it's just the fact that the container library is still very  
 basic,
 but I don't think I should go through such a complicated procedure to
 remove an known element from a list. It's just not a "very simple" or
 intuitive solution, which is something I came to love D for thus far.

 You would be doing exactly the same thing in C++ except that it would  
 be with
 an iterator rather than a range. You would use find to find the  
 iterator to the
 element that you wanted and then you'd pass that to the list's erase  
 function.
 It is only more complicated in D in that you get a range which _starts_  
 with
 the element that you want (essentially, you get the iterator to that  
 element
 plus the end iterator for the list), so if you want to remove only the  
 first
 element, you take it from the front of the range. Other than that, it's  
 the
 same as in C++. You can't remove an element from a list in C++ by  
 giving that
 element to the list anymore than you can in D. In both cases, you need  
 an
 iterator or a range.

 So, in comparison to C++, there's no significant difference. Now, Java  
 does have
 a remove function which will take an element and remove the first  
 occurence of
 that element from a list, and we could theoretically add one, but why?  
 It
 would just be duplicating the functionality that find already gives.  
 Java
 doesn't use iterators the way the C++ does, and it doesn't have ranges,  
 so it
 _can't_ have a find function the way that C++ and D do, but D can. And  
 in some
 cases, find can do it more efficiently than your loop could have.

 I grant you that if you're not used to using find like this in C++ (and  
 far,
 far too many C++ programmers use loops instead of find - in part  
 because pre-
 C++11, there were no lambdas and any time you need a custom comparison
 function, it's a pain to get one), then it's not immediately intuitive,  
 but
 it's far more flexible and powerful than removing elements by giving the
 element to a remove function on the list. And if you really want to use  
 a
 loop, then you still can. You just can't use foreach.

 for(auto r = organism[]; !r.empty; r.popFront())
 {
      if(r.front.x == x&&  r.front.y == y)
      {
          organism.stableLinearRemove(take(r, 1));
          break;
      }
 }

 but that's a lot more verbose than simply using find, and as I said, in  
 at
 least some cases, find can be more efficient than simply looping, since  
 it's
 optimized for finding elements.

 - Jonathan M Davis

 Hello again.

 I've read all further replies in this thread, but it seems to me this is  
 the most appropriate place to respond.

 Simply put, all of those options are too verbose. If what you want to do  
 is simple, the syntax should also be simple, this is what I love about  
 D. As a side-note, I don't come from Java, but I still expect a  
 container to be able to remove an element by passing it to a method. The  
 verbosity and the details of this operation should be hidden in the  
 implementation of the method and I shouldn't need to know about the  
 details. Otherwise, I could just as well implement my own container.

The basic response to this is, when dealing with containers generically  
(that is, you know you have a container, but you don't know what type),  
the "remove this element" operation is not necessarily a good primitive to  
have.

Simply because from the myriad of containers, only some can implement this  
operation efficiently.  Java embeds this operation in the interface, which  
means any interface you have to a container could potentially use O(n)  
time to remove that element.  Such an innocuous piece of syntax *should*  
have a cost if it's not efficient IMO.

BTW, the original question doesn't provide enough information to say  
"remove this element."  Even in Java, if you aren't using the default  
comparison, you must use a comparator method to determine which one to  
remove.  If cell.x == x && cell.y == y *is* the comparison operator for  
the type, then the syntax gets much simpler, because you don't need to  
pass a specialized comparison function.

In dcollections, removing a specific element (using the default comparison  
operator for that element) on a *fast lookup* container is as simple as:

container.remove(container.find(x));

Which removes the element x if it's found.  However, this is not defined  
for containers which use O(n) time to search (such as linked list), you  
must use std.algorithm.find for that:

container.remove(find(container[], x).begin);

Should work, and takes O(n) time.

-Steve

Nov 02 2011

Paolo Invernizzi <arathorn fastwebnet.it> writes:

On Nov 2, 2011, at 12:48 PM, Steven Schveighoffer wrote:

 Hello again.
=20
 I've read all further replies in this thread, but it seems to me this =


is the most appropriate place to respond.
=20
 Simply put, all of those options are too verbose. If what you want to =


do is simple, the syntax should also be simple, this is what I love =
about D. As a side-note, I don't come from Java, but I still expect a =
container to be able to remove an element by passing it to a method. The =
verbosity and the details of this operation should be hidden in the =
implementation of the method and I shouldn't need to know about the =
details. Otherwise, I could just as well implement my own container.
=20
 The basic response to this is, when dealing with containers =

generically (that is, you know you have a container, but you don't know =
what type), the "remove this element" operation is not necessarily a =
good primitive to have.
=20
 Simply because from the myriad of containers, only some can implement =

this operation efficiently.  Java embeds this operation in the =
interface, which means any interface you have to a container could =
potentially use O(n) time to remove that element.  Such an innocuous =
piece of syntax *should* have a cost if it's not efficient IMO.
=20
 BTW, the original question doesn't provide enough information to say =

"remove this element."  Even in Java, if you aren't using the default =
comparison, you must use a comparator method to determine which one to =
remove.  If cell.x =3D=3D x && cell.y =3D=3D y *is* the comparison =
operator for the type, then the syntax gets much simpler, because you =
don't need to pass a specialized comparison function.
=20
 In dcollections, removing a specific element (using the default =

comparison operator for that element) on a *fast lookup* container is as =
simple as:
=20
 container.remove(container.find(x));
=20
 Which removes the element x if it's found.  However, this is not =

defined for containers which use O(n) time to search (such as linked =
list), you must use std.algorithm.find for that:
=20
 container.remove(find(container[], x).begin);
=20
 Should work, and takes O(n) time.
=20
 -Steve


I can't really understand what is wrong with an inefficient "remove this =
element" primitive if it's really what the user had to do...
Once it's in the documentation that the operation is inefficient, why =
the user must be forced to dig into the newsgroup to find out some code?=20=

Why can't that be furnished in a free function, for example?

Cheers, Paolo

Nov 02 2011

Ary Manzana <ary esperanto.org.ar> writes:

On 11/2/11 8:48 AM, Steven Schveighoffer wrote:
 On Tue, 01 Nov 2011 16:53:26 -0400, Max Wolter <awishformore gmail.com>
 wrote:

 On 10/30/2011 9:28 PM, Jonathan M Davis wrote:
 On Sunday, October 30, 2011 20:53:02 Max Wolter wrote:
 Hello there.

 Thank you very much for the explanation.

 However, while I really liked the concept of ranges in Andrei's book
 and
 a lot of it seems intuitive and faster than using iterators, I can't
 shake the feeling that in this case, it's just unnecessarily
 convoluted.

 Maybe it's just the fact that the container library is still very
 basic,
 but I don't think I should go through such a complicated procedure to
 remove an known element from a list. It's just not a "very simple" or
 intuitive solution, which is something I came to love D for thus far.

 You would be doing exactly the same thing in C++ except that it would
 be with
 an iterator rather than a range. You would use find to find the
 iterator to the
 element that you wanted and then you'd pass that to the list's erase
 function.
 It is only more complicated in D in that you get a range which
 _starts_ with
 the element that you want (essentially, you get the iterator to that
 element
 plus the end iterator for the list), so if you want to remove only
 the first
 element, you take it from the front of the range. Other than that,
 it's the
 same as in C++. You can't remove an element from a list in C++ by
 giving that
 element to the list anymore than you can in D. In both cases, you
 need an
 iterator or a range.

 So, in comparison to C++, there's no significant difference. Now,
 Java does have
 a remove function which will take an element and remove the first
 occurence of
 that element from a list, and we could theoretically add one, but
 why? It
 would just be duplicating the functionality that find already gives.
 Java
 doesn't use iterators the way the C++ does, and it doesn't have
 ranges, so it
 _can't_ have a find function the way that C++ and D do, but D can.
 And in some
 cases, find can do it more efficiently than your loop could have.

 I grant you that if you're not used to using find like this in C++
 (and far,
 far too many C++ programmers use loops instead of find - in part
 because pre-
 C++11, there were no lambdas and any time you need a custom comparison
 function, it's a pain to get one), then it's not immediately
 intuitive, but
 it's far more flexible and powerful than removing elements by giving the
 element to a remove function on the list. And if you really want to
 use a
 loop, then you still can. You just can't use foreach.

 for(auto r = organism[]; !r.empty; r.popFront())
 {
 if(r.front.x == x&& r.front.y == y)
 {
 organism.stableLinearRemove(take(r, 1));
 break;
 }
 }

 but that's a lot more verbose than simply using find, and as I said,
 in at
 least some cases, find can be more efficient than simply looping,
 since it's
 optimized for finding elements.

 - Jonathan M Davis

 Hello again.

 I've read all further replies in this thread, but it seems to me this
 is the most appropriate place to respond.

 Simply put, all of those options are too verbose. If what you want to
 do is simple, the syntax should also be simple, this is what I love
 about D. As a side-note, I don't come from Java, but I still expect a
 container to be able to remove an element by passing it to a method.
 The verbosity and the details of this operation should be hidden in
 the implementation of the method and I shouldn't need to know about
 the details. Otherwise, I could just as well implement my own container.

 The basic response to this is, when dealing with containers generically
 (that is, you know you have a container, but you don't know what type),
 the "remove this element" operation is not necessarily a good primitive
 to have.

 Simply because from the myriad of containers, only some can implement
 this operation efficiently. Java embeds this operation in the interface,
 which means any interface you have to a container could potentially use
 O(n) time to remove that element. Such an innocuous piece of syntax
 *should* have a cost if it's not efficient IMO.

 BTW, the original question doesn't provide enough information to say
 "remove this element." Even in Java, if you aren't using the default
 comparison, you must use a comparator method to determine which one to
 remove. If cell.x == x && cell.y == y *is* the comparison operator for
 the type, then the syntax gets much simpler, because you don't need to
 pass a specialized comparison function.

 In dcollections, removing a specific element (using the default
 comparison operator for that element) on a *fast lookup* container is as
 simple as:

 container.remove(container.find(x));

 Which removes the element x if it's found. However, this is not defined
 for containers which use O(n) time to search (such as linked list), you
 must use std.algorithm.find for that:

 container.remove(find(container[], x).begin);

 Should work, and takes O(n) time.

 -Steve

I don't really understand what's wrong with inefficient methods. You can 
have inefficient methods that are convenient, like removing an element 
by the default comparison, or giving it a delegate to match the 
element(s) to remove.

You profile your application. Is that method the bottle-neck? If so, you 
change it to a more efficient one. If not, you are happy you had that 
method there, performing in an inefficient way, but which doesn't matter 
that much compared to, say, opening an SQL connection.

Programmers want to program, fast. They have schedules, they need to 
deliver. They don't need to always find the best solution. They can find 
a compromise between "working" and "fast", move on, and later profile 
and worry about what matters most.

Programmers don't want to fight with the language or think "Oh, so to 
remove an element I need to use this operation and combine it with that 
one and with that other one"...

Nov 02 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 02 Nov 2011 08:40:19 -0400, Ary Manzana <ary esperanto.org.ar>  
wrote:

 On 11/2/11 8:48 AM, Steven Schveighoffer wrote:
 The basic response to this is, when dealing with containers generically
 (that is, you know you have a container, but you don't know what type),
 the "remove this element" operation is not necessarily a good primitive
 to have.

 Simply because from the myriad of containers, only some can implement
 this operation efficiently. Java embeds this operation in the interface,
 which means any interface you have to a container could potentially use
 O(n) time to remove that element. Such an innocuous piece of syntax
 *should* have a cost if it's not efficient IMO.

 BTW, the original question doesn't provide enough information to say
 "remove this element." Even in Java, if you aren't using the default
 comparison, you must use a comparator method to determine which one to
 remove. If cell.x == x && cell.y == y *is* the comparison operator for
 the type, then the syntax gets much simpler, because you don't need to
 pass a specialized comparison function.

 In dcollections, removing a specific element (using the default
 comparison operator for that element) on a *fast lookup* container is as
 simple as:

 container.remove(container.find(x));

 Which removes the element x if it's found. However, this is not defined
 for containers which use O(n) time to search (such as linked list), you
 must use std.algorithm.find for that:

 container.remove(find(container[], x).begin);

 Should work, and takes O(n) time.

 -Steve

 I don't really understand what's wrong with inefficient methods. You can  
 have inefficient methods that are convenient, like removing an element  
 by the default comparison, or giving it a delegate to match the  
 element(s) to remove.

 You profile your application. Is that method the bottle-neck? If so, you  
 change it to a more efficient one. If not, you are happy you had that  
 method there, performing in an inefficient way, but which doesn't matter  
 that much compared to, say, opening an SQL connection.

 Programmers want to program, fast. They have schedules, they need to  
 deliver. They don't need to always find the best solution. They can find  
 a compromise between "working" and "fast", move on, and later profile  
 and worry about what matters most.

 Programmers don't want to fight with the language or think "Oh, so to  
 remove an element I need to use this operation and combine it with that  
 one and with that other one"...

Or use the right container for the job?

Where it really comes into play is generic programming.

Let's say I write some algorithm that removes certain elements from a  
container:

removeElements(C, T)(C c, T t[]...)
{
    foreach(x; t)
      c.remove(t);
}

What's the complexity of this algorithm?  For a HashSet, for instance, it  
will be O(n) where n is the number of elements to remove.

But for an ArrayList, it will be O(n*m) where m is the number of elements  
in c.

So what we get is, an algorithm whose complexity depends on the complexity  
of an operation that varies *widely*.  What you end up with is things like  
sorting algorithms which are O(n^2) or even worse.

What omitting those methods from the containers that don't support it does  
is allow you to make *predictably efficient* generic algorithms.  For  
example, you know sorting is going to be at most O(nlgn).  I don't care  
how beautiful the syntax is, a quadratic sort is epic fail, and should be  
avoided at all costs.

std.container tries to allow having these inefficient methods by changing  
the name (so algorithms can still claim efficiency by not using those  
names).  My stance is this makes the API overly complex for very little  
gain.  If something is inefficient, either don't use it, or use the range  
interface via std.algorithm.  Why should I write a linear search algorithm  
when std.algorithm already has done this?

-Steve

Nov 02 2011

Ary Manzana <ary esperanto.org.ar> writes:

On 11/2/11 10:12 AM, Steven Schveighoffer wrote:
 On Wed, 02 Nov 2011 08:40:19 -0400, Ary Manzana <ary esperanto.org.ar>
 wrote:

 On 11/2/11 8:48 AM, Steven Schveighoffer wrote:
 The basic response to this is, when dealing with containers generically
 (that is, you know you have a container, but you don't know what type),
 the "remove this element" operation is not necessarily a good primitive
 to have.

 Simply because from the myriad of containers, only some can implement
 this operation efficiently. Java embeds this operation in the interface,
 which means any interface you have to a container could potentially use
 O(n) time to remove that element. Such an innocuous piece of syntax
 *should* have a cost if it's not efficient IMO.

 BTW, the original question doesn't provide enough information to say
 "remove this element." Even in Java, if you aren't using the default
 comparison, you must use a comparator method to determine which one to
 remove. If cell.x == x && cell.y == y *is* the comparison operator for
 the type, then the syntax gets much simpler, because you don't need to
 pass a specialized comparison function.

 In dcollections, removing a specific element (using the default
 comparison operator for that element) on a *fast lookup* container is as
 simple as:

 container.remove(container.find(x));

 Which removes the element x if it's found. However, this is not defined
 for containers which use O(n) time to search (such as linked list), you
 must use std.algorithm.find for that:

 container.remove(find(container[], x).begin);

 Should work, and takes O(n) time.

 -Steve

 I don't really understand what's wrong with inefficient methods. You
 can have inefficient methods that are convenient, like removing an
 element by the default comparison, or giving it a delegate to match
 the element(s) to remove.

 You profile your application. Is that method the bottle-neck? If so,
 you change it to a more efficient one. If not, you are happy you had
 that method there, performing in an inefficient way, but which doesn't
 matter that much compared to, say, opening an SQL connection.

 Programmers want to program, fast. They have schedules, they need to
 deliver. They don't need to always find the best solution. They can
 find a compromise between "working" and "fast", move on, and later
 profile and worry about what matters most.

 Programmers don't want to fight with the language or think "Oh, so to
 remove an element I need to use this operation and combine it with
 that one and with that other one"...

 Or use the right container for the job?

 Where it really comes into play is generic programming.

 Let's say I write some algorithm that removes certain elements from a
 container:

 removeElements(C, T)(C c, T t[]...)
 {
 foreach(x; t)
 c.remove(t);
 }

 What's the complexity of this algorithm? For a HashSet, for instance, it
 will be O(n) where n is the number of elements to remove.

 But for an ArrayList, it will be O(n*m) where m is the number of
 elements in c.

But I'm sure in this algorithm I have for this app I'm making, my 
collection won't have more than 50 elements. So everything will be O(1). 
I need to remove an element from the collection. I really don't care 
about the complexity of the operation, because if n is 50, everything is 
O(1).

Why can't I have an inefficient (for you) remove operation that, for me, 
will be ok? I don't want to spend 2 hours browsing the ranges algorithms 
to figure out how to combine them to remove an element...

My point is, as someone else said it in another post: add inefficient 
operations and tell the programmer so. Then he can decide what to do. If 
he's sure that that performance penalty is not a big issue for him, 
he'll be happy to have that method available.

Nov 02 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 02 Nov 2011 09:17:39 -0400, Ary Manzana <ary esperanto.org.ar>  
wrote:

 On 11/2/11 10:12 AM, Steven Schveighoffer wrote:
 On Wed, 02 Nov 2011 08:40:19 -0400, Ary Manzana <ary esperanto.org.ar>
 wrote:

 On 11/2/11 8:48 AM, Steven Schveighoffer wrote:
 The basic response to this is, when dealing with containers  
 generically
 (that is, you know you have a container, but you don't know what  
 type),
 the "remove this element" operation is not necessarily a good  
 primitive
 to have.

 Simply because from the myriad of containers, only some can implement
 this operation efficiently. Java embeds this operation in the  
 interface,
 which means any interface you have to a container could potentially  
 use
 O(n) time to remove that element. Such an innocuous piece of syntax
 *should* have a cost if it's not efficient IMO.

 BTW, the original question doesn't provide enough information to say
 "remove this element." Even in Java, if you aren't using the default
 comparison, you must use a comparator method to determine which one to
 remove. If cell.x == x && cell.y == y *is* the comparison operator for
 the type, then the syntax gets much simpler, because you don't need to
 pass a specialized comparison function.

 In dcollections, removing a specific element (using the default
 comparison operator for that element) on a *fast lookup* container is  
 as
 simple as:

 container.remove(container.find(x));

 Which removes the element x if it's found. However, this is not  
 defined
 for containers which use O(n) time to search (such as linked list),  
 you
 must use std.algorithm.find for that:

 container.remove(find(container[], x).begin);

 Should work, and takes O(n) time.

 -Steve

 I don't really understand what's wrong with inefficient methods. You
 can have inefficient methods that are convenient, like removing an
 element by the default comparison, or giving it a delegate to match
 the element(s) to remove.

 You profile your application. Is that method the bottle-neck? If so,
 you change it to a more efficient one. If not, you are happy you had
 that method there, performing in an inefficient way, but which doesn't
 matter that much compared to, say, opening an SQL connection.

 Programmers want to program, fast. They have schedules, they need to
 deliver. They don't need to always find the best solution. They can
 find a compromise between "working" and "fast", move on, and later
 profile and worry about what matters most.

 Programmers don't want to fight with the language or think "Oh, so to
 remove an element I need to use this operation and combine it with
 that one and with that other one"...

 Or use the right container for the job?

 Where it really comes into play is generic programming.

 Let's say I write some algorithm that removes certain elements from a
 container:

 removeElements(C, T)(C c, T t[]...)
 {
 foreach(x; t)
 c.remove(t);
 }

 What's the complexity of this algorithm? For a HashSet, for instance, it
 will be O(n) where n is the number of elements to remove.

 But for an ArrayList, it will be O(n*m) where m is the number of
 elements in c.

 But I'm sure in this algorithm I have for this app I'm making, my  
 collection won't have more than 50 elements. So everything will be O(1).  
 I need to remove an element from the collection. I really don't care  
 about the complexity of the operation, because if n is 50, everything is  
 O(1).

Then your specific application can use std.algorithm.find to implement the  
equivalent.  Only the algorithm body changes, not the usage of the  
algorithm.

 Why can't I have an inefficient (for you) remove operation that, for me,  
 will be ok?

I never said you couldn't (and I've even given examples of such  
implementations).  It's just not neatly packaged into a method.

But again, if the method is exactly the same as the efficient version for  
other containers, it becomes *impossible* to design an algorithm that  
guarantees any sort of complexity.  As I said before, quadratic sort is  
epic fail, and needs to always be avoided.

I'll give you a scenario:

People often complain that Linked List does not have an opIndex on it.   
Yes it's inefficient, but so what?  "I know it's inefficient, let me  
decide whether it's worth it or not."

So let's say I add it to LinkList.  Those people are happy.

But now, LinkList becomes defined as a *random-access-range* according to  
std.ranges.  Therefore, std.algorithm.sort(linklist) compiles!  And is now  
something like O(n^3).

Whereas LinkList already defines a sort method, which uses mergesort  
(O(nlgn)).  So are you going to realize this when reading someones code  
and you see:

sort(somelist);

That it's going to be horribly inefficient?  Why shouldn't we strive for a  
library where such things just don't compile?

 I don't want to spend 2 hours browsing the ranges algorithms to figure  
 out how to combine them to remove an element...

You severely exaggerate the amount of time needed to browse the  
algorithms.  I wrote an equivalent in less than a minute.  And I don't  
even look at std.algorithm maybe once every few months (and didn't look at  
it to write the solution).

 My point is, as someone else said it in another post: add inefficient  
 operations and tell the programmer so. Then he can decide what to do. If  
 he's sure that that performance penalty is not a big issue for him,  
 he'll be happy to have that method available.

The operation is available using std.algorithm.  There is no need to  
reimplement it as a method, sanctioned by the container.  As I said  
before, if you find yourself wanting to remove specific elements from a  
container, perhaps Array is not the smartest choice for a container.   
There are plenty of others.

There is something to be said for a language/library that discourages or  
prevents you from writing dumb code.  It means that I'm that much more  
confident a piece of code written in that language is more efficient than  
one written in another language.

-Steve

Nov 02 2011

Max Wolter <awishformore gmail.com> writes:

On 11/2/2011 2:41 PM, Steven Schveighoffer wrote:
 On Wed, 02 Nov 2011 09:17:39 -0400, Ary Manzana <ary esperanto.org.ar>
 wrote:

 On 11/2/11 10:12 AM, Steven Schveighoffer wrote:
 On Wed, 02 Nov 2011 08:40:19 -0400, Ary Manzana <ary esperanto.org.ar>
 wrote:

 On 11/2/11 8:48 AM, Steven Schveighoffer wrote:
 The basic response to this is, when dealing with containers
 generically
 (that is, you know you have a container, but you don't know what
 type),
 the "remove this element" operation is not necessarily a good
 primitive
 to have.

 Simply because from the myriad of containers, only some can implement
 this operation efficiently. Java embeds this operation in the
 interface,
 which means any interface you have to a container could potentially
 use
 O(n) time to remove that element. Such an innocuous piece of syntax
 *should* have a cost if it's not efficient IMO.

 BTW, the original question doesn't provide enough information to say
 "remove this element." Even in Java, if you aren't using the default
 comparison, you must use a comparator method to determine which one to
 remove. If cell.x == x && cell.y == y *is* the comparison operator for
 the type, then the syntax gets much simpler, because you don't need to
 pass a specialized comparison function.

 In dcollections, removing a specific element (using the default
 comparison operator for that element) on a *fast lookup* container
 is as
 simple as:

 container.remove(container.find(x));

 Which removes the element x if it's found. However, this is not
 defined
 for containers which use O(n) time to search (such as linked list),
 you
 must use std.algorithm.find for that:

 container.remove(find(container[], x).begin);

 Should work, and takes O(n) time.

 -Steve

 I don't really understand what's wrong with inefficient methods. You
 can have inefficient methods that are convenient, like removing an
 element by the default comparison, or giving it a delegate to match
 the element(s) to remove.

 You profile your application. Is that method the bottle-neck? If so,
 you change it to a more efficient one. If not, you are happy you had
 that method there, performing in an inefficient way, but which doesn't
 matter that much compared to, say, opening an SQL connection.

 Programmers want to program, fast. They have schedules, they need to
 deliver. They don't need to always find the best solution. They can
 find a compromise between "working" and "fast", move on, and later
 profile and worry about what matters most.

 Programmers don't want to fight with the language or think "Oh, so to
 remove an element I need to use this operation and combine it with
 that one and with that other one"...

 Or use the right container for the job?

 Where it really comes into play is generic programming.

 Let's say I write some algorithm that removes certain elements from a
 container:

 removeElements(C, T)(C c, T t[]...)
 {
 foreach(x; t)
 c.remove(t);
 }

 What's the complexity of this algorithm? For a HashSet, for instance, it
 will be O(n) where n is the number of elements to remove.

 But for an ArrayList, it will be O(n*m) where m is the number of
 elements in c.

 But I'm sure in this algorithm I have for this app I'm making, my
 collection won't have more than 50 elements. So everything will be
 O(1). I need to remove an element from the collection. I really don't
 care about the complexity of the operation, because if n is 50,
 everything is O(1).

 Then your specific application can use std.algorithm.find to implement
 the equivalent. Only the algorithm body changes, not the usage of the
 algorithm.

 Why can't I have an inefficient (for you) remove operation that, for
 me, will be ok?

 I never said you couldn't (and I've even given examples of such
 implementations). It's just not neatly packaged into a method.

 But again, if the method is exactly the same as the efficient version
 for other containers, it becomes *impossible* to design an algorithm
 that guarantees any sort of complexity. As I said before, quadratic sort
 is epic fail, and needs to always be avoided.

 I'll give you a scenario:

 People often complain that Linked List does not have an opIndex on it.
 Yes it's inefficient, but so what? "I know it's inefficient, let me
 decide whether it's worth it or not."

 So let's say I add it to LinkList. Those people are happy.

 But now, LinkList becomes defined as a *random-access-range* according
 to std.ranges. Therefore, std.algorithm.sort(linklist) compiles! And is
 now something like O(n^3).

 Whereas LinkList already defines a sort method, which uses mergesort
 (O(nlgn)). So are you going to realize this when reading someones code
 and you see:

 sort(somelist);

 That it's going to be horribly inefficient? Why shouldn't we strive for
 a library where such things just don't compile?

 I don't want to spend 2 hours browsing the ranges algorithms to figure
 out how to combine them to remove an element...

 You severely exaggerate the amount of time needed to browse the
 algorithms. I wrote an equivalent in less than a minute. And I don't
 even look at std.algorithm maybe once every few months (and didn't look
 at it to write the solution).

 My point is, as someone else said it in another post: add inefficient
 operations and tell the programmer so. Then he can decide what to do.
 If he's sure that that performance penalty is not a big issue for him,
 he'll be happy to have that method available.

 The operation is available using std.algorithm. There is no need to
 reimplement it as a method, sanctioned by the container. As I said
 before, if you find yourself wanting to remove specific elements from a
 container, perhaps Array is not the smartest choice for a container.
 There are plenty of others.

 There is something to be said for a language/library that discourages or
 prevents you from writing dumb code. It means that I'm that much more
 confident a piece of code written in that language is more efficient
 than one written in another language.

 -Steve

Hello.

You generally arguing this is all nice and good, but this is a very 
specific case.

I am using a LinkList because in my code, the elements are iterated over 
a million times and during this, I add stuff in-between elements all the 
time. However, I will be removing stuff *very* rarely. I am thus using 
the appropriate container and it doesn't matter whether the remove will 
be inefficient.

To put it another way: if removing elements was of crucial importance to 
the performance of my code in the first place, I wouldn't (and 
shouldn't) be using a LinkList. Therefore, implementing an inefficient 
method which does this won't be of consequence. Finally, the fundamental 
statement I'm trying to make here is: adding and removing *single* 
elements should be a straightforward method call for *any* container.

As a side note, your example about generic programming is really neat 
and makes sense; personally, I would never want a linked list with 
indexes and it's also a horrible analogy to the complaint at hand.

/Max

Nov 02 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 02 Nov 2011 19:24:03 -0400, Max Wolter <awishformore gmail.com>  
wrote:

 On 11/2/2011 2:41 PM, Steven Schveighoffer wrote:

 I never said you couldn't (and I've even given examples of such
 implementations). It's just not neatly packaged into a method.

 But again, if the method is exactly the same as the efficient version
 for other containers, it becomes *impossible* to design an algorithm
 that guarantees any sort of complexity. As I said before, quadratic sort
 is epic fail, and needs to always be avoided.

 I'll give you a scenario:

 People often complain that Linked List does not have an opIndex on it.
 Yes it's inefficient, but so what? "I know it's inefficient, let me
 decide whether it's worth it or not."

 So let's say I add it to LinkList. Those people are happy.

 But now, LinkList becomes defined as a *random-access-range* according
 to std.ranges. Therefore, std.algorithm.sort(linklist) compiles! And is
 now something like O(n^3).

 Whereas LinkList already defines a sort method, which uses mergesort
 (O(nlgn)). So are you going to realize this when reading someones code
 and you see:

 sort(somelist);

 That it's going to be horribly inefficient? Why shouldn't we strive for
 a library where such things just don't compile?

 Hello.

 You generally arguing this is all nice and good, but this is a very  
 specific case.

 I am using a LinkList because in my code, the elements are iterated over  
 a million times and during this, I add stuff in-between elements all the  
 time. However, I will be removing stuff *very* rarely. I am thus using  
 the appropriate container and it doesn't matter whether the remove will  
 be inefficient.

A linked list (any list really) is important if you want to maintain  
insertion order.  If that's not important, don't use a list, a hash or  
tree is about as efficient.  There exist (not in D) hybrid container types  
that allow O(1) removal using value, and maintain insertion order.

In fact, the actual remove is not inefficient if you have a reference to  
an element (not just the value).  Unfortunately, for SList, this is not  
the case, but it should be fixed at some point.

But I still maintain, anything in std.container is not just a container  
for your code, it's a container that tries to cater to the most common  
needs.  If you want a remove(value) function, it is trivial to write.

 To put it another way: if removing elements was of crucial importance to  
 the performance of my code in the first place, I wouldn't (and  
 shouldn't) be using a LinkList.

As long as you don't need to search for the element to remove using its  
value, removal in a linked list should be O(1).  A linked list that does  
not allow O(1) removal and O(1) insertion given a topological reference is  
a failure (yes, that includes the current version of SList).

 Therefore, implementing an inefficient method which does this won't be  
 of consequence. Finally, the fundamental statement I'm trying to make  
 here is: adding and removing *single* elements should be a  
 straightforward method call for *any* container.

Adding, yes.  Removing a container-selected element, yes.  Removing a  
*specific* element, no.  Inserting an element in a *specific location*,  
no.  std.container has taken the stance that primitives should reflect the  
efficiency of the operation.

This is not the only valid position to have.  It's just std.container's  
position.  For example, Java allows this.

 As a side note, your example about generic programming is really neat  
 and makes sense; personally, I would never want a linked list with  
 indexes and it's also a horrible analogy to the complaint at hand.

People have asked for it and argued vigorously for it on this newsgroup,  
just as you are arguing for this.

-Steve

Nov 03 2011

Max Wolter <awishformore gmail.com> writes:

 As long as you don't need to search for the element to remove using its
 value, removal in a linked list should be O(1). A linked list that does
 not allow O(1) removal and O(1) insertion given a topological reference
 is a failure (yes, that includes the current version of SList).

Well, thank god that's cleared up.

Time to move on.

/Max

Nov 04 2011

Mike Parker <aldacron gmail.com> writes:

On 11/2/2011 10:41 PM, Steven Schveighoffer wrote:

 Then your specific application can use std.algorithm.find to implement
 the equivalent. Only the algorithm body changes, not the usage of the
 algorithm.

This is the root of the problem. How are we supposed to know that we 
need something from std.algorithm to remove something from a container 
in std.container? It's non-intuitive and non-obvious unless you already 
have more than a passing familiarity with ranges. If we can't have 
remove(element) on a container, then we need some good documentation in 
the std.container module that points the way.

Nov 03 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday, November 03, 2011 16:41:27 Mike Parker wrote:
 On 11/2/2011 10:41 PM, Steven Schveighoffer wrote:
 Then your specific application can use std.algorithm.find to implement
 the equivalent. Only the algorithm body changes, not the usage of the
 algorithm.

 
 This is the root of the problem. How are we supposed to know that we
 need something from std.algorithm to remove something from a container
 in std.container? It's non-intuitive and non-obvious unless you already
 have more than a passing familiarity with ranges. If we can't have
 remove(element) on a container, then we need some good documentation in
 the std.container module that points the way.

It's quite clear that we need to do a better job getting the concept of ranges 
across (probably with at least one explanatory article) and that 
std.container's documentation needs improvement.

- Jonathan M Davis

Nov 03 2011

Jacob Carlborg <doob me.com> writes:

On 2011-11-03 08:41, Mike Parker wrote:
 On 11/2/2011 10:41 PM, Steven Schveighoffer wrote:

 Then your specific application can use std.algorithm.find to implement
 the equivalent. Only the algorithm body changes, not the usage of the
 algorithm.

 This is the root of the problem. How are we supposed to know that we
 need something from std.algorithm to remove something from a container
 in std.container? It's non-intuitive and non-obvious unless you already
 have more than a passing familiarity with ranges. If we can't have
 remove(element) on a container, then we need some good documentation in
 the std.container module that points the way.

What about having a remove method on a container that calls remove in 
std.algorithm, just as a convenience.

-- 
/Jacob Carlborg

Nov 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 03 Nov 2011 03:41:27 -0400, Mike Parker <aldacron gmail.com> wrote:

 On 11/2/2011 10:41 PM, Steven Schveighoffer wrote:

 Then your specific application can use std.algorithm.find to implement
 the equivalent. Only the algorithm body changes, not the usage of the
 algorithm.

 This is the root of the problem. How are we supposed to know that we  
 need something from std.algorithm to remove something from a container  
 in std.container? It's non-intuitive and non-obvious unless you already  
 have more than a passing familiarity with ranges. If we can't have  
 remove(element) on a container, then we need some good documentation in  
 the std.container module that points the way.

The primitive for a container is remove(range).  Ranges are essential to  
containers, and should be the major interface to them.  remove(value)  
translates to remove(findRangeFor(value)).  I'm asserting that we should  
not promote this "shortcut" if it's unavoidably linear in nature,  
otherwise, remove(value) has the stigma of being slow, even for containers  
which can do it quickly.  It's not the only choice, and there are ways to  
argue both sides.  But the fact that one can still do this using  
std.algorithm makes it at least so you can have the best of both worlds --  
it's difficult to do, not impossible, but we can still develop generic  
code with complexity guarantees.

I agree the documentation needs more care.  I think std.container suffers  
 from neglect in other areas too.  I argued this position, however, because  
even though it's not spelled out in std.container's docs, it *is* the  
position std.container is taking.

-Steve

Nov 03 2011

travert phare.normalesup.org (Christophe) writes:

"Steven Schveighoffer" , dans le message (digitalmars.D.learn:30402), a
 The primitive for a container is remove(range).  Ranges are essential to  
 containers, and should be the major interface to them.

Programmers have to learn ranges to use containers. Hiding ranges is not 
helping them.

But here, it is more complicated than that, because range may be more 
powerful than iterators, they are less friendly to use when a single 
element reference has to be used.

c.remove(find(c.all, E));

will not remove the first occurence of E, but all elements beyond E.
so instead we have to write:

c.remove(take(find(c[], E), 1));

Then it's too much.

The problem is that range are not practical to refer to a single element 
in the container. We need to have single-element reference to manipulate 
the range. Then a function should be used to find such one-element 
reference. std.find is already taken, and can hardly be changed 
(although it should be name popUntil), but we can still use a find 
method of the container.

auto r = take(find(c[], E), 1);

should just be written:

auto r = c.find(E);

Then the syntax to remove a single element from c is acceptable:
c.remove(c.find(E)).

Now let us remove several consecutive elements from c, let us say, all 
elements between the value E and the next value F:

| auto r = find(c[], E);
| int i=0;
| foreach(e, r)
|   if (e == F) break;
|   else ++i;
| c.remove(take(r, i+1));

That is not practical at all, and in addition, it is not efficient, 
since r is walked again from E to F.

If we add little methods to single element reference to make them behave 
a little like iterators, we can recreate ranges from two single element 
references, and regain all the power of iterators. To remove all 
elements from E to the next F included:

auto r = c.find( E);
c.remove(r, r.find(F)++);
// or c.remove(range(r, r.find(F)++));

(I use the find method a bit like a joker in this exemple, it is just 
to get the idea).

In conclusion, to refer to a single element of a container for simple 
operations, range looses against iterator. Ranges even loose to refer to 
a range of consecutive elements...
Many alternatives are possible, but a simple iterator structure to refer 
to a single element, and that you can combine to recreate a range (and 
use all range algorithms) would be enough, and would complete the range 
interface to make them have no drawback against iterators.

Nov 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 03 Nov 2011 12:32:06 -0400, Christophe  
<travert phare.normalesup.org> wrote:

 "Steven Schveighoffer" , dans le message (digitalmars.D.learn:30402), a
 The primitive for a container is remove(range).  Ranges are essential to
 containers, and should be the major interface to them.

 Programmers have to learn ranges to use containers. Hiding ranges is not
 helping them.

 But here, it is more complicated than that, because range may be more
 powerful than iterators, they are less friendly to use when a single
 element reference has to be used.

 c.remove(find(c.all, E));

 will not remove the first occurence of E, but all elements beyond E.
 so instead we have to write:

 c.remove(take(find(c[], E), 1));

 Then it's too much.

 The problem is that range are not practical to refer to a single element
 in the container. We need to have single-element reference to manipulate
 the range. Then a function should be used to find such one-element
 reference. std.find is already taken, and can hardly be changed
 (although it should be name popUntil), but we can still use a find
 method of the container.

 auto r = take(find(c[], E), 1);

 should just be written:

 auto r = c.find(E);

 Then the syntax to remove a single element from c is acceptable:
 c.remove(c.find(E)).

 Now let us remove several consecutive elements from c, let us say, all
 elements between the value E and the next value F:

 | auto r = find(c[], E);
 | int i=0;
 | foreach(e, r)
 |   if (e == F) break;
 |   else ++i;
 | c.remove(take(r, i+1));

 That is not practical at all, and in addition, it is not efficient,
 since r is walked again from E to F.

 If we add little methods to single element reference to make them behave
 a little like iterators, we can recreate ranges from two single element
 references, and regain all the power of iterators. To remove all
 elements from E to the next F included:

 auto r = c.find( E);
 c.remove(r, r.find(F)++);
 // or c.remove(range(r, r.find(F)++));

 (I use the find method a bit like a joker in this exemple, it is just
 to get the idea).

 In conclusion, to refer to a single element of a container for simple
 operations, range looses against iterator. Ranges even loose to refer to
 a range of consecutive elements...
 Many alternatives are possible, but a simple iterator structure to refer
 to a single element, and that you can combine to recreate a range (and
 use all range algorithms) would be enough, and would complete the range
 interface to make them have no drawback against iterators.

Preaching to the choir sir :)

http://www.dsource.org/projects/dcollections/browser/branches/d2/concepts.txt

If you can convince Andrei that cursors are a good addition to  
std.container, you have my admiration.  I've tried and failed quite a few  
times.

To illustrate syntax:

auto cursor = c.elemAt(E);
c.remove(c[cursor..c.end]); // note, $ could be used here, but opDollar is  
broken.

Note, this only works if c supports elemAt.  For example, TreeSet supports  
this, LinkList does not.  But dcollections supports even other slicing  
abilities.  For example TreeSet can do this too:

c.remove(c[E..F]);

If you have a container that doesn't support elemAt, you can use  
std.algorithm.find, and the dcollections' range.begin method to get a  
valid cursor:

auto cursor = find(c[], E).begin;

I realize this isn't much better than take(find(c[], E), 1), but I don't  
know if you realize what a task/chore it would be to create a simple  
shortcut in a class for std.algorithm.find.

-Steve

Nov 03 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 02.11.2011, 12:48 Uhr, schrieb Steven Schveighoffer  
<schveiguy yahoo.com>:
 In dcollections, removing a specific element (using the default  
 comparison operator for that element) on a *fast lookup* container is as  
 simple as:

 container.remove(container.find(x));

 Which removes the element x if it's found.  However, this is not defined  
 for containers which use O(n) time to search (such as linked list), you  
 must use std.algorithm.find for that:

 container.remove(find(container[], x).begin);

 Should work, and takes O(n) time.

 -Steve

While I stumble over this. I found that I sometimes need a certain  
container (data structure/algorithm), but with an additional fast removal  
or lookup. So I returned a "handle" (a pointer or array index) for any  
inserted element that you can store (in the element itself for example)  
and use for lookup/removal. Templates bake the correct struct for this:

// The element I want to store

struct Element {
   // ... actual data
   size_t handle; // a pointer to a node in disguise, returned from  
insert(...)
}

// A collection node

struct Node {
   Element e;
   Node* prev;
   Node* next;
}

// Usage

elem.handle = list.insert(elem);
list.remove(elem.handle);

Now you have a double-linked list with O(1) removal, as long as you keep  
the handle around. The limitation is, that the handle has to remain  
constant. So a binary heap with the elements embedded in an array and no  
indirection through a pointer would not work. As soon as elements are  
reorganized, their handle (array index in this case) becomes invalid.
But the point is, that you can reduce the time complexity for removal in  
every data structure like this in an implementation agnostic way (unless  
plain array of elements) and I think it is better than Java's approach  
that just works, but may result in an O(n) lookup operation.
Iterators with remove functionality overlap with this partially. (They  
don't need an extra handle stored somewhere, but generally go over the  
complete list.)

Nov 05 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 05 Nov 2011 17:28:34 -0400, Marco Leise <Marco.Leise gmx.de> wrote:

 Am 02.11.2011, 12:48 Uhr, schrieb Steven Schveighoffer  
 <schveiguy yahoo.com>:
 In dcollections, removing a specific element (using the default  
 comparison operator for that element) on a *fast lookup* container is  
 as simple as:

 container.remove(container.find(x));

 Which removes the element x if it's found.  However, this is not  
 defined for containers which use O(n) time to search (such as linked  
 list), you must use std.algorithm.find for that:

 container.remove(find(container[], x).begin);

 Should work, and takes O(n) time.

 -Steve

 While I stumble over this. I found that I sometimes need a certain  
 container (data structure/algorithm), but with an additional fast  
 removal or lookup. So I returned a "handle" (a pointer or array index)  
 for any inserted element that you can store (in the element itself for  
 example) and use for lookup/removal. Templates bake the correct struct  
 for this:

 // The element I want to store

 struct Element {
    // ... actual data
    size_t handle; // a pointer to a node in disguise, returned from  
 insert(...)
 }

 // A collection node

 struct Node {
    Element e;
    Node* prev;
    Node* next;
 }

 // Usage

 elem.handle = list.insert(elem);
 list.remove(elem.handle);

You should be able to do this with dcollections:

struct Element(T) {
    T data;
    List!T.cursor handle;
}

elem.handle = list.insert(list.end, elem.data); // x is the value being  
inserted
list.remove(elem.handle);  // removes the element you inserted earlier.

Note, you have to specify an insertion point (this is similar to C++).   
But note that you can't exactly create a handle to a list of the Element  
type, because you need that cursor type to define Element.

 Now you have a double-linked list with O(1) removal, as long as you keep  
 the handle around. The limitation is, that the handle has to remain  
 constant. So a binary heap with the elements embedded in an array and no  
 indirection through a pointer would not work. As soon as elements are  
 reorganized, their handle (array index in this case) becomes invalid.
 But the point is, that you can reduce the time complexity for removal in  
 every data structure like this in an implementation agnostic way (unless  
 plain array of elements) and I think it is better than Java's approach  
 that just works, but may result in an O(n) lookup operation.
 Iterators with remove functionality overlap with this partially. (They  
 don't need an extra handle stored somewhere, but generally go over the  
 complete list.)

Definitely.  This is the benefit of iterators/cursors.  The ability to  
refer to exactly one element makes things much more straightforward.  It's  
why dcollections has cursors.

-Steve

Nov 05 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sun, 30 Oct 2011 06:38:30 -0400, Max Wolter <awishformore gmail.com>  
wrote:

 Hello there.

 I seem to be having problems wrapping my head around how to use the  
 ranges in the context of containers in phobos. Specifically, I can't  
 seem to figure out how to remove an element from a linked list.

           foreach(cell; organism)
           {
              if(cell.x == x && cell.y == y)
              {
                 organism.stableLinearRemove(cell);
                 break;
              }
           }

 mind.d(123): Error: function  
 std.container.SList!(Cell).SList.linearRemove (Range r) is not callable  
 using argument types (Cell)
 mind.d(123): Error: cannot implicitly convert expression (cell) of type  
 cell.Cell to Take!(Range)

 I somehow get the feeling such a basic operation should just...work?

I offer an alternative collection library to std.container which has a  
simpler (IMO) interface and features.

There is specifically a feature for removal during iteration.  It would  
look like this:

foreach(ref doRemove, cell; &organism.purge)
{
    doRemove = cell.x == x && cell.y == y; // flag indicating the current  
element should be removed
    // note, not necessary to break here
    if(doRemove)
       break;
}

Note two things, removal during iteration is directly supported, meaning  
it does not derail the iteration; and removal during iteration is a quick  
operation (O(lg(n)) or better) in terms of the individual removal.

If you have a linked list, you can also do it the "range" way:

organism.remove(find!((a){return a.x == x && a.y ==  
y;})(organism[]).begin);

begin is an accessor for a dcollections range that returns a reference  
(cursor) to the first element in the range.

Note also that dcollections' linked list is doubly-linked, so removal is  
O(1).

-Steve

Oct 31 2011

D Programming

C/C++ Programming

Other

digitalmars.D.learn - std.container & ranges