www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - opApply Vs. Ranges: What should take precedence?

reply dsimcha <dsimcha yahoo.com> writes:
I was playing around with dcollections today and it reminded me of a subtle
unresolved issue.  This has been brought up here before, but always buried
deep in some other thread.  I think it deserves its own thread for some
serious debate.

What should take precedence when a class or struct defines both a range
interface and an opApply with one variable?  My vote is for opApply to take
precedence for the following reasons:

1.  In general, if someone defines a range interface, and then also defines an
opApply interface, they probably have a good reason for doing so, since ranges
provide a superset of opApply functionality.

2.  Some things can't be iterated over as efficiently w/o control of the call
stack.  For example, iterating over trees w/ control of the call stack is easy
and efficient.  Iterating without control of the call stack requires a heap
allocation for an explicit stack.

3.  opApply can be more efficient than ranges if the range functions (front,
popFront, empty) are virtual.  For runtime polymorphic class objects, it may
make sense to provide opApply as an optimization, even if the semantics are
the same as range-based foreach.
Nov 15 2009
next sibling parent BLS <windevguy hotmail.de> writes:
On 15/11/2009 20:47, dsimcha wrote:
 2.  Some things can't be iterated over as efficiently w/o control of the call
 stack.  For example, iterating over trees w/ control of the call stack is easy
 and efficient.  Iterating without control of the call stack requires a heap
 allocation for an explicit stack.

This is why we need std.collections. Simply as proof of product for ranges. Or as QA unit, if you prefer that.
Nov 15 2009
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 15 Nov 2009 14:47:23 -0500, dsimcha <dsimcha yahoo.com> wrote:

 I was playing around with dcollections today and it reminded me of a  
 subtle
 unresolved issue.  This has been brought up here before, but always  
 buried
 deep in some other thread.  I think it deserves its own thread for some
 serious debate.

 What should take precedence when a class or struct defines both a range
 interface and an opApply with one variable?  My vote is for opApply to  
 take
 precedence for the following reasons:

 1.  In general, if someone defines a range interface, and then also  
 defines an
 opApply interface, they probably have a good reason for doing so, since  
 ranges
 provide a superset of opApply functionality.

This is only the case where ranges *can* implement foreach. There are other reasons for opApply, for instance overloading, or having multiple indexes.
 2.  Some things can't be iterated over as efficiently w/o control of the  
 call
 stack.  For example, iterating over trees w/ control of the call stack  
 is easy
 and efficient.  Iterating without control of the call stack requires a  
 heap
 allocation for an explicit stack.

Hm... I disagree here. My Tree implementation iterates over all the elements without recursion. To answer the original question -- opApply should be chosen, and this is not debatable. There is only *one* purpose for opApply -- to hook onto foreach. If you defined both opApply and equivalent range functions and range functions where chosen first, opApply would be wasted code. If you want to debate whether opApply should even exist anymore, that is a valid debate, but I think that ranges need to be much more flexible in terms of foreach before that happens. I don't think ranges will ever be useful for defining foreach on an interface. -Steve
Nov 16 2009
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 On Sun, 15 Nov 2009 14:47:23 -0500, dsimcha <dsimcha yahoo.com> wrote:
 I was playing around with dcollections today and it reminded me of a
 subtle
 unresolved issue.  This has been brought up here before, but always
 buried
 deep in some other thread.  I think it deserves its own thread for some
 serious debate.

 What should take precedence when a class or struct defines both a range
 interface and an opApply with one variable?  My vote is for opApply to
 take
 precedence for the following reasons:

 1.  In general, if someone defines a range interface, and then also
 defines an
 opApply interface, they probably have a good reason for doing so, since
 ranges
 provide a superset of opApply functionality.

other reasons for opApply, for instance overloading, or having multiple indexes.
 2.  Some things can't be iterated over as efficiently w/o control of the
 call
 stack.  For example, iterating over trees w/ control of the call stack
 is easy
 and efficient.  Iterating without control of the call stack requires a
 heap
 allocation for an explicit stack.

elements without recursion.

Yes, but looking at your implementation, you have parent pointers, which are necessary anyhow for RB trees. This makes it easier. If you aren't using an RB tree, storing parent pointers counts as an inefficiency. So does requiring a heap allocation for an explicit stack every time a range object is created.
 To answer the original question -- opApply should be chosen, and this is
 not debatable.  There is only *one* purpose for opApply -- to hook onto
 foreach.  If you defined both opApply and equivalent range functions and
 range functions where chosen first, opApply would be wasted code.

Agreed. I'm starting to think that this is enough of a slam-dunk to go in Bugzilla.
 If you want to debate whether opApply should even exist anymore, that is a
 valid debate, but I think that ranges need to be much more flexible in
 terms of foreach before that happens.

As I've said a few times before, I totally agree w/ you that, while the functionality of opApply and ranges overlap, they have different enough goals and design tradeoffs to both have a legitimate place in the language.
Nov 16 2009
parent reply Walter Bright <newshound1 digitalmars.com> writes:
And here I was thinking perhaps opApply should just be dumped in favor 
of ranges.
Nov 20 2009
parent Justin Johansson <no spam.com> writes:
Bill Baxter wrote:
 On Fri, Nov 20, 2009 at 3:36 PM, Walter Bright
 <newshound1 digitalmars.com> wrote:
 And here I was thinking perhaps opApply should just be dumped in favor of
 ranges.

I think the opApply should take precedence. The only reason to define opApply is because foreach uses it. Ranges on the other hand are useful in other situations. --bb

Is is absolutely necessary for opApply to take a ref parameter? (I'm asking from D1 familiarity not D2 so question might not be relevant). My understanding is that by taking a ref parameter there is an additional pointer dereference to get to the actual datum that is the subject of current application. Of course the current regime does appear to allow a foreach to modify the items in the "container" being iterated over, though myself, I've only ever used foreach in readonly mode. Justin Johansson
Nov 20 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Mon, 16 Nov 2009 09:12:52 -0500, dsimcha <dsimcha yahoo.com> wrote:

 == Quote from Steven Schveighoffer (schveiguy yahoo.com)'s article
 My Tree implementation iterates over all the
 elements without recursion.

Yes, but looking at your implementation, you have parent pointers, which are necessary anyhow for RB trees. This makes it easier. If you aren't using an RB tree, storing parent pointers counts as an inefficiency. So does requiring a heap allocation for an explicit stack every time a range object is created.

Ah yes, you are right. Good point. A range is impossible on such a container without an allocated stack.
 To answer the original question -- opApply should be chosen, and this is
 not debatable.  There is only *one* purpose for opApply -- to hook onto
 foreach.  If you defined both opApply and equivalent range functions and
 range functions where chosen first, opApply would be wasted code.

Agreed. I'm starting to think that this is enough of a slam-dunk to go in Bugzilla.

You mean opApply is not preferred? If so, it definitely should go in bugzilla. When I see it, I'll vote it up. -Steve
Nov 16 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Fri, Nov 20, 2009 at 3:36 PM, Walter Bright
<newshound1 digitalmars.com> wrote:
 And here I was thinking perhaps opApply should just be dumped in favor of
 ranges.

I think the opApply should take precedence. The only reason to define opApply is because foreach uses it. Ranges on the other hand are useful in other situations. --bb
Nov 20 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Fri, Nov 20, 2009 at 4:22 PM, Justin Johansson <no spam.com> wrote:
 Bill Baxter wrote:
 On Fri, Nov 20, 2009 at 3:36 PM, Walter Bright
 <newshound1 digitalmars.com> wrote:
 And here I was thinking perhaps opApply should just be dumped in favor =



 ranges.

I think the opApply should take precedence. The only reason to define opApply is because foreach uses it. Ranges on the other hand are useful in other situations. --bb

Is is absolutely necessary for opApply to take a ref parameter? =A0(I'm a=

 from D1 familiarity not D2 so question might not be relevant).

 My understanding is that by taking a ref parameter there is an additional
 pointer dereference to get to the actual datum that is the subject of
 current application. =A0Of course the current regime does appear to allow=

 foreach to modify the items in the "container" being iterated over, thoug=

 myself, I've only ever used foreach in readonly mode.

Making the delegate's arg "const ref" seems to work in D2. And prevents writing to the argument, as expected. --bb
Nov 20 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 20 Nov 2009 19:32:10 -0500, Bill Baxter <wbaxter gmail.com> wrote:

 On Fri, Nov 20, 2009 at 4:22 PM, Justin Johansson <no spam.com> wrote:
 Bill Baxter wrote:
 On Fri, Nov 20, 2009 at 3:36 PM, Walter Bright
 <newshound1 digitalmars.com> wrote:
 And here I was thinking perhaps opApply should just be dumped in  
 favor of
 ranges.

I think the opApply should take precedence. The only reason to define opApply is because foreach uses it. Ranges on the other hand are useful in other situations. --bb

Is is absolutely necessary for opApply to take a ref parameter? (I'm asking from D1 familiarity not D2 so question might not be relevant). My understanding is that by taking a ref parameter there is an additional pointer dereference to get to the actual datum that is the subject of current application. Of course the current regime does appear to allow a foreach to modify the items in the "container" being iterated over, though myself, I've only ever used foreach in readonly mode.

Making the delegate's arg "const ref" seems to work in D2. And prevents writing to the argument, as expected.

It still requires an lvalue, which is not always convenient... I have proposed an enhancement for this, see http://d.puremagic.com/issues/show_bug.cgi?id=2443 -Steve
Nov 23 2009
prev sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Mon, Nov 23, 2009 at 4:27 AM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 On Fri, 20 Nov 2009 19:32:10 -0500, Bill Baxter <wbaxter gmail.com> wrote=

 On Fri, Nov 20, 2009 at 4:22 PM, Justin Johansson <no spam.com> wrote:
 Bill Baxter wrote:
 On Fri, Nov 20, 2009 at 3:36 PM, Walter Bright
 <newshound1 digitalmars.com> wrote:
 And here I was thinking perhaps opApply should just be dumped in favo=





 of
 ranges.

I think the opApply should take precedence. The only reason to define opApply is because foreach uses it. Ranges on the other hand are useful in other situations. --bb

Is is absolutely necessary for opApply to take a ref parameter? =A0(I'm asking from D1 familiarity not D2 so question might not be relevant). My understanding is that by taking a ref parameter there is an addition=



 pointer dereference to get to the actual datum that is the subject of
 current application. =A0Of course the current regime does appear to all=



 foreach to modify the items in the "container" being iterated over,
 though
 myself, I've only ever used foreach in readonly mode.

Making the delegate's arg "const ref" seems to work in D2. And prevents writing to the argument, as expected.

It still requires an lvalue, which is not always convenient... I have proposed an enhancement for this, see http://d.puremagic.com/issues/show_bug.cgi?id=3D2443

Yeh, that enhancement request makes total sense. I was certainly baffled by the inability to use a non-ref type there, too. --bb
Nov 23 2009