digitalmars.D - Iterators for D

Walter Bright (20/20) Nov 06 2006 It's becoming increasingly obvious that D needs iterators. While opApply...

Walter Bright (3/4) Nov 06 2006 Make that .begin

rm (16/22) Nov 06 2006 that would give something like this?

Sean Kelly (8/26) Nov 06 2006 Oh right, that's what doesn't seem to work with the C++ model--such an

rm (5/10) Nov 06 2006 I didn't read the discussion, but wasn't there a topic about

Sean Kelly (12/23) Nov 06 2006 How? Say I do this:

Hasan Aljudy (4/35) Nov 06 2006 instead of:

Bill Baxter (5/46) Nov 06 2006 The point is that you could replace char[] with SomeCollection and the

Dave (6/35) Nov 06 2006 Would this mechanism work for, say, arrays of structs (not struct*[] 's)...

Dave (4/43) Nov 06 2006 I should add - will native arrays of structs (say) provide this by defau...

Walter Bright (2/4) Nov 06 2006 Yes. It would be no different than for native arrays of ints.

Dave (2/7) Nov 06 2006 Cool!

Tom S (6/9) Nov 06 2006 Could you elaborate on how this would look like ? E.g. a usage example

Walter Bright (2/5) Nov 06 2006 With an iterator, what is the key?

Tom S (2/8) Nov 06 2006 You've got a point. I failed to see the big picture here.
Bill Baxter (17/23) Nov 06 2006 It could just be the count. Say I'm trying to linearize a non-linear
rm (3/9) Nov 06 2006 with an iterator, what is the "value"?

Sean Kelly (8/28) Nov 06 2006 Is it worth discussing whether the pointer-style C++ iterators are the

Walter Bright (2/8) Nov 06 2006 Hmm, or it could be done starting with .end and going to .begin.

Sean Kelly (7/17) Nov 06 2006 Yup. About the only potentially weird thing there is that, assuming C++...

Bill Baxter (51/71) Nov 06 2006 How about pointers (unknown length arrays)?

Walter Bright (14/82) Nov 06 2006 It's simpler than that. Pointers can be converted to arrays:

Bill Baxter (22/71) Nov 06 2006 The C++ way really embodies two design choices:

Walter Bright (2/10) Nov 06 2006 That's a very good point. Got any ideas on that?

Craig Black (8/18) Nov 07 2006 When writing custom C++ iterators, I find that end() is not ever necessa...

Tom S (3/8) Nov 07 2006 Also in some cases, 'end()' is really unnatural to write. E.g. with
Kristian Kilpi (42/50) Nov 07 2006 er

Karen Lanrap (15/16) Nov 06 2006 In the general case the structure to be linearized is a directed

Mike Capp (4/7) Nov 06 2006 Wouldn't this make iterators incompatible with e.g. linked lists? Seems ...

Walter Bright (4/12) Nov 06 2006 To work, all it has to do is support [0], and throw an exception if the

Bill Baxter (34/50) Nov 06 2006 This is one point where there's a huge divide between C++ iterators and

Sean Kelly (11/53) Nov 06 2006 For what it's worth, C++98 style iterator traits won't work in D because...

Kirk McDonald (63/92) Nov 06 2006 What are the arguments against that, again? I do not recall, and it's

Walter Bright (18/92) Nov 06 2006 Because D has many cases of implicit dereferencing, I thought it would

Kirk McDonald (34/156) Nov 06 2006 Wouldn't that be p[0] = ...?

Walter Bright (21/80) Nov 06 2006 I must have misunderstood you. I thought you were asking if [n] worked

Bill Baxter (31/45) Nov 06 2006 The only thing that comes to mind is if you get multiple indirections.

Jarrett Billingsley (16/18) Nov 06 2006 Somehow..

Walter Bright (8/32) Nov 06 2006 The problem is if the iterator is an actual pointer, there is no .value

Jarrett Billingsley (4/11) Nov 06 2006 And yet another is to allow overriding of opDeref and opDerefAssign ;)
Sean Kelly (5/34) Nov 06 2006 Is there really any reason to support pointers as iterators though? C++...

Bill Baxter (10/24) Nov 06 2006 Well, they could. It's up to Mr. Compiler Writer if a pointer has a

Walter Bright (10/16) Nov 06 2006 Consider the following struct:

Bill Baxter (10/31) Nov 06 2006 It's a name conflict. They happen. I'd have trouble if I wanted to
Charles D Hixson (15/36) Nov 07 2006 Maybe that's why Ada used "`" as a property marker rather than

Sean Kelly (6/30) Nov 07 2006 I've used it a few times, but generally more when I'm throwing together

Walter Bright (2/5) Nov 06 2006 Can you be more specific about what the problems and solutions are?

Georg Wrede (8/16) Nov 07 2006 For one thing, if pointers still are supported, then all algorithms have...
Sean Kelly (27/33) Nov 07 2006 A direct application of pointers as iterators couldn't be done exactly

Walter Bright (3/14) Nov 07 2006 Hmm, instead of 'pointers as iterators', have 'arrays as iterators'.

Sean Kelly (5/20) Nov 07 2006 I went ahead and wrote everything up during my commute this morning.
Benji Smith (29/40) Nov 07 2006 I agree completely. As someone with a background in Java, C#, and Python...
Bill Baxter (13/28) Nov 07 2006 OOh, I like the direction this is heading.

Ary Manzana (3/8) Nov 07 2006 In fact, Java has them:

Daniel Keep (33/84) Nov 07 2006 Hang on; aren't we back to where we are *right now*?

Bill Baxter (14/35) Nov 07 2006 Close, but right now we don't have a good way to iterate over multiple

Daniel Keep (22/76) Nov 07 2006 Well, as I said, once you have your sequence expressed as a coroutine,

Sean Kelly (7/13) Nov 08 2006 That's pretty much it. Saving the current register data, loading the

Benji Smith (11/21) Nov 08 2006 What about containers where there are multiple valid orders of

Chris Nicholson-Sauls (13/39) Nov 08 2006 It should be roughly equivelant in cost to explicitly calling the method...

Sean Kelly (17/106) Nov 08 2006 Essentially. But D has no formal description of an iterator type. I

Bill Baxter (15/137) Nov 08 2006 I wrote some mesh (graph) manipulation algorithms in Python. I really

=?iso-8859-1?q?Knud_S=F8rensen?= (3/8) Nov 06 2006 What are those cases ? Maybe we can find a way to fix the problem with

Walter Bright (4/14) Nov 06 2006 One such case is the usefulness of being able to provide an input

Bill Baxter (12/27) Nov 06 2006 Another case is the desire to iterate over multiple containers at the
KlausO (6/10) Nov 07 2006 Some questions:

Chris Nicholson-Sauls (67/80) Nov 07 2006 Mostly agreed. For example, one case I seem to find mentioned a few tim...

Bill Baxter (5/23) Nov 07 2006 Other way around. Tree traversal is harder to do with C++ style

Craig Black (6/26) Nov 07 2006 It's not a huge problem, but passing an argument needlessly is both ugly...
Oskar Linde (107/121) Nov 07 2006 I apologize for this long post. Writing too much has become a bad habit

nazo (11/17) Nov 07 2006 I think that this is more better.
Walter Bright (12/53) Nov 07 2006 Right.

Dave (6/45) Nov 12 2006 For arrays, we already have arr.ptr, what if array.end was simply added?

Burton Radons (53/53) Nov 10 2006 I don't like C++ iterators; they're a very old design made to fit a

Walter Bright (2/2) Nov 10 2006 As nice as yield is for some types of problems, they aren't going to

Walter Bright <newshound digitalmars.com> writes:

It's becoming increasingly obvious that D needs iterators. While opApply 
  is far better for some kinds of iteration (such as recursively 
traversing a directory), iterators are more efficient in some cases, and 
allow for things that opApply makes difficult.

So hear are the design goals:

1) Works with dynamic and stat arrays
2) Doesn't need to work with associative arrays
3) Should be possible to achieve "pointer efficiency" with it
4) Needs to be able to restrict lvalue access (what C++ does with const 
iterators)
5) Needs to work seamlessly with foreach
6) Iterators need to be copyable

So here's one design:

.being property returns an iterator that starts at the beginning

.end returns an iterator that is at the end

Overloading * doesn't work with D. But, instead:

Overload opIndex for rvalue access
Overload opIndexAssign for lvalue access

Overloading opIndex also will work for random access

foreach loops will not be able to have a 'key' parameter.

Nov 06 2006

Walter Bright <newshound digitalmars.com> writes:

Walter Bright wrote:
 .being property returns an iterator that starts at the beginning

Make that .begin

Also, overloading += will be used to advance the iterator.

Nov 06 2006

rm <roel.mathys gmail.com> writes:

Walter Bright wrote:
 Walter Bright wrote:
 .being property returns an iterator that starts at the beginning

 
 Make that .begin
 
 Also, overloading += will be used to advance the iterator.

that would give something like this?

import std.stdio;

void main()
{
	char[] s = "hello";
	// += for advancing an iterator, why not ++?
	for (iterator i = s.begin; i != s.end; i += 1)
		writefln(s[i]);

	// ??? i is an iterator or a char  ???
	foreach(i; s)
		// extend writefln to take iterators ?
		// but that would mean that it must be possible
		// to determine a value from an iterator
		writefln(i);
}

Nov 06 2006

Sean Kelly <sean f4.ca> writes:

rm wrote:
 Walter Bright wrote:
 Walter Bright wrote:
 .being property returns an iterator that starts at the beginning

 Make that .begin

 Also, overloading += will be used to advance the iterator.

 
 that would give something like this?
 
 import std.stdio;
 
 void main()
 {
     char[] s = "hello";
     // += for advancing an iterator, why not ++?
     for (iterator i = s.begin; i != s.end; i += 1)
         writefln(s[i]);

Oh right, that's what doesn't seem to work with the C++ model--such an 
iterator wouldn't work in foreach:

     foreach( e; s.begin() ) // nowhere to supply s.end()

I think the Java style may be more appropriate:

     foreach( e; s.fwdIter() )
     foreach( e; s.revIter() )


Sean

Nov 06 2006

rm <roel.mathys gmail.com> writes:

 I think the Java style may be more appropriate:
 
     foreach( e; s.fwdIter() )
     foreach( e; s.revIter() )
 

I didn't read the discussion, but wasn't there a topic about 
foreach_reverse? Should that not cover s.revIter()?

Secondly, why must one give in the foreach statement the first iterator?
I'd say that there the iterable must be given?

roel

Nov 06 2006

Sean Kelly <sean f4.ca> writes:

rm wrote:
 
 I think the Java style may be more appropriate:

     foreach( e; s.fwdIter() )
     foreach( e; s.revIter() )

 
 I didn't read the discussion, but wasn't there a topic about 
 foreach_reverse? Should that not cover s.revIter()?

How?  Say I do this:

     foreach_reverse( e; s.begin() )

I've passed an iterator pointing to the beginning of a sequence to an 
algorithm that wants to iterate backwards, so best case the iterator is 
bidirectional and we will visit exactly one element (the theoretical 
"last" in this case) before exiting the loop.

 Secondly, why must one give in the foreach statement the first iterator?
 I'd say that there the iterable must be given?

Something like that, yes.  You need a unified object that knows when 
it's reached the end of the sequence.  So assuming C++ style iterators 
you'd have to wrap them in an object that takes care of the comparisons 
and looping to work with foreach.


Sean

Nov 06 2006

Hasan Aljudy <hasan.aljudy gmail.com> writes:

rm wrote:
 Walter Bright wrote:
 Walter Bright wrote:
 .being property returns an iterator that starts at the beginning

 Make that .begin

 Also, overloading += will be used to advance the iterator.

 
 that would give something like this?
 
 import std.stdio;
 
 void main()
 {
     char[] s = "hello";
     // += for advancing an iterator, why not ++?
     for (iterator i = s.begin; i != s.end; i += 1)
         writefln(s[i]);
 
     // ??? i is an iterator or a char  ???
     foreach(i; s)
         // extend writefln to take iterators ?
         // but that would mean that it must be possible
         // to determine a value from an iterator
         writefln(i);
 }
 

Why would I use
     for (iterator i = s.begin; i != s.end; i += 1)
         writefln(s[i]);

instead of:

     for (int i = 0; i < s.length; i++)
         writefln(s[i]);

?

Nov 06 2006

Bill Baxter <dnewsgroup billbaxter.com> writes:

Hasan Aljudy wrote:
 
 
 rm wrote:
 Walter Bright wrote:
 Walter Bright wrote:
 .being property returns an iterator that starts at the beginning

 Make that .begin

 Also, overloading += will be used to advance the iterator.

 that would give something like this?

 import std.stdio;

 void main()
 {
     char[] s = "hello";
     // += for advancing an iterator, why not ++?
     for (iterator i = s.begin; i != s.end; i += 1)
         writefln(s[i]);

     // ??? i is an iterator or a char  ???
     foreach(i; s)
         // extend writefln to take iterators ?
         // but that would mean that it must be possible
         // to determine a value from an iterator
         writefln(i);
 }

 
 Why would I use
  >     for (iterator i = s.begin; i != s.end; i += 1)
  >         writefln(s[i]);
 
 instead of:
 
  >     for (int i = 0; i < s.length; i++)
  >         writefln(s[i]);
 
 ?

The point is that you could replace char[] with SomeCollection and the 
code should still work.  But the same mechanism should also work on the 
built-in types.

--bb

Nov 06 2006

Dave <Dave_member pathlink.com> writes:

Walter Bright wrote:
 It's becoming increasingly obvious that D needs iterators. While opApply 
  is far better for some kinds of iteration (such as recursively 
 traversing a directory), iterators are more efficient in some cases, and 
 allow for things that opApply makes difficult.
 
 So hear are the design goals:
 
 1) Works with dynamic and stat arrays
 2) Doesn't need to work with associative arrays
 3) Should be possible to achieve "pointer efficiency" with it
 4) Needs to be able to restrict lvalue access (what C++ does with const 
 iterators)
 5) Needs to work seamlessly with foreach
 6) Iterators need to be copyable
 
 So here's one design:
 
 .being property returns an iterator that starts at the beginning
 
 .end returns an iterator that is at the end
 
 Overloading * doesn't work with D. But, instead:
 
 Overload opIndex for rvalue access
 Overload opIndexAssign for lvalue access
 
 Overloading opIndex also will work for random access
 
 foreach loops will not be able to have a 'key' parameter.

Would this mechanism work for, say, arrays of structs (not struct*[] 's) so
that we could avoid the 
overhead of either a) copying the struct and/or b) the extra level of
indirection in the loop? (In 
other words, could the iterator provide a reference to each element by default).

Thanks,

- Dave

Nov 06 2006

Dave <Dave_member pathlink.com> writes:

Dave wrote:
 Walter Bright wrote:
 It's becoming increasingly obvious that D needs iterators. While 
 opApply  is far better for some kinds of iteration (such as 
 recursively traversing a directory), iterators are more efficient in 
 some cases, and allow for things that opApply makes difficult.

 So hear are the design goals:

 1) Works with dynamic and stat arrays
 2) Doesn't need to work with associative arrays
 3) Should be possible to achieve "pointer efficiency" with it
 4) Needs to be able to restrict lvalue access (what C++ does with 
 const iterators)
 5) Needs to work seamlessly with foreach
 6) Iterators need to be copyable

 So here's one design:

 .being property returns an iterator that starts at the beginning

 .end returns an iterator that is at the end

 Overloading * doesn't work with D. But, instead:

 Overload opIndex for rvalue access
 Overload opIndexAssign for lvalue access

 Overloading opIndex also will work for random access

 foreach loops will not be able to have a 'key' parameter.

 
 Would this mechanism work for, say, arrays of structs (not struct*[] 's) 
 so that we could avoid the overhead of either a) copying the struct 
 and/or b) the extra level of indirection in the loop? (In other words, 
 could the iterator provide a reference to each element by default).
 
 Thanks,
 
 - Dave

I should add - will native arrays of structs (say) provide this by default w/o
having to add 
specialized operators?

Thanks.

Nov 06 2006

Walter Bright <newshound digitalmars.com> writes:

Dave wrote:
 I should add - will native arrays of structs (say) provide this by 
 default w/o having to add specialized operators?

Yes. It would be no different than for native arrays of ints.

Nov 06 2006

Dave <Dave_member pathlink.com> writes:

Walter Bright wrote:
 Dave wrote:
 I should add - will native arrays of structs (say) provide this by 
 default w/o having to add specialized operators?

 
 Yes. It would be no different than for native arrays of ints.

Cool!

Nov 06 2006

Tom S <h3r3tic remove.mat.uni.torun.pl> writes:

Walter Bright wrote:
 Overload opIndex for rvalue access
 Overload opIndexAssign for lvalue access

Could you elaborate on how this would look like ? E.g. a usage example 
of iterators with the proposed design.


 foreach loops will not be able to have a 'key' parameter.

What's the particular reason for that restriction ?


--
Tomasz Stachowiak

Nov 06 2006

Walter Bright <newshound digitalmars.com> writes:

Tom S wrote:
 Walter Bright wrote:
 foreach loops will not be able to have a 'key' parameter.

 What's the particular reason for that restriction ?

With an iterator, what is the key?

Nov 06 2006

Tom S <h3r3tic remove.mat.uni.torun.pl> writes:

Walter Bright wrote:
 Tom S wrote:
 Walter Bright wrote:
 foreach loops will not be able to have a 'key' parameter.

 What's the particular reason for that restriction ?

 
 With an iterator, what is the key?

You've got a point. I failed to see the big picture here.

Nov 06 2006

Bill Baxter <dnewsgroup billbaxter.com> writes:

Walter Bright wrote:
 Tom S wrote:
 Walter Bright wrote:
 foreach loops will not be able to have a 'key' parameter.

 What's the particular reason for that restriction ?

 
 With an iterator, what is the key?

It could just be the count.  Say I'm trying to linearize a non-linear 
datastructure into a pre-existing array:


    foreach(int i, val v; hierarchical_thing.iterator())
    {
       linearized[i] = val;
    }

Also having the count there would make it easier to do something like 
iterating lock step in funky ways over different objects:

    foreach(int i, val v; thing.randomized_iterator()) {
       other_thing[i] = val;
    }

Obviously you could count things yourself if you had to.
But I find for instance in Python that I use 'enumerate()' quite often. 
  enumerate() takes an iterator and returns an iterator that spits out 
(count,value) tuples.

--bb

Nov 06 2006

rm <roel.mathys gmail.com> writes:

Walter Bright wrote:
 Tom S wrote:
 Walter Bright wrote:
 foreach loops will not be able to have a 'key' parameter.

 What's the particular reason for that restriction ?

 
 With an iterator, what is the key?

with an iterator, what is the "value"?
Is it the key-value *pair*, or is it just the key?

Nov 06 2006

Sean Kelly <sean f4.ca> writes:

Walter Bright wrote:
 It's becoming increasingly obvious that D needs iterators. While opApply 
  is far better for some kinds of iteration (such as recursively 
 traversing a directory), iterators are more efficient in some cases, and 
 allow for things that opApply makes difficult.
 
 So hear are the design goals:
 
 1) Works with dynamic and stat arrays
 2) Doesn't need to work with associative arrays
 3) Should be possible to achieve "pointer efficiency" with it
 4) Needs to be able to restrict lvalue access (what C++ does with const 
 iterators)
 5) Needs to work seamlessly with foreach
 6) Iterators need to be copyable
 
 So here's one design:
 
 ..being property returns an iterator that starts at the beginning
 
 ..end returns an iterator that is at the end

Is it worth discussing whether the pointer-style C++ iterators are the 
best approach?  They certainly are if random access iterators are a 
requirement, but if we're just implementing forward and reverse 
iterators, the all-in-one Java approach may be better.  In that vein, I 
  assume alose that reverse iteration would work with reverse iterators 
and an .rbegin type method instead of .begin and foreach_reverse?


Sean

Nov 06 2006

Walter Bright <newshound digitalmars.com> writes:

Sean Kelly wrote:
 Is it worth discussing whether the pointer-style C++ iterators are the 
 best approach?  They certainly are if random access iterators are a 
 requirement, but if we're just implementing forward and reverse 
 iterators, the all-in-one Java approach may be better.  In that vein, I 
  assume alose that reverse iteration would work with reverse iterators 
 and an .rbegin type method instead of .begin and foreach_reverse?

Hmm, or it could be done starting with .end and going to .begin.

Nov 06 2006

Sean Kelly <sean f4.ca> writes:

Walter Bright wrote:
 Sean Kelly wrote:
 Is it worth discussing whether the pointer-style C++ iterators are the 
 best approach?  They certainly are if random access iterators are a 
 requirement, but if we're just implementing forward and reverse 
 iterators, the all-in-one Java approach may be better.  In that vein, 
 I  assume alose that reverse iteration would work with reverse 
 iterators and an .rbegin type method instead of .begin and 
 foreach_reverse?

 
 Hmm, or it could be done starting with .end and going to .begin.

Yup.  About the only potentially weird thing there is that, assuming C++ 
  style iterators, it would require reverse iterators and forward 
iterators to be comparable to one another.  And, I suppose, for .end to 
return a reverse style iterator even for sequences that don't support 
reverse iteration?


Sean

Nov 06 2006

Bill Baxter <wbaxter gmail.com> writes:

Walter Bright wrote:
 It's becoming increasingly obvious that D needs iterators. While opApply 
  is far better for some kinds of iteration (such as recursively 
 traversing a directory), iterators are more efficient in some cases, and 
 allow for things that opApply makes difficult.
 
 So hear are the design goals:
 
 1) Works with dynamic and stat arrays

How about pointers (unknown length arrays)?
   int *some_numbers;

Is it required that a pointer be a valid iterator?
I.e. should it be possible to have this template function:

    print_all(some_numbers, some_numbers+theLength);

Or will that case require a wrapper object that contains theLength?

    auto iter = PointerIterator(some_numbers, theLength);
    print_all(iter);

 2) Doesn't need to work with associative arrays

Why not?  Do you mean it doesn't need to because you can just iterate 
over keys or values?

 3) Should be possible to achieve "pointer efficiency" with it
 4) Needs to be able to restrict lvalue access (what C++ does with const 
 iterators)
 5) Needs to work seamlessly with foreach
 6) Iterators need to be copyable

 So here's one design:
 
 .being property returns an iterator that starts at the beginning
 
 .end returns an iterator that is at the end

That's like C++'s way.  Iterator is basically a generalized pointer.

  is like a pointer that knows it's own limits.

the C++ way working for D because, in practice, references and 
dereferencing get used pretty heavily there.  On the other hand 


One of the major design goals for C++ was that for a basic array a 
regular pointer should be a valid iterator.  Thus the iterator cannot 
know where it's own end() is, because a pointer does not have such 
knowledge.  But aside from cute demos on SGI's STL web page where 
algorithms are applied to plain arrays using pointers,  I've never seen 
that sort of thing used in the real world.  You almost always use a 
std::vector or std::list or something besides a raw array.  So pointer 
equivalence seems to be pretty useless to me.  Even moreso with D.  In 
the rare case you need to work with raw pointers, you can always make an 
iterator wrapper.

On the plus side for C++ style:
* iterators are lightweight (just one size_t/pointer in most cases)
* algorithms and methods all take begin/end pairs making things uniform.
Minuses
* Iterators are dumb and error prone and easily go out of bounds
* algorithms all require begin/end pairs making things cumbersome

I think the minuses outweigh the plusses.
But maybe it's useful to take one concrete example:  A generic mergesort 
or quicksort algorithm.

In C++ recursive calls just pass the limits of the range to the children
mergesort(begin, end)
{
     ...
     mergesort(begin, (end-begin)/2);
     mergesort((end-begin)/2, end);
}
It's very efficient.

In the iterator-is-object approach I'm not sure how that works.
If all you've got is .next() or .hasNext() I'm not sure what you're 
supposed to do to create an iterator over half the range represented by 
the original iterator.


 foreach loops will not be able to have a 'key' parameter.

Don't understand that one.


--bb

Nov 06 2006

Walter Bright <newshound digitalmars.com> writes:

Bill Baxter wrote:
 Walter Bright wrote:
 It's becoming increasingly obvious that D needs iterators. While 
 opApply  is far better for some kinds of iteration (such as 
 recursively traversing a directory), iterators are more efficient in 
 some cases, and allow for things that opApply makes difficult.

 So hear are the design goals:

 1) Works with dynamic and stat arrays

 
 How about pointers (unknown length arrays)?
   int *some_numbers;
 
 Is it required that a pointer be a valid iterator?
 I.e. should it be possible to have this template function:
 
    print_all(some_numbers, some_numbers+theLength);
 
 Or will that case require a wrapper object that contains theLength?
 
    auto iter = PointerIterator(some_numbers, theLength);
    print_all(iter);

It's simpler than that. Pointers can be converted to arrays:
	auto array = some_numbers[0 .. theLength];
and then you can get the iterator.

 2) Doesn't need to work with associative arrays

 Why not?  Do you mean it doesn't need to because you can just iterate 
 over keys or values?

Because that falls into my big complaint with iterators - there's no way 
to efficiently 'linearize' a binary tree. Might as well just iterate 
over the keys or values.


 That's like C++'s way.  Iterator is basically a generalized pointer.

Yes.


  is like a pointer that knows it's own limits.

 the C++ way working for D because, in practice, references and 
 dereferencing get used pretty heavily there.  On the other hand 


I think the C++ like way will be a lot more efficient, and I think it 
will work. Reference types are not needed.

 One of the major design goals for C++ was that for a basic array a 
 regular pointer should be a valid iterator.  Thus the iterator cannot 
 know where it's own end() is, because a pointer does not have such 
 knowledge.  But aside from cute demos on SGI's STL web page where 
 algorithms are applied to plain arrays using pointers,  I've never seen 
 that sort of thing used in the real world.  You almost always use a 
 std::vector or std::list or something besides a raw array.  So pointer 
 equivalence seems to be pretty useless to me.  Even moreso with D.  In 
 the rare case you need to work with raw pointers, you can always make an 
 iterator wrapper.

I think it'll work fine, since pointers can be easily converted to 
dynamic arrays (see above).


 On the plus side for C++ style:
 * iterators are lightweight (just one size_t/pointer in most cases)
 * algorithms and methods all take begin/end pairs making things uniform.
 Minuses
 * Iterators are dumb and error prone and easily go out of bounds
 * algorithms all require begin/end pairs making things cumbersome
 
 I think the minuses outweigh the plusses.
 But maybe it's useful to take one concrete example:  A generic mergesort 
 or quicksort algorithm.
 
 In C++ recursive calls just pass the limits of the range to the children
 mergesort(begin, end)
 {
     ...
     mergesort(begin, (end-begin)/2);
     mergesort((end-begin)/2, end);
 }
 It's very efficient.
 
 In the iterator-is-object approach I'm not sure how that works.
 If all you've got is .next() or .hasNext() I'm not sure what you're 
 supposed to do to create an iterator over half the range represented by 
 the original iterator.

Iterators can be any type, not just objects, that support [], ++, +=, and =.

 foreach loops will not be able to have a 'key' parameter.

 Don't understand that one.

What's the key for an unknown iterator type?

Nov 06 2006

Bill Baxter <dnewsgroup billbaxter.com> writes:

Walter Bright wrote:
 Bill Baxter wrote:
 Walter Bright wrote:
 It's becoming increasingly obvious that D needs iterators. While 
 opApply  is far better for some kinds of iteration (such as 
 recursively traversing a directory), iterators are more efficient in 
 some cases, and allow for things that opApply makes difficult.

 So hear are the design goals:

 1) Works with dynamic and stat arrays

 How about pointers (unknown length arrays)?
   int *some_numbers;

 Is it required that a pointer be a valid iterator?
 I.e. should it be possible to have this template function:

    print_all(some_numbers, some_numbers+theLength);

 Or will that case require a wrapper object that contains theLength?

    auto iter = PointerIterator(some_numbers, theLength);
    print_all(iter);

 
 It's simpler than that. Pointers can be converted to arrays:
     auto array = some_numbers[0 .. theLength];
 and then you can get the iterator.

Nice!  Didn't know you could do that.


 I think the C++ like way will be a lot more efficient, and I think it 
 will work. Reference types are not needed.

The C++ way really embodies two design choices:
1) iterators are defined by 'concept'
    --> Means generic iteration functions all need to be templates
    --> The other alternative is define them by interfaces.
        --> these two aren't necessarily mutually exclusive though
2) an iterator is a pointer -- begin() and end() are kept separate.
    --> The other alternative is 'an iterator is a range'

Definitely 1) will be more efficient than using interfaces,
but I don't thing pointer semantics, 2), are required for 1) to work.

It may be possible for D iterators to have the speed of C++ while having 
the convenience and safety of Java's iterators that know their bounds.

 I think the minuses outweigh the plusses.
 But maybe it's useful to take one concrete example:  A generic 
 mergesort or quicksort algorithm.

 In C++ recursive calls just pass the limits of the range to the children
 mergesort(begin, end)
 {
     ...
     mergesort(begin, (end-begin)/2);
     mergesort((end-begin)/2, end);
 }
 It's very efficient.

 In the iterator-is-object approach I'm not sure how that works.
 If all you've got is .next() or .hasNext() I'm not sure what you're 
 supposed to do to create an iterator over half the range represented 
 by the original iterator.

 
 Iterators can be any type, not just objects, that support [], ++, +=, 
 and =.

I think the iterator-as-range idea is at least worth discussing.  99% of 
the time you use an iterator you also need to know where to stop.  It's 
pretty similar to how arrays in D keep a length, because 99% of the time 
when you have an array you need to know how long it is too.  So it makes 
sense to keep those two bits of information together.  Similarly in C++, 
when you have an iterator, you almost always need the end() too.  And 
any time you want to pass around iterators you end up having to pass the 
two bits of information around separately.

--bb

Nov 06 2006

Walter Bright <newshound digitalmars.com> writes:

Bill Baxter wrote:
 I think the iterator-as-range idea is at least worth discussing.  99% of 
 the time you use an iterator you also need to know where to stop.  It's 
 pretty similar to how arrays in D keep a length, because 99% of the time 
 when you have an array you need to know how long it is too.  So it makes 
 sense to keep those two bits of information together.  Similarly in C++, 
 when you have an iterator, you almost always need the end() too.  And 
 any time you want to pass around iterators you end up having to pass the 
 two bits of information around separately.

That's a very good point. Got any ideas on that?

Nov 06 2006

"Craig Black" <cblack ara.com> writes:

When writing custom C++ iterators, I find that end() is not ever necessary.
If end() is not used, it means that a little more smarts have to be added to
the iterator itself so that the iterator knows when to stop.  In some cases
this means that the iterator needs a pointer to the collection/container
object in order to get that information.

-Craig

"Walter Bright" <newshound digitalmars.com> wrote in message
news:eioqvl$26t0$2 digitaldaemon.com...
 Bill Baxter wrote:
 I think the iterator-as-range idea is at least worth discussing.  99% of
 the time you use an iterator you also need to know where to stop.  It's
 pretty similar to how arrays in D keep a length, because 99% of the time
 when you have an array you need to know how long it is too.  So it makes
 sense to keep those two bits of information together.  Similarly in C++,
 when you have an iterator, you almost always need the end() too.  And
 any time you want to pass around iterators you end up having to pass the
 two bits of information around separately.

 That's a very good point. Got any ideas on that?

Nov 07 2006

Tom S <h3r3tic remove.mat.uni.torun.pl> writes:

Craig Black wrote:
 When writing custom C++ iterators, I find that end() is not ever necessary.
 If end() is not used, it means that a little more smarts have to be added to
 the iterator itself so that the iterator knows when to stop.  In some cases
 this means that the iterator needs a pointer to the collection/container
 object in order to get that information.

Also in some cases, 'end()' is really unnatural to write. E.g. with 
cyclic containers.

Nov 07 2006

"Kristian Kilpi" <kjkilpi gmail.com> writes:

On Tue, 07 Nov 2006 16:03:13 +0200, Craig Black <cblack ara.com> wrote:
 When writing custom C++ iterators, I find that end() is not ever  =

 necessary.
 If end() is not used, it means that a little more smarts have to be  =

 added to
 the iterator itself so that the iterator knows when to stop.  In some =

 =

 cases
 this means that the iterator needs a pointer to the collection/contain=

er
 object in order to get that information.

A benefit of having a pointer to the iterated container is of course tha=
t  =

boundary checks are always up to date, even if you change the container =
 =

while iterating it. Well, I'm not saying that it would be the way to do =
 =

the iterators, or that it's not the way to do them.

Btw, I have used (in C++) iterators having the following functions:

   isEnd()    //does the iterator point to the end position?  (isNotEnd(=
)  =

etc provided for convenience)
              //there is also: isBegin(), isFirst(), isLast()
   toBegin()  //to position -1 (which is outside the container)
   toFirst()  //to position 0
   toLast()   //to position end - 1
   toEnd()    //the end position points outside the container
   toNext()
   toPrev()
   get()      //returns the current item
   read()     //=3D=3D get() + toNext()
   getNext()
   getPrev()
   readBackward()  //=3D=3D get() + toPrev()
   ...

I have named iterators capable of changing the container as 'Writer's  =

('wr' + 'iter'). They have the following functions:

   set()
   write()  //=3D=3D set() + toNext()
   ...


Ok, there are a lot of functions... but my actual point is that the  =

iterator has two positions, 'begin' and 'end', that point *outside* the =
 =

container. When writing (or setting) a value to 'end', the value is  =

appended to the end of the container (very natural operation for files).=
  =

The same way "i.toBegin(); i.set(val);" inserts 'val' to the start of th=
e  =

container (very unnatural operation for files, and actually causes an  =

error if tried for files).

Nov 07 2006

Karen Lanrap <karen digitaldaemon.com> writes:

Walter Bright wrote:

 there's no way to efficiently 'linearize' a binary tree

In the general case the structure to be linearized is a directed 
graph. Every linearization has to at least implicitely build a 
spanning tree for that graph.

A priori in spanning trees the number of childs of a node is limited 
only by the number of nodes of the structure.

Therefore every linearizator must be able to handle trees of 
arbitrary degree.

From tree grammars it is known, that top down attributions and bottom 
up attributions are incomparable to each other when only one run is 
allowed.

Because the goal of any linearization is to compute some attribution 
of the structure every 'traverser' must assist the process outlined 
above, where in the general case more than one run is needed.

Now: what is efficiency in this general case?

Nov 06 2006

Mike Capp <mike.capp gmail.com> writes:

 Overload opIndex for rvalue access
 [...]
 Overloading opIndex also will work for random access

Wouldn't this make iterators incompatible with e.g. linked lists? Seems to
defeat
much of the purpose.

(It might be possible to implement random access for a list, but it's not
something you'd want to encourage.)

Nov 06 2006

Walter Bright <newshound digitalmars.com> writes:

Mike Capp wrote:
 Overload opIndex for rvalue access
 [...]
 Overloading opIndex also will work for random access

 
 Wouldn't this make iterators incompatible with e.g. linked lists? Seems to
defeat
 much of the purpose.

To work, all it has to do is support [0], and throw an exception if the 
index is anything but 0.

 (It might be possible to implement random access for a list, but it's not
 something you'd want to encourage.)

I agree.

Nov 06 2006

Bill Baxter <dnewsgroup billbaxter.com> writes:

Walter Bright wrote:
 Mike Capp wrote:
 Overload opIndex for rvalue access
 [...]
 Overloading opIndex also will work for random access

 Wouldn't this make iterators incompatible with e.g. linked lists? 
 Seems to defeat
 much of the purpose.

 
 To work, all it has to do is support [0], and throw an exception if the 
 index is anything but 0.
 
 (It might be possible to implement random access for a list, but it's not
 something you'd want to encourage.)

 
 I agree.

This is one point where there's a huge divide between C++ iterators and 
Java iterators.

In C++ there is no such thing as *the* iterator type or interface.  What 
C++ (or STL, rather) has is collection of *iterator concepts*.  What 
that means is that if opIndex doesn't make sense for the the collection 
in question, you just leave it out.  A 'concept' is like an interface 
but for templates.  Here are the concepts in the STL:

Trivial Iterator - http://www.sgi.com/tech/stl/trivial.html
Input Iterator - http://www.sgi.com/tech/stl/InputIterator.html
Output Iterator - http://www.sgi.com/tech/stl/OutputIterator.html
Forward Iterator - http://www.sgi.com/tech/stl/ForwardIterator.html
Bidirectional Iterator - 
http://www.sgi.com/tech/stl/BidirectionalIterator.html
Random Access Iterator
http://www.sgi.com/tech/stl/RandomAccessIterator.html

If you want your container to support forward iteration, then your 
iterator can be of any type you wish, but it must support x++, ++x, and 
*x operations.



pros and cons.
* Main pro is that functions that take generic iterators can be ordinary 
functions rather than templates.
* Main con is that such functions can't generally be as efficient as raw 
pointer manipulation, because you have to make a virtual method call 
through the interface pointer to get each next element.


D is in a very interesting place in that it is high level enough that 
the C++ model doesn't quite make sense, but low level enough that the 
Java model doesn't quite make sense either.

D may be able to have the best of both worlds.  Define *both* standard 
concepts *and* standard interfaces that implement those concepts, and 
then libraries can write either templatized algorithms for speed, or 
regular function based algorithms for code-size, use with inheritance, etc.

--bb

Nov 06 2006

Sean Kelly <sean f4.ca> writes:

Bill Baxter wrote:
 Walter Bright wrote:
 Mike Capp wrote:
 Overload opIndex for rvalue access
 [...]
 Overloading opIndex also will work for random access

 Wouldn't this make iterators incompatible with e.g. linked lists? 
 Seems to defeat
 much of the purpose.

 To work, all it has to do is support [0], and throw an exception if 
 the index is anything but 0.

 (It might be possible to implement random access for a list, but it's 
 not
 something you'd want to encourage.)

 I agree.

 
 This is one point where there's a huge divide between C++ iterators and 
 Java iterators.
 
 In C++ there is no such thing as *the* iterator type or interface.  What 
 C++ (or STL, rather) has is collection of *iterator concepts*.  What 
 that means is that if opIndex doesn't make sense for the the collection 
 in question, you just leave it out.  A 'concept' is like an interface 
 but for templates.

For what it's worth, C++98 style iterator traits won't work in D because 
of D's simpler template lookup rules.  I think we'll have to go with 
something much closer to C++0x concept checking if we want to overload 
algorithms for different iterator categories.


 pros and cons.
 * Main pro is that functions that take generic iterators can be ordinary 
 functions rather than templates.
 * Main con is that such functions can't generally be as efficient as raw 
 pointer manipulation, because you have to make a virtual method call 
 through the interface pointer to get each next element.

Hrm... you could still use a Java design without the base class and have 
everyone just use templates though.

 D is in a very interesting place in that it is high level enough that 
 the C++ model doesn't quite make sense, but low level enough that the 
 Java model doesn't quite make sense either.
 
 D may be able to have the best of both worlds.  Define *both* standard 
 concepts *and* standard interfaces that implement those concepts, and 
 then libraries can write either templatized algorithms for speed, or 
 regular function based algorithms for code-size, use with inheritance, etc.

That's kind of what I was thinking, though if iterators all inherit from 
a common base class then the calls will be virtual regardless, unless 
the DMD optimizer gets a bit fancier.


Sean

Nov 06 2006

Kirk McDonald <kirklin.mcdonald gmail.com> writes:

Walter Bright wrote:
 It's becoming increasingly obvious that D needs iterators. While opApply 
  is far better for some kinds of iteration (such as recursively 
 traversing a directory), iterators are more efficient in some cases, and 
 allow for things that opApply makes difficult.
 
 So hear are the design goals:
 
 1) Works with dynamic and stat arrays
 2) Doesn't need to work with associative arrays
 3) Should be possible to achieve "pointer efficiency" with it
 4) Needs to be able to restrict lvalue access (what C++ does with const 
 iterators)
 5) Needs to work seamlessly with foreach
 6) Iterators need to be copyable
 
 So here's one design:
 
 .begin property returns an iterator that starts at the beginning
 
 .end returns an iterator that is at the end
 
 Overloading * doesn't work with D. But, instead:
 

What are the arguments against that, again? I do not recall, and it's 
worth at least reviewing why we don't have opDeref and opDerefAssign.

As you know, Walter, a major benefit of C++-style iterators is that they 
are semantically as close as possible to pointers. Would [] be changed 
to dereference pointers? I sure hope not. If we go this route, I would 
suggest adding these unary-* overloads.

 Overload opIndex for rvalue access
 Overload opIndexAssign for lvalue access
 

I think you mean opSlice and opSliceAssign.

Bar bar;
auto foo = bar.begin();
Bar baz = foo[];
foo[] = Bar.init;

 Overloading opIndex also will work for random access
 
 foreach loops will not be able to have a 'key' parameter.

So, I would assume these iterators would work something like this:

If a class provides .begin and .end, and these both return the same 
type, and that type provides the requisite iterator methods, it is 
foreach-able.

If a class provides .begin, .end, and opApply, one of the two iteration 
methods has to take precedence. I would hope it's opApply.

For a type T, its associated iterator type should be available via 
T.iterator. This has to be standard.

An iterable type T MUST provide these methods:
I begin()
I end()

An iterator type I MUST provide the following overloads:
E opSlice()
I opPostInc()

E is the type of an element in the collection.

These are enough to describe a read-only forward iterator, which is 
adequate for foreach. In other words, given a type T that meets the 
requirements outlined above, the following is adequate to iterate 
through it:

foreach(e; t) {
     // ...
}

'e' would be of type E.

We run into a curious problem involving pre-increment, which uses 
opAddAssign. Classes for which random-access is problematic should not 
be required to provide this, which is essentially a random-access 
method. There are two options here:

1) Only require that iterators support opPostInc. (This further removes 
them from actual pointers.)

2) Introduce a convention whereby a non-random-access iterator throws an 
exception if any value other than 1 is passed to its opAddAssign overload.

Extending these rules for bidirectional iterators and random-access 
iterators, as well as read-write iterators, is left as an exercise to 
the reader. (And not a very difficult one.)

All of these strange edge-cases and differences between pointers and 
possible iterator types convince me that C++-STYLE ITERATORS ARE NOT FOR 
D. D, unlike C++, does not supply enough operator overloads for their 
semantics to be identical to those of pointers, which was the whole 
point of C++-style iterators in the first place.

Instead, we should look to other languages for inspiration. I suggest 
Java and Python, whose iterator semantics are similar. Other posters 
have already made suggestions on this front. I would only suggest the 
"opIter" method for getting an iterator, and that the iterator class 
itself be available via T.iterator (whether it is a nested class or an 
alias shouldn't matter). Classes can provide methods for returning 
alternate iterators, as well. (A class providing opIter should be 
iterated over as easily as an iterator itself.)

-- 
Kirk McDonald
Pyd: Wrapping Python with D
http://pyd.dsource.org

Nov 06 2006

Walter Bright <newshound digitalmars.com> writes:

Kirk McDonald wrote:
 Walter Bright wrote:
 Overloading * doesn't work with D. But, instead:

 What are the arguments against that, again?

Because D has many cases of implicit dereferencing, I thought it would 
cause more trouble than it's worth.

 I do not recall, and it's 
 worth at least reviewing why we don't have opDeref and opDerefAssign.
 
 As you know, Walter, a major benefit of C++-style iterators is that they 
 are semantically as close as possible to pointers. Would [] be changed 
 to dereference pointers?

You've always been able to:
	int* p;
	p[1] = ...

 I sure hope not. If we go this route, I would 
 suggest adding these unary-* overloads.
 
 Overload opIndex for rvalue access
 Overload opIndexAssign for lvalue access

 
 I think you mean opSlice and opSliceAssign.

No, I meant Index.

 Overloading opIndex also will work for random access

 foreach loops will not be able to have a 'key' parameter.

 
 So, I would assume these iterators would work something like this:
 
 If a class provides .begin and .end, and these both return the same 
 type, and that type provides the requisite iterator methods, it is 
 foreach-able.

Yes, though they will return a value, not a type.

 If a class provides .begin, .end, and opApply, one of the two iteration 
 methods has to take precedence. I would hope it's opApply.

I was leaning towards the other way around.

 For a type T, its associated iterator type should be available via 
 T.iterator. This has to be standard.

It's not needed, as typeof(T.begin) will do it.

 An iterable type T MUST provide these methods:
 I begin()
 I end()

Yes.

 An iterator type I MUST provide the following overloads:
 E opSlice()

No. opIndex()

 I opPostInc()

Probably opAddAssign()

 E is the type of an element in the collection.


 These are enough to describe a read-only forward iterator, which is 
 adequate for foreach. In other words, given a type T that meets the 
 requirements outlined above, the following is adequate to iterate 
 through it:
 
 foreach(e; t) {
     // ...
 }
 
 'e' would be of type E.

Yes.

 We run into a curious problem involving pre-increment, which uses 
 opAddAssign. Classes for which random-access is problematic should not 
 be required to provide this, which is essentially a random-access 
 method. There are two options here:
 
 1) Only require that iterators support opPostInc. (This further removes 
 them from actual pointers.)
 
 2) Introduce a convention whereby a non-random-access iterator throws an 
 exception if any value other than 1 is passed to its opAddAssign overload.

I could go either way with that.

 Extending these rules for bidirectional iterators and random-access 
 iterators, as well as read-write iterators, is left as an exercise to 
 the reader. (And not a very difficult one.)
 
 All of these strange edge-cases and differences between pointers and 
 possible iterator types convince me that C++-STYLE ITERATORS ARE NOT FOR 
 D. D, unlike C++, does not supply enough operator overloads for their 
 semantics to be identical to those of pointers, which was the whole 
 point of C++-style iterators in the first place.

I think it does provide enough, in fact, it will be less wacky than in 
C++ (look at the wretched iterator/const_iterator dichotomy). It'll work 
with core arrays, and will not need those typedefs.


 Instead, we should look to other languages for inspiration. I suggest 
 Java and Python, whose iterator semantics are similar. Other posters 
 have already made suggestions on this front. I would only suggest the 
 "opIter" method for getting an iterator, and that the iterator class 
 itself be available via T.iterator (whether it is a nested class or an 
 alias shouldn't matter). Classes can provide methods for returning 
 alternate iterators, as well. (A class providing opIter should be 
 iterated over as easily as an iterator itself.)

Nov 06 2006

Kirk McDonald <kirklin.mcdonald gmail.com> writes:

Walter Bright wrote:
 Kirk McDonald wrote:
 Walter Bright wrote:
 Overloading * doesn't work with D. But, instead:

 What are the arguments against that, again?

 
 Because D has many cases of implicit dereferencing, I thought it would 
 cause more trouble than it's worth.
 
 I do not recall, and it's worth at least reviewing why we don't have 
 opDeref and opDerefAssign.

 As you know, Walter, a major benefit of C++-style iterators is that 
 they are semantically as close as possible to pointers. Would [] be 
 changed to dereference pointers?

 
 You've always been able to:
     int* p;
     p[1] = ...
 

Wouldn't that be p[0] = ...?

Is that how you intend to use opIndex to "dereference" the iterator? 
That's /terrifying/. Although it is consistent with pointers, using [0] 
for everything is just asking for trouble with regards to 
non-random-access iterators. And though it's probably not totally 
meaningless to a random reader of the code, it comes pretty darned close.

 I sure hope not. If we go this route, I would suggest adding these 
 unary-* overloads.

 Overload opIndex for rvalue access
 Overload opIndexAssign for lvalue access

 I think you mean opSlice and opSliceAssign.

 
 No, I meant Index.
 
 Overloading opIndex also will work for random access

 foreach loops will not be able to have a 'key' parameter.

 So, I would assume these iterators would work something like this:

 If a class provides .begin and .end, and these both return the same 
 type, and that type provides the requisite iterator methods, it is 
 foreach-able.

 
 Yes, though they will return a value, not a type.
 

s/these both return the same type/their return types are the same/

 If a class provides .begin, .end, and opApply, one of the two 
 iteration methods has to take precedence. I would hope it's opApply.

 
 I was leaning towards the other way around.
 

It is slightly easier and clearer to explicitly say:

foreach(e; t.begin) { }

than

foreach(e; &t.opApply) { }

opApply is an operator overload. .begin would just be a special method. 
I would expect the operator to take precedence.

 For a type T, its associated iterator type should be available via 
 T.iterator. This has to be standard.

 
 It's not needed, as typeof(T.begin) will do it.
 

Due to the way in which D's type inference from properties works, you'll 
need typeof(T.begin()). Otherwise, it will derive the type of the method 
itself. (Which isn't even a valid type, as such.)

 An iterable type T MUST provide these methods:
 I begin()
 I end()

 
 Yes.
 
 An iterator type I MUST provide the following overloads:
 E opSlice()

 
 No. opIndex()

I'm referring specifically to the no-index form of opSlice (meaning i[] 
would "dereference" the iterator). (A no-index opIndex doesn't work.) 
This of course is inconsistent with pointers, so saying i[0] is better 
on that count.

 
 I opPostInc()

 
 Probably opAddAssign()
 

Oh, there's another one:

int opEquals(I)

 E is the type of an element in the collection.

 
 
 These are enough to describe a read-only forward iterator, which is 
 adequate for foreach. In other words, given a type T that meets the 
 requirements outlined above, the following is adequate to iterate 
 through it:

 foreach(e; t) {
     // ...
 }

 'e' would be of type E.

 
 Yes.
 
 We run into a curious problem involving pre-increment, which uses 
 opAddAssign. Classes for which random-access is problematic should not 
 be required to provide this, which is essentially a random-access 
 method. There are two options here:

 1) Only require that iterators support opPostInc. (This further 
 removes them from actual pointers.)

 2) Introduce a convention whereby a non-random-access iterator throws 
 an exception if any value other than 1 is passed to its opAddAssign 
 overload.

 
 I could go either way with that.
 

Neither one is particularly elegant.

 Extending these rules for bidirectional iterators and random-access 
 iterators, as well as read-write iterators, is left as an exercise to 
 the reader. (And not a very difficult one.)

 All of these strange edge-cases and differences between pointers and 
 possible iterator types convince me that C++-STYLE ITERATORS ARE NOT 
 FOR D. D, unlike C++, does not supply enough operator overloads for 
 their semantics to be identical to those of pointers, which was the 
 whole point of C++-style iterators in the first place.

 
 I think it does provide enough, in fact, it will be less wacky than in 
 C++ (look at the wretched iterator/const_iterator dichotomy). It'll work 
 with core arrays, and will not need those typedefs.
 

Without the ability to overload the dereference operator, any attempt to 
write a class that behaves like a pointer in D will be unwieldy, or at 
least ugly. Admittedly, this isn't much of an issue when just using 
foreach, but might be an issue with any kind of STL-like algorithms 
library. (Which is what you're inviting when using C++'s iterator 
semantics.)

 
 Instead, we should look to other languages for inspiration. I suggest 
 Java and Python, whose iterator semantics are similar. Other posters 
 have already made suggestions on this front. I would only suggest the 
 "opIter" method for getting an iterator, and that the iterator class 
 itself be available via T.iterator (whether it is a nested class or an 
 alias shouldn't matter). Classes can provide methods for returning 
 alternate iterators, as well. (A class providing opIter should be 
 iterated over as easily as an iterator itself.)

 
 

-- 
Kirk McDonald
Pyd: Wrapping Python with D
http://pyd.dsource.org

Nov 06 2006

Walter Bright <newshound digitalmars.com> writes:

Kirk McDonald wrote:
 Walter Bright wrote:
 You've always been able to:
     int* p;
     p[1] = ...

 Wouldn't that be p[0] = ...?

I must have misunderstood you. I thought you were asking if [n] worked 
on pointers. Yes, it does.

 Is that how you intend to use opIndex to "dereference" the iterator? 
 That's /terrifying/. Although it is consistent with pointers, using [0] 
 for everything is just asking for trouble with regards to 
 non-random-access iterators. And though it's probably not totally 
 meaningless to a random reader of the code, it comes pretty darned close.

I agree it might be startling at first because it's unusual. Once one is 
past that, however, is it that bad?

 If a class provides .begin, .end, and opApply, one of the two 
 iteration methods has to take precedence. I would hope it's opApply.

 I was leaning towards the other way around.

 
 It is slightly easier and clearer to explicitly say:
 
 foreach(e; t.begin) { }
 
 than
 
 foreach(e; &t.opApply) { }
 
 opApply is an operator overload. .begin would just be a special method. 
 I would expect the operator to take precedence.

The:
	foreach(e; t.begin) { }
would not be supported. It would be:
	foreach (e; t) { }
If an aggregate had both, and iterators would be chosen by default, 
overriding with:
	foreach (e; &t.opApply) { }
would not require any new conventions or syntax. So I think this is the 
right way to go.


 For a type T, its associated iterator type should be available via 
 T.iterator. This has to be standard.

 It's not needed, as typeof(T.begin) will do it.

 
 Due to the way in which D's type inference from properties works, you'll 
 need typeof(T.begin()). Otherwise, it will derive the type of the method 
 itself. (Which isn't even a valid type, as such.)

Yes, you're right.

 No. opIndex()

 
 I'm referring specifically to the no-index form of opSlice (meaning i[] 
 would "dereference" the iterator). (A no-index opIndex doesn't work.) 
 This of course is inconsistent with pointers, so saying i[0] is better 
 on that count.

Ok, I understand now where you were coming from with the slice.


 I opPostInc()

 Probably opAddAssign()

 
 Oh, there's another one:
 
 int opEquals(I)

Yes, I'd overlooked that.

 I could go either way with that.

 Neither one is particularly elegant.

True, but that inelegance is hidden away in the library, but the 
flexibility is probably worth it.

 I think it does provide enough, in fact, it will be less wacky than in 
 C++ (look at the wretched iterator/const_iterator dichotomy). It'll 
 work with core arrays, and will not need those typedefs.

 Without the ability to overload the dereference operator, any attempt to 
 write a class that behaves like a pointer in D will be unwieldy, or at 
 least ugly. Admittedly, this isn't much of an issue when just using 
 foreach, but might be an issue with any kind of STL-like algorithms 
 library. (Which is what you're inviting when using C++'s iterator 
 semantics.)

I still think it isn't necessary, but you might be thinking of a use 
case I haven't. Can you provide an example?

Nov 06 2006

Bill Baxter <dnewsgroup billbaxter.com> writes:

Walter Bright wrote:
 Kirk McDonald wrote:

 
 I think it does provide enough, in fact, it will be less wacky than 
 in C++ (look at the wretched iterator/const_iterator dichotomy). 
 It'll work with core arrays, and will not need those typedefs.

 Without the ability to overload the dereference operator, any attempt 
 to write a class that behaves like a pointer in D will be unwieldy, or 
 at least ugly. Admittedly, this isn't much of an issue when just using 
 foreach, but might be an issue with any kind of STL-like algorithms 
 library. (Which is what you're inviting when using C++'s iterator 
 semantics.)

 
 I still think it isn't necessary, but you might be thinking of a use 
 case I haven't. Can you provide an example?

The only thing that comes to mind is if you get multiple indirections. 
For instance, in C++ I frequently end up creating things like vectors of 
lists or lists of vectors.  Or vectors of pointers.  Or sets of list 
iterators. Then to actually use the things I need two dereferences.

But with built-in GC, you don't really need smart pointers so often. And 
with "dot-is-all-you-need" that also eliminates many cases for dereferences.

But still I'm sure you could come up with situations where to get at the 
thing pointed to you'd need a string of [0]'s

      iteriteriter[0][0][0]

Actually, in some ways it's an improvement over the prefix deref 
operator because it reads consistently from left to right.  In c++ if 
you had iterator to vector of iterator and you wanted to deref it you'd 
need:

      *((*iterveciter)[i])

Or maybe the inner parens are unnecessary.  I can't recall, which is 
another reason why using postfix for everything is better -- no question 
about precedence rules.

So I'm sold on dereferencing being a postfix operation.  But I would 
still like it better if there were some way to get a compile-time error 
if I try to do
     iter[2]  // Error! This iterator isn't random access!

Would introducing a special symbol be of any use?  Like [*]?

     iter[*]

It would just compile to opIndex(0), but it declares the intent "this is 
dereferencing, not arbitrary indexing".  I guess it's kind of pointless 
if there's no compiler support for enforcing its use.  I guess it would 
only be useful if there were an opDeref.

Anyway, whatever you do with dereferencing -- I'm convinced you should 
absolutely not make it a unary prefix operator.  It should be postfix.

--bb

Nov 06 2006

"Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:

"Walter Bright" <newshound digitalmars.com> wrote in message 
news:eio6u5$1j6o$1 digitaldaemon.com...

 Overload opIndex for rvalue access
 Overload opIndexAssign for lvalue access

Somehow..

foreach(i; foo)
{
    auto x = i[0];
    i[0] = x + 1;
}

You often say that operator overloading should be used only what it should 
be used for, and taking the index of an iterator means.. what exactly?

Would it be nicer to just use a .value property instead?

foreach(i; foo)
{
    auto x = i.value;
    i.value = x + 1;
}

Nov 06 2006

Walter Bright <newshound digitalmars.com> writes:

Jarrett Billingsley wrote:
 "Walter Bright" <newshound digitalmars.com> wrote in message 
 news:eio6u5$1j6o$1 digitaldaemon.com...
 
 Overload opIndex for rvalue access
 Overload opIndexAssign for lvalue access

 
 Somehow..
 
 foreach(i; foo)
 {
     auto x = i[0];
     i[0] = x + 1;
 }
 
 You often say that operator overloading should be used only what it should 
 be used for, and taking the index of an iterator means.. what exactly?
 
 Would it be nicer to just use a .value property instead?
 
 foreach(i; foo)
 {
     auto x = i.value;
     i.value = x + 1;
 } 

The problem is if the iterator is an actual pointer, there is no .value 
property for a pointer.

Another possibility is to have:
	*x
rewritten to:
	x[0]
if x is not a pointer.

Nov 06 2006

"Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:

"Walter Bright" <newshound digitalmars.com> wrote in message 
news:eiosro$28g6$1 digitaldaemon.com...

 The problem is if the iterator is an actual pointer, there is no .value 
 property for a pointer.

 Another possibility is to have:
 *x
 rewritten to:
 x[0]
 if x is not a pointer.

And yet another is to allow overriding of opDeref and opDerefAssign ;) 
Which would, of course, be the _optimal_ solution, but..

Nov 06 2006

Sean Kelly <sean f4.ca> writes:

Walter Bright wrote:
 Jarrett Billingsley wrote:
 "Walter Bright" <newshound digitalmars.com> wrote in message 
 news:eio6u5$1j6o$1 digitaldaemon.com...

 Overload opIndex for rvalue access
 Overload opIndexAssign for lvalue access

 Somehow..

 foreach(i; foo)
 {
     auto x = i[0];
     i[0] = x + 1;
 }

 You often say that operator overloading should be used only what it 
 should be used for, and taking the index of an iterator means.. what 
 exactly?

 Would it be nicer to just use a .value property instead?

 foreach(i; foo)
 {
     auto x = i.value;
     i.value = x + 1;
 } 

 
 The problem is if the iterator is an actual pointer, there is no .value 
 property for a pointer.

Is there really any reason to support pointers as iterators though?  C++ 
libraries even seem to be moving away from that towards more robust and 
less error-prone iterator objects.


Sean

Nov 06 2006

Bill Baxter <dnewsgroup billbaxter.com> writes:

Sean Kelly wrote:
 Would it be nicer to just use a .value property instead?

 foreach(i; foo)
 {
     auto x = i.value;
     i.value = x + 1;
 } 

 The problem is if the iterator is an actual pointer, there is no 
 .value property for a pointer.


Well, they could.  It's up to Mr. Compiler Writer if a pointer has a 
value property or not.

 Is there really any reason to support pointers as iterators though?  C++ 
 libraries even seem to be moving away from that towards more robust and 
 less error-prone iterator objects.

Agreed.  I don't think I've ever actually used a raw pointer with a 
fancy STL algorithm.  I was actually really surprised the first time I 
saw it in the examples on the SGI STL page.  Other than those examples, 
I've never seen it in real code.

And that should go double for D, where you rarely even see pointers to 
begin with.

--bb

Nov 06 2006

Walter Bright <newshound digitalmars.com> writes:

Bill Baxter wrote:
 Sean Kelly wrote:
 The problem is if the iterator is an actual pointer, there is no 
 .value property for a pointer.


 
 Well, they could.  It's up to Mr. Compiler Writer if a pointer has a 
 value property or not.

Consider the following struct:

	struct Foo
	{
		int value;
	}

Foo* p;

p.value;

Is p.value the entire contents of Foo (as it would be for a proposed 
.value property) or just Foo.value? "value" is a pretty common field name.

Nov 06 2006

Bill Baxter <dnewsgroup billbaxter.com> writes:

Walter Bright wrote:
 Bill Baxter wrote:
 Sean Kelly wrote:
 The problem is if the iterator is an actual pointer, there is no 
 .value property for a pointer.


 Well, they could.  It's up to Mr. Compiler Writer if a pointer has a 
 value property or not.

 
 Consider the following struct:
 
     struct Foo
     {
         int value;
     }
 
 Foo* p;
 
 p.value;
 
 Is p.value the entire contents of Foo (as it would be for a proposed 
 .value property) or just Foo.value? "value" is a pretty common field name.

It's a name conflict.  They happen.  I'd have trouble if I wanted to 
have a sizeof member in my struct too.  If you want to be sure to get 
the struct's member then use p[0].value.

'Value' is too common, though.  If you were going to go that route it 
should definitely be something like .iterVal or some other word less 
likely to conflict.

I'm not saying it's the right way to go, just that it could be made to 
work if needed.

--bb

Nov 06 2006

Charles D Hixson <charleshixsn earthlink.net> writes:

Walter Bright wrote:
 Bill Baxter wrote:
 Sean Kelly wrote:
 The problem is if the iterator is an actual pointer, there is no 
 .value property for a pointer.


 Well, they could.  It's up to Mr. Compiler Writer if a pointer has a 
 value property or not.

 
 Consider the following struct:
 
     struct Foo
     {
         int value;
     }
 
 Foo* p;
 
 p.value;
 
 Is p.value the entire contents of Foo (as it would be for a proposed 
 .value property) or just Foo.value? "value" is a pretty common field name.

Maybe that's why Ada used "`" as a property marker rather than 
"."?  Of course there's the Python approach...naming it __value__.

A suitable solution might be to name it "valueof".  That's a 
much less common field name.  The problem is that this is 
something used quite frequently, so one would want it to be 
short.  Perhaps "V"?  Then one would write
struct Foo
{  int value;
}

Foo* p;

p.value;
p.V;
and have two distinct forms.  (Yes, V is used occasionally as 
a variable name...but pretty rarely.)

Nov 07 2006

Sean Kelly <sean f4.ca> writes:

Bill Baxter wrote:
 Sean Kelly wrote:
 Would it be nicer to just use a .value property instead?

 foreach(i; foo)
 {
     auto x = i.value;
     i.value = x + 1;
 } 

 The problem is if the iterator is an actual pointer, there is no 
 .value property for a pointer.


 
 Well, they could.  It's up to Mr. Compiler Writer if a pointer has a 
 value property or not.
 
 Is there really any reason to support pointers as iterators though?  
 C++ libraries even seem to be moving away from that towards more 
 robust and less error-prone iterator objects.

 
 Agreed.  I don't think I've ever actually used a raw pointer with a 
 fancy STL algorithm.  I was actually really surprised the first time I 
 saw it in the examples on the SGI STL page.  Other than those examples, 
 I've never seen it in real code.

I've used it a few times, but generally more when I'm throwing together 
a quick test than writing actual production code.  But since D arrays 
aren't the same as C++ arrays and we have the option of getting a robust 
object back (via a.begin() or whatever), why not take it?


Sean

Nov 07 2006

Walter Bright <newshound digitalmars.com> writes:

Sean Kelly wrote:
 Is there really any reason to support pointers as iterators though?  C++ 
 libraries even seem to be moving away from that towards more robust and 
 less error-prone iterator objects.

Can you be more specific about what the problems and solutions are?

Nov 06 2006

Georg Wrede <georg.wrede nospam.org> writes:

Walter Bright wrote:
 Sean Kelly wrote:
 
 Is there really any reason to support pointers as iterators though?  
 C++ libraries even seem to be moving away from that towards more 
 robust and less error-prone iterator objects.

 
 
 Can you be more specific about what the problems and solutions are?

For one thing, if pointers still are supported, then all algorithms have 
to be written so they can use them.

If we skip "real pointer compatibility", then we can decide on smarter 
iterators.

Example:

With pointer-compatible iterators, it is impossible to traverse the tree.

With smarter iterators one could.

Nov 07 2006

Sean Kelly <sean f4.ca> writes:

Walter Bright wrote:
 Sean Kelly wrote:
 Is there really any reason to support pointers as iterators though?  
 C++ libraries even seem to be moving away from that towards more 
 robust and less error-prone iterator objects.

 
 Can you be more specific about what the problems and solutions are?

A direct application of pointers as iterators couldn't be done exactly 
the C++ way because C++ uses traits templates to describe the operations 
an iterator supports.  And D will not overload templates from different 
modules, so the user could not provide overloads for his own iterators. 
  This isn't a huge problem however, as we could probably write a set of 
generic traits templates that apply to classes/structs implemented the 
'right' way rather than requiring the user to provide his own.  This 
would be roughly similar to how a user can derive from std::iterator<> 
to add the required typedefs to his object in C++.

More generally though, I find the need to use two iterators to identify 
a range to be somewhat cumbersome and annoying.  And there have been 
enough others who feel this way that there has been at least an informal 
proposal for "all in one" iterators for C++.  I haven't bothered to 
check whether it was ever formalized though.

Basically, I'm wondering whether it might be easier and more D-like to 
use iterators for stepping and opIndex for random access, instead of 
iterators for everything.  Since D has slicing, the argument for using 
iterators to define the boundaries of a range of randomly accessible 
elements seems kind of small to me.  ie.

     sort( a.begin, a.begin + 5 ); // sort the first 5 elements
or
     sort( a[0 .. 5] );

I find the second to be cleaner to look at.  But I'm undecided whether 
we'd lose any power or flexibility by not bothering with random access 
iterators.


Sean

Nov 07 2006

Walter Bright <newshound digitalmars.com> writes:

Sean Kelly wrote:
 Since D has slicing, the argument for using 
 iterators to define the boundaries of a range of randomly accessible 
 elements seems kind of small to me.  ie.
 
     sort( a.begin, a.begin + 5 ); // sort the first 5 elements
 or
     sort( a[0 .. 5] );
 
 I find the second to be cleaner to look at.  But I'm undecided whether 
 we'd lose any power or flexibility by not bothering with random access 
 iterators.

Hmm, instead of 'pointers as iterators', have 'arrays as iterators'. 
That definitely sounds like a good area to explore.

Nov 07 2006

Sean Kelly <sean f4.ca> writes:

Walter Bright wrote:
 Sean Kelly wrote:
 Since D has slicing, the argument for using iterators to define the 
 boundaries of a range of randomly accessible elements seems kind of 
 small to me.  ie.

     sort( a.begin, a.begin + 5 ); // sort the first 5 elements
 or
     sort( a[0 .. 5] );

 I find the second to be cleaner to look at.  But I'm undecided whether 
 we'd lose any power or flexibility by not bothering with random access 
 iterators.

 
 Hmm, instead of 'pointers as iterators', have 'arrays as iterators'. 
 That definitely sounds like a good area to explore.

I went ahead and wrote everything up during my commute this morning. 
It's posted separately as "Iterator straw-man."  And I apologize for it 
being a bit longer than expected :-)

Sean

Nov 07 2006

Benji Smith <dlanguage benjismith.net> writes:

Sean Kelly wrote:
 Since D has slicing, the argument for using iterators to define the 
 boundaries of a range of randomly accessible elements seems kind of 
 small to me.  ie.

     sort( a.begin, a.begin + 5 ); // sort the first 5 elements
 or
     sort( a[0 .. 5] );

 I find the second to be cleaner to look at.  But I'm undecided whether 
 we'd lose any power or flexibility by not bothering with random access 
 iterators.


(among others), but without significant C++ experience, my concept of an 
iterator is: "an object with methods for getting the current element, 
determining whether a 'next' element exists, and for advancing the 
cursor to the next element".

The C++ iterators look very weird (and very very wrong) to me.

Search and sort routines should use opIndex, opIndexAssign, and opSlice. 
If an object doesn't implement opIndex or opIndexAssign, it's not 
sortable using generic sorting algorithms. Tough luck.

Iteration, though, is completely distinct from sorting and searching, 
and should use a different mechanism.

Walter Bright wrote:
 Hmm, instead of 'pointers as iterators', have 'arrays as iterators'.
 That definitely sounds like a good area to explore.

Perhaps an iterator always steps through an array (or an object 
implementing a List or Iterable interface), and in order for an object 
to be foreach-able, you'll need to first transform an object into an 
array (or construct a List or an Iterable wrapping the object).

Arrays could still get their own special-case implementation of foreach, 
avoiding interfaces for maximum speed. And, if a class supports opIndex, 
then you could avoid making a virtual method call through the Iterable 
interface.

But I think it's also desirable, in many many cases, to implement an 
Iterator class, without supporting opIndex. The only methods in the 
iterator would be hasNext() and next().

Foreach should support real arrays, classes with opIndex, and Java-style 
iterators. Searching and sorting should only use opIndex, opIndex, 
opCmp, and opEquals. Iterators should only be used for iteration.

Anyhoo, those are my two cents.

--Benji Smith

Nov 07 2006

Bill Baxter <dnewsgroup billbaxter.com> writes:

Walter Bright wrote:
 Sean Kelly wrote:
 Since D has slicing, the argument for using iterators to define the 
 boundaries of a range of randomly accessible elements seems kind of 
 small to me.  ie.

     sort( a.begin, a.begin + 5 ); // sort the first 5 elements
 or
     sort( a[0 .. 5] );

 I find the second to be cleaner to look at.  But I'm undecided whether 
 we'd lose any power or flexibility by not bothering with random access 
 iterators.

 
 Hmm, instead of 'pointers as iterators', have 'arrays as iterators'. 
 That definitely sounds like a good area to explore.

OOh, I like the direction this is heading.

Sean I don't have time to read the straw-man right now but that 
definitely solves my recursive mergesort question.

And Benji's comment about simple iterators and ranges being distinct 
things that shouldn't be lumped together also seems on target.  If 
iterators don't have to be ranges then it's simple to see how to support 
iterators over infinite or circular containers.  They just never stop 
spitting out 'next()'s

But how do you handle generically pointing to an element then?  Like you 
have with an iterator to a node in linked list.  I guess you can have 
iterators with hasPrev() getPrev() (or --) type methods too.

--bb

Nov 07 2006

Ary Manzana <ary esperanto.org.ar> writes:

Bill Baxter escribi�:
 But how do you handle generically pointing to an element then?  Like you 
 have with an iterator to a node in linked list.  I guess you can have 
 iterators with hasPrev() getPrev() (or --) type methods too.
 
 --bb

In fact, Java has them:

http://java.sun.com/j2se/1.5.0/docs/api/java/util/ListIterator.html

Nov 07 2006

Daniel Keep <daniel.keep.lists gmail.com> writes:

Walter Bright wrote:
 Sean Kelly wrote:
 Since D has slicing, the argument for using iterators to define the
 boundaries of a range of randomly accessible elements seems kind of
 small to me.  ie.

     sort( a.begin, a.begin + 5 ); // sort the first 5 elements
 or
     sort( a[0 .. 5] );

 I find the second to be cleaner to look at.  But I'm undecided whether
 we'd lose any power or flexibility by not bothering with random access
 iterators.

 
 Hmm, instead of 'pointers as iterators', have 'arrays as iterators'.
 That definitely sounds like a good area to explore.

Hang on; aren't we back to where we are *right now*?

Things that need random access override opIndex and opIndexAssign.

Things which you can take a range of values from override opSlice and
opSliceAssign.

Things that require lock-step iteration override opApply, or supply a
function that returns a delegate that "runs" a foreach loop.

About the only thing this *doesn't* cover are bi-directional iterators
(say, iterating back and forth over an infinite series).

I know some people want to be able to stop iteration and resume it, but
couldn't that be solved by making opApply return a generator like what
Python does?  Those things are written the same way as a current opApply
(sans all the return value from the delegate stuff), and can be advanced
manually.

I know this can be done.  I wrote the coroutine module for StackThreads
a while back, and one of the *very first* things I did was implement
generic generators.  The base generator class essentially just
implemented opApply switching over to the coroutine every time it needed
the next value.  Here's a port of Python's range() function (well, the
one and zero argument versions :P):

 class
 range : Generator!(uint)
 {
     // These just let you use range() or range(n)
     static
     range
     opCall()
     {
         return range(uint.max);
     }

     static
     range
     opCall(uint limit)
     {
         return new range(limit);
     }

 protected:
     uint limit;

     // Or you can use new range(n)
     this(uint limit)
     {
         this.limit = limit;
         super();
     }

     // Where the magic happens.
     override
     uint
     run()
     {
         uint i = 0;
         while( i < limit )
             yield(i++);    // yield in D: *very* sexy

         StopGenerator();   // throws an exception, imitating what
                            // Python does to stop an iterator.
     }
 }

Now, from that, we can implement opApply AND next/current in the base
class.  Going with coroutines kills both the opApply and forward
iterator birds with one stone.

As for bi-directional iterators... I actually can't think of anywhere
I'd use them.  I just keep thinking "I could just use random access to
the same end".  Python's never suffered from not having them...

Anyway, just some random thoughts.  Haven't had coffee yet :3

	-- Daniel

-- 
Unlike Knuth, I have neither proven or tried the above; it may not even
make sense.

v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D
i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP  http://hackerkey.com/

Nov 07 2006

Bill Baxter <dnewsgroup billbaxter.com> writes:

Daniel Keep wrote:
 
 Walter Bright wrote:
 Sean Kelly wrote:
 Since D has slicing, the argument for using iterators to define the
 boundaries of a range of randomly accessible elements seems kind of
 small to me.  ie.

     sort( a.begin, a.begin + 5 ); // sort the first 5 elements
 or
     sort( a[0 .. 5] );

 I find the second to be cleaner to look at.  But I'm undecided whether
 we'd lose any power or flexibility by not bothering with random access
 iterators.

 Hmm, instead of 'pointers as iterators', have 'arrays as iterators'.
 That definitely sounds like a good area to explore.

 
 Hang on; aren't we back to where we are *right now*?

Close, but right now we don't have a good way to iterate over multiple 
things at once.
   for(i,j; i.hasNext()&&j.hasNext(); i++,j++)
   {
      // do something with current values of i and j
   }

Or as you say to stop iteration or a generic way to return a pointer to 
a particular spot in a container.

 About the only thing this *doesn't* cover are bi-directional iterators
 (say, iterating back and forth over an infinite series).

 [intersting stuff about generators and D]

If generators can handle the above cases *AND* do it with code that 
simple to create and use *AND* do all that as efficiently as for loops, 
then it sounds great to me.  My impression was that any sort of 
opApply/coroutine thing in D was not going to be so efficient.

--bb

Nov 07 2006

Daniel Keep <daniel.keep.lists gmail.com> writes:

Bill Baxter wrote:
 Daniel Keep wrote:
 Walter Bright wrote:
 Sean Kelly wrote:
 Since D has slicing, the argument for using iterators to define the
 boundaries of a range of randomly accessible elements seems kind of
 small to me.  ie.

     sort( a.begin, a.begin + 5 ); // sort the first 5 elements
 or
     sort( a[0 .. 5] );

 I find the second to be cleaner to look at.  But I'm undecided whether
 we'd lose any power or flexibility by not bothering with random access
 iterators.

 Hmm, instead of 'pointers as iterators', have 'arrays as iterators'.
 That definitely sounds like a good area to explore.

 Hang on; aren't we back to where we are *right now*?

 
 Close, but right now we don't have a good way to iterate over multiple
 things at once.
   for(i,j; i.hasNext()&&j.hasNext(); i++,j++)
   {
      // do something with current values of i and j
   }
 
 Or as you say to stop iteration or a generic way to return a pointer to
 a particular spot in a container.
 
 About the only thing this *doesn't* cover are bi-directional iterators
 (say, iterating back and forth over an infinite series).

 
 [intersting stuff about generators and D]

 
 If generators can handle the above cases *AND* do it with code that
 simple to create and use *AND* do all that as efficiently as for loops,
 then it sounds great to me.  My impression was that any sort of
 opApply/coroutine thing in D was not going to be so efficient.
 
 --bb

Well, as I said, once you have your sequence expressed as a coroutine,
writing hasNext() and getCurrent() is trivial.  Once you have that,
iterating multiple things in lockstep is also trivial:

 class zip : Generator!(Tuple!(T1,T2))
 {
     // ...
     Generator!(T1) seq1;
     Generator!(T2) seq2;
     // ...
     void run()
     {
         while( seq1.hasNext && seq2.hasNext )
             yield(new Tuple!(T1,T2)(seq1.next,seq2.next));
     }
     // ...
 }

Or something to that effect :P

As for efficiency, I'm not sure how to go about testing that so that the
results mean anything.  I'm sure my current implementation could be made
faster, but I don't know how.

Just for loose comparison, the game EVE Online has the record for the
most number of people simultaneously online on a single server at once.
 And it's servers are built using coroutines running in *Python*.

I suppose the bottleneck (if it is one) is how fast you can do
user-space context switches.  Which is basically a function call, moving
the base of the stack, and switching over stuff like the exception
handling registers.  Or somesuch... Mikola Lysenko wrote StackThreads;
my attempt only mostly worked :P

	-- Daniel

-- 
Unlike Knuth, I have neither proven or tried the above; it may not even
make sense.

v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D
i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP  http://hackerkey.com/

Nov 07 2006

Sean Kelly <sean f4.ca> writes:

Daniel Keep wrote:
 
 I suppose the bottleneck (if it is one) is how fast you can do
 user-space context switches.  Which is basically a function call, moving
 the base of the stack, and switching over stuff like the exception
 handling registers.  Or somesuch... Mikola Lysenko wrote StackThreads;
 my attempt only mostly worked :P

That's pretty much it.  Saving the current register data, loading the 
other register data, and jumping to a new instruction address.  There's 
no way this can be as efficient as an integer increment instruction (for 
loop iteration) but it's orders of magnitude faster than a kernel thread 
context switch.


Sean

Nov 08 2006

Benji Smith <dlanguage benjismith.net> writes:

Daniel Keep wrote:
 Things that need random access override opIndex and opIndexAssign.
 
 Things which you can take a range of values from override opSlice and
 opSliceAssign.
 
 Things that require lock-step iteration override opApply, or supply a
 function that returns a delegate that "runs" a foreach loop.
 
 About the only thing this *doesn't* cover are bi-directional iterators
 (say, iterating back and forth over an infinite series).

What about containers where there are multiple valid orders of 
iteration? For a binary tree class, I might want to perform iteration 
using depth-first, breadth-first, in-order, or reverse-order traversals. 
How does the opApply/opApplyReverse solution address these needs?

And isn't a delegate-call about as computationally expensive as a 
virtual method call (in the case of an Iterable iterface)?

If the delegate solution is really better than an Iterable interface, 
why create special cases for opApply and opApplyReverse? Why not just 
say that foreach always requires an iteration delegate?

--benji

Nov 08 2006

Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:

Benji Smith wrote:
 Daniel Keep wrote:
 
 Things that need random access override opIndex and opIndexAssign.

 Things which you can take a range of values from override opSlice and
 opSliceAssign.

 Things that require lock-step iteration override opApply, or supply a
 function that returns a delegate that "runs" a foreach loop.

 About the only thing this *doesn't* cover are bi-directional iterators
 (say, iterating back and forth over an infinite series).

 
 
 What about containers where there are multiple valid orders of 
 iteration? For a binary tree class, I might want to perform iteration 
 using depth-first, breadth-first, in-order, or reverse-order traversals. 
 How does the opApply/opApplyReverse solution address these needs?
 
 And isn't a delegate-call about as computationally expensive as a 
 virtual method call (in the case of an Iterable iterface)?

It should be roughly equivelant in cost to explicitly calling the method,
meaning if the 
method is virtual then you will get that cost, and if it is final then it is
like any 
function call.  Since Iterable methods will very likely almost always be
virtual, there is 
at least a chance of delegates being a little cheaper /some of the time/ --
when they can 
be pointers to final methods.

 If the delegate solution is really better than an Iterable interface, 
 why create special cases for opApply and opApplyReverse? Why not just 
 say that foreach always requires an iteration delegate?
 
 --benji

Well, when foreach(;) was first added to D, I don't think anyone had thought
about 
delegates-as-aggregates yet.  In fact, it was just added a couple of versions
back.  So I 
suppose in a sense, opApply*Reverse is just legacy -- though it makes it
straight-forward 
to iterate simple, linear-only collections.

I suppose it would be safe enough to do away with opApply*Reverse now, but I
don't think 
it hurts anything to leave them be, either.

-- Chris Nicholson-Sauls

Nov 08 2006

Sean Kelly <sean f4.ca> writes:

Daniel Keep wrote:
 
 Walter Bright wrote:
 Sean Kelly wrote:
 Since D has slicing, the argument for using iterators to define the
 boundaries of a range of randomly accessible elements seems kind of
 small to me.  ie.

     sort( a.begin, a.begin + 5 ); // sort the first 5 elements
 or
     sort( a[0 .. 5] );

 I find the second to be cleaner to look at.  But I'm undecided whether
 we'd lose any power or flexibility by not bothering with random access
 iterators.

 Hmm, instead of 'pointers as iterators', have 'arrays as iterators'.
 That definitely sounds like a good area to explore.

 
 Hang on; aren't we back to where we are *right now*?

Essentially.  But D has no formal description of an iterator type.  I 
think it would be a useful convention to establish, and doing so 
requires determining what types of iterators should be supported, and how.

 Things that need random access override opIndex and opIndexAssign.
 
 Things which you can take a range of values from override opSlice and
 opSliceAssign.
 
 Things that require lock-step iteration override opApply, or supply a
 function that returns a delegate that "runs" a foreach loop.
 
 About the only thing this *doesn't* cover are bi-directional iterators
 (say, iterating back and forth over an infinite series).

This wouldn't be too difficult to do by adding a prev() method and 
atBegin() or some such.  Or maybe the Java style of hasNext(), next(), 
hasPrev(), prev().

 I know some people want to be able to stop iteration and resume it, but
 couldn't that be solved by making opApply return a generator like what
 Python does?  Those things are written the same way as a current opApply
 (sans all the return value from the delegate stuff), and can be advanced
 manually.
 
 I know this can be done.  I wrote the coroutine module for StackThreads
 a while back, and one of the *very first* things I did was implement
 generic generators.  The base generator class essentially just
 implemented opApply switching over to the coroutine every time it needed
 the next value.  Here's a port of Python's range() function (well, the
 one and zero argument versions :P):
 
 class
 range : Generator!(uint)
 {
     // These just let you use range() or range(n)
     static
     range
     opCall()
     {
         return range(uint.max);
     }

     static
     range
     opCall(uint limit)
     {
         return new range(limit);
     }

 protected:
     uint limit;

     // Or you can use new range(n)
     this(uint limit)
     {
         this.limit = limit;
         super();
     }

     // Where the magic happens.
     override
     uint
     run()
     {
         uint i = 0;
         while( i < limit )
             yield(i++);    // yield in D: *very* sexy

         StopGenerator();   // throws an exception, imitating what
                            // Python does to stop an iterator.
     }
 }

 
 Now, from that, we can implement opApply AND next/current in the base
 class.  Going with coroutines kills both the opApply and forward
 iterator birds with one stone.

Interesting idea--I'll admit I don't have much experience with Python. 
But I think both approaches have value.  Using coroutines just to step 
across two sequences simultaneously seems unnecessarily complex, and not 
terribly efficient compared to a simple pointer increment.  But it does 
offer a tremendous amount of power for more complex cases.

 As for bi-directional iterators... I actually can't think of anywhere
 I'd use them.  I just keep thinking "I could just use random access to
 the same end".  Python's never suffered from not having them...

I don't think I've ever needed them either, but they can come in handy 
when implementing one container in terms of another.  Say an iterable 
stack and queue (unlikely, I know) that are implemented via a doubly 
linked-list.


Sean

Nov 08 2006

Bill Baxter <wbaxter gmail.com> writes:

Sean Kelly wrote:
 Daniel Keep wrote:
 
 Walter Bright wrote:

 Sean Kelly wrote:

 Since D has slicing, the argument for using iterators to define the
 boundaries of a range of randomly accessible elements seems kind of
 small to me.  ie.

     sort( a.begin, a.begin + 5 ); // sort the first 5 elements
 or
     sort( a[0 .. 5] );

 I find the second to be cleaner to look at.  But I'm undecided whether
 we'd lose any power or flexibility by not bothering with random access
 iterators.

 Hmm, instead of 'pointers as iterators', have 'arrays as iterators'.
 That definitely sounds like a good area to explore.


 Hang on; aren't we back to where we are *right now*?

 
 
 Essentially.  But D has no formal description of an iterator type.  I 
 think it would be a useful convention to establish, and doing so 
 requires determining what types of iterators should be supported, and how.
 
 Things that need random access override opIndex and opIndexAssign.

 Things which you can take a range of values from override opSlice and
 opSliceAssign.

 Things that require lock-step iteration override opApply, or supply a
 function that returns a delegate that "runs" a foreach loop.

 About the only thing this *doesn't* cover are bi-directional iterators
 (say, iterating back and forth over an infinite series).

 
 
 This wouldn't be too difficult to do by adding a prev() method and 
 atBegin() or some such.  Or maybe the Java style of hasNext(), next(), 
 hasPrev(), prev().
 
 I know some people want to be able to stop iteration and resume it, but
 couldn't that be solved by making opApply return a generator like what
 Python does?  Those things are written the same way as a current opApply
 (sans all the return value from the delegate stuff), and can be advanced
 manually.

 I know this can be done.  I wrote the coroutine module for StackThreads
 a while back, and one of the *very first* things I did was implement
 generic generators.  The base generator class essentially just
 implemented opApply switching over to the coroutine every time it needed
 the next value.  Here's a port of Python's range() function (well, the
 one and zero argument versions :P):

 class
 range : Generator!(uint)
 {
     // These just let you use range() or range(n)
     static
     range
     opCall()
     {
         return range(uint.max);
     }

     static
     range
     opCall(uint limit)
     {
         return new range(limit);
     }

 protected:
     uint limit;

     // Or you can use new range(n)
     this(uint limit)
     {
         this.limit = limit;
         super();
     }

     // Where the magic happens.
     override
     uint
     run()
     {
         uint i = 0;
         while( i < limit )
             yield(i++);    // yield in D: *very* sexy

         StopGenerator();   // throws an exception, imitating what
                            // Python does to stop an iterator.
     }
 }


 Now, from that, we can implement opApply AND next/current in the base
 class.  Going with coroutines kills both the opApply and forward
 iterator birds with one stone.

 
 
 Interesting idea--I'll admit I don't have much experience with Python. 
 But I think both approaches have value.  Using coroutines just to step 
 across two sequences simultaneously seems unnecessarily complex, and not 
 terribly efficient compared to a simple pointer increment.  But it does 
 offer a tremendous amount of power for more complex cases.
 
 As for bi-directional iterators... I actually can't think of anywhere
 I'd use them.  I just keep thinking "I could just use random access to
 the same end".  Python's never suffered from not having them...

 
 
 I don't think I've ever needed them either, but they can come in handy 
 when implementing one container in terms of another.  Say an iterable 
 stack and queue (unlikely, I know) that are implemented via a doubly 
 linked-list.

I wrote some mesh (graph) manipulation algorithms in Python.  I really 
needed a linked list for that and a way to point to e.g. an edge in a 
list of edges coming out of a vertex that wouldn't be invalidated by 
inserting edges here and there before or after it, and I needed both to 
find the previous edge and the following edge.  In C++ std::list would 
have been the clear choice.

Doing that with indexes into to a standard python list just isn't 
practical (I started out trying it that way because Python FAQ's seem to 
say "you don't need a linked list, just use a python list".  But with a 
python list, every time you insert one new item you have to go find 
everyone out there that's holding onto an index into this list and get 
them to update their values.  Just not practical.

So sometimes bidirectional iterators are needed.  Even in Python.

--bb

Nov 08 2006

=?iso-8859-1?q?Knud_S=F8rensen?= <12tkvvb02 sneakemail.com> writes:

On Mon, 06 Nov 2006 12:46:01 -0800, Walter Bright wrote:

 It's becoming increasingly obvious that D needs iterators. While opApply 
   is far better for some kinds of iteration (such as recursively 
 traversing a directory), iterators are more efficient in some cases, and 
 allow for things that opApply makes difficult.
 

What are those cases ? Maybe we can find a way to fix the problem with
opApply.

Nov 06 2006

Walter Bright <newshound digitalmars.com> writes:

Knud S�rensen wrote:
 On Mon, 06 Nov 2006 12:46:01 -0800, Walter Bright wrote:
 
 It's becoming increasingly obvious that D needs iterators. While opApply 
   is far better for some kinds of iteration (such as recursively 
 traversing a directory), iterators are more efficient in some cases, and 
 allow for things that opApply makes difficult.

 
 What are those cases ? Maybe we can find a way to fix the problem with
 opApply.

One such case is the usefulness of being able to provide an input 
iterator to a parsing function, which may itself pass the iterator off 
to other parsing functions.

Nov 06 2006

Bill Baxter <dnewsgroup billbaxter.com> writes:

Walter Bright wrote:
 Knud S�rensen wrote:
 On Mon, 06 Nov 2006 12:46:01 -0800, Walter Bright wrote:

 It's becoming increasingly obvious that D needs iterators. While 
 opApply   is far better for some kinds of iteration (such as 
 recursively traversing a directory), iterators are more efficient in 
 some cases, and allow for things that opApply makes difficult.

 What are those cases ? Maybe we can find a way to fix the problem with
 opApply.

 
 One such case is the usefulness of being able to provide an input 
 iterator to a parsing function, which may itself pass the iterator off 
 to other parsing functions.

Another case is the desire to iterate over multiple containers at the 
same time, say as you would do in a sorted list merge operation. 
opApply could handle this one if D had real coroutines (i.e. multiple 
opApply's could be running simultaneously)

Another case is the basic desire to hold onto a pointer to a container 
for later use.  For instance if you have an STL style linked list 
container where the implementation (the actual nodes) are hidden from 
the user.  You still want the user to be able to keep something that 
functions like a pointer to a particular insertion point in the list 
(and be able to use that in generic sequence manipulation algorithms).

--bb

Nov 06 2006

KlausO <oberhofer users.sf.net> writes:

Walter Bright wrote:
 
 One such case is the usefulness of being able to provide an input 
 iterator to a parsing function, which may itself pass the iterator off 
 to other parsing functions.

Some questions:

Is your data structure a graph or a tree ?

What does your iterator iterating ?

I ask because I currently investigate the visitor pattern to traverse 
graph like data structures (see attached code for an example).

Nov 07 2006

Chris Nicholson-Sauls <ibisbasenji gmail.com> writes:

Knud S�rensen wrote:
 On Mon, 06 Nov 2006 12:46:01 -0800, Walter Bright wrote:
 
 
It's becoming increasingly obvious that D needs iterators. While opApply 
  is far better for some kinds of iteration (such as recursively 
traversing a directory), iterators are more efficient in some cases, and 
allow for things that opApply makes difficult.

 
 
 What are those cases ? Maybe we can find a way to fix the problem with
 opApply.
 

Mostly agreed.  For example, one case I seem to find mentioned a few times is
providing 
depth-first versus breadth-first iteration on a tree -- supposedly this is hard
to do with 
foreach/opApply.  While I do agree that opApply is not optimal (because you
can't tell 
opApply which way you want to go) I thought this was part of the utility of the
recent 
delegates-as-aggregates feature?  The following is just off the very top of my
head, so it 
might not be ideal/perfect, but I do believe it provides this ability, using
current D:

<code>
struct Tree(int B = 2, T) {
   version (Tree_NoWarnings) {}
   else {
     static if (B == 0) {
       static assert(false, "A tree with zero branches is just useless.");
     }
     else static if (B == 1) {
       static assert(false, "A tree with only one branch may as well be an
array.");
     }
   }

   Tree*[B] children ;
   T        value    ;

   int walkDepth (int delegate(inout Tree!(T)) dlg) {
     int result ;

     foreach (x; children) {
       if (x && result = x.walkDepth(dlg)) {
         goto end;
       }
     }
     if (result = dlg(this)) {
       goto end;
     }

     end:
       return result;
   }

   int walkBreadth (int delegate(inout Tree!(T)) dlg) {
     int         result          ;
     Tree!(T)*[] gen    = [this] ,
                 next            ;

     while (gen.length) {
       foreach (x; gen) {
         if (result = dlg(x.value)) {
           goto end;
         }
         foreach (y; x.children) {
           if (y) {
             next ~= y;
           }
         }
       }
       gen = next.dup;
       next.length = 0;
     }

     end:
       return result;
   }
}

Tree!(2, int) binary_int_tree ;
// ...
foreach (node; &binary_int_tree.walkDepth) {
   // ...
}
foreach (node; &binary_int_tree.walkBreadth) {
   // ...
}
</code>

Unless I'm missing something entirely, I don't see any major utility in this
case for 
having a seperate Iterator entity...

-- Chris Nicholson-Sauls

Nov 07 2006

Bill Baxter <dnewsgroup billbaxter.com> writes:

Chris Nicholson-Sauls wrote:
 Knud S�rensen wrote:
 On Mon, 06 Nov 2006 12:46:01 -0800, Walter Bright wrote:


 It's becoming increasingly obvious that D needs iterators. While 
 opApply  is far better for some kinds of iteration (such as 
 recursively traversing a directory), iterators are more efficient in 
 some cases, and allow for things that opApply makes difficult.


 What are those cases ? Maybe we can find a way to fix the problem with
 opApply.

 
 Mostly agreed.  For example, one case I seem to find mentioned a few 
 times is providing depth-first versus breadth-first iteration on a tree 
 -- supposedly this is hard to do with foreach/opApply.  

Other way around.  Tree traversal is harder to do with C++ style 
iterators, easy to do with opApply.  I assume your code is correct but 
didn't look at it too closely.

--bb

Nov 07 2006

"Craig Black" <cblack ara.com> writes:

"Walter Bright" <newshound digitalmars.com> wrote in message 
news:eio6u5$1j6o$1 digitaldaemon.com...
 It's becoming increasingly obvious that D needs iterators. While opApply 
 is far better for some kinds of iteration (such as recursively traversing 
 a directory), iterators are more efficient in some cases, and allow for 
 things that opApply makes difficult.

 So hear are the design goals:

 1) Works with dynamic and stat arrays
 2) Doesn't need to work with associative arrays
 3) Should be possible to achieve "pointer efficiency" with it
 4) Needs to be able to restrict lvalue access (what C++ does with const 
 iterators)
 5) Needs to work seamlessly with foreach
 6) Iterators need to be copyable

 So here's one design:

 .being property returns an iterator that starts at the beginning

 .end returns an iterator that is at the end

 Overloading * doesn't work with D. But, instead:

 Overload opIndex for rvalue access
 Overload opIndexAssign for lvalue access

 Overloading opIndex also will work for random access

 foreach loops will not be able to have a 'key' parameter.

It's not a huge problem, but passing an argument needlessly is both ugly and 
slightly inefficient.  Perhaps opIndex and opIndexAssign could be overloaded 
without any parameters.

-Craig

Nov 07 2006

Oskar Linde <oskar.lindeREM OVEgmail.com> writes:

Walter Bright wrote:
 It's becoming increasingly obvious that D needs iterators. While opApply 
  is far better for some kinds of iteration (such as recursively 
 traversing a directory), iterators are more efficient in some cases, and 
 allow for things that opApply makes difficult.
 
 So hear are the design goals:
 
 1) Works with dynamic and stat arrays
 2) Doesn't need to work with associative arrays
 3) Should be possible to achieve "pointer efficiency" with it
 4) Needs to be able to restrict lvalue access (what C++ does with const 
 iterators)
 5) Needs to work seamlessly with foreach
 6) Iterators need to be copyable

I apologize for this long post. Writing too much has become a bad habit 
of mine lately it seems. This took a while writing, and in the meantime 
new posts appeared that say much of the same. If you don't bother 
reading this, read Sean's posts instead, as I agree 100 % with him on 
this. :)

What I suggest is that we should not try to copy the C++ "iterator == 
pointer" idiom, but instead aim for an iterator design that is 
self-contained and simple.

For me, the iterator concept is simple. An iterator is an entity that 
allows an iterative way of accessing data. An iterator needs:
1. a way to reference the data
2. a way to traverse the data
3. a way to signal the end of the data


second iterator that just signals the end of the iterative range.)

An iterator could also be a generator.
Here are some examples (translated to D) of how iterators are 
implemented in some languages other than C++:

Java
----

class Iterator(T) {
	bool hasNext(); // self explanatory
	T* next();      // steps and returns reference to next element
}

while(i.hasNext())
	writefln(*i.next());



-----------------------

class Iterator(T) {
	bool MoveNext();  // false if no more elements
	T Current();      // property getter
}

while(i.MoveNext)
	writefln(i.Current);


Python
------

class Iterator(T) {
	T next();
}

while(true)
	try
		writefln(i.next);
	catch (StopIteration)
		break;


Taking inspiration from the above and following Walter's design goals 
above, here is my suggestion:

interface Iterator(T) {

	T value;         // getter (rvalue access)
	T value(T);      // setter (lvalue access)
	int opApply(...) // support foreach style iteration
}

An alternative is supplying a value property that returns a pointer to 
the element, but that would not allow limiting lvalue access for read 
only iterators and introduces a pointer, which might be bad by itself.

Usage:

// explicit iteration
while(i.step) {
	writefln(i.value);
	i.value = 5;
}

// implicit iteration
foreach(o; i)
	writefln(o);


I will briefly explain my reasoning:

As I said, I believe the iterator should be clean and simple. Saying 
that it is OK for the iterator implementations to be dirty and messy, 
but that the implementations will be hidden away in libraries and only 
be the headache of library writers will lead to just that - only library 
writers will write iterators.

As Sean hints at, "random access iterators" are not pure iterators. The 
random access part is orthogonal to the iterative part. In D, a perfect 
way to implement a random access view is defining a .length property and 
an opIndex method. A random access view doesn't need to be iterable and 
an iterator doesn't need to provide random access. A D slice is the 
perfect example of a random access view.

I see no reason to use operator overloading when the result isn't 
replaceable with a pointer anyway.

Bill Baxter's points regarding iterator as a concept vs an 
interface/abstract class are very valid. And as Bill says, maybe both 
should be supported. One of them could always be converted to the other.

Regarding efficiency, here is one sample implementation that a compiler 
should(*) be able to make just as efficient as for/foreach loops:

struct Iterator(T) {
     T* ptr;
     T* end;

     bool step() { return ++ptr != end; }
     T value() { return *ptr; }
     T value(T x) { return *ptr = x; }
     // mixin OpApplyMixin;
}

Iterator!(T) iter(T)(T[] x) {
     Iterator!(T) ret;// = {x.ptr-1, x.ptr+x.length}; // ICE
     ret.ptr = x.ptr-1;
     ret.end = x.ptr+x.length;
     return ret;
}

void main() {
     int[] arr = [1,2,3,4,5];

     auto i = arr.iter();

     while(i.step) {
         writefln("%s",i.value);
         i.value = 5;
     }
}

/Oskar

Nov 07 2006

nazo <lovesyao gmail.com> writes:

interface Iterator(T) {

    T value;         // getter (rvalue access)
    T value(T);      // setter (lvalue access)
    int opApply(...) // support foreach style iteration
}

I think that this is more better.

interface Iterator(T) {

                        // also support in pointer with opBack().
     bool opNext(int);  // speed hack
     T opValue;         // getter (rvalue access). also support in pointer
     T opValue(T);      // setter (lvalue access). also support in pointer
     int opApply(...) // support foreach style iteration
}

while(iter.next)//call opNext()
   writefln(*iter);//call opValue()

Nov 07 2006

Walter Bright <newshound digitalmars.com> writes:

Oskar Linde wrote:
 As Sean hints at, "random access iterators" are not pure iterators. The 
 random access part is orthogonal to the iterative part. In D, a perfect 
 way to implement a random access view is defining a .length property and 
 an opIndex method. A random access view doesn't need to be iterable and 
 an iterator doesn't need to provide random access. A D slice is the 
 perfect example of a random access view.

You two might be right.

 I see no reason to use operator overloading when the result isn't 
 replaceable with a pointer anyway.

Right.

 Bill Baxter's points regarding iterator as a concept vs an 
 interface/abstract class are very valid. And as Bill says, maybe both 
 should be supported. One of them could always be converted to the other.

Yes.

 Regarding efficiency, here is one sample implementation that a compiler 
 should(*) be able to make just as efficient as for/foreach loops:
 
 struct Iterator(T) {
     T* ptr;
     T* end;
 
     bool step() { return ++ptr != end; }
     T value() { return *ptr; }
     T value(T x) { return *ptr = x; }
     // mixin OpApplyMixin;
 }
 
 Iterator!(T) iter(T)(T[] x) {
     Iterator!(T) ret;// = {x.ptr-1, x.ptr+x.length}; // ICE
     ret.ptr = x.ptr-1;
     ret.end = x.ptr+x.length;
     return ret;
 }
 
 void main() {
     int[] arr = [1,2,3,4,5];
 
     auto i = arr.iter();
 
     while(i.step) {
         writefln("%s",i.value);
         i.value = 5;
     }
 }

It's good, but the pointer-before-the-start is problematic. This can 
cause problems on some architectures. Pointing past the end is 
explicitly supported, but before the beginning is not. I'd also prefer 
the step() get 'stuck' at the end, instead of going past it if it gets 
called again (this makes sense for input iterators). So I suggest instead:

	for (auto i = arr.iter(); !i.atEnd(); i.step())
		...

in other words, separate out the 'end' test from the 'step' operation.

Nov 07 2006

Dave <Dave_member pathlink.com> writes:

Walter Bright wrote:
 struct Iterator(T) {
     T* ptr;
     T* end;

     bool step() { return ++ptr != end; }
     T value() { return *ptr; }
     T value(T x) { return *ptr = x; }
     // mixin OpApplyMixin;
 }

 Iterator!(T) iter(T)(T[] x) {
     Iterator!(T) ret;// = {x.ptr-1, x.ptr+x.length}; // ICE
     ret.ptr = x.ptr-1;
     ret.end = x.ptr+x.length;
     return ret;
 }

 void main() {
     int[] arr = [1,2,3,4,5];

     auto i = arr.iter();

     while(i.step) {
         writefln("%s",i.value);
         i.value = 5;
     }
 }

 
 It's good, but the pointer-before-the-start is problematic. This can 
 cause problems on some architectures. Pointing past the end is 
 explicitly supported, but before the beginning is not. I'd also prefer 
 the step() get 'stuck' at the end, instead of going past it if it gets 
 called again (this makes sense for input iterators). So I suggest instead:
 
     for (auto i = arr.iter(); !i.atEnd(); i.step())
         ...
 
 in other words, separate out the 'end' test from the 'step' operation.

For arrays, we already have arr.ptr, what if array.end was simply added?

auto slice = array[50..100];
for(auto i = slice.ptr; i != slice.end; i++) { ... }

Then for UDT custom iterators, ptr, end and op[Pre|Post][Inc|Dec] could be
overloaded?

Just a thought..

Nov 12 2006

Burton Radons <burton-radons smocky.com> writes:

I don't like C++ iterators; they're a very old design made to fit a 
language that was unmodifiable. They can't work with destructive or 
unbound iterators (well, they can, but you just gotta hope to hell the 
user doesn't use the interface in an unusual way), they can't work with 
associative arrays or tree structures without allocation (or wasting 
space in the structure for parent node pointers), and they're VERY 
verbose and distributed. Plus they allow iterating to an illegal index 
(the end index), which requires an additional state value for some 
iterables.

Python's generators are a far better solution, like:

    void i_range (T) (T from, T to, T step = cast (T) 1, yield T index)
    {
        for (index = from; index < to; index += step)
            yield;
    }

    foreach (i; i_range (0, 100)) ...

Or:

    yield T i_range (T) (T from, T to, T step = cast (T) 1)
    {
        T index;

        for (index = from; index < to; index += step)
            yield index;
    }

You've been meaning to allow frame pointers for proper closures anyway, 
right? :P

The implementation in iterators using the simplest interface I can think 
of would be something like:

    struct IRange (T, T from, T to, T step = cast (T) 1)
    {
        T index = from;

        static IRange opCall (T value)
        {
            IRange result;

            result.index = value;
            return result;
        }

        T opThis ()
        {
            return index;
        }

        IRange opNext ()
        {
            assert (index <= end);
            return opCall (index + 1);
        }

        IRange opEnd ()
        {
            return opCall (to);
        }
    }

    foreach (i; IRange! (int, 0, 100)) ...

Notice how the SINGLE line indicating how the arguments are interpreted 
are hidden within a whole bunch of boilerplate.

Nov 10 2006

Walter Bright <newshound digitalmars.com> writes:

As nice as yield is for some types of problems, they aren't going to 
happen in D anytime soon, because they're a lot of tricky work to implement.

Nov 10 2006

D Programming

C/C++ Programming

Other

digitalmars.D - Iterators for D