www.digitalmars.com         C & C++   DMDScript  

D - Ruby's iterators

reply "Robert" <no spam.ne.jp> writes:
How about Ruby's iterators, Walter?

The iterators are very funny and useful features.
http://www.rubycentral.com/book/tut_containers.html#S2

By using the iterators, one can declare "functions" like "foreach".
e.g.


class Foo


    end
    def my_each

        for i in 0... array.length

            yield( array[i])
        end
    end
end

def main
    foo = Foo.new([1, 2, 3])
    foo.my_each {

        print elem
    }
end; main


"my_each" is similar to D's "opApply".

//////////// D's code ////////////
class Foo {
    uint[] m_array;
    this(uint[] array) { m_array = array; }
    int opApply(int delegate(inout uint) yield) {
        for(int i = 0; i < m_array.length; ++i) {
            int res = yield(m_array[i]);
            if(res != 0) return res;
        }
        return 0;
    }
}

int main() {
    static const uint[] array = [1, 2, 3];
    Foo foo = new Foo(array);
    foreach(uint i; foo) { printf("%d", i); }
    return 0;
}
//////////// end ////////////

But, the iterators are more general than "opApply".
e.g.


def my_times(n)
    for i in 0...n
        yield(i)
    end
end
def main
    my_times(5) { |i|

    }
end; main


This code can be translated to D by using function pointer:

//////////// D's code ////////////
void times(int n, void function(int) yield) {
    for(int i = 0; i < n; ++i) {
        yield(i);
    }
}
int main() {
    times(5, function void(int i) {
        printf("%d", i);
    });

    void print(int i) {
        printf("%d", i);
    }
    times(5, print);
    return 0;
}
//////////// end ////////////

But, Ruby's code is smarter IMO.

If one could also use iterators in D,
it would become:

//////////// D's? code ////////////
int times(int n; int) {
    for(int i = 0; i < n; ++i) {
        int res = yield(i);
        if(res != 0) return res;
    }
    return 0;
}
int main() {
    times(5; int i) {
        printf("%d", i);
    }
    return 0;
}
//////////// end ////////////

If there are no parameters,

//////////// D's? code ////////////
int times(void; int) {
    for(int i = 0; i < n; ++i) {
        int res = yield(i);
        if(res != 0) return res;
    }
    return 0;
}
int main() {
    times(void; int i) {
        printf("%d", i);
    }
    times(int i) {    // if possible...
        printf("%d", i);
    }
    return 0;
}
//////////// end ////////////
Jan 04 2004
parent reply "Walter" <walter digitalmars.com> writes:
I looked into using a 'yield' setup in D, which needs coroutines to
implement. Unfortunately, there doesn't seem to be a clean way of doing
coroutines under win32, and I abandoned it in favor of a technique I know
will work and is portable.
Jan 05 2004
next sibling parent reply "Robert" <no spam.ne.jp> writes:
To my amateur eyes, "yield" setup is almost the same
as "foreach" and "opApply" setup,
or simple callback setup with function literal
except for behaviors of break, continue, etc.
Where does the difference, whether coroutines are needed or not,
come from?

"Walter" <walter digitalmars.com> wrote in message
news:btbcgq$1r5t$2 digitaldaemon.com...
 I looked into using a 'yield' setup in D, which needs coroutines to
 implement. Unfortunately, there doesn't seem to be a clean way of doing
 coroutines under win32, and I abandoned it in favor of a technique I know
 will work and is portable.
Jan 05 2004
parent reply Robert <Robert_member pathlink.com> writes:
1. Normal iterators

Definition:
iterator times(int n) block(int) {
for(int i = 0; i < n; ++i) {
yield(i);
}
}

Call:
times(3) block(int i) {
printf("%d\n", i);
}

The block is passed through a hidden parameter
"int delegate(inout int) __d_yield" as foreach and opApply.
The stereotype:
int __d_yeild_result = __d_yield(i);
if(__d_yeild_result != 0) {
return __d_yeild_result;
}
and
return 0;
are abbreviated as above.
Therefore, if the loop is broken,
the iterator "times" ends at "yield(i);".

The syntax is different from foreach and opApply,
but the mechanism is same as them *completely*.


2. No parameters of the block

Definition:
iterator times(int n) block() {
for(int i = 0; n; ++i) {
yield();
}
}

or

iterator times(int n) {
for(int i = 0; n; ++i) {
yield();
}
}

if possible.

Call:
times(3) block() {
printf("%d\n", i);
}

or

times(3) {
printf("%d\n", i);
}

if possible.


3. Inheritance of iteration

Definition:
iterator threeTimes() block(int) {
yield times(3);
}

Call:
threeTimes() block(int i) {
printf("%d\n", i);
}

Block delegate __d_yield is passed to times.
So, continuation (or fiber) is *not* needed.


4. Finalization

Definition:
iterator times(int n) block(int) {
for(int i = 0; i < n; ++i) {
yield(i) {
puts("Break: times");
}
}
puts("End: times");
}

iterator threeTimes() block(int) {
yield times(3) {
puts("Break: threeTimes");
}
puts("End: threeTimes");
}

If "yield" has a block,
the block is performed only before return at "yield".

Robert (Japanese)
Jan 10 2004
parent reply "Sean L. Palmer" <palmer.sean verizon.net> writes:
The problem with iterators is that a for loop is the complete wrong way to
approach the problem from an architectural standpoint... you want the code
generation etc to be controlled by the client code;  the iterator is a tool.
But with for-loop semantics we have to pass in a function to be executed or
otherwise do it more-or-less inside out.  What if the whole problem was
turned around?  What kind of construct is the opposite of a for-loop?

for-loop:

for (variable/start; condition; increment)
    action;

what we want is more like:

do action
for (variable/start; condition; increment);

Then the for part could be some kind of function and get called (used) by
different actions.

Maybe make range types deal with iterator values, int being a "generic" kind
of iterator somewhat akin to void*.  Not tied to any specific container.
Loose, or unsafe, iterators, that to be used must be specified along with an
array.

Somehow this whole iterator mess needs to be integrated with array slicing.

For instance I want to be able to write:

int x[400];

x[] = x[] * 2;

I also want to be able to do this:

template (T)
class mycontainer
{
    mycontainer(int count) { guts.length = count; }
    private T[] guts;
    property int length  { get() { return guts.length; } set(int v) {
guts.length = v; } }
    T[] opIndex(range(int) rng) { return guts[rng]; }
}

mycontainer(int) x(400);

x[] = x[] * 2;

or

range(int) rng = 0..x.length;
x[rng] = x[rng] * 2;

What is a slice anyway but a sequence of values you may access by discrete
values along the controlled range.  The mechanism by which those values are
accessed doesn't have to be specified.  Put whatever iterator machinery you
want there.

That would make x[] mean not only an array of values of unspecified size,
but a way to produce x.length values by random access, in any sequence the
user cares to access them.  Works well with for-loops.  It would also mean
that x[] represents traversal of all of the values inside x, in an
unspecified order (the "default" order).  This should match the order of
indexing by integer, but the actual mechanism of making the expression map
to each element should be handled by cooperation between the compiler and
the container classes involved, and should be able to be done more
efficiently than a naive for-loop would be able to manage.

I need sleep!!

Sean


"Robert" <Robert_member pathlink.com> wrote in message
news:btpuis$9ub$1 digitaldaemon.com...
 1. Normal iterators

 Definition:
 iterator times(int n) block(int) {
 for(int i = 0; i < n; ++i) {
 yield(i);
 }
 }

 Call:
 times(3) block(int i) {
 printf("%d\n", i);
 }

 The block is passed through a hidden parameter
 "int delegate(inout int) __d_yield" as foreach and opApply.
 The stereotype:
 int __d_yeild_result = __d_yield(i);
 if(__d_yeild_result != 0) {
 return __d_yeild_result;
 }
 and
 return 0;
 are abbreviated as above.
 Therefore, if the loop is broken,
 the iterator "times" ends at "yield(i);".

 The syntax is different from foreach and opApply,
 but the mechanism is same as them *completely*.


 2. No parameters of the block

 Definition:
 iterator times(int n) block() {
 for(int i = 0; n; ++i) {
 yield();
 }
 }

 or

 iterator times(int n) {
 for(int i = 0; n; ++i) {
 yield();
 }
 }

 if possible.

 Call:
 times(3) block() {
 printf("%d\n", i);
 }

 or

 times(3) {
 printf("%d\n", i);
 }

 if possible.


 3. Inheritance of iteration

 Definition:
 iterator threeTimes() block(int) {
 yield times(3);
 }

 Call:
 threeTimes() block(int i) {
 printf("%d\n", i);
 }

 Block delegate __d_yield is passed to times.
 So, continuation (or fiber) is *not* needed.


 4. Finalization

 Definition:
 iterator times(int n) block(int) {
 for(int i = 0; i < n; ++i) {
 yield(i) {
 puts("Break: times");
 }
 }
 puts("End: times");
 }

 iterator threeTimes() block(int) {
 yield times(3) {
 puts("Break: threeTimes");
 }
 puts("End: threeTimes");
 }

 If "yield" has a block,
 the block is performed only before return at "yield".

 Robert (Japanese)
Jan 11 2004
parent "Robert" <no spam.ne.jp> writes:
"Sean L. Palmer" <palmer.sean verizon.net> wrote in message
news:btr677$29ri$1 digitaldaemon.com...
 The problem with iterators is that a for loop is the complete wrong way to
 approach the problem from an architectural standpoint... you want the code
 generation etc to be controlled by the client code;  the iterator is a
tool.
 But with for-loop semantics we have to pass in a function to be executed
or
 otherwise do it more-or-less inside out.  What if the whole problem was
 turned around?  What kind of construct is the opposite of a for-loop?

 for-loop:

 for (variable/start; condition; increment)
     action;

 what we want is more like:

 do action
 for (variable/start; condition; increment);

 Then the for part could be some kind of function and get called (used) by
 different actions.
The former describes processing flow. First, do "variable/start", next do "condition", next do "action", next go to "for(...)", next "increment", next "condition", ... In case of the latter, what is executed first comes last of the statement. I think it is practically not good. It's as if BibTeX's style file...
 Maybe make range types deal with iterator values, int being a "generic"
kind
 of iterator somewhat akin to void*.  Not tied to any specific container.
 Loose, or unsafe, iterators, that to be used must be specified along with
an
 array.

 Somehow this whole iterator mess needs to be integrated with array
slicing.
 For instance I want to be able to write:

 int x[400];

 x[] = x[] * 2;
<snip> It's similar to Fortran90: a(:,:,:) = b(:,:,:) + c(:,:,:) d(:) = e(3,:) It is powerful and easy to be parallelized, but it is too restrictive. IMHO, It may be needed almost only in numerical calculation programs...
Jan 11 2004
prev sibling next sibling parent reply Hauke Duden <H.NS.Duden gmx.net> writes:
Walter wrote:
 I looked into using a 'yield' setup in D, which needs coroutines to
 implement. Unfortunately, there doesn't seem to be a clean way of doing
 coroutines under win32, and I abandoned it in favor of a technique I know
 will work and is portable.
I think "fibers" in Win32 (see CreateFiber) are essentially coroutines (sort of manually-switched threads). But I agree that this is not a common enough feature in OSs to require it for D. For example, it is not available on Windows95. And I'm sure portable OSs like WinCE, PalmOS, etc. will also not provide such a rarely used feature. Hauke
Jan 05 2004
parent "Matthew" <matthew.hat stlsoft.dot.org> writes:
"Hauke Duden" <H.NS.Duden gmx.net> wrote in message
news:btbhsf$23lk$1 digitaldaemon.com...
 Walter wrote:
 I looked into using a 'yield' setup in D, which needs coroutines to
 implement. Unfortunately, there doesn't seem to be a clean way of doing
 coroutines under win32, and I abandoned it in favor of a technique I
know
 will work and is portable.
I think "fibers" in Win32 (see CreateFiber) are essentially coroutines (sort of manually-switched threads).
It is.
 But I agree that this is not a common enough feature in OSs to require
 it for D.
In a sense you're right, since it is flawed. But it is a fantastic feature. Alas, I can't give you a copy of next month's CUJ, but it'll be out in a few days. (Sorry to be a prima donna about it, but you sign away your rights to broadcast when you give it to the magazine.)
 For example, it is not available on Windows95. And I'm sure
 portable OSs like WinCE, PalmOS, etc. will also not provide such a
 rarely used feature.
The techniques described in the article would *absolutely* be useful in D, and provide some amazing capabilities for merging enumeration models. One of the things I intend to propose to Walter for 1.1 is that D provides co-routines that are like fibers, but without all the flaws of the Win32 version, and I'm hoping he'll have the patience to let me help him do all the weird assembler gunk, as I'm a real greenie in that area. (Apart from the new atomic functions coming in STLSoft 1.7.1 in Feb, I've not written any assembler since the mid 90s.)
Jan 05 2004
prev sibling parent "Matthew" <matthew.hat stlsoft.dot.org> writes:
 I looked into using a 'yield' setup in D, which needs coroutines to
 implement. Unfortunately, there doesn't seem to be a clean way of doing
 coroutines under win32, and I abandoned it in favor of a technique I know
 will work and is portable.
There sort of is. Check out next February's CUJ (which will be out in a few days). To have a truly workable solution, though, would require an assembler expert - where would we get one of them, I wonder? <G> - to implement a proper API, as the Win32 fiber api doesn't cut the mustard.
Jan 05 2004