digitalmars.D - A problem with generators

bearophile (43/43) Mar 17 2010 D currently has two ways to define a generator, the first is the older o...

dsimcha (9/19) Mar 17 2010 intrusive. And this syntax doesn't even gain performance, because dmd is...

bearophile (13/16) Mar 17 2010 That's a defect, not a quality ;-)

Adam D. Ruppe (7/9) Mar 17 2010 It says exactly what it does, and enables something you couldn't do befo...

bearophile (6/9) Mar 17 2010 It shows too much the machinery, this can lead to bugs, requires more co...

Andrei Alexandrescu (11/27) Mar 17 2010 I honestly don't believe that's the case. If you're looking for such

Walter Bright (3/7) Mar 17 2010 I don't agree that go shows that at all, we will only know what it shows...

bearophile (3/3) Mar 17 2010 And by the way, you seem to have missed the purpose of my post: it was a...

Adam D. Ruppe (8/13) Mar 17 2010 That's a *really* subjective statement. I understand pretty close to

bearophile (7/8) Mar 17 2010 Try to read after the line:

Lutger (13/48) Mar 17 2010 Hi, is this generator thing really the same as opApply? I thought this i...

bearophile (14/21) Mar 17 2010 Currently opApply is not efficient with dmd. But LDC is able to often in...

=?ISO-8859-1?Q?Pelle_M=E5nsson?= (3/46) Mar 17 2010 While I think it should be done properly with ranges and not opApply, I

bearophile (5/6) Mar 17 2010 What do you mean?
bearophile (4/5) Mar 17 2010 Note that syntax was invented on the spot, it's not perfect, as you have...

Norbert Nemec (72/145) Mar 18 2010 I really like that simplified, "yield"-based syntax. However, I believe

Regan Heath (3/31) Mar 18 2010 So, these generators are essentially inline functions?

Norbert Nemec (8/9) Mar 18 2010 No. They may be inlined, but this is just an optimization just like for

Walter Bright (3/14) Mar 18 2010 How is that different from a local delegate that refers to local variabl...

Andrei Alexandrescu (5/20) Mar 18 2010 I think the difference is that e.g. a coroutine can repeatedly return

Walter Bright (6/27) Mar 18 2010 Right, the delegate would have to keep its persistent state in outer loc...

Norbert Nemec (5/9) Mar 19 2010 Which concept of generators do you refer to? The stack-based generators

Walter Bright (2/12) Mar 19 2010 How can it do that, and yet allow the calling function also call other f...

Norbert Nemec (15/29) Mar 19 2010 When a generator is called for the first time, it shifts the ESP to make...

Norbert Nemec (9/25) Mar 19 2010 It is pretty much the same in the trivial case of one loop having one

bearophile <bearophileHUGS lycos.com> writes:

D currently has two ways to define a generator, the first is the older one with
opApply (And opApplyReverse):
int opApply(int delegate(ref Type [, ...]) dg);


The syntax of opApply is terrible: hard to remember, bug-prone, noisy,
intrusive. And this syntax doesn't even gain performance, because dmd is not
very good at inlining here.
Several times in the past I have said that a very good syntax for this is the
Python one:

def foo():
  yield 5

It's really easy to use and remember, not bug prone. This possible syntax is
about 75 times better than the current opApply:

yield(int) foo() {
  yield 5;
}


That's sugar for something like:

struct foo {
    int opApply(int delegate(ref int) dg) {
        int result;
        int r = 5;
        result = dg(r);
        if (result)
            goto END;
      END:
        return result;
    }
}





The other way to create a struct/class generator in D2 is with the Range
protocol, with the methods that ask for the next item, etc. I have seen that
such generators sometimes are harder to write, because you have to manage the
state yourself, but they can be more efficient and they are more powerful.

Both ways are useful, because they are useful for different situations. They
are one the inverted-out version of the other.



of a problem, that increases the computational complexity of the code when
there is a nested generator.


syntax and the performance (both languages have not yet implemented it).

Python version:
http://python.org/dev/peps/pep-0380/

With that this code:

def foo():
  for x in some_iterable:
    yield x


Becomes:

def foo():
  yield from some_iterable




http://kirillosenkov.blogspot.com/2007/10/yield-foreach.html

http://connect.microsoft.com/VisualStudio/feedback/details/256934/yield-return-to-also-yield-collections

http://blogs.msdn.com/wesdyer/archive/2007/03/23/all-about-iterators.aspx


As D2 will start coming out of its embryo state, it too will probably have to
fece this problem with iterators. In the meantime I hope the opApply syntax
will be replaced by something better (the yield I've shown).

Bye,
bearophile

Mar 17 2010

dsimcha <dsimcha yahoo.com> writes:

== Quote from bearophile (bearophileHUGS lycos.com)'s article
 D currently has two ways to define a generator, the first is the older one with

opApply (And opApplyReverse):
 int opApply(int delegate(ref Type [, ...]) dg);
 The syntax of opApply is terrible: hard to remember, bug-prone, noisy,

intrusive. And this syntax doesn't even gain performance, because dmd is not
very
good at inlining here.
 Several times in the past I have said that a very good syntax for this is the

Python one:
 def foo():
   yield 5
 It's really easy to use and remember, not bug prone. This possible syntax is

about 75 times better than the current opApply:
 yield(int) foo() {
   yield 5;
 }

The **BIG** advantage of the status quo, where the loop body is a delegate
instead
of being "magic", is that it enables lots of hackish things to be done, like
running different iterations of the loop in different threads.

Mar 17 2010

bearophile <bearophileHUGS lycos.com> writes:

dsimcha:
 The **BIG** advantage of the status quo, where the loop body is a delegate
instead
 of being "magic", is that it enables lots of hackish things to be done, like
 running different iterations of the loop in different threads.

That's a defect, not a quality ;-)

Something like this:
yield(int) foo() { yield 5; }

Has advantages:
- It's much simpler to remember, while I need to look in code snippets every
time I want to write an opApply
- It's shorter to write and to read, so you need less time to write code, you
put less bugs in the code, and your program gets smaller. It saves several
lines of code. It's easy to see if it's correct.

learn it, and other programmers surely will find it simpler than the opAssign.
And If you want to do hacks there's the other syntax, based on range iteration
methods.

As you say, the good thing of opApply is that it contains no magic, the syntax
can almost be mapped (if you know D very well) to the assembly code the
compiler will generate. But there's not enough sugar there.

Anyway, D2 is finalized, so I don't think yield will be added soon. It's OK.
But I think opApply is a hack that I don't like to see in D2. Go language shows
that it's better to not have a feature than having a bad one, even if it's a
very important feature like exceptions, classes, etc, because new features can
be added later, while it's harder to remove them.

Bye,
bearophile

Mar 17 2010

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Wed, Mar 17, 2010 at 05:47:12PM -0400, bearophile wrote:
 But I think opApply is a hack that I don't like to see in D2.

It says exactly what it does, and enables something you couldn't do before.

How is that a hack?

 Go language shows

The Go language sucks.

-- 
Adam D. Ruppe
http://arsdnet.net

Mar 17 2010

bearophile <bearophileHUGS lycos.com> writes:

Adam D. Ruppe:
 It says exactly what it does, and enables something you couldn't do before.

Even assembly says what it does, but here I want more sugar :-) It helps.


 How is that a hack?

It shows too much the machinery, this can lead to bugs, requires more code,
requires more memory to remember it, you need more time to read it and see
if/where it's wrong.


 The Go language sucks.

In future it can suck less :-) And even a bad product or ignorant person can
give useful suggestions. The authors of Go are not ignorant people: Robert
Griesemer, Rob Pike and Ken Thompson. I am not even worth brushing their
sandals :-) They are giving a lesson in almost the limits of minimalism in a
modern language.

Bye,
bearophile

Mar 17 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 03/17/2010 05:03 PM, bearophile wrote:
 Adam D. Ruppe:
 It says exactly what it does, and enables something you couldn't do
 before.

 Even assembly says what it does, but here I want more sugar :-) It
 helps.


 How is that a hack?

 It shows too much the machinery, this can lead to bugs, requires more
 code, requires more memory to remember it, you need more time to read
 it and see if/where it's wrong.


 The Go language sucks.

 In future it can suck less :-) And even a bad product or ignorant
 person can give useful suggestions. The authors of Go are not
 ignorant people: Robert Griesemer, Rob Pike and Ken Thompson. I am
 not even worth brushing their sandals :-)

Me neither, but we should judge Go on its own merit.

 They are giving a lesson in
 almost the limits of minimalism in a modern language.

I honestly don't believe that's the case. If you're looking for such 
lessons, there are plenty of much better examples of minimalistic 
languages. Go is simply an unfinished language; it doesn't have the kind 
of features that make a minimalistic language be also comprehensive. 
There are problems for which Go does not have an answer yet and that's 
simply because an answer hasn't been thought of yet. From what I see so 
far, it's unlikely revolutionary solutions are right down the corner. Go 
is a rehash of old ideas - but fortunately fixes the typo in O_CREAT.


Andrei

Mar 17 2010

Walter Bright <newshound1 digitalmars.com> writes:

bearophile wrote:
 Go language
 shows that it's better to not have a feature than having a bad one, even if
 it's a very important feature like exceptions, classes, etc, because new
 features can be added later, while it's harder to remove them.


I don't agree that go shows that at all, we will only know what it shows after 
it's been in use for a few years.

Mar 17 2010

bearophile <bearophileHUGS lycos.com> writes:

And by the way, you seem to have missed the purpose of my post: it was about
the problem of computational complexity of nested generators.

Bye,
bearophile

Mar 17 2010

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Wed, Mar 17, 2010 at 05:26:14PM -0400, bearophile wrote:
 It's really easy to use and remember, not bug prone. This possible syntax is
about 75 times better than the current opApply:
 
 yield(int) foo() {
   yield 5;
 }


That's a *really* subjective statement. I understand pretty close to
instantly what opApply does - sends your data to the function inside
the loop.

Even with your explanation, I still don't really get what yield does.

-- 
Adam D. Ruppe
http://arsdnet.net

Mar 17 2010

bearophile <bearophileHUGS lycos.com> writes:

Adam D. Ruppe:
 Even with your explanation, I still don't really get what yield does.

Try to read after the line:
"That's sugar for something like:"

You can see a generator that uses opApply that will be clear.

It's a way to spit out things out of a struct/class, it's a way to define a
generator, that is something that spits out a new item every time you ask it
to, with a foreach.

Bye,
bearophile

Mar 17 2010

Lutger <lutger.blijdestijn gmail.com> writes:

bearophile wrote:

 D currently has two ways to define a generator, the first is the older one
 with opApply (And opApplyReverse): int opApply(int delegate(ref Type [,
 ...]) dg);
 
 
 The syntax of opApply is terrible: hard to remember, bug-prone, noisy,
 intrusive. And this syntax doesn't even gain performance, because dmd is
 not very good at inlining here. Several times in the past I have said that
 a very good syntax for this is the Python one:
 
 def foo():
   yield 5
 
 It's really easy to use and remember, not bug prone. This possible syntax
 is about 75 times better than the current opApply:
 
 yield(int) foo() {
   yield 5;
 }
 
 
 That's sugar for something like:
 
 struct foo {
     int opApply(int delegate(ref int) dg) {
         int result;
         int r = 5;
         result = dg(r);
         if (result)
             goto END;
       END:
         return result;
     }
 }
 

Hi, is this generator thing really the same as opApply? I thought this is  
more like co-routines. Can yield be as efficient as current opApply? 
(assuming inlining will eventually happen where possible)

The tango docs have/had a good discussion about opApply vs Iterator, where 
the author called opApply a visitor iirc. Unfortunately the part about 
generators was never finished.

yield may be easy to understand once you understand it :) It's prettier for 
sure. But take a glimpse at this page on stackoverflow, the length of the 
answers should tell you that it's not as straightforward as you might think 
and that it's not only about the syntax:

http://stackoverflow.com/questions/231767/can-somebody-explain-me-this-
python-yield-statement

Mar 17 2010

bearophile <bearophileHUGS lycos.com> writes:

Lutger:
 Hi, is this generator thing really the same as opApply? I thought this is  
 more like co-routines.

In Python it's syntax for co-routines, but the syntax I have suggested for D is
just sugar for the opApply.


Can yield be as efficient as current opApply? (assuming inlining will
eventually happen where possible)<

Currently opApply is not efficient with dmd. But LDC is able to often inline
there, so I am happy. The yield as I mean it is as (in)efficient as the opApply.

Note that where you need more performance you can reverse inside-out the
iterator, and use the range iterator methods. If they are not virtual dmd has
no problem in inlining them.

And note that in 80-90% of your code you don't need 100% performance. There are
many cases in your code where you need something that's handy, short, easy to
remember and use, not error-prone, but you don't need to save few bits of time.


 yield may be easy to understand once you understand it :)

It contains more syntax sugar, so a system programmer can find it harder to
understand what actually it's doing. This is a disadvantage. Too much syntax
sugar gives syntax cancer :-)


It's prettier for 
 sure. But take a glimpse at this page on stackoverflow, the length of the 
 answers should tell you that it's not as straightforward as you might think 
 and that it's not only about the syntax:

I don't fully like the yield syntax of Python because they have first invented
it for a simple but very useful purpose, and then they have badly extended it
for coroutines too.
So now you have a statement that's used almost as a variable name. Ugh :-P

Regarding the semantics of python yield, it's not as simple as it was in the
beginning, because they have bolted on it the coroutine semantics. And it
contains a trap. That's probably why that thread on stack overflow is so long.
This problem is not shared by my simpler yield I have suggested probably more
than one year ago for D1.

Do you think future D newbies will will less hard to learn the opApply compared
to the yield?

Anyway, the purpose of this post was not to discuss about yield syntax. The
main problem with generators was not a problem about syntax, but to discuss the

show it very well.

Bye,
bearophile

Mar 17 2010

=?ISO-8859-1?Q?Pelle_M=E5nsson?= <pelle.mansson gmail.com> writes:

On 03/17/2010 10:26 PM, bearophile wrote:
 D currently has two ways to define a generator, the first is the older one
with opApply (And opApplyReverse):
 int opApply(int delegate(ref Type [, ...]) dg);


 The syntax of opApply is terrible: hard to remember, bug-prone, noisy,
intrusive. And this syntax doesn't even gain performance, because dmd is not
very good at inlining here.
 Several times in the past I have said that a very good syntax for this is the
Python one:

 def foo():
    yield 5

 It's really easy to use and remember, not bug prone. This possible syntax is
about 75 times better than the current opApply:

 yield(int) foo() {
    yield 5;
 }


 That's sugar for something like:

 struct foo {
      int opApply(int delegate(ref int) dg) {
          int result;
          int r = 5;
          result = dg(r);
          if (result)
              goto END;
        END:
          return result;
      }
 }





 The other way to create a struct/class generator in D2 is with the Range
protocol, with the methods that ask for the next item, etc. I have seen that
such generators sometimes are harder to write, because you have to manage the
state yourself, but they can be more efficient and they are more powerful.

 Both ways are useful, because they are useful for different situations. They
are one the inverted-out version of the other.



of a problem, that increases the computational complexity of the code when
there is a nested generator.


syntax and the performance (both languages have not yet implemented it).

 Python version:
 http://python.org/dev/peps/pep-0380/

 With that this code:

 def foo():
    for x in some_iterable:
      yield x


 Becomes:

 def foo():
    yield from some_iterable




 http://kirillosenkov.blogspot.com/2007/10/yield-foreach.html

 http://connect.microsoft.com/VisualStudio/feedback/details/256934/yield-return-to-also-yield-collections

 http://blogs.msdn.com/wesdyer/archive/2007/03/23/all-about-iterators.aspx


 As D2 will start coming out of its embryo state, it too will probably have to
fece this problem with iterators. In the meantime I hope the opApply syntax
will be replaced by something better (the yield I've shown).

 Bye,
 bearophile

While I think it should be done properly with ranges and not opApply, I 
strongly support this syntactic sugar.

Mar 17 2010

bearophile <bearophileHUGS lycos.com> writes:

Pelle M.: 
 While I think it should be done properly with ranges and not opApply,

What do you mean?
(There is already the syntax based on the range protocol, where you manage
manually the state of your iterator. I have not suggested to remove that.)

Bye,
bearophile

Mar 17 2010

bearophile <bearophileHUGS lycos.com> writes:

Pelle M�nsson:
I strongly support this syntactic sugar.<

Note that syntax was invented on the spot, it's not perfect, as you have seen
it's sugar for a struct, so what if you want a class? :-)

Bye,
bearophile

Mar 17 2010

Norbert Nemec <Norbert Nemec-online.de> writes:

I really like that simplified, "yield"-based syntax. However, I believe 
that it conflicts fundamentally with D's confusing concept of generators.

Most other languages that offer generators (e.g. Python) view the 
generator as an object that spits out values one-by-one. This is very 
similar to a co-routine.

Purely for performance reasons, D approaches the issue in a backward 
way, passing the loop body as a delegate. Indeed, this may be faster 
than full-fledged co-routines, but it is rather difficult to map a 
simplified coroutine-like syntax based on 'yield' statements to this 
very different concept by mere "syntactic sugar". Furthermore, the D 
concept scales awkwardly to combining multiple generators in one loop or 
using nested generators.

In fact, there is a simple but powerful way to design purely stack-based 
generators, sticking to the straightforward concept of co-routines 
without the overhead of full co-routines and without the need for 
delegates (inspired by the Sather language):

Make a generator a co-routine but restrict its lifetime to the inside of 
a loop. When the loop starts, it calls all the contained generators to 
initialize their local data on the regular stack. When the generators 
yield, the stack-pointer is *not* reset (as it would be at a return) but 
kept at its value. The generator routines can therefore be called again 
within the same loop and find their local data still on the stack. When 
the loop finishes, the stack pointer is reset to its initial value and 
all the local data of all the contained generators is immediately released.

This design does not interfere with the regular use of the stack for 
regular function calls within the loop. Nested loops are possible, as 
are nested generators or loops that combine values from multiple 
generators. All with a straightforward syntax and all based on pure 
stack operations that are no more costly than regular function calls and 
can be inlined just as easily.

This concept even allows to place calls to generators at arbitrary 
positions inside a loop, rather than at the beginning only. The regular 
D loop:
     foreach(auto x;mygenerator) {
         do_something(x);
     }
could be written as something like
     loop {
         do_something(#mygenerator);
     }

call that may either yield a value or break the loop. (I don't care 
about the exact syntax, but it should stick out somehow.

This can be generalised as
     loop {
         do_something();
         do_something_else(#mygenerator);
         yet_something_else(#anothergenerator);
     };

value or break the loop.

The clear restriction of this is that "native" generators of this design 
are not stand-alone objects, cannot survive outside a loop and work only 
within a single thread. More general generators, however, can easily be 
implemented as objects that offer a "native" generator as interface.

The only difficulty of this approach is that it has to be done at the 
assembler level because a unusual calling convention is needed. Typical 
high-level languages do not allow an emulation of this mechanism. I do 
not know about byte-code representations.

This whole mechanism is inspired by the Sather languages that first 
would have allowed it (even though that compiler never actually 
implemented it this way). To my knowledge, all other languages that 
offer generators did break this concept by allowing generators to exist 
beyond the lifetime of a loop and therefore depend on heap-allocated data.

I don't expect that D would ever adopt such a breaking change, even 
though it would be the most efficient implementation of generators while 
retaining the conceptually simple definition of co-routines. Still, I 
find the concept just so beautiful that I have to talk about it once in 
a while...

Greetings,
Norbert



bearophile wrote:
 D currently has two ways to define a generator, the first is the older one
with opApply (And opApplyReverse):
 int opApply(int delegate(ref Type [, ...]) dg);
 
 
 The syntax of opApply is terrible: hard to remember, bug-prone, noisy,
intrusive. And this syntax doesn't even gain performance, because dmd is not
very good at inlining here.
 Several times in the past I have said that a very good syntax for this is the
Python one:
 
 def foo():
   yield 5
 
 It's really easy to use and remember, not bug prone. This possible syntax is
about 75 times better than the current opApply:
 
 yield(int) foo() {
   yield 5;
 }
 
 
 That's sugar for something like:
 
 struct foo {
     int opApply(int delegate(ref int) dg) {
         int result;
         int r = 5;
         result = dg(r);
         if (result)
             goto END;
       END:
         return result;
     }
 }
 
 

 
 
 The other way to create a struct/class generator in D2 is with the Range
protocol, with the methods that ask for the next item, etc. I have seen that
such generators sometimes are harder to write, because you have to manage the
state yourself, but they can be more efficient and they are more powerful.
 
 Both ways are useful, because they are useful for different situations. They
are one the inverted-out version of the other.
 
 

of a problem, that increases the computational complexity of the code when
there is a nested generator.
 

syntax and the performance (both languages have not yet implemented it).
 
 Python version:
 http://python.org/dev/peps/pep-0380/
 
 With that this code:
 
 def foo():
   for x in some_iterable:
     yield x
 
 
 Becomes:
 
 def foo():
   yield from some_iterable
 
 

 
 http://kirillosenkov.blogspot.com/2007/10/yield-foreach.html
 
 http://connect.microsoft.com/VisualStudio/feedback/details/256934/yield-return-to-also-yield-collections
 
 http://blogs.msdn.com/wesdyer/archive/2007/03/23/all-about-iterators.aspx
 
 
 As D2 will start coming out of its embryo state, it too will probably have to
fece this problem with iterators. In the meantime I hope the opApply syntax
will be replaced by something better (the yield I've shown).
 
 Bye,
 bearophile

Mar 18 2010

Regan Heath <regan netmail.co.nz> writes:

Norbert Nemec wrote:
 I really like that simplified, "yield"-based syntax. However, I believe 
 that it conflicts fundamentally with D's confusing concept of generators.
 
 Most other languages that offer generators (e.g. Python) view the 
 generator as an object that spits out values one-by-one. This is very 
 similar to a co-routine.
 
 Purely for performance reasons, D approaches the issue in a backward 
 way, passing the loop body as a delegate. Indeed, this may be faster 
 than full-fledged co-routines, but it is rather difficult to map a 
 simplified coroutine-like syntax based on 'yield' statements to this 
 very different concept by mere "syntactic sugar". Furthermore, the D 
 concept scales awkwardly to combining multiple generators in one loop or 
 using nested generators.
 
 In fact, there is a simple but powerful way to design purely stack-based 
 generators, sticking to the straightforward concept of co-routines 
 without the overhead of full co-routines and without the need for 
 delegates (inspired by the Sather language):
 
 Make a generator a co-routine but restrict its lifetime to the inside of 
 a loop. When the loop starts, it calls all the contained generators to 
 initialize their local data on the regular stack. When the generators 
 yield, the stack-pointer is *not* reset (as it would be at a return) but 
 kept at its value. The generator routines can therefore be called again 
 within the same loop and find their local data still on the stack. When 
 the loop finishes, the stack pointer is reset to its initial value and 
 all the local data of all the contained generators is immediately released.

So, these generators are essentially inline functions?

R

Mar 18 2010

Norbert Nemec <Norbert Nemec-online.de> writes:

Regan Heath wrote:
 So, these generators are essentially inline functions?

No. They may be inlined, but this is just an optimization just like for 
  regular functions. If they are not inlined, these generators are 
simply functions with a special calling convention that allows them to 
retain their local data and execution point on the stack in between calls.

I did work out the assembly code for this some time ago and it worked 
nicely. Very similar to what is needed for co-routines, but it is 
possible to do everything on the stack.

Mar 18 2010

Walter Bright <newshound1 digitalmars.com> writes:

Norbert Nemec wrote:
 Regan Heath wrote:
 So, these generators are essentially inline functions?

 
 No. They may be inlined, but this is just an optimization just like for 
  regular functions. If they are not inlined, these generators are simply 
 functions with a special calling convention that allows them to retain 
 their local data and execution point on the stack in between calls.
 
 I did work out the assembly code for this some time ago and it worked 
 nicely. Very similar to what is needed for co-routines, but it is 
 possible to do everything on the stack.

How is that different from a local delegate that refers to local variables of 
the function it is lexically nested in?

Mar 18 2010

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 03/18/2010 01:49 PM, Walter Bright wrote:
 Norbert Nemec wrote:
 Regan Heath wrote:
 So, these generators are essentially inline functions?

 No. They may be inlined, but this is just an optimization just like
 for regular functions. If they are not inlined, these generators are
 simply functions with a special calling convention that allows them to
 retain their local data and execution point on the stack in between
 calls.

 I did work out the assembly code for this some time ago and it worked
 nicely. Very similar to what is needed for co-routines, but it is
 possible to do everything on the stack.

 How is that different from a local delegate that refers to local
 variables of the function it is lexically nested in?

I think the difference is that e.g. a coroutine can repeatedly return 
values from a loop. In contrast, the delegate must keep the state 
separate, which may be a tad more complicated.

Andrei

Mar 18 2010

Walter Bright <newshound1 digitalmars.com> writes:

Andrei Alexandrescu wrote:
 On 03/18/2010 01:49 PM, Walter Bright wrote:
 Norbert Nemec wrote:
 Regan Heath wrote:
 So, these generators are essentially inline functions?

 No. They may be inlined, but this is just an optimization just like
 for regular functions. If they are not inlined, these generators are
 simply functions with a special calling convention that allows them to
 retain their local data and execution point on the stack in between
 calls.

 I did work out the assembly code for this some time ago and it worked
 nicely. Very similar to what is needed for co-routines, but it is
 possible to do everything on the stack.

 How is that different from a local delegate that refers to local
 variables of the function it is lexically nested in?

 
 I think the difference is that e.g. a coroutine can repeatedly return 
 values from a loop. In contrast, the delegate must keep the state 
 separate, which may be a tad more complicated.

Right, the delegate would have to keep its persistent state in outer locals.

The trouble with a generator using the caller's stack is that then the
generator 
cannot recursively call itself (such as if it was walking a tree). In order to 
work properly in the general case, a generator has to allocate all its local 
variables on the heap.

Mar 18 2010

Norbert Nemec <Norbert Nemec-online.de> writes:

Walter Bright wrote:
 The trouble with a generator using the caller's stack is that then the 
 generator cannot recursively call itself (such as if it was walking a 
 tree). In order to work properly in the general case, a generator has to 
 allocate all its local variables on the heap.

Which concept of generators do you refer to? The stack-based generators 
that I suggested do in fact work recursively. A generator can obtain its 
values from other generators or even recursively from itself without 
needing heap space anywhere.

Mar 19 2010

Walter Bright <newshound1 digitalmars.com> writes:

Norbert Nemec wrote:
 Walter Bright wrote:
 The trouble with a generator using the caller's stack is that then the 
 generator cannot recursively call itself (such as if it was walking a 
 tree). In order to work properly in the general case, a generator has 
 to allocate all its local variables on the heap.

 
 Which concept of generators do you refer to? The stack-based generators 
 that I suggested do in fact work recursively. A generator can obtain its 
 values from other generators or even recursively from itself without 
 needing heap space anywhere.

How can it do that, and yet allow the calling function also call other
functions?

Mar 19 2010

Norbert Nemec <Norbert Nemec-online.de> writes:

Walter Bright wrote:
 Norbert Nemec wrote:
 Walter Bright wrote:
 The trouble with a generator using the caller's stack is that then 
 the generator cannot recursively call itself (such as if it was 
 walking a tree). In order to work properly in the general case, a 
 generator has to allocate all its local variables on the heap.

 Which concept of generators do you refer to? The stack-based 
 generators that I suggested do in fact work recursively. A generator 
 can obtain its values from other generators or even recursively from 
 itself without needing heap space anywhere.

 
 How can it do that, and yet allow the calling function also call other 
 functions?

When a generator is called for the first time, it shifts the ESP to make 
room for its own local variables. When it yields, the ESP remains where 
it is, so the generator's local data is protected. At any time, the ESP 
can be used as usual for regular function calls.

Whenever a loop is entered, it needs some additional space on the stack 
for all the local data of contained generators. This stack space is 
finite, simply defined by the sum of the local data of each contained 
generator. The loop's stack space is released when the loop is 
terminated. Any loops nested inside the outer loop, perhaps inside 
generators, will reserve additional stack space, but none of these stack 
spaces can survive the lifetime of an outer loop, so whenever a loop 
exits, all the stack spaces if inner loops can be freed as well.

I will try to work out an example in assembly code, but I have not used 
assembly in years, so it will take some time.

Mar 19 2010

Norbert Nemec <Norbert Nemec-online.de> writes:

Walter Bright wrote:
 Norbert Nemec wrote:
 Regan Heath wrote:
 So, these generators are essentially inline functions?

 No. They may be inlined, but this is just an optimization just like 
 for  regular functions. If they are not inlined, these generators are 
 simply functions with a special calling convention that allows them to 
 retain their local data and execution point on the stack in between 
 calls.

 I did work out the assembly code for this some time ago and it worked 
 nicely. Very similar to what is needed for co-routines, but it is 
 possible to do everything on the stack.

 
 How is that different from a local delegate that refers to local 
 variables of the function it is lexically nested in?

It is pretty much the same in the trivial case of one loop having one 
generator in the loop header. The difference shows up in more complex cases:

* Try to design a "filter" (a generator that takes another generator as 
argument) - may be possible but probably ugly to implement.

* Try to run a loop that pulls its values pairwise from two independent 
generators - I don't see a way to do this without serious mental acrobatic.

Apart from that, I guess that the runtime overhead without inlining is 
somewhat different. Would have to investigate that in more detail.

Mar 19 2010

D Programming

C/C++ Programming

Other

digitalmars.D - A problem with generators