www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - foreach and metaprogramming

reply Per Eckerdal <per.eckerdal gmail.com> writes:
I have spent some time reading the archives of these newsgroups, and 
I've seen quite a lot, especially on dtl about foreach and 
foreach_reverse and how they should be designed to support that-and-that 
feature. And there are a lot of other places where people talk about 
"2.0" features.

Just my 2 cents about this:

A language should not strive to have many features, but rather a small 
set of features that allow you to do other things on top of it. (Yes, 
I'm one of those scary lisp guys.) If you add tools to create your own 
statements then you don't need to have foreach or foreach_reverse.

I don't know very much about how compilers work when optimizing but I 
can't see how foreach to a for loop would reduce performance.

I don't know if OpenC++/OpenJava 
(http://www.csg.is.titech.ac.jp/openjava/) work well, but that's pretty 
much the idea. I've seen that someone has already tried to do this in D, 
but I think it deserves more attention.

Now this is _definitely_ a 2.0 feature :), but still, wouldn't it be 
cool to remove the need for static if, foreach, synchronized and 
contract programming and replace it with something even more general, 
that also lets you do cool stuff like compile-time generation of 
serialization functions, reflection/introspection/rtti implemented in 
the language.. The list goes on.

To me the question seems to be whether to implement one feature after 
another and inevitably seeing D become bloated like all the other 
languages (Java, C++) or implementing a metaobject protocol and have all 
those features in a snap and being able to extend the language beyond 
Java's wildest dreams, without bloating the core language. (Okay, I 
admit that was slightly angled towards my view, but I hope you get my point)

Don't think I dislike D, I am seriously impressed by it and like it a 
lot. I just think it could be even better :)

/Per
Nov 07 2006
next sibling parent reply Charles D Hixson <charleshixsn earthlink.net> writes:
Per Eckerdal wrote:
 ...
 To me the question seems to be whether to implement one feature after 
 another and inevitably seeing D become bloated like all the other 
 languages (Java, C++) or implementing a metaobject protocol and have all 
 those features in a snap and being able to extend the language beyond 
 Java's wildest dreams, without bloating the core language. (Okay, I 
 admit that was slightly angled towards my view, but I hope you get my 
 point)
 
 Don't think I dislike D, I am seriously impressed by it and like it a 
 lot. I just think it could be even better :)
 
 /Per

I more or less agree with you, but many people here are much more focused on speed. General features are powerful, but code generated using them tends to be (relatively) slow. Functions coded in a more specialized fashion can be faster. Whether the difference is significant or not depends on what you are planning to do.
Nov 07 2006
next sibling parent reply BCS <nill pathlink.com> writes:
== Quote from Charles D Hixson (charleshixsn earthlink.net)'s article
 I more or less agree with you, but many people here are much
 more focused on speed.  General features are powerful, but
 code generated using them tends to be (relatively) slow.
 Functions coded in a more specialized fashion can be faster.
 Whether the difference is significant or not depends on what
 you are planning to do.

I'm from the speed is good camp. I like D because it seems to give good speed while still being easy to use. I know of a few systems that allow for better ease of use (in some respects) but a a great cost in lost speed. As I see it, if ease of use is *all* you want, D isn't what you want.
Nov 07 2006
next sibling parent Walter Bright <newshound digitalmars.com> writes:
BCS wrote:
 I'm from the speed is good camp. I like D because it seems to give good speed
 while still being easy to use. I know of a few systems that allow for better
ease
 of use (in some respects) but a a great cost in lost speed. As I see it, if
ease
 of use is *all* you want, D isn't what you want.

While many languages do offer "ease of use" and slow code, the ease of use suffers when one does run up against the speed problem with it. Then, you've got to figure out how to recode it in another language, or figure out how to interface other languages to the slow language for the speed critical parts. When you start doing that, the productivity advantages of the "ease of use" go out the window.
Nov 07 2006
prev sibling parent reply Per Eckerdal <per.eckerdal gmail.com> writes:
 I'm from the speed is good camp. I like D because it seems to give good speed
 while still being easy to use.

(Sorry for the long post.) I don't think speed needs to be a problem if you implement it as a compile-time language for lexical-aware macros. Here is one potential way of doing this, to show you what I'm thinking about: This would consist of four different language features. The first one is just syntactic sugar and should be fairly easy to implement: Ruby-style blocks. It could be implemented as just sending a delegate as the last function argument: foo(1) { writefln("Hello"); } foo(1, { writefln("Hello"); }); Would be the same. I don't know if this is hard to do, but I can't see why it would. And it makes for some very elegant things at other places as well, look at Ruby for examples. The second one is two new operators, I'll call them $ and #. $ and # can be put before expressions, statements, declarations and identifiers. Examples are: $int a = 5; ${ /* code block */ } class $Name {}. It binds very tightly, you should be able to assume that only the thing next to it is bound. # returns the AST of the following expression, $ is for declaring macro blocks. $ blocks may return nothing, and then they will be replaced by nothing: ${int a = 5;}; gives nothing. They can also return literals, e.g. int a = ${return 5;}; Or, they can return AST objects, and then they will be replaced by that tree: ${ return #{if (a == 5) die();}; } // This is equal to if (a == 5) die(); The argument to # must be a complete expression/declaration/statement, so #{ if (a == } is invalid. The only operation you can do with ASTs created with # is to concatenate with the ~ operator. I think you could implement this by making all $ blocks functions in a separate "meta"-module (used only internally when compiling). So if you define a function within a macro block, you define a function in the "meta"-module, essentially defining a macro that can be invoked with ${return macroname(args);} Given this, most of the features of this can be done, but I'd suggest adding some more syntax sugar, a macro keyword: macro foreach; This declares that all foreach calls should be surrounded within ${return ;} and all arguments enclosed in #{}. So you will be able to access the compile-time function foreach as though it was a normal function. There are one more thing to add: macro int hello(int a) { return a; } is equivalent to ${ int hello(int a) { return a; } }; macro hello; The last feature that might be added is a library that only compile-time code can access that allows you to reflect over classes and functions. (Probably only in the same module as the calling code, but I don't think that's a big limitation) As an example I will show how you can implement a nifty kind of iterators using this technique. class LinkedList { // ... macro each(void delegate(Type t) block) { return #{ while (iterate over the list) { block(data); } }; } // ... } LinkedList ll = new LinkedList; // Add data.. // I borrowed Ruby's syntax. I don't really like it but it works as demo ll.each { |Type data| writefln(data); } (This example obviously requires the macro stuff to be done *after* templates, but I think that makes sense.) It can't be hard for the compiler to recognize inline delegate calls, so this code should be able to get expanded into no method calls at all. To sum this all up: What I'm talking about is really only a slightly more hygienic and a lot more powerful macro system. I can't see why this would reduce performance, but I can see how this can be used to perform heavy high-level optimization (for example when you have a tree that doesn't map very well to an index operator or C++-iterators). And it would let you do lots of other cool things as well, like automated persistence layers without another DSL and things like that. The great drawback I can see is that this would give much longer compilation times. But that's the cost of doing more at compile time and less at runtime, isn't it? (And you should be able to optimize it some, as well) One thing that is good about this is that you expose very few compiler internals; only create and append AST is needed, and the rest can be implemented as a normal (compile-time) API. /Per
Nov 08 2006
parent Kyle Furlong <kylefurlong gmail.com> writes:
Per Eckerdal wrote:
 I'm from the speed is good camp. I like D because it seems to give 
 good speed
 while still being easy to use.

(Sorry for the long post.) I don't think speed needs to be a problem if you implement it as a compile-time language for lexical-aware macros. Here is one potential way of doing this, to show you what I'm thinking about: This would consist of four different language features. The first one is just syntactic sugar and should be fairly easy to implement: Ruby-style blocks. It could be implemented as just sending a delegate as the last function argument: foo(1) { writefln("Hello"); } foo(1, { writefln("Hello"); }); Would be the same. I don't know if this is hard to do, but I can't see why it would. And it makes for some very elegant things at other places as well, look at Ruby for examples. The second one is two new operators, I'll call them $ and #. $ and # can be put before expressions, statements, declarations and identifiers. Examples are: $int a = 5; ${ /* code block */ } class $Name {}. It binds very tightly, you should be able to assume that only the thing next to it is bound. # returns the AST of the following expression, $ is for declaring macro blocks. $ blocks may return nothing, and then they will be replaced by nothing: ${int a = 5;}; gives nothing. They can also return literals, e.g. int a = ${return 5;}; Or, they can return AST objects, and then they will be replaced by that tree: ${ return #{if (a == 5) die();}; } // This is equal to if (a == 5) die(); The argument to # must be a complete expression/declaration/statement, so #{ if (a == } is invalid. The only operation you can do with ASTs created with # is to concatenate with the ~ operator. I think you could implement this by making all $ blocks functions in a separate "meta"-module (used only internally when compiling). So if you define a function within a macro block, you define a function in the "meta"-module, essentially defining a macro that can be invoked with ${return macroname(args);} Given this, most of the features of this can be done, but I'd suggest adding some more syntax sugar, a macro keyword: macro foreach; This declares that all foreach calls should be surrounded within ${return ;} and all arguments enclosed in #{}. So you will be able to access the compile-time function foreach as though it was a normal function. There are one more thing to add: macro int hello(int a) { return a; } is equivalent to ${ int hello(int a) { return a; } }; macro hello; The last feature that might be added is a library that only compile-time code can access that allows you to reflect over classes and functions. (Probably only in the same module as the calling code, but I don't think that's a big limitation) As an example I will show how you can implement a nifty kind of iterators using this technique. class LinkedList { // ... macro each(void delegate(Type t) block) { return #{ while (iterate over the list) { block(data); } }; } // ... } LinkedList ll = new LinkedList; // Add data.. // I borrowed Ruby's syntax. I don't really like it but it works as demo ll.each { |Type data| writefln(data); } (This example obviously requires the macro stuff to be done *after* templates, but I think that makes sense.) It can't be hard for the compiler to recognize inline delegate calls, so this code should be able to get expanded into no method calls at all. To sum this all up: What I'm talking about is really only a slightly more hygienic and a lot more powerful macro system. I can't see why this would reduce performance, but I can see how this can be used to perform heavy high-level optimization (for example when you have a tree that doesn't map very well to an index operator or C++-iterators). And it would let you do lots of other cool things as well, like automated persistence layers without another DSL and things like that. The great drawback I can see is that this would give much longer compilation times. But that's the cost of doing more at compile time and less at runtime, isn't it? (And you should be able to optimize it some, as well) One thing that is good about this is that you expose very few compiler internals; only create and append AST is needed, and the rest can be implemented as a normal (compile-time) API. /Per

I agree with the sentiment of making compile-time programming more powerful. I'm not sure the syntax you introduce is what we are looking for, however.
Nov 08 2006
prev sibling parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Charles D Hixson wrote:
 Per Eckerdal wrote:
 ...
 To me the question seems to be whether to implement one feature after 
 another and inevitably seeing D become bloated like all the other 
 languages (Java, C++) or implementing a metaobject protocol and have 
 all those features in a snap and being able to extend the language 
 beyond Java's wildest dreams, without bloating the core language. 
 (Okay, I admit that was slightly angled towards my view, but I hope 
 you get my point)

 Don't think I dislike D, I am seriously impressed by it and like it a 
 lot. I just think it could be even better :)

 /Per

I more or less agree with you, but many people here are much more focused on speed. General features are powerful, but code generated using them tends to be (relatively) slow. Functions coded in a more specialized fashion can be faster. Whether the difference is significant or not depends on what you are planning to do.

And if you don't need the speed then there are number of nice languages out there that will probably suit most folks needs better than D, so it makes sense that D should not yield on that point. On the other hand, there are cases where a more general feature could be as fast or faster than the specialized functions, but the general feature is much harder to implement in the compiler. In that case D tends to go for the easy fix. Working fast today is valued more than being general today and maybe working fast someday. D tends toward "fast today" and "maybe general someday". Lisp is a good example of 'general today, maybe fast someday'. From what I understand, there still aren't any high-quality freely-available Lisp compilers out there. --bb
Nov 07 2006
prev sibling parent "Andrey Khropov" <andkhropov_nosp m_mtu-net.ru> writes:
Per Eckerdal wrote:

 I have spent some time reading the archives of these newsgroups, and I've
 seen quite a lot, especially on dtl about foreach and foreach_reverse and how
 they should be designed to support that-and-that feature. And there are a lot
 of other places where people talk about "2.0" features.
 
 Just my 2 cents about this:
 
 A language should not strive to have many features, but rather a small set of
 features that allow you to do other things on top of it. (Yes, I'm one of
 those scary lisp guys.)

  If you add tools to create your own statements then
 you don't need to have foreach or foreach_reverse.

I think Nemerle is a language you'll probably find attractive. D is more for lower level system programming and compiler simplicity and hence its speed is one of the design goals. -- AKhropov
Nov 08 2006