digitalmars.D - BLADE 0.2Alpha: Vector operations with mixins, expression templates,

Don Clugston (48/48) Apr 03 2007 I have been trying to come up with a convincing use case for the new

Walter Bright (3/7) Apr 03 2007 This is pretty incredible.

Don Clugston (17/25) Apr 04 2007 I thought you'd like it. The sheer power of D these days...

Georg Wrede (8/9) Apr 04 2007 I've some experience proofreading and commenting similar articles. Only
Walter Bright (4/22) Apr 04 2007 Good question. I've already fixed #1028. But even if it looks stone age
David B. Held (9/18) Apr 05 2007 Well, hopefully, if Walter breaks your code again, it will be to make it...

Walter Bright (5/23) Apr 05 2007 I'm pretty sure DDJ would publish it, and it would reach a much wider

Dan (5/5) Apr 05 2007 Current compilers tend to dump IR's to a buffer and then adjust them acc...

janderson (15/23) Apr 07 2007 That would be nice and would help however I think it'll take more then

Don Clugston (12/38) Apr 05 2007 Ooooooh....

Georg Wrede (6/18) Apr 05 2007 Gees, Don,
Georg Wrede (29/51) Apr 05 2007 Later, when you start writing the article itself:
Walter Bright (7/19) Apr 05 2007 Just wait 'till you get AST macros!

Dan (3/11) Apr 05 2007 Wow. That's the most excited I've seen Walter. : )

Georg Wrede (2/15) Apr 05 2007 LOL, have to admit I agree!!!

Bill Baxter (5/21) Apr 05 2007 It really is very cool and all, but did I just miss the benchmarks? I

Don Clugston (11/34) Apr 06 2007 You're right, but it wasn't posted as a finished product!

Walter Bright (4/7) Apr 03 2007 The hard part is doing register allocation. Generating code for the
Pragma (10/66) Apr 03 2007 This is a work of art Don - it's practically a compiler extension. Nice...

Bruno Medeiros (6/78) Apr 07 2007 Hum, a minor question, is a string representation of the expressions

Pragma (11/90) Apr 09 2007 I know Don wrote this lib, but I think I can answer this.

Dan (4/12) Apr 09 2007 Well said. Once the AST is reflected at compile time, I'm sure the code...

Dave (6/21) Apr 09 2007 Hmmm - I wonder if a lot of what Don's been doing is serving as a "proof...
Pragma (9/23) Apr 09 2007 Thanks. But evaluating the AST will only take things so far. There is ...
Paul Findlay (3/8) Apr 09 2007 And its dawning on me that some text processing can take advantage of do...

Dan (3/10) Apr 10 2007 Yeah, pretty much any large (>1kb) buffer copy algorithm runs fastest th...

Don Clugston (16/113) Apr 09 2007 Actually I don't know how to make a tree nicely with tuples.

Pragma (80/197) Apr 09 2007 Here's an example (for grins):

Don Clugston (11/185) Apr 10 2007 The problem I was referring to, is: how to store both values, and

Walter Bright (2/5) Apr 10 2007 You guys gotta publish!

Don Clugston (6/12) Apr 10 2007 I've begun a draft. A historical question for you --

Walter Bright (8/14) Apr 10 2007 Depends on how you look at it. g++ was first 'released' in December,

Pragma (4/10) Apr 10 2007 Baby steps Walter, baby steps... ;)

Pragma (21/46) Apr 10 2007 Oh, my bad. Yea, it would probably be overkill for BLADE. :)

KlausO (15/238) Apr 10 2007 Hey pragma,

Pragma (11/54) Apr 10 2007 Wow, that's one heck of an endorsement. Thanks, but don't throw anythin...

KlausO (17/81) Apr 10 2007 Another issue I ran into is circular template dependencies. You could

Pragma (6/66) Apr 10 2007 Yep, that one bit me too; just use the CTFE trick I mentioned. Wrapping...

BCS (5/21) Apr 10 2007 The way my dparse sidesteps this (both the id length limit and forward

Walter Bright (2/8) Apr 10 2007 One of the motivations for CTFE was to address that exact problem.

Witold Baryluk (9/9) Apr 03 2007 Hello,

Don Clugston (21/28) Apr 04 2007 Do you intend to implement BLAS2 and BLAS3-like functionality? I feel

Davidl (16/64) Apr 03 2007 i think compilers have feeled a great challenge from compile-time back-e...
Jascha Wetzel (20/76) Apr 05 2007 This is really neat.
Mikola Lysenko (16/16) Apr 05 2007 Excellent work! I have to say that I've had my misgivings about mixin

torhu (3/11) Apr 05 2007 Oh my GAD? Get it? ;)
Bill Baxter (18/37) Apr 05 2007 Wow, that looks really sweet. Last I heard on GA was that it was good

Mikola Lysenko (21/62) Apr 05 2007 As I understand it, GAIGEN uses an external program to generate the

Dan (3/11) Apr 06 2007 In essence, it's a *demonstration* that with all of D's features combine...

Don Clugston (14/26) Apr 07 2007 It might be because no-one has really tried. Agner told me that he has

Dan (5/21) Apr 10 2007 Oh! DotNet? Another runtime hosted microsoft environment? Brilliant. ...

Don Clugston <dac nospam.com.au> writes:

I have been trying to come up with a convincing use case for the new 
mixins (and for metaprogramming in general). My best effort to date can 
be found at:
http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d

It generates near-optimal x87 asm code for BLAS1-style basic vector 
operations. 32, 64 and 80 bit vectors are all supported.

Compile with -version=BladeDebug to see the asm code which is generated.

Typical usage:

void main()
{
     auto p = Vec([1.0L, 2, 18]);    // a vector of 80-bit reals.
     auto q = Vec([3.5L, 1.1, 3.8]);  // ditto
     auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats
     auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles
     real d = dot(r, p+r+r);
     ireal e = dot(r, z);
     q -= ((r+p)*18.0L*314.1L - (p-r))* 35;
     d = dot(r, p+r+r);
}

Notice that mixed-length operations (real[] + float[] - double[]) are 
supported.

Like the C++ Blitz++ library, expression templates are used to convert 
vector expressions into efficient element-wise operations. Unlike that 
library, however, there is no reliance on the compiler's optimiser. 
Instead, the expression template is manipulated as text, converted into 
postfix, and then passed to a simple CTFE compile-time assembler, which 
creates highly efficient asm code which is used as a mixin.
To understand the later parts of the code, you need some knowledge of 
x87 assembler. In fact, you probably need to have read Agner Fog's 
superb Pentium optimisation manual (www.agner.org).

Some observations:
* I was amazed at how simple the expression template code is (it is 
somewhat cluttered by the code to check for real/imaginary type mismatch 
errors).
* I've often read that the x87 floating-point stack is notoriously 
difficult for compilers to write code for, but it works quite well in 
this case.
* The major workarounds are:

- inability to define operators for built-in arrays (hence the use of 
'Vec' wrappers).
- inability to index through a tuple in a CTFE function (solved by 
converting types into a string).
* There have been mutterings about how unhygenic/dangerous the new 
mixins are. In this case, the mixin forms the _entire_ body of the 
function. This is an interesting situation which I think a language 
purist will find more palatable.

Enjoy.

Apr 03 2007

Walter Bright <newshound1 digitalmars.com> writes:

Don Clugston wrote:
 I have been trying to come up with a convincing use case for the new 
 mixins (and for metaprogramming in general). My best effort to date can 
 be found at:
 http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d

This is pretty incredible.

Can I beg you to write an article about it?

Apr 03 2007

Don Clugston <dac nospam.com.au> writes:

Walter Bright wrote:
 Don Clugston wrote:
 I have been trying to come up with a convincing use case for the new 
 mixins (and for metaprogramming in general). My best effort to date 
 can be found at:
 http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d

 
 This is pretty incredible.

I thought you'd like it. The sheer power of D these days...
"Flies like a rocket, steers like a bicycle."

 Can I beg you to write an article about it?

OK. Are you thinking somewhere like DDJ?

It's nicely focused (central idea being something like "D's 
metaprogramming is so powerful, that compiler back-end optimisation can 
easily be performed in compile-time libraries") -- but it would require 
loads of explanation of D's coolest features. Could be a challenge to 
make it easily readable.

But with the proposed changes to 'const', macros, and previous 
discussion about whether tuples should be automatically flattened, I 
have doubts over how stable that part of the language is. I wouldn't 
want to write something that didn't compile any more, or looked 
hopelessly obsolete by the time it was published.
I've been bitten before -- CTFE blasted almost all of my previous 
metaprogramming code into the stone age <g>. What's the shelf life of 
this stuff?.

Apr 04 2007

Georg Wrede <georg nospam.org> writes:

Don Clugston wrote:
 Could be a challenge to make it easily readable.

I've some experience proofreading and commenting similar articles. Only 
too happy to oblige!

Also, DDJ readers may not be the CUJ crowd, but still, they can handle 
advanced stuff. Besides, some similar magazines help you by rewriting 
the article anyway, because then it has the same overall "feel" that 
their other articles. Articles in SciAm for example feel the same, no 
matter who or from what continent is presented as the author.

Apr 04 2007

Walter Bright <newshound1 digitalmars.com> writes:

Don Clugston wrote:
 Walter Bright wrote:
 Can I beg you to write an article about it?

 
 OK. Are you thinking somewhere like DDJ?

Anywhere!

 It's nicely focused (central idea being something like "D's 
 metaprogramming is so powerful, that compiler back-end optimisation can 
 easily be performed in compile-time libraries") -- but it would require 
 loads of explanation of D's coolest features. Could be a challenge to 
 make it easily readable.
 
 But with the proposed changes to 'const', macros, and previous 
 discussion about whether tuples should be automatically flattened, I 
 have doubts over how stable that part of the language is. I wouldn't 
 want to write something that didn't compile any more, or looked 
 hopelessly obsolete by the time it was published.
 I've been bitten before -- CTFE blasted almost all of my previous 
 metaprogramming code into the stone age <g>. What's the shelf life of 
 this stuff?.


6 months from now, it's still a decade ahead of any other language.

Apr 04 2007

"David B. Held" <dheld codelogicconsulting.com> writes:

Don Clugston wrote:
 [...]
 But with the proposed changes to 'const', macros, and previous 
 discussion about whether tuples should be automatically flattened, I 
 have doubts over how stable that part of the language is. I wouldn't 
 want to write something that didn't compile any more, or looked 
 hopelessly obsolete by the time it was published.
 I've been bitten before -- CTFE blasted almost all of my previous 
 metaprogramming code into the stone age <g>. What's the shelf life of 
 this stuff?.

Well, hopefully, if Walter breaks your code again, it will be to make it 
even more elegant (although, that will be a bit of a trick, I admit). 
Eventually, the language will be so powerful that you can't make your 
code any more compact, and then you won't have to worry about it getting 
broken.  To be honest, I'd like to see an article about this in CUJ.  I 
don't know if they would publish it or not, but if they did, I guarantee 
it would make more than a few people take a hard look at D.

Dave

Apr 05 2007

Walter Bright <newshound1 digitalmars.com> writes:

David B. Held wrote:
 Don Clugston wrote:
 [...]
 But with the proposed changes to 'const', macros, and previous 
 discussion about whether tuples should be automatically flattened, I 
 have doubts over how stable that part of the language is. I wouldn't 
 want to write something that didn't compile any more, or looked 
 hopelessly obsolete by the time it was published.
 I've been bitten before -- CTFE blasted almost all of my previous 
 metaprogramming code into the stone age <g>. What's the shelf life of 
 this stuff?.

 
 Well, hopefully, if Walter breaks your code again, it will be to make it 
 even more elegant (although, that will be a bit of a trick, I admit). 
 Eventually, the language will be so powerful that you can't make your 
 code any more compact, and then you won't have to worry about it getting 
 broken.  To be honest, I'd like to see an article about this in CUJ.  I 
 don't know if they would publish it or not, but if they did, I guarantee 
 it would make more than a few people take a hard look at D.

I'm pretty sure DDJ would publish it, and it would reach a much wider 
audience than CUJ, which has disappeared <g>.

Anyhow, Don, this kind of stuff needs to reach a much wider audience 
than a random posting in this n.g. would get.

Apr 05 2007

Dan <murpsoft hotmail.com> writes:

Current compilers tend to dump IR's to a buffer and then adjust them according
to parser rules.  The wrong approach.

I totally want to see this kind of programming get implemented into the
compiler itself.  I know it's too much to ask, but if DMD could generate
optimal floating point math and cache pipelines, it would rend the gaming
industry from C++'s hands REAL quick like.

A C++ programmer can say:
"oh yeah, it's a minor upgrade made for my convenience, but I'm already
conveniently comfortable with C++"...

Until a decision maker sees that D performs 20% better, dramatically reduces
debugging time, cuts source code volume, and has better version control.

Apr 05 2007

janderson <askme me.com> writes:

Dan wrote:
 Current compilers tend to dump IR's to a buffer and then adjust them according
to parser rules.  The wrong approach.
 
 I totally want to see this kind of programming get implemented into the
compiler itself.  I know it's too much to ask, but if DMD could generate
optimal floating point math and cache pipelines, it would rend the gaming
industry from C++'s hands REAL quick like.
 
 A C++ programmer can say:
 "oh yeah, it's a minor upgrade made for my convenience, but I'm already
conveniently comfortable with C++"...
 
 Until a decision maker sees that D performs 20% better, dramatically reduces
debugging time, cuts source code volume, and has better version control.

That would be nice and would help however I think it'll take more then 
that to convince a Game company to switch to D, particularly ones using 
XNA.

It would have to be a startup company that is not working 360, perhaps a 
Nintendo, sony, Linux or perhaps windows.  I say perhaps, because if a 
company is going to switch to anything its probably XNA.  Now if we had 
D for Net, maybe the story would be slightly different.

I do like the idea of using D for games, I just see it as being a hard 
sell.  Scripting and tools are probably the main gateway for D into the 
game programming world.

Once a few game companies start using it in their main project and can 
demonstrate productivity/speed improvements then the wall may indeed 
start to crumble.

-Joel

Apr 07 2007

Don Clugston <dac nospam.com.au> writes:

Walter Bright wrote:
 David B. Held wrote:
 Don Clugston wrote:
 [...]
 But with the proposed changes to 'const', macros, and previous 
 discussion about whether tuples should be automatically flattened, I 
 have doubts over how stable that part of the language is. I wouldn't 
 want to write something that didn't compile any more, or looked 
 hopelessly obsolete by the time it was published.
 I've been bitten before -- CTFE blasted almost all of my previous 
 metaprogramming code into the stone age <g>. What's the shelf life of 
 this stuff?.

 Well, hopefully, if Walter breaks your code again, it will be to make 
 it even more elegant (although, that will be a bit of a trick, I 
 admit). Eventually, the language will be so powerful that you can't 
 make your code any more compact, and then you won't have to worry 
 about it getting broken.  To be honest, I'd like to see an article 
 about this in CUJ.  I don't know if they would publish it or not, but 
 if they did, I guarantee it would make more than a few people take a 
 hard look at D.

 
 I'm pretty sure DDJ would publish it, and it would reach a much wider 
 audience than CUJ, which has disappeared <g>.
 
 Anyhow, Don, this kind of stuff needs to reach a much wider audience 
 than a random posting in this n.g. would get.

Ooooooh....
I just updated the code, adding a code generator for D code, for when 
x87 is unavailable. I could not believe how easy it was. It was about 30 
lines of trivial code. Yet directly generating a 'for' loop is still 
better than what Blitz++ does.
I reckon it'd take less than a week to implement everything those 
libraries do.

The language synergy of tuples, mixins, CTFE, builtin strings, the ~ 
operator, constant folding, array slicing, template string parameters, 
type deduction, defined inline asm, static if, is() expressions... is 
astonishing. *All* of them are required in order for this to work.

Apr 05 2007

Georg Wrede <georg nospam.org> writes:

Don Clugston wrote:
 Ooooooh....
 I just updated the code, adding a code generator for D code, for when 
 x87 is unavailable. I could not believe how easy it was. It was about 30 
 lines of trivial code. Yet directly generating a 'for' loop is still 
 better than what Blitz++ does.
 I reckon it'd take less than a week to implement everything those 
 libraries do.
 
 The language synergy of tuples, mixins, CTFE, builtin strings, the ~ 
 operator, constant folding, array slicing, template string parameters, 
 type deduction, defined inline asm, static if, is() expressions... is 
 astonishing. *All* of them are required in order for this to work.

Gees, Don,

the above quote would be the perfect ingress (or at least a "quote box") 
in a DDJ article!!!!

I'm clenching the chair so I wouldn't buy the tickets to come to your 
place and write it together with you, right now!

Apr 05 2007

Georg Wrede <georg nospam.org> writes:

Quoting from the page:

 If you have an article idea, you should first send us an article
 proposal. This is a short statement (about one page) that says what your
 article will be about. You can mail it to our <a href="address.htm">office</a>
 or e-mail it to <a href="mailto:editors ddj.com">editors ddj.com</a>.
 Be sure to give us several ways to contact you (such as phone number,
 e-mail address, fax number, and mailing address). We'll look at your
 idea and get back to you, usually within a month.</p>

...
 <p>Although many people simply write their entire article and send it
 to us, there are several *advantages* to sending us a proposal. We can
 look at your idea and possibly suggest a particular slant you hadn't
 considered. We can also help you to organize the article.

Later, when you start writing the article itself:

 Write as if you are giving a brief, informal talk to a group of
 coworkers. You want to be concise and well organized because you don't
 want to waste their time, but you don't want to be too stuffy or
 formal because you know them.

Ah, and to give you more of the "feel" of being an article author, 
here's another quote from the same page:

 Illustrations and screen captures offer an opportunity to lure
 anyone looking through an issue into reading your article. They
 can also provide succinct summaries of complex ideas. Since we
 redraw most figures anyway, to conform to our size and style
 requirements,
 you should not worry about making your images very high quality.
 Many authors prefer to sketch them by hand and fax or mail them.

Heh, especially the last sentence. The same goes for the text itself. 
The point being, from the magazine's point of view, most stuff sent to 
them is by teenagers, idiots, mentals, over-the-hill old-timers -- and 
only a handful is stuff they'd be interested in. Now, for those few, 
they'd kill, lick your boots, or even sharpen your pencil for you.

As to whether this would interest them, all you have to send them is the 
above mentioned Ingress. Let them take over from that. Simply naming 
Walter as a reference should get you a definite YES/NO from them. And 
once you get the YES, they'll hold your hand through and through.

---

For you personally, having your article in DDJ is of course "nice". But 
the real benefit is that then you've done it, been there. So, by the 
time D advances yet another bit, and your "template jujitsu" turns into 
"template nuke" (as perceived from the audience -- you'll never notice 
the change yourself!), it's gonna be a lot easier to jot up something 
for them again. Or for the next magazine. (Ahem, I'm ashamed to say, but 
at least as I experienced, my first article was a bit like "doing it for 
the first time with a woman". Once you get over it, the next one is a 
piece of cake.)

Thusly, both now and in the future, you're in a position to do D a /lot/ 
of good -- with effort that's actually less than some of your better D 
forages to date.

PS, if they get back to you within /less/ than 4 weeks, they're 
drooling. Unless they've read this post, of course. ;-)

Apr 05 2007

Walter Bright <newshound1 digitalmars.com> writes:

Don Clugston wrote:
 Ooooooh....
 I just updated the code, adding a code generator for D code, for when 
 x87 is unavailable. I could not believe how easy it was. It was about 30 
 lines of trivial code. Yet directly generating a 'for' loop is still 
 better than what Blitz++ does.
 I reckon it'd take less than a week to implement everything those 
 libraries do.
 
 The language synergy of tuples, mixins, CTFE, builtin strings, the ~ 
 operator, constant folding, array slicing, template string parameters, 
 type deduction, defined inline asm, static if, is() expressions... is 
 astonishing. *All* of them are required in order for this to work.

Just wait 'till you get AST macros!

But this stuff is *awesome*. Scratch DDJ, I don't want to wait 6 months. 
This needs to get out now, - I suggest www.artima.com, followed up with 
dzone, slashdot, and digg.

It would also be a real treat to have you present this at Amazon's D 
conference in August.

Apr 05 2007

Dan <murpsoft hotmail.com> writes:

Walter Bright Wrote:
 Just wait 'till you get AST macros!
 
 But this stuff is *awesome*. Scratch DDJ, I don't want to wait 6 months. 
 This needs to get out now, - I suggest www.artima.com, followed up with 
 dzone, slashdot, and digg.
 
 It would also be a real treat to have you present this at Amazon's D 
 conference in August.

Wow.  That's the most excited I've seen Walter.  : )
I'm going to have to read and understand your code.  I can tell.

Apr 05 2007

Georg Wrede <georg nospam.org> writes:

Dan wrote:
 Walter Bright Wrote:
 
Just wait 'till you get AST macros!

But this stuff is *awesome*. Scratch DDJ, I don't want to wait 6 months. 
This needs to get out now, - I suggest www.artima.com, followed up with 
dzone, slashdot, and digg.

It would also be a real treat to have you present this at Amazon's D 
conference in August.

 
 
 Wow.  That's the most excited I've seen Walter.  : )

LOL, have to admit I agree!!!

Apr 05 2007

Bill Baxter <dnewsgroup billbaxter.com> writes:

Georg Wrede wrote:
 Dan wrote:
 Walter Bright Wrote:

 Just wait 'till you get AST macros!

 But this stuff is *awesome*. Scratch DDJ, I don't want to wait 6 
 months. This needs to get out now, - I suggest www.artima.com, 
 followed up with dzone, slashdot, and digg.

 It would also be a real treat to have you present this at Amazon's D 
 conference in August.


 Wow.  That's the most excited I've seen Walter.  : )

 
 LOL, have to admit I agree!!!

It really is very cool and all, but did I just miss the benchmarks?  I 
need numbers, man!   You can generate all the assembly you want, but at 
the end of the day its the numbers that people care about.

--bb

Apr 05 2007

Don Clugston <dac nospam.com.au> writes:

Bill Baxter wrote:
 Georg Wrede wrote:
 Dan wrote:
 Walter Bright Wrote:

 Just wait 'till you get AST macros!

 But this stuff is *awesome*. Scratch DDJ, I don't want to wait 6 
 months. This needs to get out now, - I suggest www.artima.com, 
 followed up with dzone, slashdot, and digg.

 It would also be a real treat to have you present this at Amazon's D 
 conference in August.


 Wow.  That's the most excited I've seen Walter.  : )

 LOL, have to admit I agree!!!

 
 It really is very cool and all, but did I just miss the benchmarks?  I 
 need numbers, man!   You can generate all the assembly you want, but at 
 the end of the day its the numbers that people care about.
 
 --bb

You're right, but it wasn't posted as a finished product!
I post this sort of thing every now and again, largely to give Walter 
some feedback as to what the language is doing at the bleeding edge. In 
this case, I thought it would be relevant to AST macro development.
Also I posted this to get an idea if there was much interest in this 
sort of thing -- I don't have a great deal of free time, so I want to 
make good use of it. I only got the code working a few days ago.
The speed of the code will likely be affected by how well DMD optimises 
the tuple copying. I even haven't looked at what code it's actually 
generating.

Apr 06 2007

Walter Bright <newshound1 digitalmars.com> writes:

Don Clugston wrote:
 * I've often read that the x87 floating-point stack is notoriously 
 difficult for compilers to write code for, but it works quite well in 
 this case.

The hard part is doing register allocation. Generating code for the 
stack machine is trivial. There are some complications for dealing with 
87 stack overflows.

Apr 03 2007

Pragma <ericanderton yahoo.removeme.com> writes:

Don Clugston wrote:
 I have been trying to come up with a convincing use case for the new 
 mixins (and for metaprogramming in general). My best effort to date can 
 be found at:
 http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d
 
 It generates near-optimal x87 asm code for BLAS1-style basic vector 
 operations. 32, 64 and 80 bit vectors are all supported.
 
 Compile with -version=BladeDebug to see the asm code which is generated.
 
 Typical usage:
 
 void main()
 {
     auto p = Vec([1.0L, 2, 18]);    // a vector of 80-bit reals.
     auto q = Vec([3.5L, 1.1, 3.8]);  // ditto
     auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats
     auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles
     real d = dot(r, p+r+r);
     ireal e = dot(r, z);
     q -= ((r+p)*18.0L*314.1L - (p-r))* 35;
     d = dot(r, p+r+r);
 }
 
 Notice that mixed-length operations (real[] + float[] - double[]) are 
 supported.
 
 Like the C++ Blitz++ library, expression templates are used to convert 
 vector expressions into efficient element-wise operations. Unlike that 
 library, however, there is no reliance on the compiler's optimiser. 
 Instead, the expression template is manipulated as text, converted into 
 postfix, and then passed to a simple CTFE compile-time assembler, which 
 creates highly efficient asm code which is used as a mixin.
 To understand the later parts of the code, you need some knowledge of 
 x87 assembler. In fact, you probably need to have read Agner Fog's 
 superb Pentium optimisation manual (www.agner.org).
 
 Some observations:
 * I was amazed at how simple the expression template code is (it is 
 somewhat cluttered by the code to check for real/imaginary type mismatch 
 errors).
 * I've often read that the x87 floating-point stack is notoriously 
 difficult for compilers to write code for, but it works quite well in 
 this case.
 * The major workarounds are:

 - inability to define operators for built-in arrays (hence the use of 
 'Vec' wrappers).
 - inability to index through a tuple in a CTFE function (solved by 
 converting types into a string).
 * There have been mutterings about how unhygenic/dangerous the new 
 mixins are. In this case, the mixin forms the _entire_ body of the 
 function. This is an interesting situation which I think a language 
 purist will find more palatable.
 
 Enjoy.

This is a work of art Don - it's practically a compiler extension.  Nice job. :)

For others that are interested in how this actually gets the job done, I found
this in your documentation:

* THEORY:
* Expression templates are used to create an expression string of the form
"(a+b*c)+d"
* and a tuple, the entries of which correspond to a, b, c, d, ...
* This string is converted to postfix. The postfix string is converted to
* a string containing x87 asm, which is then mixed into a function which
accepts the tuple.

-- 
- EricAnderton at yahoo

Apr 03 2007

Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:

Pragma wrote:
 Don Clugston wrote:
 I have been trying to come up with a convincing use case for the new 
 mixins (and for metaprogramming in general). My best effort to date 
 can be found at:
 http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d

 It generates near-optimal x87 asm code for BLAS1-style basic vector 
 operations. 32, 64 and 80 bit vectors are all supported.

 Compile with -version=BladeDebug to see the asm code which is generated.

 Typical usage:

 void main()
 {
     auto p = Vec([1.0L, 2, 18]);    // a vector of 80-bit reals.
     auto q = Vec([3.5L, 1.1, 3.8]);  // ditto
     auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats
     auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles
     real d = dot(r, p+r+r);
     ireal e = dot(r, z);
     q -= ((r+p)*18.0L*314.1L - (p-r))* 35;
     d = dot(r, p+r+r);
 }

 Notice that mixed-length operations (real[] + float[] - double[]) are 
 supported.

 Like the C++ Blitz++ library, expression templates are used to convert 
 vector expressions into efficient element-wise operations. Unlike that 
 library, however, there is no reliance on the compiler's optimiser. 
 Instead, the expression template is manipulated as text, converted 
 into postfix, and then passed to a simple CTFE compile-time assembler, 
 which creates highly efficient asm code which is used as a mixin.
 To understand the later parts of the code, you need some knowledge of 
 x87 assembler. In fact, you probably need to have read Agner Fog's 
 superb Pentium optimisation manual (www.agner.org).

 Some observations:
 * I was amazed at how simple the expression template code is (it is 
 somewhat cluttered by the code to check for real/imaginary type 
 mismatch errors).
 * I've often read that the x87 floating-point stack is notoriously 
 difficult for compilers to write code for, but it works quite well in 
 this case.
 * The major workarounds are:

 - inability to define operators for built-in arrays (hence the use of 
 'Vec' wrappers).
 - inability to index through a tuple in a CTFE function (solved by 
 converting types into a string).
 * There have been mutterings about how unhygenic/dangerous the new 
 mixins are. In this case, the mixin forms the _entire_ body of the 
 function. This is an interesting situation which I think a language 
 purist will find more palatable.

 Enjoy.

 
 This is a work of art Don - it's practically a compiler extension.  Nice 
 job. :)
 
 For others that are interested in how this actually gets the job done, I 
 found this in your documentation:
 
 * THEORY:
 * Expression templates are used to create an expression string of the 
 form "(a+b*c)+d"
 * and a tuple, the entries of which correspond to a, b, c, d, ...
 * This string is converted to postfix. The postfix string is converted to
 * a string containing x87 asm, which is then mixed into a function which 
 accepts the tuple.
 

Hum, a minor question, is a string representation of the expressions 
better (easier to use, manipulate, etc.) than a tree representation?

-- 
Bruno Medeiros - MSc in CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D

Apr 07 2007

Pragma <ericanderton yahoo.removeme.com> writes:

Bruno Medeiros wrote:
 Pragma wrote:
 Don Clugston wrote:
 I have been trying to come up with a convincing use case for the new 
 mixins (and for metaprogramming in general). My best effort to date 
 can be found at:
 http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d 


 It generates near-optimal x87 asm code for BLAS1-style basic vector 
 operations. 32, 64 and 80 bit vectors are all supported.

 Compile with -version=BladeDebug to see the asm code which is generated.

 Typical usage:

 void main()
 {
     auto p = Vec([1.0L, 2, 18]);    // a vector of 80-bit reals.
     auto q = Vec([3.5L, 1.1, 3.8]);  // ditto
     auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats
     auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles
     real d = dot(r, p+r+r);
     ireal e = dot(r, z);
     q -= ((r+p)*18.0L*314.1L - (p-r))* 35;
     d = dot(r, p+r+r);
 }

 Notice that mixed-length operations (real[] + float[] - double[]) are 
 supported.

 Like the C++ Blitz++ library, expression templates are used to 
 convert vector expressions into efficient element-wise operations. 
 Unlike that library, however, there is no reliance on the compiler's 
 optimiser. Instead, the expression template is manipulated as text, 
 converted into postfix, and then passed to a simple CTFE compile-time 
 assembler, which creates highly efficient asm code which is used as a 
 mixin.
 To understand the later parts of the code, you need some knowledge of 
 x87 assembler. In fact, you probably need to have read Agner Fog's 
 superb Pentium optimisation manual (www.agner.org).

 Some observations:
 * I was amazed at how simple the expression template code is (it is 
 somewhat cluttered by the code to check for real/imaginary type 
 mismatch errors).
 * I've often read that the x87 floating-point stack is notoriously 
 difficult for compilers to write code for, but it works quite well in 
 this case.
 * The major workarounds are:

 - inability to define operators for built-in arrays (hence the use of 
 'Vec' wrappers).
 - inability to index through a tuple in a CTFE function (solved by 
 converting types into a string).
 * There have been mutterings about how unhygenic/dangerous the new 
 mixins are. In this case, the mixin forms the _entire_ body of the 
 function. This is an interesting situation which I think a language 
 purist will find more palatable.

 Enjoy.

 This is a work of art Don - it's practically a compiler extension.  
 Nice job. :)

 For others that are interested in how this actually gets the job done, 
 I found this in your documentation:

 * THEORY:
 * Expression templates are used to create an expression string of the 
 form "(a+b*c)+d"
 * and a tuple, the entries of which correspond to a, b, c, d, ...
 * This string is converted to postfix. The postfix string is converted to
 * a string containing x87 asm, which is then mixed into a function 
 which accepts the tuple.

 
 Hum, a minor question, is a string representation of the expressions 
 better (easier to use, manipulate, etc.) than a tree representation?
 

I know Don wrote this lib, but I think I can answer this.

In short: no.  But it's really an unintentionally loaded question: there are
some rather severe limitations as to what 
kind of data structures you can create at compile time.  You're basically
limited to tuples and strings, each of which 
have drawbacks of their own.

You can create a tree using tuples, but then you run into problems with
over-running the limitations on identifier 
length built into the compiler.  Strings are no where near as flexible out of
the box, but pose no storage limitations, 
which gives a slight edge to string-based representations of data.  With some
clever coding, strings can be made to 
store just about any structure imaginable (even if it does make for some ugly
code).

-- 
- EricAnderton at yahoo

Apr 09 2007

Dan <murpsoft hotmail.com> writes:

Pragma Wrote:
 In short: no.  But it's really an unintentionally loaded question: there are
some rather severe limitations as to what 
 kind of data structures you can create at compile time.  You're basically
limited to tuples and strings, each of which 
 have drawbacks of their own.
 
 You can create a tree using tuples, but then you run into problems with
over-running the limitations on identifier 
 length built into the compiler.  Strings are no where near as flexible out of
the box, but pose no storage limitations, 
 which gives a slight edge to string-based representations of data.  With some
clever coding, strings can be made to 
 store just about any structure imaginable (even if it does make for some ugly
code).

Well said.  Once the AST is reflected at compile time, I'm sure the code can be
made vastly more refined; and perhaps additional code generators can be
developed and *hopefully* integrated into D.

Particular generators that spark my interest tend to have to do with x87, SSE
extensions, and x86-64.  Most compilers to date don't properly use these
functionalities.  Having them integrated into D Agner Fog-optimally would
hugely replace alot of C++ gaming, video, rendering and graphics engine code.

*bigsmile

Apr 09 2007

Dave <Dave_member pathlink.com> writes:

Dan wrote:
 Pragma Wrote:
 In short: no.  But it's really an unintentionally loaded question: there are
some rather severe limitations as to what 
 kind of data structures you can create at compile time.  You're basically
limited to tuples and strings, each of which 
 have drawbacks of their own.

 You can create a tree using tuples, but then you run into problems with
over-running the limitations on identifier 
 length built into the compiler.  Strings are no where near as flexible out of
the box, but pose no storage limitations, 
 which gives a slight edge to string-based representations of data.  With some
clever coding, strings can be made to 
 store just about any structure imaginable (even if it does make for some ugly
code).

 
 Well said.  Once the AST is reflected at compile time, I'm sure the code can
be made vastly more refined; and perhaps additional code generators can be
developed and *hopefully* integrated into D.
 
 Particular generators that spark my interest tend to have to do with x87, SSE
extensions, and x86-64.  Most compilers to date don't properly use these
functionalities.  Having them integrated into D Agner Fog-optimally would
hugely replace alot of C++ gaming, video, rendering and graphics engine code.
 
 *bigsmile

Hmmm - I wonder if a lot of what Don's been doing is serving as a "proof of
concept" for the D 2.0 
compiler, to be written in D of course <g>

(It just seems like a lot of the new D features fit so well with compiler
development. Especially 
for things like "proving constness", etc.)

- Dave

Apr 09 2007

Pragma <ericanderton yahoo.removeme.com> writes:

Dan wrote:
 Pragma Wrote:
 In short: no.  But it's really an unintentionally loaded question: there are
some rather severe limitations as to what 
 kind of data structures you can create at compile time.  You're basically
limited to tuples and strings, each of which 
 have drawbacks of their own.

 You can create a tree using tuples, but then you run into problems with
over-running the limitations on identifier 
 length built into the compiler.  Strings are no where near as flexible out of
the box, but pose no storage limitations, 
 which gives a slight edge to string-based representations of data.  With some
clever coding, strings can be made to 
 store just about any structure imaginable (even if it does make for some ugly
code).

 
 Well said.  Once the AST is reflected at compile time, I'm sure the code can
be made vastly more refined; and perhaps additional code generators can be
developed and *hopefully* integrated into D.

Thanks.  But evaluating the AST will only take things so far.  There is a huge
gulf of functionality that stands between 
where we are now (at compile time) and what D would be like as a full-on
scripting language (not that I'm advocating 
that it be pushed that far).  Much of what sits in the middle (namely structs,
multi-dimensional arrays and maps) can go 
a long way towards *analyzing* that tree for code generation and more.

 
 Particular generators that spark my interest tend to have to do with x87, SSE
extensions, and x86-64.  Most compilers to date don't properly use these
functionalities.  Having them integrated into D Agner Fog-optimally would
hugely replace alot of C++ gaming, video, rendering and graphics engine code.
 

Anyway, I'm grinning too.  I can't wait to see what comes up next.  So far,
converting Enki to a compile-time generator 
is proving to be a huge challenge, but workable.

-- 
- EricAnderton at yahoo

Apr 09 2007

Paul Findlay <r.lph50+d gmail.com> writes:

 Particular generators that spark my interest tend to have to do with x87,
 SSE extensions, and x86-64.  Most compilers to date don't properly use
 these functionalities.  Having them integrated into D Agner Fog-optimally
 would hugely replace alot of C++ gaming, video, rendering and graphics
 engine code.

And its dawning on me that some text processing can take advantage of doing
64-bit/128-bit chunks at a time

 - Paul

Apr 09 2007

Dan <murpsoft hotmail.com> writes:

Paul Findlay Wrote:

 Particular generators that spark my interest tend to have to do with x87,
 SSE extensions, and x86-64.  Most compilers to date don't properly use
 these functionalities.  Having them integrated into D Agner Fog-optimally
 would hugely replace alot of C++ gaming, video, rendering and graphics
 engine code.

 And its dawning on me that some text processing can take advantage of doing
 64-bit/128-bit chunks at a time

Yeah, pretty much any large (>1kb) buffer copy algorithm runs fastest through
SSE2 ever since 128-bit came out.  The raw power overcomes the need for
tweaking the loop instead of just using rep movsd.

I think the asm guys have the best code we could hope for already written up on
a few pages.  Agner Fog, Paul Hsieh et al.  If we could leech their strategy
(if meaning legal) then I can see a good use for such a thing.  I guess we just
ought to wait for AST reflection.

Apr 10 2007

Don Clugston <dac nospam.com.au> writes:

Pragma wrote:
 Bruno Medeiros wrote:
 Pragma wrote:
 Don Clugston wrote:
 I have been trying to come up with a convincing use case for the new 
 mixins (and for metaprogramming in general). My best effort to date 
 can be found at:
 http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d 


 It generates near-optimal x87 asm code for BLAS1-style basic vector 
 operations. 32, 64 and 80 bit vectors are all supported.

 Compile with -version=BladeDebug to see the asm code which is 
 generated.

 Typical usage:

 void main()
 {
     auto p = Vec([1.0L, 2, 18]);    // a vector of 80-bit reals.
     auto q = Vec([3.5L, 1.1, 3.8]);  // ditto
     auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats
     auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles
     real d = dot(r, p+r+r);
     ireal e = dot(r, z);
     q -= ((r+p)*18.0L*314.1L - (p-r))* 35;
     d = dot(r, p+r+r);
 }

 Notice that mixed-length operations (real[] + float[] - double[]) 
 are supported.

 Like the C++ Blitz++ library, expression templates are used to 
 convert vector expressions into efficient element-wise operations. 
 Unlike that library, however, there is no reliance on the compiler's 
 optimiser. Instead, the expression template is manipulated as text, 
 converted into postfix, and then passed to a simple CTFE 
 compile-time assembler, which creates highly efficient asm code 
 which is used as a mixin.
 To understand the later parts of the code, you need some knowledge 
 of x87 assembler. In fact, you probably need to have read Agner 
 Fog's superb Pentium optimisation manual (www.agner.org).

 Some observations:
 * I was amazed at how simple the expression template code is (it is 
 somewhat cluttered by the code to check for real/imaginary type 
 mismatch errors).
 * I've often read that the x87 floating-point stack is notoriously 
 difficult for compilers to write code for, but it works quite well 
 in this case.
 * The major workarounds are:

 - inability to define operators for built-in arrays (hence the use 
 of 'Vec' wrappers).
 - inability to index through a tuple in a CTFE function (solved by 
 converting types into a string).
 * There have been mutterings about how unhygenic/dangerous the new 
 mixins are. In this case, the mixin forms the _entire_ body of the 
 function. This is an interesting situation which I think a language 
 purist will find more palatable.

 Enjoy.

 This is a work of art Don - it's practically a compiler extension.  
 Nice job. :)

 For others that are interested in how this actually gets the job 
 done, I found this in your documentation:

 * THEORY:
 * Expression templates are used to create an expression string of the 
 form "(a+b*c)+d"
 * and a tuple, the entries of which correspond to a, b, c, d, ...
 * This string is converted to postfix. The postfix string is 
 converted to
 * a string containing x87 asm, which is then mixed into a function 
 which accepts the tuple.

 Hum, a minor question, is a string representation of the expressions 
 better (easier to use, manipulate, etc.) than a tree representation?

 
 I know Don wrote this lib, but I think I can answer this.
 
 In short: no.  But it's really an unintentionally loaded question: there 
 are some rather severe limitations as to what kind of data structures 
 you can create at compile time.  You're basically limited to tuples and 
 strings, each of which have drawbacks of their own.
 
 You can create a tree using tuples, but then you run into problems with 
 over-running the limitations on identifier length built into the 
 compiler. 

Actually I don't know how to make a tree nicely with tuples.

  Strings are no where near as flexible out of the box, but
 pose no storage limitations, which gives a slight edge to string-based 
 representations of data.  With some clever coding, strings can be made 
 to store just about any structure imaginable (even if it does make for 
 some ugly code).

There's a couple of minor things, though: (1) in the case of generating 
D code to mixin, no tree is required (the string just gets mixed back in 
again, so it might as well stay as a string).
(2) The "separate string and tuple" structure means the expression can 
be optimised without needing to move any of the values around; this 
creates less garbage for the compiler to optimise away.
(3) Some patterns are probably easier to find with a string, than a tree.
(4) It's more natural for D coders to think in terms of arrays, than 
lists. (and there is much more language support).

(5) A string representation can easily express things like c=a+3*(a+b);
where 'a' appears twice. (At present, there's no way to generate it, but 
theoretically the compiler could provide it many cases).

So -- if trees were available, I'm not sure if I'd use them, or not.

Apr 09 2007

Pragma <ericanderton yahoo.removeme.com> writes:

Don Clugston wrote:
 Pragma wrote:
 Bruno Medeiros wrote:
 Pragma wrote:
 Don Clugston wrote:
 I have been trying to come up with a convincing use case for the 
 new mixins (and for metaprogramming in general). My best effort to 
 date can be found at:
 http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d 


 It generates near-optimal x87 asm code for BLAS1-style basic vector 
 operations. 32, 64 and 80 bit vectors are all supported.

 Compile with -version=BladeDebug to see the asm code which is 
 generated.

 Typical usage:

 void main()
 {
     auto p = Vec([1.0L, 2, 18]);    // a vector of 80-bit reals.
     auto q = Vec([3.5L, 1.1, 3.8]);  // ditto
     auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats
     auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles
     real d = dot(r, p+r+r);
     ireal e = dot(r, z);
     q -= ((r+p)*18.0L*314.1L - (p-r))* 35;
     d = dot(r, p+r+r);
 }

 Notice that mixed-length operations (real[] + float[] - double[]) 
 are supported.

 Like the C++ Blitz++ library, expression templates are used to 
 convert vector expressions into efficient element-wise operations. 
 Unlike that library, however, there is no reliance on the 
 compiler's optimiser. Instead, the expression template is 
 manipulated as text, converted into postfix, and then passed to a 
 simple CTFE compile-time assembler, which creates highly efficient 
 asm code which is used as a mixin.
 To understand the later parts of the code, you need some knowledge 
 of x87 assembler. In fact, you probably need to have read Agner 
 Fog's superb Pentium optimisation manual (www.agner.org).

 Some observations:
 * I was amazed at how simple the expression template code is (it is 
 somewhat cluttered by the code to check for real/imaginary type 
 mismatch errors).
 * I've often read that the x87 floating-point stack is notoriously 
 difficult for compilers to write code for, but it works quite well 
 in this case.
 * The major workarounds are:

 - inability to define operators for built-in arrays (hence the use 
 of 'Vec' wrappers).
 - inability to index through a tuple in a CTFE function (solved by 
 converting types into a string).
 * There have been mutterings about how unhygenic/dangerous the new 
 mixins are. In this case, the mixin forms the _entire_ body of the 
 function. This is an interesting situation which I think a language 
 purist will find more palatable.

 Enjoy.

 This is a work of art Don - it's practically a compiler extension.  
 Nice job. :)

 For others that are interested in how this actually gets the job 
 done, I found this in your documentation:

 * THEORY:
 * Expression templates are used to create an expression string of 
 the form "(a+b*c)+d"
 * and a tuple, the entries of which correspond to a, b, c, d, ...
 * This string is converted to postfix. The postfix string is 
 converted to
 * a string containing x87 asm, which is then mixed into a function 
 which accepts the tuple.

 Hum, a minor question, is a string representation of the expressions 
 better (easier to use, manipulate, etc.) than a tree representation?

 I know Don wrote this lib, but I think I can answer this.

 In short: no.  But it's really an unintentionally loaded question: 
 there are some rather severe limitations as to what kind of data 
 structures you can create at compile time.  You're basically limited 
 to tuples and strings, each of which have drawbacks of their own.

 You can create a tree using tuples, but then you run into problems 
 with over-running the limitations on identifier length built into the 
 compiler. 

 Actually I don't know how to make a tree nicely with tuples.

Here's an example (for grins):

import std.stdio;

template Node(char[] Data,Nodes...){
	alias Data data;
	alias Nodes nodes;
}

template PrintData(char[] parent){
	const char[] PrintData = "";
}

template PrintData(char[] parent,alias Node){
	static if(parent == ""){
		const char[] PrintData = Node.data ~ "\n" ~
			PrintData!(Node.data,Node.nodes);
	}
	else{
		const char[] PrintData = parent ~ "." ~ Node.data ~ "\n" ~
			PrintData!(parent ~ "." ~ Node.data,Node.nodes);
	}
}

template PrintData(char[] parent,alias Node,V...){
	const char[] PrintData = PrintData!(parent,Node) ~ PrintData!(parent,V);
}

// example "tree" structure
alias Node!("1",
	Node!("1"),
	Node!("2",
		Node!("1"),
		Node!("2")
	)
) dataset;

void main(){
	writefln("%s",PrintData!("",dataset));
}

 
  Strings are no where near as flexible out of the box, but
 pose no storage limitations, which gives a slight edge to string-based 
 representations of data.  With some clever coding, strings can be made 
 to store just about any structure imaginable (even if it does make for 
 some ugly code).

 
 There's a couple of minor things, though: (1) in the case of generating 
 D code to mixin, no tree is required (the string just gets mixed back in 
 again, so it might as well stay as a string).
 (2) The "separate string and tuple" structure means the expression can 
 be optimised without needing to move any of the values around; this 
 creates less garbage for the compiler to optimise away.
 (3) Some patterns are probably easier to find with a string, than a tree.
 (4) It's more natural for D coders to think in terms of arrays, than 
 lists. (and there is much more language support).
 
 (5) A string representation can easily express things like c=a+3*(a+b);
 where 'a' appears twice. (At present, there's no way to generate it, but 
 theoretically the compiler could provide it many cases).
 
 So -- if trees were available, I'm not sure if I'd use them, or not.

They're really handy.  I've found some uses for them already:

/*
ZeroOrMoreExpr
	= generate.ZeroOrMoreExpr()
	::= "["  "{"  Expression:expr  "}"  "]"  [RangeSetExpr:ranges ]
[Binding:binding ]
		["*" SubExpressionTerm:delim ] [SubExpressionTerm:term];
*/
bool ZeroOrMoreExpr(inout char[] value,inout char[] bindings,inout char[]
tokens,inout char[] err){
	return Rule!(
		generate.ZeroOrMoreExpr,
		And!(
			Literal!("["),
			Literal!("{"),
			Bind!(Expression,"expr"),
			Literal!("}"),
			Literal!("]"),
			Optional!(
				Bind!(RangeSetExpr,"ranges")
			),			
			Optional!(
				Bind!(Binding,"binding")
			),
			Optional!(
				And!(
					Literal!("*"),
					Bind!(SubExpressionTerm,"delim")
				)
			),
			Bind!(SubExpressionTerm,"term")
		)
	)(value,bindings,tokens,err);		
}

There's a *lot* going on in this sample.  Basically what you see here is a
chunk of the as-of-yet-experimental 
compile-time Enki parser.  This piece parses the Zero-or-more expression part
of the EBNF variant that Enki supports.

The templates evaluate to CTFE's that in turn make up the parser when it's
executed.  So there's several layers of 
compile-time evaluation going on here.  Also, what's not obvious is that
"char[] tokens" is actually an *array of 
strings* that is stored in a single char[] array; each string's length data is
encoded as a size_t mapped onto the 
appropriate number of chars.  The Bind!() expressions also map parsed out data
to a key/value set which is stored in a 
similar fashion (char[] bindings).

The goals here were to side-step the limitations of D's identifier name-length
while providing a user-readable toolkit 
for parser composition; the only limit now is placed on the length of any given
rule.  I haven't found a way to unify 
the compile-time and run-time parser generation yet (one backend with two
targets), but I anticipate using something 
very similar to the above by providing a CT and RT version of the template
library.  At a minimum I plan on having this 
consume a .bnf file at compile-time, in order to create a run-time parser via
mixin.

-- 
- EricAnderton at yahoo

Apr 09 2007

Don Clugston <dac nospam.com.au> writes:

Pragma wrote:
 Don Clugston wrote:
 Pragma wrote:
 Bruno Medeiros wrote:
 Pragma wrote:
 Don Clugston wrote:
 I have been trying to come up with a convincing use case for the 
 new mixins (and for metaprogramming in general). My best effort to 
 date can be found at:
 http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d 


 It generates near-optimal x87 asm code for BLAS1-style basic 
 vector operations. 32, 64 and 80 bit vectors are all supported.

 Compile with -version=BladeDebug to see the asm code which is 
 generated.

 Typical usage:

 void main()
 {
     auto p = Vec([1.0L, 2, 18]);    // a vector of 80-bit reals.
     auto q = Vec([3.5L, 1.1, 3.8]);  // ditto
     auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats
     auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles
     real d = dot(r, p+r+r);
     ireal e = dot(r, z);
     q -= ((r+p)*18.0L*314.1L - (p-r))* 35;
     d = dot(r, p+r+r);
 }

 Notice that mixed-length operations (real[] + float[] - double[]) 
 are supported.

 Like the C++ Blitz++ library, expression templates are used to 
 convert vector expressions into efficient element-wise operations. 
 Unlike that library, however, there is no reliance on the 
 compiler's optimiser. Instead, the expression template is 
 manipulated as text, converted into postfix, and then passed to a 
 simple CTFE compile-time assembler, which creates highly efficient 
 asm code which is used as a mixin.
 To understand the later parts of the code, you need some knowledge 
 of x87 assembler. In fact, you probably need to have read Agner 
 Fog's superb Pentium optimisation manual (www.agner.org).

 Some observations:
 * I was amazed at how simple the expression template code is (it 
 is somewhat cluttered by the code to check for real/imaginary type 
 mismatch errors).
 * I've often read that the x87 floating-point stack is notoriously 
 difficult for compilers to write code for, but it works quite well 
 in this case.
 * The major workarounds are:
 - inability to use a tuple element directly from asm code (bug 

 - inability to define operators for built-in arrays (hence the use 
 of 'Vec' wrappers).
 - inability to index through a tuple in a CTFE function (solved by 
 converting types into a string).
 * There have been mutterings about how unhygenic/dangerous the new 
 mixins are. In this case, the mixin forms the _entire_ body of the 
 function. This is an interesting situation which I think a 
 language purist will find more palatable.

 Enjoy.

 This is a work of art Don - it's practically a compiler extension.  
 Nice job. :)

 For others that are interested in how this actually gets the job 
 done, I found this in your documentation:

 * THEORY:
 * Expression templates are used to create an expression string of 
 the form "(a+b*c)+d"
 * and a tuple, the entries of which correspond to a, b, c, d, ...
 * This string is converted to postfix. The postfix string is 
 converted to
 * a string containing x87 asm, which is then mixed into a function 
 which accepts the tuple.

 Hum, a minor question, is a string representation of the expressions 
 better (easier to use, manipulate, etc.) than a tree representation?

 I know Don wrote this lib, but I think I can answer this.

 In short: no.  But it's really an unintentionally loaded question: 
 there are some rather severe limitations as to what kind of data 
 structures you can create at compile time.  You're basically limited 
 to tuples and strings, each of which have drawbacks of their own.

 You can create a tree using tuples, but then you run into problems 
 with over-running the limitations on identifier length built into the 
 compiler. 

 Actually I don't know how to make a tree nicely with tuples.

 
 Here's an example (for grins):
 
 import std.stdio;
 
 template Node(char[] Data,Nodes...){
     alias Data data;
     alias Nodes nodes;
 }

 // example "tree" structure
 alias Node!("1",
     Node!("1"),
     Node!("2",
         Node!("1"),
         Node!("2")
     )
 ) dataset;

The problem I was referring to, is: how to store both values, and 
functions/operators, inside the tree. It seems to get messy very quickly.

 So -- if trees were available, I'm not sure if I'd use them, or not.

 
 They're really handy.  I've found some uses for them already:

I meant for this application. There's no doubt they're indispensable in 
other contexts.

 
 /*
 ZeroOrMoreExpr
     = generate.ZeroOrMoreExpr()
     ::= "["  "{"  Expression:expr  "}"  "]"  [RangeSetExpr:ranges ] 
 [Binding:binding ]
         ["*" SubExpressionTerm:delim ] [SubExpressionTerm:term];
 */
 bool ZeroOrMoreExpr(inout char[] value,inout char[] bindings,inout 
 char[] tokens,inout char[] err){
     return Rule!(
         generate.ZeroOrMoreExpr,
         And!(
             Literal!("["),
             Literal!("{"),
             Bind!(Expression,"expr"),
             Literal!("}"),
             Literal!("]"),
             Optional!(
                 Bind!(RangeSetExpr,"ranges")
             ),           
             Optional!(
                 Bind!(Binding,"binding")
             ),
             Optional!(
                 And!(
                     Literal!("*"),
                     Bind!(SubExpressionTerm,"delim")
                 )
             ),
             Bind!(SubExpressionTerm,"term")
         )
     )(value,bindings,tokens,err);       
 }
 
 There's a *lot* going on in this sample.  Basically what you see here is 
 a chunk of the as-of-yet-experimental compile-time Enki parser.  This 
 piece parses the Zero-or-more expression part of the EBNF variant that 
 Enki supports.
 
 The templates evaluate to CTFE's that in turn make up the parser when 
 it's executed.  So there's several layers of compile-time evaluation 
 going on here.  Also, what's not obvious is that "char[] tokens" is 
 actually an *array of strings* that is stored in a single char[] array; 
 each string's length data is encoded as a size_t mapped onto the 
 appropriate number of chars.  The Bind!() expressions also map parsed 
 out data to a key/value set which is stored in a similar fashion (char[] 
 bindings).

Seriously cool! Seems like you're generating a tree of nested mixins?
Anyway, I suspect this will really benefit from any compiler 
improvements, eg CTFE support for AAs or nested functions. And obviously 
AST macros.

 The goals here were to side-step the limitations of D's identifier 
 name-length while providing a user-readable toolkit for parser 
 composition; the only limit now is placed on the length of any given 
 rule.  I haven't found a way to unify the compile-time and run-time 
 parser generation yet (one backend with two targets), but I anticipate 
 using something very similar to the above by providing a CT and RT 
 version of the template library.  At a minimum I plan on having this 
 consume a .bnf file at compile-time, in order to create a run-time 
 parser via mixin.

I really think the new mixins + CTFE was a breakthrough. Publish some of 
this stuff, and I think we'll see an exodus of many Boost developers into D.

Apr 10 2007

Walter Bright <newshound1 digitalmars.com> writes:

Don Clugston wrote:
 I really think the new mixins + CTFE was a breakthrough. Publish some of 
 this stuff, and I think we'll see an exodus of many Boost developers 
 into D.

You guys gotta publish!

Apr 10 2007

Don Clugston <dac nospam.com.au> writes:

Walter Bright wrote:
 Don Clugston wrote:
 I really think the new mixins + CTFE was a breakthrough. Publish some 
 of this stuff, and I think we'll see an exodus of many Boost 
 developers into D.

 
 You guys gotta publish!

I've begun a draft. A historical question for you --

On this page,
http://www.artima.com/cppsource/top_cpp_software.html
Scott Meyers says that g++ was the first compiler to generate native 
code. But I thought Zortech was older than g++. Is that correct?

Apr 10 2007

Walter Bright <newshound1 digitalmars.com> writes:

Don Clugston wrote:
 I've begun a draft. A historical question for you --
 
 On this page,
 http://www.artima.com/cppsource/top_cpp_software.html
 Scott Meyers says that g++ was the first compiler to generate native 
 code. But I thought Zortech was older than g++. Is that correct?

Depends on how you look at it. g++ was first 'released' in December, 
1987. But the release notes for it say: "The GNU C++ Compiler is still 
in test release, and is NOT ready for everyday use" so I don't consider 
that a real release. Oregon Software released their native C++ compiler 
(not for the PC) in Jan or Feb 1988, nobody seems to recall exactly. 
Zortech's first release was in Jun 1988. Michael Tiemann, author of g++, 
calls the Sep 1988 release the "first really stable version."

Apr 10 2007

Pragma <ericanderton yahoo.removeme.com> writes:

Walter Bright wrote:
 Don Clugston wrote:
 I really think the new mixins + CTFE was a breakthrough. Publish some 
 of this stuff, and I think we'll see an exodus of many Boost 
 developers into D.

 
 You guys gotta publish!

Baby steps Walter, baby steps... ;)

-- 
- EricAnderton at yahoo

Apr 10 2007

Pragma <ericanderton yahoo.removeme.com> writes:

Don Clugston wrote:
 The problem I was referring to, is: how to store both values, and 
 functions/operators, inside the tree. It seems to get messy very quickly.

It can, especially with the additional code you need to treat all those
templates as specialized data types.

 I meant for this application. There's no doubt they're indispensable in 
 other contexts.

Oh, my bad.  Yea, it would probably be overkill for BLADE. :)

 Basically what you see here
 is a chunk of the as-of-yet-experimental compile-time Enki parser.  
 This piece parses the Zero-or-more expression part of the EBNF variant 
 that Enki supports.

 The templates evaluate to CTFE's that in turn make up the parser when 
 it's executed.  So there's several layers of compile-time evaluation 
 going on here.  Also, what's not obvious is that "char[] tokens" is 
 actually an *array of strings* that is stored in a single char[] 
 array; each string's length data is encoded as a size_t mapped onto 
 the appropriate number of chars.  The Bind!() expressions also map 
 parsed out data to a key/value set which is stored in a similar 
 fashion (char[] bindings).

 
 Seriously cool! Seems like you're generating a tree of nested mixins?

Almost.  The generate.ZeroOrMore returns an arbitrary string, that may be
another "map" or a chunk of runtime code; the 
value is placed in the "value" string that's passed in.  The *rootmost*
generator is what compiles all this stuff into 
an actual chunk of mixin-able code.  It's analogous to how the current
rendition of Enki works.

It seems overkill, but it's needed so I can do some basic semantic analysis,
and do things like declare binding vars. 
I'm still looking for ways to simplify the process.

 Anyway, I suspect this will really benefit from any compiler 
 improvements, eg CTFE support for AAs or nested functions. And obviously 
 AST macros.

Definitely for AA's but not so much for AST macros - I get the impression that
AST manipulation will only be useful for 
D code, and not for completely arbitrary grammars like EBNF.  Getting
compile-time AA support would cut Enki's code size 
down by almost a third:

const char[] hackedArray =
"\x00\x00\x00\x05hello\x00\x00\x00\x05cruel\x00\x00\x00\x05world";

Under the hood, this is what most of my data looks like.  Dropping the support
routines for manipulating such structures 
would help make things a lot less cumbersome. :(

 I really think the new mixins + CTFE was a breakthrough. Publish some of 
 this stuff, and I think we'll see an exodus of many Boost developers 
 into D.

Agreed.  CTFE frees us from many limitations imposed by templates, and the
added flexibility of mixin() pretty much 
gives us the same strength of javascript's eval() statement at compile-time. 
It's a huge deal.

It's possible that we might find a handful of interested devs in their camp,
but I think we're still back to the same 
old problem as with the rest of C++: momentum.  I sincerely doubt we'll see a
mass exodus from one camp to this one 
regardless of how good a job is done here.  Of course, I'll be happy to be
wrong about that. :)
-- 
- EricAnderton at yahoo

Apr 10 2007

KlausO <oberhofer users.sf.net> writes:

Pragma schrieb:
 Don Clugston wrote:
 Pragma wrote:
 Bruno Medeiros wrote:
 Pragma wrote:
 Don Clugston wrote:
 I have been trying to come up with a convincing use case for the 
 new mixins (and for metaprogramming in general). My best effort to 
 date can be found at:
 http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d 


 It generates near-optimal x87 asm code for BLAS1-style basic 
 vector operations. 32, 64 and 80 bit vectors are all supported.

 Compile with -version=BladeDebug to see the asm code which is 
 generated.

 Typical usage:

 void main()
 {
     auto p = Vec([1.0L, 2, 18]);    // a vector of 80-bit reals.
     auto q = Vec([3.5L, 1.1, 3.8]);  // ditto
     auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats
     auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles
     real d = dot(r, p+r+r);
     ireal e = dot(r, z);
     q -= ((r+p)*18.0L*314.1L - (p-r))* 35;
     d = dot(r, p+r+r);
 }

 Notice that mixed-length operations (real[] + float[] - double[]) 
 are supported.

 Like the C++ Blitz++ library, expression templates are used to 
 convert vector expressions into efficient element-wise operations. 
 Unlike that library, however, there is no reliance on the 
 compiler's optimiser. Instead, the expression template is 
 manipulated as text, converted into postfix, and then passed to a 
 simple CTFE compile-time assembler, which creates highly efficient 
 asm code which is used as a mixin.
 To understand the later parts of the code, you need some knowledge 
 of x87 assembler. In fact, you probably need to have read Agner 
 Fog's superb Pentium optimisation manual (www.agner.org).

 Some observations:
 * I was amazed at how simple the expression template code is (it 
 is somewhat cluttered by the code to check for real/imaginary type 
 mismatch errors).
 * I've often read that the x87 floating-point stack is notoriously 
 difficult for compilers to write code for, but it works quite well 
 in this case.
 * The major workarounds are:
 - inability to use a tuple element directly from asm code (bug 

 - inability to define operators for built-in arrays (hence the use 
 of 'Vec' wrappers).
 - inability to index through a tuple in a CTFE function (solved by 
 converting types into a string).
 * There have been mutterings about how unhygenic/dangerous the new 
 mixins are. In this case, the mixin forms the _entire_ body of the 
 function. This is an interesting situation which I think a 
 language purist will find more palatable.

 Enjoy.

 This is a work of art Don - it's practically a compiler extension.  
 Nice job. :)

 For others that are interested in how this actually gets the job 
 done, I found this in your documentation:

 * THEORY:
 * Expression templates are used to create an expression string of 
 the form "(a+b*c)+d"
 * and a tuple, the entries of which correspond to a, b, c, d, ...
 * This string is converted to postfix. The postfix string is 
 converted to
 * a string containing x87 asm, which is then mixed into a function 
 which accepts the tuple.

 Hum, a minor question, is a string representation of the expressions 
 better (easier to use, manipulate, etc.) than a tree representation?

 I know Don wrote this lib, but I think I can answer this.

 In short: no.  But it's really an unintentionally loaded question: 
 there are some rather severe limitations as to what kind of data 
 structures you can create at compile time.  You're basically limited 
 to tuples and strings, each of which have drawbacks of their own.

 You can create a tree using tuples, but then you run into problems 
 with over-running the limitations on identifier length built into the 
 compiler. 

 Actually I don't know how to make a tree nicely with tuples.

 
 Here's an example (for grins):
 
 import std.stdio;
 
 template Node(char[] Data,Nodes...){
     alias Data data;
     alias Nodes nodes;
 }
 
 template PrintData(char[] parent){
     const char[] PrintData = "";
 }
 
 template PrintData(char[] parent,alias Node){
     static if(parent == ""){
         const char[] PrintData = Node.data ~ "\n" ~
             PrintData!(Node.data,Node.nodes);
     }
     else{
         const char[] PrintData = parent ~ "." ~ Node.data ~ "\n" ~
             PrintData!(parent ~ "." ~ Node.data,Node.nodes);
     }
 }
 
 template PrintData(char[] parent,alias Node,V...){
     const char[] PrintData = PrintData!(parent,Node) ~ 
 PrintData!(parent,V);
 }
 
 // example "tree" structure
 alias Node!("1",
     Node!("1"),
     Node!("2",
         Node!("1"),
         Node!("2")
     )
 ) dataset;
 
 void main(){
     writefln("%s",PrintData!("",dataset));
 }
 
  Strings are no where near as flexible out of the box, but
 pose no storage limitations, which gives a slight edge to 
 string-based representations of data.  With some clever coding, 
 strings can be made to store just about any structure imaginable 
 (even if it does make for some ugly code).

 There's a couple of minor things, though: (1) in the case of 
 generating D code to mixin, no tree is required (the string just gets 
 mixed back in again, so it might as well stay as a string).
 (2) The "separate string and tuple" structure means the expression can 
 be optimised without needing to move any of the values around; this 
 creates less garbage for the compiler to optimise away.
 (3) Some patterns are probably easier to find with a string, than a tree.
 (4) It's more natural for D coders to think in terms of arrays, than 
 lists. (and there is much more language support).

 (5) A string representation can easily express things like c=a+3*(a+b);
 where 'a' appears twice. (At present, there's no way to generate it, 
 but theoretically the compiler could provide it many cases).

 So -- if trees were available, I'm not sure if I'd use them, or not.

 
 They're really handy.  I've found some uses for them already:
 
 /*
 ZeroOrMoreExpr
     = generate.ZeroOrMoreExpr()
     ::= "["  "{"  Expression:expr  "}"  "]"  [RangeSetExpr:ranges ] 
 [Binding:binding ]
         ["*" SubExpressionTerm:delim ] [SubExpressionTerm:term];
 */
 bool ZeroOrMoreExpr(inout char[] value,inout char[] bindings,inout 
 char[] tokens,inout char[] err){
     return Rule!(
         generate.ZeroOrMoreExpr,
         And!(
             Literal!("["),
             Literal!("{"),
             Bind!(Expression,"expr"),
             Literal!("}"),
             Literal!("]"),
             Optional!(
                 Bind!(RangeSetExpr,"ranges")
             ),           
             Optional!(
                 Bind!(Binding,"binding")
             ),
             Optional!(
                 And!(
                     Literal!("*"),
                     Bind!(SubExpressionTerm,"delim")
                 )
             ),
             Bind!(SubExpressionTerm,"term")
         )
     )(value,bindings,tokens,err);       
 }
 
 There's a *lot* going on in this sample.  Basically what you see here is 
 a chunk of the as-of-yet-experimental compile-time Enki parser.  This 
 piece parses the Zero-or-more expression part of the EBNF variant that 
 Enki supports.
 
 The templates evaluate to CTFE's that in turn make up the parser when 
 it's executed.  So there's several layers of compile-time evaluation 
 going on here.  Also, what's not obvious is that "char[] tokens" is 
 actually an *array of strings* that is stored in a single char[] array; 
 each string's length data is encoded as a size_t mapped onto the 
 appropriate number of chars.  The Bind!() expressions also map parsed 
 out data to a key/value set which is stored in a similar fashion (char[] 
 bindings).
 
 The goals here were to side-step the limitations of D's identifier 
 name-length while providing a user-readable toolkit for parser 
 composition; the only limit now is placed on the length of any given 
 rule.  I haven't found a way to unify the compile-time and run-time 
 parser generation yet (one backend with two targets), but I anticipate 
 using something very similar to the above by providing a CT and RT 
 version of the template library.  At a minimum I plan on having this 
 consume a .bnf file at compile-time, in order to create a run-time 
 parser via mixin.
 

Hey pragma,

really cool, I've come up with a similar structure while experimenting
with a PEG parser in D (see attachment)
after I've read this article series on
Codeproject:

http://www.codeproject.com/cpp/crafting_interpreter_p1.asp
http://www.codeproject.com/cpp/crafting_interpreter_p2.asp
http://www.codeproject.com/cpp/crafting_interpreter_p3.asp

The template system of D does an awesome job in keeping
templated PEG grammars readable.

BTW: If you turn Enki into a PEG style parser I definitely throw
my attempts into the dustbin :-)
Greets

KlausO

Apr 10 2007

Pragma <ericanderton yahoo.removeme.com> writes:

KlausO wrote:
 
 Hey pragma,
 
 really cool, I've come up with a similar structure while experimenting
 with a PEG parser in D (see attachment)
 after I've read this article series on
 Codeproject:
 
 http://www.codeproject.com/cpp/crafting_interpreter_p1.asp
 http://www.codeproject.com/cpp/crafting_interpreter_p2.asp
 http://www.codeproject.com/cpp/crafting_interpreter_p3.asp
 
 The template system of D does an awesome job in keeping
 templated PEG grammars readable.

Yes it does!  Thanks for posting this - I didn't even know that article was
there.

 BTW: If you turn Enki into a PEG style parser I definitely throw
 my attempts into the dustbin :-)
 Greets

Wow, that's one heck of an endorsement.  Thanks, but don't throw anything out
yet.  This rendition of Enki is still a 
ways off though.  FYI I plan on keeping Enki's internals as human-readable as
possible by keeping it self-hosting.  So 
there'll be two ways to utilize the toolkit: EBNF coding and "by hand".

 alias   Action!(
           And!(
             PlusRepeat!(EmailChar),
             Char!(' '), 
             PlusRepeat!(
               And!(
                 Or!(
                   In!(
                     Range!('a', 'z'),
                     Range!('A', 'Z'),
                     Range!('0', '9'),
                     Char!('_'),
                     Char!('%'),
                     Char!('-')
                   ),
                   Char!('.')
                 ),
                 Not!(EmailSuffix)
               )
             ),
             Char!('.'),
             EmailSuffix
           ),
           delegate void(char[] email) { writefln("<email:", email, ">"); }
         )
         Email;

I had an earlier cut that looked a lot like this. :)  But there's a very subtle
problem lurking in there.  By making 
your entire grammar one monster-sized template instance, you'll run into DMD's
identifier-length limit *fast*.  As a 
result it failed when I tried to transcribe Enki's ENBF definition.  That's why
I wrap each rule as a CTFE-style 
function as it side-steps that issue rather nicely, without generating too many
warts.

-- 
- EricAnderton at yahoo

Apr 10 2007

KlausO <oberhofer users.sf.net> writes:

Pragma schrieb:
 KlausO wrote:
 Hey pragma,

 really cool, I've come up with a similar structure while experimenting
 with a PEG parser in D (see attachment)
 after I've read this article series on
 Codeproject:

 http://www.codeproject.com/cpp/crafting_interpreter_p1.asp
 http://www.codeproject.com/cpp/crafting_interpreter_p2.asp
 http://www.codeproject.com/cpp/crafting_interpreter_p3.asp

 The template system of D does an awesome job in keeping
 templated PEG grammars readable.

 
 Yes it does!  Thanks for posting this - I didn't even know that article 
 was there.
 
 BTW: If you turn Enki into a PEG style parser I definitely throw
 my attempts into the dustbin :-)
 Greets

 
 Wow, that's one heck of an endorsement.  Thanks, but don't throw 
 anything out yet.  This rendition of Enki is still a ways off though.  
 FYI I plan on keeping Enki's internals as human-readable as possible by 
 keeping it self-hosting.  So there'll be two ways to utilize the 
 toolkit: EBNF coding and "by hand".

Nice to hear that I hit your taste :-)

 
 alias   Action!(
           And!(
             PlusRepeat!(EmailChar),
             Char!(' '),             PlusRepeat!(
               And!(
                 Or!(
                   In!(
                     Range!('a', 'z'),
                     Range!('A', 'Z'),
                     Range!('0', '9'),
                     Char!('_'),
                     Char!('%'),
                     Char!('-')
                   ),
                   Char!('.')
                 ),
                 Not!(EmailSuffix)
               )
             ),
             Char!('.'),
             EmailSuffix
           ),
           delegate void(char[] email) { writefln("<email:", email, 
 ">"); }
         )
         Email;

 
 I had an earlier cut that looked a lot like this. :)  But there's a very 
 subtle problem lurking in there.  By making your entire grammar one 
 monster-sized template instance, you'll run into DMD's identifier-length 
 limit *fast*.  As a result it failed when I tried to transcribe Enki's 
 ENBF definition.  That's why I wrap each rule as a CTFE-style function 
 as it side-steps that issue rather nicely, without generating too many 
 warts.
 

Another issue I ran into is circular template dependencies. You could 
get nasty error messages like

dpeg.d(282): Error: forward reference to 
'And!(Alternative,OptRepeat!(And!(WS,Char!('/'),WS,Alternative)),WS)'

Any idea how they could be resolved ?


FYI:

Other good PEG references:
http://pdos.csail.mit.edu/~baford/packrat/


http://www.codeproject.com/csharp/cat.asp
very interesting had been
http://code.google.com/p/cat-language/wiki/HowTheInterpreterWorks

C++ templated parsers
http://www.codeproject.com/cpp/yard-xml-parser.asp
http://www.codeproject.com/cpp/biscuit.asp

Apr 10 2007

Pragma <ericanderton yahoo.removeme.com> writes:

KlausO wrote:
 Pragma schrieb:
 KlausO wrote:

 alias   Action!(
           And!(
             PlusRepeat!(EmailChar),
             Char!(' '),             PlusRepeat!(
               And!(
                 Or!(
                   In!(
                     Range!('a', 'z'),
                     Range!('A', 'Z'),
                     Range!('0', '9'),
                     Char!('_'),
                     Char!('%'),
                     Char!('-')
                   ),
                   Char!('.')
                 ),
                 Not!(EmailSuffix)
               )
             ),
             Char!('.'),
             EmailSuffix
           ),
           delegate void(char[] email) { writefln("<email:", email, 
 ">"); }
         )
         Email;

 I had an earlier cut that looked a lot like this. :)  But there's a 
 very subtle problem lurking in there.  By making your entire grammar 
 one monster-sized template instance, you'll run into DMD's 
 identifier-length limit *fast*.  As a result it failed when I tried to 
 transcribe Enki's ENBF definition.  That's why I wrap each rule as a 
 CTFE-style function as it side-steps that issue rather nicely, without 
 generating too many warts.

 
 Another issue I ran into is circular template dependencies. You could 
 get nasty error messages like
 
 dpeg.d(282): Error: forward reference to 
 'And!(Alternative,OptRepeat!(And!(WS,Char!('/'),WS,Alternative)),WS)'
 
 Any idea how they could be resolved ?

Yep, that one bit me too; just use the CTFE trick I mentioned.  Wrapping each
rule fixes this since DMD can resolve 
forward references to functions, but not with template instances.

 
 FYI:
 
 Other good PEG references:
 http://pdos.csail.mit.edu/~baford/packrat/
 

 http://www.codeproject.com/csharp/cat.asp
 very interesting had been
 http://code.google.com/p/cat-language/wiki/HowTheInterpreterWorks
 
 C++ templated parsers
 http://www.codeproject.com/cpp/yard-xml-parser.asp
 http://www.codeproject.com/cpp/biscuit.asp

Thanks.  I can always use a little more research to lean on.

-- 
- EricAnderton at yahoo

Apr 10 2007

BCS <BCS pathlink.com> writes:

 KlausO wrote:
 
 Pragma schrieb:
 I had an earlier cut that looked a lot like this. :)  But there's a 
 very subtle problem lurking in there.  By making your entire grammar 
 one monster-sized template instance, you'll run into DMD's 
 identifier-length limit *fast*.  As a result it failed when I tried 
 to transcribe Enki's ENBF definition.  That's why I wrap each rule as 
 a CTFE-style function as it side-steps that issue rather nicely, 
 without generating too many warts.

 Another issue I ran into is circular template dependencies. You could 
 get nasty error messages like



The way my dparse sidesteps this (both the id length limit and forward 
references) is to have the parser functions specialized only on the name 
of the reduction, the grammar is carried in an outer scope (the mixed in 
template that is never used as an identifier). This also allows the 
inner template to specialize on anything that is defined.

Apr 10 2007

Walter Bright <newshound1 digitalmars.com> writes:

Pragma wrote:
 By making your entire grammar one 
 monster-sized template instance, you'll run into DMD's identifier-length 
 limit *fast*.  As a result it failed when I tried to transcribe Enki's 
 ENBF definition.  That's why I wrap each rule as a CTFE-style function 
 as it side-steps that issue rather nicely, without generating too many 
 warts.

One of the motivations for CTFE was to address that exact problem.

Apr 10 2007

Witold Baryluk <baryluk mpi.int.pl> writes:

Hello,

Very good work. I'm impressed. Actually i was developing very
similar program, but yours is practicly all I was needing.
I'm developing linear algebra package in D, and now I'm optimising
it for vector machines, so your BLAS1-like pacakge will be
helpful. :)

Thx.

-- 
Witold Baryluk

Apr 03 2007

Don Clugston <dac nospam.com.au> writes:

Witold Baryluk wrote:
 Hello,
 
 Very good work. I'm impressed. Actually i was developing very
 similar program, but yours is practicly all I was needing.
 I'm developing linear algebra package in D, and now I'm optimising
 it for vector machines, so your BLAS1-like pacakge will be
 helpful. :)

Do you intend to implement BLAS2 and BLAS3-like functionality? I feel 
that I don't know enough about cache-efficient matrix blocking 
techniques to be confident in presenting code. I don't know how 
sophisticated the techniques are in libraries like ATLAS, but the ones 
used in Blitz++ should be easy to compete with.

Blade is still mostly proof-of-concept rather than being 
industrial-strength -- there are so many possibilities for improvement, 
it's not well tested, and I haven't even put any effort into making the 
code look nice. But as Davidl said, it shows that hard-core back-end 
optimisation can now be done in a library at compile time.

The code is quite modularised, so it would be straightforward to add a 
check for 'is it SSE-able' (ie, are all the vectors floats) and 'is it 
SSE2-able' (are all the vectors doubles), and if so, send them off into 
specialised assemblers, otherwise send them to the x87 one. The 
postfix-ing step will involve searching for *+ combinations where fma 
can be used. The ability to know the vector length in many cases can be 
a huge advantage for SSE code, where loop unrolling by a factor of 2 or 
4 is always necessary. I haven't got that far yet.

I'm pretty sure that this type of technique can blow almost everything 
else out of the water. <g>

Apr 04 2007

Davidl <Davidl 126.com> writes:

i think compilers have feeled a great challenge from compile-time back-e=
nd  =

like power

 I have been trying to come up with a convincing use case for the new  =

 mixins (and for metaprogramming in general). My best effort to date ca=

n  =

 be found at:
 http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blad=

e.d
 It generates near-optimal x87 asm code for BLAS1-style basic vector  =

 operations. 32, 64 and 80 bit vectors are all supported.

 Compile with -version=3DBladeDebug to see the asm code which is genera=

ted.
 Typical usage:

 void main()
 {
      auto p =3D Vec([1.0L, 2, 18]);    // a vector of 80-bit reals.
      auto q =3D Vec([3.5L, 1.1, 3.8]);  // ditto
      auto r =3D Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats
      auto z =3D Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idouble=

s
      real d =3D dot(r, p+r+r);
      ireal e =3D dot(r, z);
      q -=3D ((r+p)*18.0L*314.1L - (p-r))* 35;
      d =3D dot(r, p+r+r);
 }

 Notice that mixed-length operations (real[] + float[] - double[]) are =

 =

 supported.

 Like the C++ Blitz++ library, expression templates are used to convert=

  =

 vector expressions into efficient element-wise operations. Unlike that=

  =

 library, however, there is no reliance on the compiler's optimiser.  =

 Instead, the expression template is manipulated as text, converted int=

o  =

 postfix, and then passed to a simple CTFE compile-time assembler, whic=

h  =

 creates highly efficient asm code which is used as a mixin.
 To understand the later parts of the code, you need some knowledge of =

 =

 x87 assembler. In fact, you probably need to have read Agner Fog's  =

 superb Pentium optimisation manual (www.agner.org).

 Some observations:
 * I was amazed at how simple the expression template code is (it is  =

 somewhat cluttered by the code to check for real/imaginary type mismat=

ch  =

 errors).
 * I've often read that the x87 floating-point stack is notoriously  =

 difficult for compilers to write code for, but it works quite well in =

 =

 this case.
 * The major workarounds are:


 - inability to define operators for built-in arrays (hence the use of =

 =

 'Vec' wrappers).
 - inability to index through a tuple in a CTFE function (solved by  =

 converting types into a string).
 * There have been mutterings about how unhygenic/dangerous the new  =

 mixins are. In this case, the mixin forms the _entire_ body of the  =

 function. This is an interesting situation which I think a language  =

 purist will find more palatable.

 Enjoy.

Apr 03 2007

Jascha Wetzel <"[firstname]" mainia.de> writes:

This is really neat.
I like that this basically is a compiler extension and it's still
a) well readable
b) produces simple error messages during compilation (if any)
c) introduces non-artificial syntax for the task

c) will actually depend more on the versatility of operator overloading
than the CTFE stuff.
a) and b) are so much better than what c++ template magic gave us (as
can be observed in boost).

Now i think there's only one requirement missing to make it a perfect
compiler extension and that is to produce debuggable code.
Actually in this case, as the extension is supposed to generate mostly
single line expression statements, it might not be that important. But
if we consider multi-line DSL statements, a way to mark generated code
as belonging to one of the DSL source lines would be great.

I think, a way for a CTF to recieve the file+line where it was called
from would do the trick. In BLADE that information could be added to the
expression string, passed through to the postfix notation and finally be
inserted into the ASM code with #line directives.

Don Clugston wrote:
 I have been trying to come up with a convincing use case for the new
 mixins (and for metaprogramming in general). My best effort to date can
 be found at:
 http://www.dsource.org/projects/mathextra/browser/trunk/mathextra/Blade.d
 
 It generates near-optimal x87 asm code for BLAS1-style basic vector
 operations. 32, 64 and 80 bit vectors are all supported.
 
 Compile with -version=BladeDebug to see the asm code which is generated.
 
 Typical usage:
 
 void main()
 {
     auto p = Vec([1.0L, 2, 18]);    // a vector of 80-bit reals.
     auto q = Vec([3.5L, 1.1, 3.8]);  // ditto
     auto r = Vec([17.0f, 28.25, 1]); // a vector of 32-bit floats
     auto z = Vec([17.0i, 28.1i, 1i]); // a vector of 64-bit idoubles
     real d = dot(r, p+r+r);
     ireal e = dot(r, z);
     q -= ((r+p)*18.0L*314.1L - (p-r))* 35;
     d = dot(r, p+r+r);
 }
 
 Notice that mixed-length operations (real[] + float[] - double[]) are
 supported.
 
 Like the C++ Blitz++ library, expression templates are used to convert
 vector expressions into efficient element-wise operations. Unlike that
 library, however, there is no reliance on the compiler's optimiser.
 Instead, the expression template is manipulated as text, converted into
 postfix, and then passed to a simple CTFE compile-time assembler, which
 creates highly efficient asm code which is used as a mixin.
 To understand the later parts of the code, you need some knowledge of
 x87 assembler. In fact, you probably need to have read Agner Fog's
 superb Pentium optimisation manual (www.agner.org).
 
 Some observations:
 * I was amazed at how simple the expression template code is (it is
 somewhat cluttered by the code to check for real/imaginary type mismatch
 errors).
 * I've often read that the x87 floating-point stack is notoriously
 difficult for compilers to write code for, but it works quite well in
 this case.
 * The major workarounds are:

 - inability to define operators for built-in arrays (hence the use of
 'Vec' wrappers).
 - inability to index through a tuple in a CTFE function (solved by
 converting types into a string).
 * There have been mutterings about how unhygenic/dangerous the new
 mixins are. In this case, the mixin forms the _entire_ body of the
 function. This is an interesting situation which I think a language
 purist will find more palatable.
 
 Enjoy.

Apr 05 2007

Mikola Lysenko <mclysenk mtu.edu> writes:

Excellent work!  I have to say that I've had my misgivings about mixin 
statements, but I'm glad to see that BLADE and Don's hard work have 
proven me wrong.  Compile time vector expressions are desperately 
needed, and this library goes a long way to sorting things out.  As far 
as meta-type tricks go, this is by far the coolest I've ever seen in any 
  programming language.

One very interesting application I would like to see with this technique 
would be a compile time geometric algebra library.  If no one writes one 
by this summer, I will probably do it myself.  (Actually, when I heard 
the name BLADE, my heart skipped a beat thinking it was a GA library, 
but alas it is not...)  Maybe it is my unrestrained exuberance, I would 
guess that a strong D implementation should easily crush the leading C++ 
GA lib, GAIGEN in both performance and ease of use.

For reference, here is the GAIGEN homepage:
http://www.science.uva.nl/ga/gaigen/content_gaigen.html

-Mikola Lysenko

Apr 05 2007

torhu <fake address.dude> writes:

Mikola Lysenko wrote:
<snip>
 One very interesting application I would like to see with this technique 
 would be a compile time geometric algebra library.  If no one writes one 
 by this summer, I will probably do it myself.  (Actually, when I heard 
 the name BLADE, my heart skipped a beat thinking it was a GA library, 
 but alas it is not...)  Maybe it is my unrestrained exuberance, I would 
 guess that a strong D implementation should easily crush the leading C++ 
 GA lib, GAIGEN in both performance and ease of use.
 


Oh my GAD?  Get it? ;)

Apr 05 2007

Bill Baxter <dnewsgroup billbaxter.com> writes:

Mikola Lysenko wrote:
 Excellent work!  I have to say that I've had my misgivings about mixin 
 statements, but I'm glad to see that BLADE and Don's hard work have 
 proven me wrong.  Compile time vector expressions are desperately 
 needed, and this library goes a long way to sorting things out.  As far 
 as meta-type tricks go, this is by far the coolest I've ever seen in any 
  programming language.
 
 One very interesting application I would like to see with this technique 
 would be a compile time geometric algebra library.  If no one writes one 
 by this summer, I will probably do it myself.  (Actually, when I heard 
 the name BLADE, my heart skipped a beat thinking it was a GA library, 
 but alas it is not...)  Maybe it is my unrestrained exuberance, I would 
 guess that a strong D implementation should easily crush the leading C++ 
 GA lib, GAIGEN in both performance and ease of use.
 
 For reference, here is the GAIGEN homepage:
 http://www.science.uva.nl/ga/gaigen/content_gaigen.html
 
 -Mikola Lysenko

Wow, that looks really sweet.  Last I heard on GA was that it was good 
for working out the math, but too slow for real use.  Now it's fast too 
apparently!  Now I've really got to learn GA.  And a D compile-time fu 
version would be neat.

One question: from the GAIGEN page it sounds almost like GAIGEN 
generates off-line a set of primitive operations that are efficient for 
the chosen specific GA domain (3D euclidean or whatever).  If so, then 
that sounds pretty good to me.  What's wrong with using an off-line 
approach?  Are there some expressions things that are just too difficult 
to optimize ahead of time without seeing the whole expression as it will 
actually used?  Or is it more like GAIGEN creates the primitives 
operations for a particular GA like dot product, cross product, and 
matrix multiplies?  If so then it still maybe leaves a lot of 
optimization opportunities on the floor.  Like evaluating A*B*C*vec as 
((A*B)*C)*vec vs. (A*(B*(C*vec))) can make a big difference if ABC are 
big matrices and vec is a small vector.

--bb

Apr 05 2007

Mikola Lysenko <mclysenk mtu.edu> writes:

Bill Baxter wrote:
 Mikola Lysenko wrote:
 Excellent work!  I have to say that I've had my misgivings about mixin 
 statements, but I'm glad to see that BLADE and Don's hard work have 
 proven me wrong.  Compile time vector expressions are desperately 
 needed, and this library goes a long way to sorting things out.  As 
 far as meta-type tricks go, this is by far the coolest I've ever seen 
 in any  programming language.

 One very interesting application I would like to see with this 
 technique would be a compile time geometric algebra library.  If no 
 one writes one by this summer, I will probably do it myself.  
 (Actually, when I heard the name BLADE, my heart skipped a beat 
 thinking it was a GA library, but alas it is not...)  Maybe it is my 
 unrestrained exuberance, I would guess that a strong D implementation 
 should easily crush the leading C++ GA lib, GAIGEN in both performance 
 and ease of use.

 For reference, here is the GAIGEN homepage:
 http://www.science.uva.nl/ga/gaigen/content_gaigen.html

 -Mikola Lysenko

 
 Wow, that looks really sweet.  Last I heard on GA was that it was good 
 for working out the math, but too slow for real use.  Now it's fast too 
 apparently!  Now I've really got to learn GA.  And a D compile-time fu 
 version would be neat.
 
 One question: from the GAIGEN page it sounds almost like GAIGEN 
 generates off-line a set of primitive operations that are efficient for 
 the chosen specific GA domain (3D euclidean or whatever).  If so, then 
 that sounds pretty good to me.  What's wrong with using an off-line 
 approach?  Are there some expressions things that are just too difficult 
 to optimize ahead of time without seeing the whole expression as it will 
 actually used?  Or is it more like GAIGEN creates the primitives 
 operations for a particular GA like dot product, cross product, and 
 matrix multiplies?  If so then it still maybe leaves a lot of 
 optimization opportunities on the floor.  Like evaluating A*B*C*vec as 
 ((A*B)*C)*vec vs. (A*(B*(C*vec))) can make a big difference if ABC are 
 big matrices and vec is a small vector.
 
 --bb

As I understand it, GAIGEN uses an external program to generate the 
implementation, and the implementation uses template expressions to 
actually perform the optimizations.  The process seems a bit contrived 
to me, but I'm sure they have their reasons.  Also if you look at the 
performance statistics on that site, you will see that the geometric 
algebra implementation is actually SLOWER than the standard linear 
algebra (but not by much.)  The main advantage of GA when programming is 
that the code is shorter and simpler.

Recently, GA has been getting some more widespread acceptance and 
commercial success.  Take Geomerics for example; they're a small company 
that specializes in graphics and physics software based on GA inspired 
algorithms.  Here is there website:

http://www.geomerics.com

They've got a very impressive realtime radiosity engine that seems to 
have gotten some press in the last few weeks.  There's also some 
interesting stuff in the technology section on their website.

Writing a GA library for D is definitely on my todo list (right after I 
get done with this semester of classes.)  I will probably work on it 
this summer unless someone beats me to it.

-Mik

Apr 05 2007

Dan <murpsoft hotmail.com> writes:

Don Clugston Wrote:
 this case, I thought it would be relevant to AST macro development.
 Also I posted this to get an idea if there was much interest in this 
 sort of thing -- I don't have a great deal of free time, so I want to 
 make good use of it. I only got the code working a few days ago.
 The speed of the code will likely be affected by how well DMD optimises 
 the tuple copying. I even haven't looked at what code it's actually 
 generating.
 

In essence, it's a *demonstration* that with all of D's features combined, one
can write a fwicked-optimized compiler at compile time in D.

He's also working towards it being Agner Fog's ideal of optimal; and is taking
the right steps to do it.  Agner Fog is quite respected for his work in the x86
family of assembly; and previously I haven't seen any program that can
*generate* code sufficiently well to approach his optimal assembly source.

Apr 06 2007

Don Clugston <dac nospam.com.au> writes:

Dan wrote:
 Don Clugston Wrote:
 this case, I thought it would be relevant to AST macro development.
 Also I posted this to get an idea if there was much interest in this 
 sort of thing -- I don't have a great deal of free time, so I want to 
 make good use of it. I only got the code working a few days ago.
 The speed of the code will likely be affected by how well DMD optimises 
 the tuple copying. I even haven't looked at what code it's actually 
 generating.

 
 In essence, it's a *demonstration* that with all of D's features combined, one
can write a fwicked-optimized compiler at compile time in D.

Exactly.

 He's also working towards it being Agner Fog's ideal of optimal; and is taking
the right steps to do it.  Agner Fog is quite respected for his work in the x86
family of assembly; and previously I haven't seen any program that can
*generate* code sufficiently well to approach his optimal assembly source.

It might be because no-one has really tried. Agner told me that he has 
*never* been emailed by a compiler writer, even though he's tried to get 
their attention; I find that astonishing.

Someone recently posted a link to an article by Michael Abrash about his 
  software DirectX 3D renderer. There are some similarities to this: 
he's developed a basic, efficient code structure, and his renderer takes 
little fragments of code and inserts them into it. Working out what the 
code structure should be (which is what compilers generally try to do) 
is a fantastically difficult problem, but splicing fragments together is 
pretty easy.

OTOH it must be pretty hard to identify patterns like this after the 
initial code generation step.

Apr 07 2007

Dan <murpsoft hotmail.com> writes:

janderson Wrote:

 That would be nice and would help however I think it'll take more then 
 that to convince a Game company to switch to D, particularly ones using 
 XNA.

Umm... A runtime hosted microsoft product?  For anything high performance,
that's suicide!  No wonder the games the last couple years have come out
bloated to 8Gb.

 It would have to be a startup company that is not working 360, perhaps a 
 Nintendo, sony, Linux or perhaps windows.  I say perhaps, because if a 
 company is going to switch to anything its probably XNA.  Now if we had 
 D for Net, maybe the story would be slightly different.

Oh!  DotNet?  Another runtime hosted microsoft environment?  Brilliant.  I
would hope you've suggested this is so *only* because programmers would want to
target the XBox360.  Anything running on Linux would *hugely* suffer
performance running on DotNet compared to D; as would anything on the x86-64
platform - the performance hit would still be considerable on Windows, but
nonetheless tolerable on modern computers.

 I do like the idea of using D for games, I just see it as being a hard 
 sell.  Scripting and tools are probably the main gateway for D into the 
 game programming world.

Well, with Walnut 2.x en route and Blade slowly evolving... I'd say that within
the next year we ought to see a stupidly easy-to-edit scripting engine along
with incredibly fast vector and floating point math.  If someone eventually
provides runtime AST reflection, we may soon thereafter see SSE2 and proveable
security follow shortly thereafter.

 
 Once a few game companies start using it in their main project and can 
 demonstrate productivity/speed improvements then the wall may indeed 
 start to crumble.
 
 -Joel

That's probably a fair assessment.

Apr 10 2007

D Programming

C/C++ Programming

Other

digitalmars.D - BLADE 0.2Alpha: Vector operations with mixins, expression templates,