www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - How templates work (1)

reply Stefan Koch <uplink.coder googlemail.com> writes:
Hi,

I've previously talked about the inherit compile-time impact of 
templates.
After talking  the problem through with a few people among them 
experienced D programmers, I am beginning to see that not 
everyone is aware what a template instantiation has to do.

let's take a simple identity template
template I(alias T)
{
     alias I = T;
}

now let's make an instance
I!(int) i1;

How does the compiler know what a template resolved to?
First it has to look up the _name_ 'I' from it's current scope, 
upwards.
// (depending on the depth (nestedness) of the scope that can 
actually take a lot of time.)
After it has found the template of the name 'I' it will now clone 
the body.
creating something like
{ alias I = _formal_param_1; }
after doing this it will now substitute the formal parameter 
(also known as just parameter) with the actual parameter (also 
known as argument).
resulting in this body :
{ alias I = int }
// (depending on the number of template parameters this too can 
be costly.)
then we substitute the newly created body with the instance which 
looks like this:
{ alias I = int } i1;
Of course this cannot compile since we don't pull a symbol out of 
the body.
I've for clarity omitted the handling of what is got publicized 
as the "eponymous template trick".
Before substituting we will actually try to look for a member in 
the template body which matches the name of the template and 
reference it.
Therefore

I!(int) i1;

actually results in:

{ alias I = int }.I i1;

and that's all there is to a simple templates instance.

let's repeat this for the type long

I!(long) l1;
copy the body
{ alias I = T; }
substitute the parameter
{ alias I = long; }
look for the eponymous member.
{ alias I = long; }.I
put it in place.
{ alias I = long; }.I l1;

Simple right?

And the fundamentals are really that simple.

However:
We have not covered how names are resolved in scopes;
Nor the intricacies of how to pick the right overload for a 
template.
And which implicit conversion happen for that;
Neither have we talked about deferred resolution of templates or
which steps semantic analysis has to take.

I will describe those in future posts as soon as I actually 
understand how the overload are picked ;)

Cheers,

Stefan
May 29
next sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Friday, 29 May 2020 at 11:13:17 UTC, Stefan Koch wrote:
 Hi,

 I've previously talked about the inherit compile-time impact of 
 templates.
 After talking  the problem through with a few people among them 
 experienced D programmers, I am beginning to see that not 
 everyone is aware what a template instantiation has to do.

 [...]
P.S. If found this little article interesting please leave feedback. P.P.S. Do remind me to talk about mangling as this is a rather interesting process. (It too is very simple in principle)
May 29
parent reply Ben Jones <fake fake.fake> writes:
On Friday, 29 May 2020 at 11:17:51 UTC, Stefan Koch wrote:
 P.S. If found this little article interesting please leave 
 feedback.
Interesting and understandable. It might also be nice to mention which functions/methods are responsible for the various pieces. Agree it would be good for the blog at some point
May 29
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Friday, 29 May 2020 at 15:17:55 UTC, Ben Jones wrote:
 On Friday, 29 May 2020 at 11:17:51 UTC, Stefan Koch wrote:
 P.S. If found this little article interesting please leave 
 feedback.
Interesting and understandable. It might also be nice to mention which functions/methods are responsible for the various pieces. Agree it would be good for the blog at some point
Those details would fill a few pages I am afraid. I'd like to talk about the what has to happen rather than how it happens. So that way it's a bit like reading the summery of a horror movie rather than watching the whole thing.
May 29
parent reply Bruce Carneal <bcarneal gmail.com> writes:
On Friday, 29 May 2020 at 16:04:06 UTC, Stefan Koch wrote:
 On Friday, 29 May 2020 at 15:17:55 UTC, Ben Jones wrote:
 On Friday, 29 May 2020 at 11:17:51 UTC, Stefan Koch wrote:
 P.S. If found this little article interesting please leave 
 feedback.
Interesting and understandable. It might also be nice to mention which functions/methods are responsible for the various pieces. Agree it would be good for the blog at some point
Those details would fill a few pages I am afraid. I'd like to talk about the what has to happen rather than how it happens. So that way it's a bit like reading the summery of a horror movie rather than watching the whole thing.
I would love to see a blog on this. I would also love to see a follow on blog, if you have time, on the performance gains you're realizing in the area. Your recent posts in other threads are very exciting.
May 29
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Friday, 29 May 2020 at 16:57:38 UTC, Bruce Carneal wrote:
 On Friday, 29 May 2020 at 16:04:06 UTC, Stefan Koch wrote:
 On Friday, 29 May 2020 at 15:17:55 UTC, Ben Jones wrote:
 [...]
Those details would fill a few pages I am afraid. I'd like to talk about the what has to happen rather than how it happens. So that way it's a bit like reading the summery of a horror movie rather than watching the whole thing.
I would love to see a blog on this. I would also love to see a follow on blog, if you have time, on the performance gains you're realizing in the area. Your recent posts in other threads are very exciting.
Any Questions or suggestions are appreciated.
May 29
next sibling parent reply Max Samukha <maxsamukha gmail.com> writes:
On Friday, 29 May 2020 at 19:17:26 UTC, Stefan Koch wrote:
 On Friday, 29 May 2020 at 16:57:38 UTC, Bruce Carneal wrote:
 On Friday, 29 May 2020 at 16:04:06 UTC, Stefan Koch wrote:
 On Friday, 29 May 2020 at 15:17:55 UTC, Ben Jones wrote:
 [...]
Those details would fill a few pages I am afraid. I'd like to talk about the what has to happen rather than how it happens. So that way it's a bit like reading the summery of a horror movie rather than watching the whole thing.
I would love to see a blog on this. I would also love to see a follow on blog, if you have time, on the performance gains you're realizing in the area. Your recent posts in other threads are very exciting.
Any Questions or suggestions are appreciated.
I!"boo" -> { alias I = "boo"; }.I? Why do we still have to alias a = Alias!"boo"?
May 29
parent Stefan Koch <uplink.coder googlemail.com> writes:
On Friday, 29 May 2020 at 21:18:42 UTC, Max Samukha wrote:
 On Friday, 29 May 2020 at 19:17:26 UTC, Stefan Koch wrote:
 On Friday, 29 May 2020 at 16:57:38 UTC, Bruce Carneal wrote:
 On Friday, 29 May 2020 at 16:04:06 UTC, Stefan Koch wrote:
 On Friday, 29 May 2020 at 15:17:55 UTC, Ben Jones wrote:
 [...]
Those details would fill a few pages I am afraid. I'd like to talk about the what has to happen rather than how it happens. So that way it's a bit like reading the summery of a horror movie rather than watching the whole thing.
I would love to see a blog on this. I would also love to see a follow on blog, if you have time, on the performance gains you're realizing in the area. Your recent posts in other threads are very exciting.
Any Questions or suggestions are appreciated.
I!"boo" -> { alias I = "boo"; }.I? Why do we still have to alias a = Alias!"boo"?
You are correct. That is what will happen and the instance of I will alias to the string "boo". I might be wrong. But I believe the reason we have to do this is a parser issue. The parser expects a type after the alias {ident} = . The template instance will parse as a type while the string "boo" does not.
May 29
prev sibling parent NaN <divide by.zero> writes:
On Friday, 29 May 2020 at 19:17:26 UTC, Stefan Koch wrote:
 On Friday, 29 May 2020 at 16:57:38 UTC, Bruce Carneal wrote:
 On Friday, 29 May 2020 at 16:04:06 UTC, Stefan Koch wrote:
 On Friday, 29 May 2020 at 15:17:55 UTC, Ben Jones wrote:
 [...]
Those details would fill a few pages I am afraid. I'd like to talk about the what has to happen rather than how it happens. So that way it's a bit like reading the summery of a horror movie rather than watching the whole thing.
I would love to see a blog on this. I would also love to see a follow on blog, if you have time, on the performance gains you're realizing in the area. Your recent posts in other threads are very exciting.
Any Questions or suggestions are appreciated.
Id also love to see a blog post, but more in depth. It was very easy to follow, just wanted more of it. Maybe a few blog posts that lead into a talk for the virtual DConf later in the year? Compiler internals is one of the most interesting topics imo.
May 29
prev sibling next sibling parent reply Mike Parker <aldacron gmail.com> writes:
On Friday, 29 May 2020 at 11:13:17 UTC, Stefan Koch wrote:

 I will describe those in future posts as soon as I actually 
 understand how the overload are picked ;)
Blog post, perhaps? :-)
May 29
next sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Friday, 29 May 2020 at 11:26:49 UTC, Mike Parker wrote:
 On Friday, 29 May 2020 at 11:13:17 UTC, Stefan Koch wrote:

 I will describe those in future posts as soon as I actually 
 understand how the overload are picked ;)
Blog post, perhaps? :-)
depends, did you find the article I wrote understandable and interesting? I'd like to have a test-run here in the forums before putting it on the official blog.
May 29
next sibling parent Mike Parker <aldacron gmail.com> writes:
On Friday, 29 May 2020 at 11:33:50 UTC, Stefan Koch wrote:

 depends, did you find the article I wrote understandable and 
 interesting?
Yes, it's an interesting topic. It would need to be fleshed out into article form for the blog though. I'd be happy to work with you on that.
 I'd like to have a test-run here in the forums before putting 
 it on the official blog.
Sure. When you've output all you want to say on the topic we can see about gathering shaping it up.
May 29
prev sibling parent jmh530 <john.michael.hall gmail.com> writes:
On Friday, 29 May 2020 at 11:33:50 UTC, Stefan Koch wrote:
 [snip]
 Blog post, perhaps? :-)
depends, did you find the article I wrote understandable and interesting? I'd like to have a test-run here in the forums before putting it on the official blog.
I think it was both understandable and interesting. It might be worthwhile to wait to publish on the blog until you get the complete series so that you can be sure everything is consistent, then publish once a month or something like that.
May 29
prev sibling parent RazvanN <razvan.nitu1305 gmail.com> writes:
On Friday, 29 May 2020 at 11:26:49 UTC, Mike Parker wrote:
 On Friday, 29 May 2020 at 11:13:17 UTC, Stefan Koch wrote:

 I will describe those in future posts as soon as I actually 
 understand how the overload are picked ;)
Blog post, perhaps? :-)
I think that this should go in the documentation or in some developers` manual. It is much easier to read it somewhere than to reverse engineer it from the code.
May 29
prev sibling parent reply tsbockman <thomas.bockman gmail.com> writes:
On Friday, 29 May 2020 at 11:13:17 UTC, Stefan Koch wrote:
 However:
 We have not covered how names are resolved in scopes;
 Nor the intricacies of how to pick the right overload for a 
 template.
 And which implicit conversion happen for that;
 Neither have we talked about deferred resolution of templates or
 which steps semantic analysis has to take.
Don't forget to discuss template memoization. As I'm sure you're well aware, it has massive performance consequences, both good and bad, as well as being semantically essential to some uses of templates. Greenspun's tenth rule of programming states: "Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp." D very nearly fulfills this law (a tongue-in-cheek version of the "inner-platform effect") through its template system, except that it's probably not fair to call it "informally-specified". I argue that the root causes of the terrible compile-time performance, and especially memory consumption, of D's template system are forced memoization of intermediate results, and the lack of the tail recursion optimization. Who ever heard of a pure functional language without tail recursion optimization, or *any* programming language with inescapable memoization? There is no good reason to assign a long, permanent symbol to each and every partially sorted scrap of an AliasSeq that is generated in the midst of a compile-time sort, instead of just garbage collecting them. But my understanding is currently those intermediate results just stick around forever, massively raising the asymptotic memory complexity of otherwise benign algorithms.
May 29
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Friday, 29 May 2020 at 19:46:04 UTC, tsbockman wrote:
 On Friday, 29 May 2020 at 11:13:17 UTC, Stefan Koch wrote:
 However:
 We have not covered how names are resolved in scopes;
 Nor the intricacies of how to pick the right overload for a 
 template.
 And which implicit conversion happen for that;
 Neither have we talked about deferred resolution of templates 
 or
 which steps semantic analysis has to take.
Don't forget to discuss template memoization. As I'm sure you're well aware, it has massive performance consequences, both good and bad, as well as being semantically essential to some uses of templates. Greenspun's tenth rule of programming states: "Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp." D very nearly fulfills this law (a tongue-in-cheek version of the "inner-platform effect") through its template system, except that it's probably not fair to call it "informally-specified". I argue that the root causes of the terrible compile-time performance, and especially memory consumption, of D's template system are forced memoization of intermediate results, and the lack of the tail recursion optimization. Who ever heard of a pure functional language without tail recursion optimization, or *any* programming language with inescapable memoization? There is no good reason to assign a long, permanent symbol to each and every partially sorted scrap of an AliasSeq that is generated in the midst of a compile-time sort, instead of just garbage collecting them. But my understanding is currently those intermediate results just stick around forever, massively raising the asymptotic memory complexity of otherwise benign algorithms.
Without memorization some things would be completely impractical to do. Then again ignoring the implementation in dmd right now; There is no fundamental reason why anything has to be cached in a purely functional system. since computing it again should yield the same values. As it stands the cache does only do good. It is not the slow part. Creating tons of ast nodes is however. For practical reasons dmd does not free intermediate results, that would be the case even if the template instances were not cached.
May 29
parent reply tsbockman <thomas.bockman gmail.com> writes:
On Friday, 29 May 2020 at 19:57:26 UTC, Stefan Koch wrote:
 Without memorization some things would be completely 
 impractical to do.
Yes, but does that mean that *everything* needs to be memoized? In other programming languages (and that's all template D really is, a programming language), memoization is used as needed, not *all the time*.
 Then again ignoring the implementation in dmd right now;
 There is no fundamental reason why anything has to be cached in 
 a purely functional system.
 since computing it again should yield the same values.

 As it stands the cache does only do good.
 It is not the slow part.
 Creating tons of ast nodes is however.
A single instantiation of std.meta.staticSort with a few hundred items requires well over 100 MB of memory which will not be freed until the compiler exits. It seems like it's not a problem only because it's impossible to write meta-programs that process non-trivial amounts of data, so people just don't. (I would test with more items, but the compiler crashes due to hitting the template recursion limit - which is quite silly, given that the algorithm used, merge sort, only requires O(log(n)) non-tail levels in other programming languages.)
 For practical reasons dmd does not free intermediate results, 
 that would be the case
 even if the template instances were not cached.
DMD does try to free intermediate results with the "-lowmem" option: https://dlang.org/dmd-linux.html#switch-lowmem Of course, the usefulness of this option is severely limited due to the forced memoization.
May 29
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Friday, 29 May 2020 at 21:38:20 UTC, tsbockman wrote:
 On Friday, 29 May 2020 at 19:57:26 UTC, Stefan Koch wrote:
 [...]
Yes, but does that mean that *everything* needs to be memoized? In other programming languages (and that's all template D really is, a programming language), memoization is used as needed, not *all the time*.
 [...]
A single instantiation of std.meta.staticSort with a few hundred items requires well over 100 MB of memory which will not be freed until the compiler exits. It seems like it's not a problem only because it's impossible to write meta-programs that process non-trivial amounts of data, so people just don't. (I would test with more items, but the compiler crashes due to hitting the template recursion limit - which is quite silly, given that the algorithm used, merge sort, only requires O(log(n)) non-tail levels in other programming languages.)
 [...]
DMD does try to free intermediate results with the "-lowmem" option: https://dlang.org/dmd-linux.html#switch-lowmem Of course, the usefulness of this option is severely limited due to the forced memoization.
I haven't seen staticSort so far. I shall have a look at it. The weka fork of ldc has a dynamically configure-able recursion limit. And dmd can be trivially patched to take it as a runtime option as well. Just in case you do want to see what happens if you hit the _real_ limit ;)
May 29
parent reply tsbockman <thomas.bockman gmail.com> writes:
On Friday, 29 May 2020 at 21:53:33 UTC, Stefan Koch wrote:
 I haven't seen staticSort so far.
 I shall have a look at it.
Your type function project, newCTFE, and the ability for the compiler to discard intermediate results in CTFE would, combined, enable meta algorithms that scale reasonably, I think. The ability to mark templates (or maybe specific instantiations?) noMemo to disable memoization would be another approach. Anyway, here's some test code I wrote: import std.meta, std.random, std.stdio; immutable(int[]) randomItems(int n) pure trusted nothrow { assert(__ctfe); auto rand = Xorshift(n); int[] ret = new int[n]; for(size_t x = 0; x < n; ++x) { ret[x] = rand.front; rand.popFront(); } return cast(immutable) ret; } struct S(int _x) { enum int x = _x; } enum int X(S) = S.x; template less(A, B) { enum bool less = A.x < B.x; } alias boom = staticSort!(less, staticMap!(S, aliasSeqOf!(randomItems(384)))); void main() { writeln([ staticMap!(X, boom) ]); }
 The weka fork of ldc has a dynamically configure-able recursion 
 limit.
 And dmd can be trivially patched to take it as a runtime option 
 as well.
 Just in case you do want to see what happens if you hit the 
 _real_ limit ;)
I have actually hit the real limit by accident before; that's how I discovered that it's possible to crash my operating system just by burning too much memory in user space. It's also why I now have 32 GB of RAM on my workstation. :-) For anyone else having trouble with out-of-memory related crashes on Linux, I recommend replacing dmd in your PATH with a script like this to prevent the OS from crashing: (using ulimit -v $MAX_MEMORY_IN_KB; /path/to/dmd $ )
May 29
parent tsbockman <thomas.bockman gmail.com> writes:
On Friday, 29 May 2020 at 22:31:54 UTC, tsbockman wrote:
 For anyone else having trouble with out-of-memory related 
 crashes on Linux, I recommend replacing dmd in your PATH with a 
 script like this to prevent the OS from crashing:

     (using ulimit -v $MAX_MEMORY_IN_KB; /path/to/dmd $ )
Oops, that should be: (ulimit -v $MAX_MEMORY_IN_KB; /path/to/dmd $ )
May 29