digitalmars.D - [GSoC] RFC: Thrift project proposal (draft)

David Nadlinger (16/16) Mar 24 2011 Hi all,

Robert Jacques (34/50) Mar 24 2011 First and foremost, I would strongly recommend against looking at Thrift...

David Nadlinger (50/63) Mar 25 2011 Hello Robert,

Don (6/48) Mar 26 2011 I'm giving CTFE a *major* overhaul right now. I don't know if I'll be

David Nadlinger (7/12) Mar 26 2011 That's great news – do you plan to put your work in progress up at

Don (8/21) Mar 26 2011 Yes, definitely. All my fixes go into my fork of dmd on github.

Andrei Alexandrescu (9/30) Mar 26 2011 This is absolutely awesome. Compile-time evaluation is a key strategic

Don (6/40) Mar 26 2011 Eventually. That requires some form of class literal to be created

Andrei Alexandrescu (5/40) Mar 27 2011 Sounds great. Most of the advanced CTFE applications that I'm thinking

bearophile (5/9) Mar 26 2011 That seems the bigger source of problems for CT code. For me another CT ...
dsimcha (4/9) Mar 26 2011 This is great news, I'm looking forward to it. Thanks for the hard

Don (19/29) Mar 26 2011 The basic problem with the current implementation of CTFE is that it

dsimcha (9/38) Mar 26 2011 Right. IIUC there's also no way to free the memory from copies that are...

Don (9/54) Mar 26 2011 That's not the big problem, actually. The issue is that x[7]=6;

Robert Jacques (5/16) Mar 26 2011 How hard would it be for the compiler to allocate all the memory for a
spir (25/38) Mar 27 2011 Hello Don,

Robert Jacques (9/48) Mar 27 2011 Hi Denis,

Don (4/60) Mar 27 2011 I believe it was a quick hack to get things working. But it needs to

Andrej Mitrovic (12/12) Mar 27 2011 I remember a few months ago I've tried using CTFE and import
Andrej Mitrovic (18/22) Mar 29 2011 Found it. It doesn't actually load a .def file, and it wouldn't make
Andrej Mitrovic (12/12) Mar 31 2011 Ok this is the thing that really gets me with CTFE:

David Nadlinger (17/31) Mar 26 2011 First of all, let me say again that I am really looking forward to your

David Nadlinger (4/26) Mar 26 2011 Ah, forget that part, I wasn't aware of David's post asking the same

Jacob Carlborg (5/34) Mar 27 2011 Will the time it takes to compile heavy uses of CTFE be affected by this...

Robert Jacques (2/39) Mar 27 2011 Any string heavy CTFE should see a major improvement in performance.

Jonathan M Davis (9/50) Mar 27 2011 Yeah. Considering how memory-heavy CTFE tends to be, I'd expect that suc...

Jacob Carlborg (6/22) Mar 25 2011 Don't know if this will be any problem with the Thrift protocol,

David Nadlinger (6/9) Mar 25 2011 Thrift and other, similar projects (like Google's Protocol Buffers) go

Jacob Carlborg (4/13) Mar 25 2011 Ok, I see.

Andrei Alexandrescu (5/17) Mar 26 2011 This is a strong proposal that I will back up. I have shared it inside
David Nadlinger (4/4) Mar 28 2011 I just revised the proposal and submitted it via Google's official

David Nadlinger <see klickverbot.at> writes:

Hi all,

I am putting together a Google Summer of Code project proposal regarding 
the Apache Thrift idea (see the ideas page[1]), which I intend to 
officially submit as soon as the application period opens. You can find 
my first draft at http://klickverbot.at/code/gsoc/thrift/.

While I would love to hear any opinions, two specific questions:

Walter, could you as the organization admin please have a look at this 
if it meets your formal expectations (the application template section 
of the Digital Mars GSoC profile is still empty)?

Andrei, as you are the one behind the original suggestion, would you 
mind having a quick glance at the proposal? Do you have any experience 
with Thrift in production use from your work at Facebook?

David


P.S.: I am notoriously bad at writing »About me« sections, but from
reading around a bit I figured a GSoC application should include one…



[1] http://www.prowiki.org/wiki4d/wiki.cgi?GSOC_2011_Ideas

Mar 24 2011

"Robert Jacques" <sandford jhu.edu> writes:

On Thu, 24 Mar 2011 19:46:39 -0400, David Nadlinger <see klickverbot.at>  
wrote:

 Hi all,

 I am putting together a Google Summer of Code project proposal regarding  
 the Apache Thrift idea (see the ideas page[1]), which I intend to  
 officially submit as soon as the application period opens. You can find  
 my first draft at http://klickverbot.at/code/gsoc/thrift/.

 While I would love to hear any opinions, two specific questions:

 Walter, could you as the organization admin please have a look at this  
 if it meets your formal expectations (the application template section  
 of the Digital Mars GSoC profile is still empty)?

 Andrei, as you are the one behind the original suggestion, would you  
 mind having a quick glance at the proposal? Do you have any experience  
 with Thrift in production use from your work at Facebook?

 David


 P.S.: I am notoriously bad at writing »About me« sections, but from
 reading around a bit I figured a GSoC application should include one…



 [1] http://www.prowiki.org/wiki4d/wiki.cgi?GSOC_2011_Ideas

First and foremost, I would strongly recommend against looking at Thrifts  
internals; if you do, the project _should not_ be submitted to Phobos.  
(Thrift is Apache License 2.0 which isn't compatible with the Boost  
License). Alternatively, you could aim to get the library into etc.*, or  
simply make it a D source project. I do feel that aiming for Phobos would  
strengthen your application though.

As for the project itself, I'd agree with you that due to certain,  
well-known CTFE bugs, you probably wouldn't be able to parse anything more  
than the simplest Thrift IDL at compile time today. But one of the major  
advantages of CTFE is that there is no difference between regular D  
functions and CTFE D, so you can develop a full Thrift IDL parser/code  
generator in D and then use it as part of a build to today and an input to  
a string mixin tomorrow. I think playing up D's strengths, and that you  
are coding with an eye to the future, would strengthen your application.  
Currently, your proposal sounds like a simple port of a C++ library to D.  
This maybe what you intend to do, but if so, you should clarify this in  
your proposal.

Regarding your writing, it's fairly solid, though it feels a bit too  
familiar for a formal proposal of work. (Though this might just be my  
academic background talking.) Also, I noticed a tendency for in-lined  
footnotes, ala "besides further working out the details of the project,",  
or "I’d expect to further improve both the code generator and the binding  
code, along with the accompanying documentation.". I'd recommend focusing  
on the big things you want to do (like contacting the D and Thrift  
communities, working of documentation and unit tests, etc) and leave out  
the expected day-to-day stuff. (i.e. Put the big rocks in the jar first  
and leave the gravel, sand and water to later :  
andrew.goenardi.com/big-rocks-and-a-jar)

While I don't have the time for a mentorship, I have been working on an  
update to std.json, std.variant/algebraic as well as my own binary  
serialization library, and am willing to share code and/or talk  
serialization/de-serialization design.

Mar 24 2011

David Nadlinger <see klickverbot.at> writes:

Hello Robert,

thank you for taking the time to read my proposal.

On 3/25/11 5:48 AM, Robert Jacques wrote:
 First and foremost, I would strongly recommend against looking at
 Thrifts internals; if you do, the project _should not_ be submitted to
 Phobos. (Thrift is Apache License 2.0 which isn't compatible with the
 Boost License). Alternatively, you could aim to get the library into
 etc.*, or simply make it a D source project. I do feel that aiming for
 Phobos would strengthen your application though.

This can certainly be discussed, but I don't think including this 
project into Phobos would be the best choice – at least as long as an 
external »interface compiler«, i.e. generator would be used –, but 
rather trying to make it a part of the official Thrift project. This is 
how Thrift support was done for other languages, and having the code 
generator implementation in another project than the library it targets 
seems not like a wise thing to do.

Although I'm not a lawyer, I have been involved with D long enough to be 
aware of a large part of the issues which can originate from Phobos 
being Boost-licensed. If we decide that we want to have Thrift support 
in Phobos itself, it would, strictly speaking, become hairy with regards 
to IP anyway, because at least as far as I can see, some protocol 
details are in fact implementation-defined.

Figuring these protocol details out from the code is just what I meant 
to do anyway, I'll clarify the draft with regard to this.


 […] But one of the major advantages of CTFE is that there is no difference
between regular D functions and CTFE D, so you can develop a full Thrift IDL
parser/code generator in D and then use it as part of a build to today and an
input to a string mixin tomorrow.I think playing up D's strengths, and that you
are coding with an eye to 

the future, would strengthen your application.

To be honest, I don't think this will be possible with D CTFE in thee 
near future until somebody steps forward and radically improves the 
current CTFE implementation (thinking of it, this might be a nice 
project for GSoC as well).

To back my pessimism a bit: I was doing a simple CTFE implementation of 
Gaussian elimination some weeks ago. Coming up with a version DMD would 
accept for compile-time values took me something like ten minutes, 
complete with runtime unittests. However, it wasn't until I spent two 
more afternoons of debugging (and two new wrong-code Bugzilla entries) 
until the CTFE results would actually match the runtime values computed 
by the same piece of code.

And that was for code specifically written to be CTFE-friendly. In my 
experience, trying to reuse non-trivial pieces of normal runtime code 
not written with CTFE in mind results in even more problems – for 
example, you can't even really use std.algorithm if you want your code 
to run under CTFE at the moment.

These issues make me skeptical about whether taking a possible future 
CTFE implementation into account is worth the hassle, even more so given 
the scope of the project (the official Thrift parser is something like 
3.5 kLOC, with another 4 kLOC for the actual C++ code generator).


 Regarding your writing, it's fairly solid, though it feels a bit too
 familiar for a formal proposal of work.

Yes, I am aware that it is written in a rather colloquial style, 
decidedly too colloquial if I were to apply e.g. for a research grant. 
But as this, as far as I know, is not even going to leave the D 
community, I was not at all sure about the right level of formality. 
Thanks for the suggestion, though, as I was planning to give it a 
stylistic overhaul before the official submission anyway.


 While I don't have the time for a mentorship, I have been working on an
 update to std.json, std.variant/algebraic as well as my own binary
 serialization library, and am willing to share code and/or talk
 serialization/de-serialization design.

Thank you for the offer, I'll certainly contact you if this project 
should be approved. Also, being able to build on a solid JSON library 
will probably also be helpful for this project, as Thrift includes a 
JSON-based protocol.

David

Mar 25 2011

Don <nospam nospam.com> writes:

David Nadlinger wrote:
 Hello Robert,
 
 thank you for taking the time to read my proposal.
 
 On 3/25/11 5:48 AM, Robert Jacques wrote:
 First and foremost, I would strongly recommend against looking at
 Thrifts internals; if you do, the project _should not_ be submitted to
 Phobos. (Thrift is Apache License 2.0 which isn't compatible with the
 Boost License). Alternatively, you could aim to get the library into
 etc.*, or simply make it a D source project. I do feel that aiming for
 Phobos would strengthen your application though.

 
 This can certainly be discussed, but I don't think including this 
 project into Phobos would be the best choice – at least as long as an 
 external »interface compiler«, i.e. generator would be used –, but 
 rather trying to make it a part of the official Thrift project. This is 
 how Thrift support was done for other languages, and having the code 
 generator implementation in another project than the library it targets 
 seems not like a wise thing to do.
 
 Although I'm not a lawyer, I have been involved with D long enough to be 
 aware of a large part of the issues which can originate from Phobos 
 being Boost-licensed. If we decide that we want to have Thrift support 
 in Phobos itself, it would, strictly speaking, become hairy with regards 
 to IP anyway, because at least as far as I can see, some protocol 
 details are in fact implementation-defined.
 
 Figuring these protocol details out from the code is just what I meant 
 to do anyway, I'll clarify the draft with regard to this.
 
 
 […] But one of the major advantages of CTFE is that there is no 
 difference between regular D functions and CTFE D, so you can develop 
 a full Thrift IDL parser/code generator in D and then use it as part 
 of a build to today and an input to a string mixin tomorrow.I think 
 playing up D's strengths, and that you are coding with an eye to 

 the future, would strengthen your application.
 
 To be honest, I don't think this will be possible with D CTFE in thee 
 near future until somebody steps forward and radically improves the 
 current CTFE implementation (thinking of it, this might be a nice 
 project for GSoC as well).

I'm giving CTFE a *major* overhaul right now. I don't know if I'll be 
finished in time for the next compiler release, but definitely by the 
release after that. Most importantly, bug 1330, which is the root cause 
of almost all of the problems, will be fixed. I hope to move CTFE out 
the "experimental feature" category.

Mar 26 2011

David Nadlinger <see klickverbot.at> writes:

On 3/26/11 5:16 PM, Don wrote:
 I'm giving CTFE a *major* overhaul right now. I don't know if I'll be
 finished in time for the next compiler release, but definitely by the
 release after that. Most importantly, bug 1330, which is the root cause
 of almost all of the problems, will be fixed. I hope to move CTFE out
 the "experimental feature" category.

That's great news – do you plan to put your work in progress up at 
GitHub somewhere before the official release? I'm playing around with 
CTFE quite a bit at the moment and plan to have a stab at making the 
basic parts of std.algorithm CTFE-able soon (Steve, did you find time to 
look at the Appender issue yet?), so I'd be glad to test any improvements…

David

Mar 26 2011

Don <nospam nospam.com> writes:

David Nadlinger wrote:
 On 3/26/11 5:16 PM, Don wrote:
 I'm giving CTFE a *major* overhaul right now. I don't know if I'll be
 finished in time for the next compiler release, but definitely by the
 release after that. Most importantly, bug 1330, which is the root cause
 of almost all of the problems, will be fixed. I hope to move CTFE out
 the "experimental feature" category.

 
 That's great news – do you plan to put your work in progress up at 
 GitHub somewhere before the official release? 

Yes, definitely. All my fixes go into my fork of dmd on github.
My CTFE work is progressing quite well. Simple test cases like the one 
in bug 1330 are working (and all the existing tests still pass, of 
course). It will be a while before I publish it to github, though -- the 
code is VERY untidy, and lots of stuff isn't implemented yet.

 I'm playing around with 
 CTFE quite a bit at the moment and plan to have a stab at making the 
 basic parts of std.algorithm CTFE-able soon (Steve, did you find time to 
 look at the Appender issue yet?), so I'd be glad to test any improvements…

My changes will make a *lot* more things work in CTFE. I recommend 
against spending much time making things CTFE-able right now.

Mar 26 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/26/11 3:01 PM, Don wrote:
 David Nadlinger wrote:
 On 3/26/11 5:16 PM, Don wrote:
 I'm giving CTFE a *major* overhaul right now. I don't know if I'll be
 finished in time for the next compiler release, but definitely by the
 release after that. Most importantly, bug 1330, which is the root cause
 of almost all of the problems, will be fixed. I hope to move CTFE out
 the "experimental feature" category.

 That's great news – do you plan to put your work in progress up at
 GitHub somewhere before the official release?

 Yes, definitely. All my fixes go into my fork of dmd on github.
 My CTFE work is progressing quite well. Simple test cases like the one
 in bug 1330 are working (and all the existing tests still pass, of
 course). It will be a while before I publish it to github, though -- the
 code is VERY untidy, and lots of stuff isn't implemented yet.

 I'm playing around with CTFE quite a bit at the moment and plan to
 have a stab at making the basic parts of std.algorithm CTFE-able soon
 (Steve, did you find time to look at the Appender issue yet?), so I'd
 be glad to test any improvements…

 My changes will make a *lot* more things work in CTFE. I recommend
 against spending much time making things CTFE-able right now.

This is absolutely awesome. Compile-time evaluation is a key strategic 
feature of D. Thank you!

Two questions - do you plan to allow class object creation a la new 
Widget? Also, since the upcoming features will be in time for GSoC 
projects, could you write a brief documentation project describing the 
scope of your improvements?


Thanks again,

Andrei

Mar 26 2011

Don <nospam nospam.com> writes:

Andrei Alexandrescu wrote:
 On 3/26/11 3:01 PM, Don wrote:
 David Nadlinger wrote:
 On 3/26/11 5:16 PM, Don wrote:
 I'm giving CTFE a *major* overhaul right now. I don't know if I'll be
 finished in time for the next compiler release, but definitely by the
 release after that. Most importantly, bug 1330, which is the root cause
 of almost all of the problems, will be fixed. I hope to move CTFE out
 the "experimental feature" category.

 That's great news – do you plan to put your work in progress up at
 GitHub somewhere before the official release?

 Yes, definitely. All my fixes go into my fork of dmd on github.
 My CTFE work is progressing quite well. Simple test cases like the one
 in bug 1330 are working (and all the existing tests still pass, of
 course). It will be a while before I publish it to github, though -- the
 code is VERY untidy, and lots of stuff isn't implemented yet.

 I'm playing around with CTFE quite a bit at the moment and plan to
 have a stab at making the basic parts of std.algorithm CTFE-able soon
 (Steve, did you find time to look at the Appender issue yet?), so I'd
 be glad to test any improvements…

 My changes will make a *lot* more things work in CTFE. I recommend
 against spending much time making things CTFE-able right now.

 
 This is absolutely awesome. Compile-time evaluation is a key strategic 
 feature of D. Thank you!
 
 Two questions - do you plan to allow class object creation a la new 
 Widget? 

Eventually. That requires some form of class literal to be created 
inside the compiler, so it's a bit more work.

 Also, since the upcoming features will be in time for GSoC 
 projects, could you write a brief documentation project describing the 
 scope of your improvements?

My plan at this stage is just to overhaul the existing functionality (so 
that everything that currently sort-of works or seems to work, actually 
DOES work).

Mar 26 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 03/27/2011 12:25 AM, Don wrote:
 Andrei Alexandrescu wrote:
 On 3/26/11 3:01 PM, Don wrote:
 David Nadlinger wrote:
 On 3/26/11 5:16 PM, Don wrote:
 I'm giving CTFE a *major* overhaul right now. I don't know if I'll be
 finished in time for the next compiler release, but definitely by the
 release after that. Most importantly, bug 1330, which is the root
 cause
 of almost all of the problems, will be fixed. I hope to move CTFE out
 the "experimental feature" category.

 That's great news – do you plan to put your work in progress up at
 GitHub somewhere before the official release?

 Yes, definitely. All my fixes go into my fork of dmd on github.
 My CTFE work is progressing quite well. Simple test cases like the one
 in bug 1330 are working (and all the existing tests still pass, of
 course). It will be a while before I publish it to github, though -- the
 code is VERY untidy, and lots of stuff isn't implemented yet.

 I'm playing around with CTFE quite a bit at the moment and plan to
 have a stab at making the basic parts of std.algorithm CTFE-able soon
 (Steve, did you find time to look at the Appender issue yet?), so I'd
 be glad to test any improvements…

 My changes will make a *lot* more things work in CTFE. I recommend
 against spending much time making things CTFE-able right now.

 This is absolutely awesome. Compile-time evaluation is a key strategic
 feature of D. Thank you!

 Two questions - do you plan to allow class object creation a la new
 Widget?

 Eventually. That requires some form of class literal to be created
 inside the compiler, so it's a bit more work.

Sounds great. Most of the advanced CTFE applications that I'm thinking 
of involve referential data types. Right now only arrays offer that in 
CTFE space, which is quite limiting.

Andrei

Mar 27 2011

bearophile <bearophileHUGS lycos.com> writes:

Don:

 I'm giving CTFE a *major* overhaul right now.

Thank you Don, you are doing a lot for the improvement of the D compiler :-)


 Most importantly, bug 1330, which is the root cause 
 of almost all of the problems, will be fixed. I hope to move CTFE out 
 the "experimental feature" category.

That seems the bigger source of problems for CT code. For me another CT thing
I'd like improved is the printing (bug 3952).

Bye,
bearophile

Mar 26 2011

dsimcha <dsimcha yahoo.com> writes:

On 3/26/2011 12:16 PM, Don wrote:
 I'm giving CTFE a *major* overhaul right now. I don't know if I'll be
 finished in time for the next compiler release, but definitely by the
 release after that. Most importantly, bug 1330, which is the root cause
 of almost all of the problems, will be fixed. I hope to move CTFE out
 the "experimental feature" category.

This is great news, I'm looking forward to it.  Thanks for the hard 
work.  Out of curiosity, can you give a brief overview of what new 
things CTFE will be usable for?

Mar 26 2011

Don <nospam nospam.com> writes:

dsimcha wrote:
 On 3/26/2011 12:16 PM, Don wrote:
 I'm giving CTFE a *major* overhaul right now. I don't know if I'll be
 finished in time for the next compiler release, but definitely by the
 release after that. Most importantly, bug 1330, which is the root cause
 of almost all of the problems, will be fixed. I hope to move CTFE out
 the "experimental feature" category.

 
 This is great news, I'm looking forward to it.  Thanks for the hard 
 work.  Out of curiosity, can you give a brief overview of what new 
 things CTFE will be usable for?

The basic problem with the current implementation of CTFE is that it 
uses copy-on-write. This means that references (including dynamic 
arrays) don't work properly -- they just copy a snapshot of the thing 
they are referencing. This is bug 1330. It also means it burns up memory 
like you wouldn't believe.

I'm changing CTFE to use in-place modification. This fixes all those 
issues. But this is obviously a fairly intense change, and will take 
quite a lot of time to iron out all the corner cases. So that's all I'm 
planning on doing right now.

But once that's done, it will be straightforward to implement other 
reference types, such as classes and pointers (pointer arithmetic will 
be restricted to pointers which point to array members). Once classes 
are implemented, it's straightforward to do exceptions. So, pretty much 
everything.

I've been planning on doing this for over a year, but while Walter was 
working on 64-bit, I felt that I was the only one working on the 
showstopper wrong-code bugs and regressions, so I put this 
important-but-not-urgent stuff aside.

Mar 26 2011

dsimcha <dsimcha yahoo.com> writes:

On 3/26/2011 4:16 PM, Don wrote:
 dsimcha wrote:
 On 3/26/2011 12:16 PM, Don wrote:
 I'm giving CTFE a *major* overhaul right now. I don't know if I'll be
 finished in time for the next compiler release, but definitely by the
 release after that. Most importantly, bug 1330, which is the root cause
 of almost all of the problems, will be fixed. I hope to move CTFE out
 the "experimental feature" category.

 This is great news, I'm looking forward to it. Thanks for the hard
 work. Out of curiosity, can you give a brief overview of what new
 things CTFE will be usable for?

 The basic problem with the current implementation of CTFE is that it
 uses copy-on-write. This means that references (including dynamic
 arrays) don't work properly -- they just copy a snapshot of the thing
 they are referencing. This is bug 1330. It also means it burns up memory
 like you wouldn't believe.

Right.  IIUC there's also no way to free the memory from copies that are 
no longer referenced.  I can see where this would leak memory like a sieve.

 I'm changing CTFE to use in-place modification. This fixes all those
 issues. But this is obviously a fairly intense change, and will take
 quite a lot of time to iron out all the corner cases. So that's all I'm
 planning on doing right now.

This is a _huge_ improvement, but does it address the issue of freeing 
memory or is that beyond the scope?

 But once that's done, it will be straightforward to implement other
 reference types, such as classes and pointers (pointer arithmetic will
 be restricted to pointers which point to array members). Once classes
 are implemented, it's straightforward to do exceptions. So, pretty much
 everything.

Excellent.

 I've been planning on doing this for over a year, but while Walter was
 working on 64-bit, I felt that I was the only one working on the
 showstopper wrong-code bugs and regressions, so I put this
 important-but-not-urgent stuff aside.

Agreed.  I love the 64-bit support (I've been using it for real work and 
it's surprisingly solid) but the pace of fixing miscellaneous bugs was 
understandably glacial while it was being implemented.

Mar 26 2011

Don <nospam nospam.com> writes:

dsimcha wrote:
 On 3/26/2011 4:16 PM, Don wrote:
 dsimcha wrote:
 On 3/26/2011 12:16 PM, Don wrote:
 I'm giving CTFE a *major* overhaul right now. I don't know if I'll be
 finished in time for the next compiler release, but definitely by the
 release after that. Most importantly, bug 1330, which is the root cause
 of almost all of the problems, will be fixed. I hope to move CTFE out
 the "experimental feature" category.

 This is great news, I'm looking forward to it. Thanks for the hard
 work. Out of curiosity, can you give a brief overview of what new
 things CTFE will be usable for?

 The basic problem with the current implementation of CTFE is that it
 uses copy-on-write. This means that references (including dynamic
 arrays) don't work properly -- they just copy a snapshot of the thing
 they are referencing. This is bug 1330. It also means it burns up memory
 like you wouldn't believe.

 
 Right.  IIUC there's also no way to free the memory from copies that are 
 no longer referenced.  I can see where this would leak memory like a sieve.

That's not the big problem, actually. The issue is that x[7]=6;
duplicates x, even if x has 10K elements.
Now consider:
for(int i=0; i<x.length; ++i) x[i]=3;
// creates 100M new elements!! Should create none, or 10K at most.


 I'm changing CTFE to use in-place modification. This fixes all those
 issues. But this is obviously a fairly intense change, and will take
 quite a lot of time to iron out all the corner cases. So that's all I'm
 planning on doing right now.

 
 This is a _huge_ improvement, but does it address the issue of freeing 
 memory or is that beyond the scope?

Outside the scope, but it will use an order of magnitude less memory in 
the first place, in the cases which are causing the biggest problems 
(such as the one I showed above).


 But once that's done, it will be straightforward to implement other
 reference types, such as classes and pointers (pointer arithmetic will
 be restricted to pointers which point to array members). Once classes
 are implemented, it's straightforward to do exceptions. So, pretty much
 everything.

 
 Excellent.
 
 I've been planning on doing this for over a year, but while Walter was
 working on 64-bit, I felt that I was the only one working on the
 showstopper wrong-code bugs and regressions, so I put this
 important-but-not-urgent stuff aside.

 
 Agreed.  I love the 64-bit support (I've been using it for real work and 
 it's surprisingly solid) but the pace of fixing miscellaneous bugs was 
 understandably glacial while it was being implemented.

Mar 26 2011

"Robert Jacques" <sandford jhu.edu> writes:

On Sat, 26 Mar 2011 16:57:34 -0400, Don <nospam nospam.com> wrote:
 dsimcha wrote:
 On 3/26/2011 4:16 PM, Don wrote:
 I'm changing CTFE to use in-place modification. This fixes all those
 issues. But this is obviously a fairly intense change, and will take
 quite a lot of time to iron out all the corner cases. So that's all I'm
 planning on doing right now.

  This is a _huge_ improvement, but does it address the issue of freeing  
 memory or is that beyond the scope?

 Outside the scope, but it will use an order of magnitude less memory in  
 the first place, in the cases which are causing the biggest problems  
 (such as the one I showed above).

How hard would it be for the compiler to allocate all the memory for a  
CTFE evaluation on a second heap, dup the final output and then trash the  
entire heap? Or is that how CTFE already works?

Also, thanks a bunch for working on this bug.

Mar 26 2011

spir <denis.spir gmail.com> writes:

On 03/26/2011 09:57 PM, Don wrote:
 The basic problem with the current implementation of CTFE is that it
 uses copy-on-write. This means that references (including dynamic
 arrays) don't work properly -- they just copy a snapshot of the thing
 they are referencing. This is bug 1330. It also means it burns up memory
 like you wouldn't believe.

 Right.  IIUC there's also no way to free the memory from copies that are no
 longer referenced.  I can see where this would leak memory like a sieve.

 That's not the big problem, actually. The issue is that x[7]=6;
 duplicates x, even if x has 10K elements.
 Now consider:
    for(int i=0; i<x.length; ++i) x[i]=3;
    // creates 100M new elements!! Should create none, or 10K at most.

Hello Don,

I don't understand your point. I have once implemented a toy dynamic language, 
using the common trick of boxed elements (à la Lisp). But I wanted to maintain 
value semantics as standard. A cheap way to do that is copy on write; it is 
actually cheap since simple, atomic, elements are never copied (since they 
cannot be changed on place), thus one just just needs to trace complex elements 
(array-lists & named tuples in my case):
	x := [1,2,3]	// create the array value, assign its ref
	y := x		// copy the ref, mark the value as shared
	x[1] := 0	// copy the value, reassign the ref, then change
But the new value is not shared, thus:
	x[1] := 1	// change only

So that in your loop example, at most one array copy happens (iff it was 
shared). This is as far as I know what is commonly called copy-on-write. There 
is no need to copy the value over and over again on every change if it is not 
multiple-referenced, and noone does that, I guess.

Side-Note: assignments of the form of "y := x" are really special, at least 
conceptually; but also practically when pointers or refs enter the game. I call 
them "symbol assignments" as the source is a symbol.

Denis
-- 
_________________
vita es estrany
spir.wikidot.com

Mar 27 2011

"Robert Jacques" <sandford jhu.edu> writes:

On Sun, 27 Mar 2011 08:36:39 -0400, spir <denis.spir gmail.com> wrote:

 On 03/26/2011 09:57 PM, Don wrote:
 The basic problem with the current implementation of CTFE is that it
 uses copy-on-write. This means that references (including dynamic
 arrays) don't work properly -- they just copy a snapshot of the thing
 they are referencing. This is bug 1330. It also means it burns up  
 memory
 like you wouldn't believe.

 Right.  IIUC there's also no way to free the memory from copies that  
 are no
 longer referenced.  I can see where this would leak memory like a  
 sieve.

 That's not the big problem, actually. The issue is that x[7]=6;
 duplicates x, even if x has 10K elements.
 Now consider:
    for(int i=0; i<x.length; ++i) x[i]=3;
    // creates 100M new elements!! Should create none, or 10K at most.

 Hello Don,

 I don't understand your point. I have once implemented a toy dynamic  
 language, using the common trick of boxed elements (� la Lisp). But I  
 wanted to maintain value semantics as standard. A cheap way to do that  
 is copy on write; it is actually cheap since simple, atomic, elements  
 are never copied (since they cannot be changed on place), thus one just  
 just needs to trace complex elements (array-lists & named tuples in my  
 case):
 	x := [1,2,3]	// create the array value, assign its ref
 	y := x		// copy the ref, mark the value as shared
 	x[1] := 0	// copy the value, reassign the ref, then change
 But the new value is not shared, thus:
 	x[1] := 1	// change only

 So that in your loop example, at most one array copy happens (iff it was  
 shared). This is as far as I know what is commonly called copy-on-write.  
 There is no need to copy the value over and over again on every change  
 if it is not multiple-referenced, and noone does that, I guess.

 Side-Note: assignments of the form of "y := x" are really special, at  
 least conceptually; but also practically when pointers or refs enter the  
 game. I call them "symbol assignments" as the source is a symbol.

 Denis

Hi Denis,
What Don is explaining is not how you should implement copy-on-write,  
etc., but the actual implementation of arrays in DMD's CTFE system. Right  
now, any access to an array in CTFE causes the entire array to be  
duplicated, which is a major memory and performance issue, to say nothing  
of the fact that D arrays are supposed to have reference, not value  
semantics. I don't know how or why this behavior was ever introduced, only  
that it is awesome that Don is fixing it.

Mar 27 2011

Don <nospam nospam.com> writes:

Robert Jacques wrote:
 On Sun, 27 Mar 2011 08:36:39 -0400, spir <denis.spir gmail.com> wrote:
 
 On 03/26/2011 09:57 PM, Don wrote:
 The basic problem with the current implementation of CTFE is that it
 uses copy-on-write. This means that references (including dynamic
 arrays) don't work properly -- they just copy a snapshot of the thing
 they are referencing. This is bug 1330. It also means it burns up 
 memory
 like you wouldn't believe.

 Right.  IIUC there's also no way to free the memory from copies that 
 are no
 longer referenced.  I can see where this would leak memory like a 
 sieve.

 That's not the big problem, actually. The issue is that x[7]=6;
 duplicates x, even if x has 10K elements.
 Now consider:
    for(int i=0; i<x.length; ++i) x[i]=3;
    // creates 100M new elements!! Should create none, or 10K at most.

 Hello Don,

 I don't understand your point. I have once implemented a toy dynamic 
 language, using the common trick of boxed elements (� la Lisp). But I 
 wanted to maintain value semantics as standard. A cheap way to do that 
 is copy on write; it is actually cheap since simple, atomic, elements 
 are never copied (since they cannot be changed on place), thus one 
 just just needs to trace complex elements (array-lists & named tuples 
 in my case):
     x := [1,2,3]    // create the array value, assign its ref
     y := x        // copy the ref, mark the value as shared
     x[1] := 0    // copy the value, reassign the ref, then change
 But the new value is not shared, thus:
     x[1] := 1    // change only

 So that in your loop example, at most one array copy happens (iff it 
 was shared). This is as far as I know what is commonly called 
 copy-on-write. There is no need to copy the value over and over again 
 on every change if it is not multiple-referenced, and noone does that, 
 I guess.

 Side-Note: assignments of the form of "y := x" are really special, at 
 least conceptually; but also practically when pointers or refs enter 
 the game. I call them "symbol assignments" as the source is a symbol.

 Denis

 
 Hi Denis,
 What Don is explaining is not how you should implement copy-on-write, 
 etc., but the actual implementation of arrays in DMD's CTFE system. 

Exactly.

 Right now, any access to an array in CTFE causes the entire array to be 
 duplicated, which is a major memory and performance issue, to say 
 nothing of the fact that D arrays are supposed to have reference, not 
 value semantics. I don't know how or why this behavior was ever 
 introduced, only that it is awesome that Don is fixing it.

I believe it was a quick hack to get things working. But it needs to 
disappear.

Mar 27 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

I remember a few months ago I've tried using CTFE and import
expressions to load a .def file and generate at compile-time a runtime
DLL loading mechanism in a class which would load a DLL file and
create wrapper functions for DLL functions. It would also add
try{}catch{} blocks based on a naming scheme and if -debug was
enabled. But I've had some big issues with string handling at
compile-time.

I'm not sure if I was doing something wrong or if CTFE was just
inadequate at the time (I do remember having some trouble using
foreach loops and some unfriendly CTFE error messages). I'll give it
another shoot soon. Anyhow, its great seeing someone working to
improve CTFE. Thanks, Don!

Mar 27 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 3/27/11, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:
 I remember a few months ago I've tried using CTFE and import
 expressions to load a .def file and generate at compile-time a runtime
 DLL loading mechanism in a class which would load a DLL file and
 create wrapper functions for DLL functions.

Found it. It doesn't actually load a .def file, and it wouldn't make
much sense since a def file doesn't have much except a list of symbol
names. It generates code that links function pointers to a DLL at
runtime, and creates wrapper functions which take care of calling the
C code.

First I'd create a struct with a list of function prototypes. Then I'd
just mixin() a string inside a class. The function that creates the
string to be mixed in first checks the return values of the function
prototypes, and based on that it can add code that throws on invalid
values.

Here's an example of a generated class at compile-time:
https://gist.github.com/892698
or if that doesn't display right: http://dl.dropbox.com/u/9218759/result.d

Of course much more could be done here. The generated functions could
take strings instead of char pointers and call toStringz on them when
calling a function pointer. And we could use ref instead of pointers
for other parameters.

Mar 29 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

Ok this is the thing that really gets me with CTFE:

void printFields(T)(T t)
{
    enum fields = [__traits(allMembers, T)];

    foreach (string field; fields)
    {
        mixin("writeln(t." ~ to!string(field) ~ ");");      // fail
        mixin("writeln(t." ~ to!string(fields[0]) ~ ");");  // ok
    }
}

Even though the foreach loop will work, `field` can't be accessed.

Once we have that working, its heaven.

Mar 31 2011

David Nadlinger <see klickverbot.at> writes:

On 3/26/11 9:16 PM, Don wrote:
 The basic problem with the current implementation of CTFE is that it
 uses copy-on-write. This means that references (including dynamic
 arrays) don't work properly -- they just copy a snapshot of the thing
 they are referencing. This is bug 1330. It also means it burns up memory
 like you wouldn't believe.

 I'm changing CTFE to use in-place modification. This fixes all those
 issues. But this is obviously a fairly intense change, and will take
 quite a lot of time to iron out all the corner cases. So that's all I'm
 planning on doing right now.

 But once that's done, it will be straightforward to implement other
 reference types, such as classes and pointers (pointer arithmetic will
 be restricted to pointers which point to array members). Once classes
 are implemented, it's straightforward to do exceptions. So, pretty much
 everything.

First of all, let me say again that I am really looking forward to your 
changes, as I even considered having a go at solving the referencing 
issue myself for a while (but without sound knowledge of the compiler 
internals, this is an even harder thing to pull off).

Do I understand correctly that your changes wouldn't introduce some form 
of real compile-time memory management, but alleviate the need for it by 
fixing bug 1330 and related ones, thus cutting down on the ridiculous 
amount of copying going on today?

And finally – I know such questions are tough to answer –, do you have a 
rough estimate on how long it will take you to get the basic set of 
changes ready for testing? This is somewhat relevant for me, as the 
coding period for GSoC is going to start in about two months from now, 
and I think that with current DMD, doing the Thrift compiler in CTFE 
might be infeasible due to memory usage.

Thanks a lot for your work,
David

Mar 26 2011

David Nadlinger <see klickverbot.at> writes:

On 3/26/11 9:59 PM, David Nadlinger wrote:
 On 3/26/11 9:16 PM, Don wrote:
 The basic problem with the current implementation of CTFE is that it
 uses copy-on-write. This means that references (including dynamic
 arrays) don't work properly -- they just copy a snapshot of the thing
 they are referencing. This is bug 1330. It also means it burns up memory
 like you wouldn't believe.

 I'm changing CTFE to use in-place modification. This fixes all those
 issues. But this is obviously a fairly intense change, and will take
 quite a lot of time to iron out all the corner cases. So that's all I'm
 planning on doing right now.

 But once that's done, it will be straightforward to implement other
 reference types, such as classes and pointers (pointer arithmetic will
 be restricted to pointers which point to array members). Once classes
 are implemented, it's straightforward to do exceptions. So, pretty much
 everything.

 […]

 Do I understand correctly that your changes wouldn't introduce some form
 of real compile-time memory management, but alleviate the need for it by
 fixing bug 1330 and related ones, thus cutting down on the ridiculous
 amount of copying going on today?

Ah, forget that part, I wasn't aware of David's post asking the same 
question (and your answer to it) when I wrote this message.

David

Mar 26 2011

Jacob Carlborg <doob me.com> writes:

On 2011-03-26 21:16, Don wrote:
 dsimcha wrote:
 On 3/26/2011 12:16 PM, Don wrote:
 I'm giving CTFE a *major* overhaul right now. I don't know if I'll be
 finished in time for the next compiler release, but definitely by the
 release after that. Most importantly, bug 1330, which is the root cause
 of almost all of the problems, will be fixed. I hope to move CTFE out
 the "experimental feature" category.

 This is great news, I'm looking forward to it. Thanks for the hard
 work. Out of curiosity, can you give a brief overview of what new
 things CTFE will be usable for?

 The basic problem with the current implementation of CTFE is that it
 uses copy-on-write. This means that references (including dynamic
 arrays) don't work properly -- they just copy a snapshot of the thing
 they are referencing. This is bug 1330. It also means it burns up memory
 like you wouldn't believe.

 I'm changing CTFE to use in-place modification. This fixes all those
 issues. But this is obviously a fairly intense change, and will take
 quite a lot of time to iron out all the corner cases. So that's all I'm
 planning on doing right now.

 But once that's done, it will be straightforward to implement other
 reference types, such as classes and pointers (pointer arithmetic will
 be restricted to pointers which point to array members). Once classes
 are implemented, it's straightforward to do exceptions. So, pretty much
 everything.

 I've been planning on doing this for over a year, but while Walter was
 working on 64-bit, I felt that I was the only one working on the
 showstopper wrong-code bugs and regressions, so I put this
 important-but-not-urgent stuff aside.

Will the time it takes to compile heavy uses of CTFE be affected by this 
(positive or negative)?

-- 
/Jacob Carlborg

Mar 27 2011

"Robert Jacques" <sandford jhu.edu> writes:

On Sun, 27 Mar 2011 06:06:48 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-03-26 21:16, Don wrote:
 dsimcha wrote:
 On 3/26/2011 12:16 PM, Don wrote:
 I'm giving CTFE a *major* overhaul right now. I don't know if I'll be
 finished in time for the next compiler release, but definitely by the
 release after that. Most importantly, bug 1330, which is the root  
 cause
 of almost all of the problems, will be fixed. I hope to move CTFE out
 the "experimental feature" category.

 This is great news, I'm looking forward to it. Thanks for the hard
 work. Out of curiosity, can you give a brief overview of what new
 things CTFE will be usable for?

 The basic problem with the current implementation of CTFE is that it
 uses copy-on-write. This means that references (including dynamic
 arrays) don't work properly -- they just copy a snapshot of the thing
 they are referencing. This is bug 1330. It also means it burns up memory
 like you wouldn't believe.

 I'm changing CTFE to use in-place modification. This fixes all those
 issues. But this is obviously a fairly intense change, and will take
 quite a lot of time to iron out all the corner cases. So that's all I'm
 planning on doing right now.

 But once that's done, it will be straightforward to implement other
 reference types, such as classes and pointers (pointer arithmetic will
 be restricted to pointers which point to array members). Once classes
 are implemented, it's straightforward to do exceptions. So, pretty much
 everything.

 I've been planning on doing this for over a year, but while Walter was
 working on 64-bit, I felt that I was the only one working on the
 showstopper wrong-code bugs and regressions, so I put this
 important-but-not-urgent stuff aside.

 Will the time it takes to compile heavy uses of CTFE be affected by this  
 (positive or negative)?

Any string heavy CTFE should see a major improvement in performance.

Mar 27 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On 2011-03-27 08:41, Robert Jacques wrote:
 On Sun, 27 Mar 2011 06:06:48 -0400, Jacob Carlborg <doob me.com> wrote:
 On 2011-03-26 21:16, Don wrote:
 dsimcha wrote:
 On 3/26/2011 12:16 PM, Don wrote:
 I'm giving CTFE a *major* overhaul right now. I don't know if I'll be
 finished in time for the next compiler release, but definitely by the
 release after that. Most importantly, bug 1330, which is the root
 cause
 of almost all of the problems, will be fixed. I hope to move CTFE out
 the "experimental feature" category.

 
 This is great news, I'm looking forward to it. Thanks for the hard
 work. Out of curiosity, can you give a brief overview of what new
 things CTFE will be usable for?

 
 The basic problem with the current implementation of CTFE is that it
 uses copy-on-write. This means that references (including dynamic
 arrays) don't work properly -- they just copy a snapshot of the thing
 they are referencing. This is bug 1330. It also means it burns up memory
 like you wouldn't believe.
 
 I'm changing CTFE to use in-place modification. This fixes all those
 issues. But this is obviously a fairly intense change, and will take
 quite a lot of time to iron out all the corner cases. So that's all I'm
 planning on doing right now.
 
 But once that's done, it will be straightforward to implement other
 reference types, such as classes and pointers (pointer arithmetic will
 be restricted to pointers which point to array members). Once classes
 are implemented, it's straightforward to do exceptions. So, pretty much
 everything.
 
 I've been planning on doing this for over a year, but while Walter was
 working on 64-bit, I felt that I was the only one working on the
 showstopper wrong-code bugs and regressions, so I put this
 important-but-not-urgent stuff aside.

 
 Will the time it takes to compile heavy uses of CTFE be affected by this
 (positive or negative)?

 
 Any string heavy CTFE should see a major improvement in performance.

Yeah. Considering how memory-heavy CTFE tends to be, I'd expect that such a 
massive drop in memory consumption would almost always result in a performance 
improvement. However, we could end up being surprised with how it actually 
performs, since how the performance characteristics of an application change 
as you change it can sometimes be very surprising. I would generally expect it 
to improve performance though, not harm it. And it should definitely make some 
CTFE which currently fails due to a lack of memory actually work.

- Jonathan M Davis

Mar 27 2011

Jacob Carlborg <doob me.com> writes:

On 2011-03-25 00:46, David Nadlinger wrote:
 Hi all,

 I am putting together a Google Summer of Code project proposal regarding
 the Apache Thrift idea (see the ideas page[1]), which I intend to
 officially submit as soon as the application period opens. You can find
 my first draft at http://klickverbot.at/code/gsoc/thrift/.

 While I would love to hear any opinions, two specific questions:

 Walter, could you as the organization admin please have a look at this
 if it meets your formal expectations (the application template section
 of the Digital Mars GSoC profile is still empty)?

 Andrei, as you are the one behind the original suggestion, would you
 mind having a quick glance at the proposal? Do you have any experience
 with Thrift in production use from your work at Facebook?

 David


 P.S.: I am notoriously bad at writing »About me« sections, but from
 reading around a bit I figured a GSoC application should include one…



 [1] http://www.prowiki.org/wiki4d/wiki.cgi?GSOC_2011_Ideas

Don't know if this will be any problem with the Thrift protocol, 
specially since C++ is supported, but D has very limited runtime 
reflection support making it unnecessary hard to implement serialization.

-- 
/Jacob Carlborg

Mar 25 2011

David Nadlinger <see klickverbot.at> writes:

On 3/25/11 3:04 PM, Jacob Carlborg wrote:
 Don't know if this will be any problem with the Thrift protocol,
 specially since C++ is supported, but D has very limited runtime
 reflection support making it unnecessary hard to implement serialization.

Thrift and other, similar projects (like Google's Protocol Buffers) go 
the other way round anyway – you first define the data formats and RPC 
interfaces, and then use code generated from the definition to work with 
them in your application.

David

Mar 25 2011

Jacob Carlborg <doob me.com> writes:

On 2011-03-25 15:28, David Nadlinger wrote:
 On 3/25/11 3:04 PM, Jacob Carlborg wrote:
 Don't know if this will be any problem with the Thrift protocol,
 specially since C++ is supported, but D has very limited runtime
 reflection support making it unnecessary hard to implement serialization.

 Thrift and other, similar projects (like Google's Protocol Buffers) go
 the other way round anyway – you first define the data formats and RPC
 interfaces, and then use code generated from the definition to work with
 them in your application.

 David

Ok, I see.

-- 
/Jacob Carlborg

Mar 25 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 3/24/11 4:46 PM, David Nadlinger wrote:
 Hi all,

 I am putting together a Google Summer of Code project proposal regarding
 the Apache Thrift idea (see the ideas page[1]), which I intend to
 officially submit as soon as the application period opens. You can find
 my first draft at http://klickverbot.at/code/gsoc/thrift/.

 While I would love to hear any opinions, two specific questions:

 Walter, could you as the organization admin please have a look at this
 if it meets your formal expectations (the application template section
 of the Digital Mars GSoC profile is still empty)?

 Andrei, as you are the one behind the original suggestion, would you
 mind having a quick glance at the proposal? Do you have any experience
 with Thrift in production use from your work at Facebook?

This is a strong proposal that I will back up. I have shared it inside 
Facebook and two fellow engineers offered to help with Thrift-related 
questions you might have.

Andrei

Mar 26 2011

David Nadlinger <see klickverbot.at> writes:

I just revised the proposal and submitted it via Google's official 
interface, so don't be confused if you can't find it on my website any 
longer.

David

Mar 28 2011

D Programming

C/C++ Programming

Other

digitalmars.D - [GSoC] RFC: Thrift project proposal (draft)