www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - CTFE and DI: The Crossroads of D

reply "Adam Wilson" <flyboynw gmail.com> writes:
Hello Everyone,

I am afraid that we as a community have reached an inflection point. The  
crossroads of CTFE and DI.

I recently completed my work on a patch for the DI generation system. In  
the process I solicited the feedback of the community about what should  
and should not be in a DI file. The most agreed up point was that all  
functions that can loose their implementations should. In the communities  
opinion that means only auto-functions and template-functions should  
retain their implementations.

The problem is thus: CTFE requires that any function that it could  
possibly evaluated by CTFE, must retain it's implementation.  
Unfortunately, there is simply no way for the DI generation system to know  
which functions are capable of being called by CTFE and which ones  
actually are.

This limitation is due to the fact that DI generation must be run before  
Semantic Analysis because said analysis may perform significant rewrites  
of the AST. There is even a large (for DMD) comment in the main function  
of DMD explaining that DI generation should not be moved from where it is  
due to the inconsistencies that could arise.

The patch I created currently fails in the autotester because the template  
function dur() in the druntime is called via CTFE from Phobos. Per the  
community agreed upon DI rules, the function implementation of the  
Duration constructor that is called by the dur() function is stripped away  
and CTFE fails.

We as a community need to decide how important these two features are.  
Here are the Pro's of each feature as I see it. I encourage you to add to  
this list and debate the merits of each.

Pro's for DI:
Shared libraries are useless without proper header-style interfaces to the  
code.
Can reduce compile time.
Required by business so as not to share the entire code-base of their  
product.

Pro's of CTFE:
Makes writing certain types of otherwise complicated code simple.
Very useful to systems programmers.

By my view of it, lack of DI is a major blocker to any business looking to  
use D, including mine; and that CTFE is just "nice-to-have". And I would  
argue that if D wants to see any real usage above the 0.3% it got on the  
May TIOBE index, it needs serious business investment in using the  
language. My company would love to use D, but we can't because we don't  
want to release our entire code-base; hence my work on the DI generation  
patch. I would suggest to you that almost every business looking at D is  
going to find the current DI situation ... untenable.


A Potential Solution:

In my estimation there is a solution that allows both features to retain  
their full functionality and only requires minimal rewrites to Phobos.  
However this solution CANNOT be statically enforced by the compiler,  
except though a better, more explicit, error message(s).

The core of the solution is to disallow externally dependent CTFE in all  
modules accepted into Phobos. This would also require a clear and concise  
explanation accompanying the CTFE documentation stating that calling  
external code via CTFE will most likely result in a compiler error.

The reason I am proposing this solution is that I would argue that the  
author of std.datetime choose to utilize CTFE against an external module  
(in this case, the DRuntime) when Walter has (to the best of my knowledge)  
explicitly stated that DI files in the future would not contain unneeded  
implementations and that the current DI implementation was essentially a  
hack to get *something* working. Looking at the DI generation code, I can  
tell you, it is definitely not meant to be permanent, it is filled with  
hard-coded idiosyncrasies (like it's ridiculous indenting) that had to be  
fixed.

In essence I am arguing that it is their usage of CTFE that is incorrect,  
and not the fault of DI generation. As such, those modules should be  
rewritten in light of said assumptions changing. It is my opinion that

A longer term solution would be to give CTFE the ability to work on  
functions without the source code, so long as it mets the other rules of  
CTFE and can run the required code. However, there are some serious  
security issues that would need to be cleared up before this could work.


I apologize to the author of std.datetime for the harshness of my words  
and for singling you out and to any other Phobos module authors I may have  
implicated. I mean no ill-will towards you and I am grateful for the work  
you have done in improving Phobos.

Questions? Comments? Rants? Raves?
What do you think?

-- 
Adam Wilson
IRC: LightBender
Project Coordinator
The Horizon Project
http://www.thehorizonproject.org/
May 09 2012
next sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 09/05/2012 21:27, Adam Wilson a écrit :
 Questions? Comments? Rants? Raves?
 What do you think?

You miss the point of the importance of CTFE. It « just » allow us to get the fastest possible regex engine given the regex is known at compiletime (common case) for instance. This is a major feature of D. Maybe an interpretable bytecode solution is the key. But clearly the situation is not satisfying. Annotation could also be used to provide hint for the di generator. This feature had great interest. And is something powerful we should build on, when in place. Finally, di generator should do a part of the semaintic work to work fine. auto must be resolved, and non CTFEable code could be safely removed. This is already a major improvement.
May 09 2012
next sibling parent Jacob Carlborg <doob me.com> writes:
On 2012-05-09 21:55, deadalnix wrote:

 Finally, di generator should do a part of the semaintic work to work
 fine. auto must be resolved, and non CTFEable code could be safely
 removed. This is already a major improvement.

This is what needs to be done in the long run. -- /Jacob Carlborg
May 09 2012
prev sibling next sibling parent deadalnix <deadalnix gmail.com> writes:
Le 09/05/2012 22:53, Adam Wilson a écrit :
 This requires modifying significant chunks of the D semantic analysis
 engine and is a project that only a few people could pull off, I imagine
 that that list goes something like Walter, Don, Brad, and Kenji. It's
 doable, but the guys on that list have way bigger fish to fry and
 frankly, that seems like a sledgehammer solution to a needle-sized
 problem. I don't think we need to go that far right now.

 My patch leaves auto-functions intact in DI so that D can do its
 analysis, and the non-CTFEable code problem can be solved by scrubbing
 Phobos of any reliance on DRT CTFE. It's a much simpler solution. For D3
 we can tackle the big work.

This is a huge problem if you use DMD source code. I noticed that :D It is not that of a big problem, I'm pretty confident this can be solved with a more appropriate parser/AST/tools to work on the AST.
May 09 2012
prev sibling parent Paulo Pinto <pjmlp progtools.org> writes:
Am 10.05.2012 00:34, schrieb Joseph Rushton Wakeling:
 On 10/05/12 00:25, H. S. Teoh wrote:
 Which is what fueled the market for hundreds (if not thousands) of JS
 obfuscators.

Well, that's kind of my point really. Is it so bad (from a proprietary point of view) to have to distribute .d rather than .di files, if you can obfuscate them?

Try to find an error on an obfuscated dump. Not fun. -- Paulo
May 10 2012
prev sibling next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
The real WTF is we use .di files for druntime in the
first place. It is performance sensitive and open source.

We should be using the actual sources for inlining, ctfe,
etc. anyway.

Let's not torpedo the .di patch's value for just phobos.
May 09 2012
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 05/09/2012 10:40 PM, Adam Wilson wrote:
 ... unfortunately, Phobos won't compile with the
 patch applied because of the CTFE reliance on the DRT source.

It is actually the .di reliance. Maybe you should just fix the makefile so that it does not generate .di files for druntime.
May 09 2012
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 05/09/2012 11:09 PM, Adam Wilson wrote:
 Essentially re-architect the DRT?

Slightly re-architect the makefile.
May 09 2012
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 05/09/2012 11:13 PM, Adam Wilson wrote:
 On Wed, 09 May 2012 14:10:27 -0700, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 05/09/2012 11:09 PM, Adam Wilson wrote:
 Essentially re-architect the DRT?

Slightly re-architect the makefile.

Right, but that would pretty significantly change the layout of the DRT at least the import level.

Why? For some modules do dmd -H ... for some other modules do cp module.d some/path/module.di
 Although it'd be nice to physically separate
 the support modules from the core runtime modules.

The directory layout is given by the import names.
May 09 2012
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 05/09/2012 10:45 PM, Adam Wilson wrote:
 On Wed, 09 May 2012 13:43:12 -0700, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 05/09/2012 10:40 PM, Adam Wilson wrote:
 ... unfortunately, Phobos won't compile with the
 patch applied because of the CTFE reliance on the DRT source.

It is actually the .di reliance. Maybe you should just fix the makefile so that it does not generate .di files for druntime.

It's not that easy. Phobos imports the DI's

It imports the modules.
 and it's make file would
 need to be adjusted as well. Which could very well break other things.

Slightly adjusting the makefile is certainly simpler than changing the language.
 Also, see the issues with SharedLibs. The issue of runtime commonality
 is no small deal.

It is absolutely unrealistic to do that any time soon. The runtime still regularly gets essential updates.
May 09 2012
prev sibling next sibling parent deadalnix <deadalnix gmail.com> writes:
Le 09/05/2012 22:40, Adam Wilson a écrit :
 On Wed, 09 May 2012 13:14:32 -0700, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:

 On Wed, 09 May 2012 15:57:46 -0400, Adam D. Ruppe
 <destructionator gmail.com> wrote:

 The real WTF is we use .di files for druntime in the
 first place. It is performance sensitive and open source.

 We should be using the actual sources for inlining, ctfe,
 etc. anyway.

 Let's not torpedo the .di patch's value for just phobos.

I agree (although not generating .di files does not fix all the problems of inlining and ctfe -- there are many stubbed functions even in the .d files). In my opinion, .di generation should by default generate fully-stripped code except for templates. If you want functions to be CTFE-able, don't use auto-generated .di files to import them. -Steve

That is what my patch does, unfortunately, Phobos won't compile with the patch applied because of the CTFE reliance on the DRT source.

It doesn't make much sens to di phobos and druntime IMO.
May 09 2012
prev sibling next sibling parent Artur Skawina <art.08.09 gmail.com> writes:
On 05/10/12 00:15, Adam Wilson wrote:
 On Wed, 09 May 2012 15:07:44 -0700, Adam D. Ruppe <destructionator gmail.com>
wrote:
 
 On Wednesday, 9 May 2012 at 20:41:05 UTC, Adam Wilson wrote:
 Except that there is a distinct need for the DRuntime as a shared library.

That doesn't really matter - you can deploy as a shared library and still use full source as the interface file. Hell, that's what putting implementations in the .di file does anyway!

Sure, but a lot of software developers, particularly those with money, don't want their source getting out, and in a lot of cases, there is no good reason to distribute the source. There are also a bunch of cases where you don't even want something to be CTFEable like Walter's example on a different thread of the GC. Why would ever want to CTFE the GC? Until D starts to see some serious usage in business, it's never going to get out of "toy"/"hobby" language status in the eyes of the developer community at large. Few businesses want to release their source. DI's as a complete source file are a non-starter to that large segment of the development world. Improving DI generation is just taking down another barrier to D usage by that group of people.

A "group of people" that wants to distribute binary closed-source libs, yet finds having to manually specify the API of their library to be a barrier? If having to write all the required declarations from scratch (instead of using some *.d -> *.di converter) is a real problem, then, umm, it's most likely not their biggest one... artur
May 09 2012
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/9/12 3:14 PM, Steven Schveighoffer wrote:
 On Wed, 09 May 2012 15:57:46 -0400, Adam D. Ruppe
 <destructionator gmail.com> wrote:

 The real WTF is we use .di files for druntime in the
 first place. It is performance sensitive and open source.

 We should be using the actual sources for inlining, ctfe,
 etc. anyway.

 Let's not torpedo the .di patch's value for just phobos.

I agree (although not generating .di files does not fix all the problems of inlining and ctfe -- there are many stubbed functions even in the .d files). In my opinion, .di generation should by default generate fully-stripped code except for templates. If you want functions to be CTFE-able, don't use auto-generated .di files to import them. -Steve

Actually the point here is to still be able to benefit of di automated generation while opportunistically marking certain functions as "put the body in the .di file". inline anyone? Andrei
May 09 2012
next sibling parent deadalnix <deadalnix gmail.com> writes:
Le 10/05/2012 05:00, Andrei Alexandrescu a écrit :
 Actually the point here is to still be able to benefit of di automated
 generation while opportunistically marking certain functions as "put the
 body in the .di file".

  inline anyone?


 Andrei

I think this logic is flawed. Removing implementation of a function have drawbacks (it is not CTFEable for instance). It should be done : - If it doesn't make sens anyway to include the body (non CTFEable function for instance). - If the user choose to do so (to hide code source behind an API for instance). - If the code is not callable throw exposed one (private function that aren't called from any other piece of code where the source code remains). This is one is a "garbage collection process" You don't want to strip all source code by default. With the "garbage collection" trick, you only need to mark your API to get mostly everything removed. This is a much better approach.
May 10 2012
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/10/12 6:17 AM, Steven Schveighoffer wrote:
 On Wed, 09 May 2012 23:00:07 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Actually the point here is to still be able to benefit of di automated
 generation while opportunistically marking certain functions as "put
 the body in the .di file".

If you aren't going to strip the files, I don't see the point in it.

Inlining. Andrei
May 10 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 10/05/2012 17:54, Steven Schveighoffer a écrit :
 On Thu, 10 May 2012 10:47:59 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 5/10/12 6:17 AM, Steven Schveighoffer wrote:
 On Wed, 09 May 2012 23:00:07 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Actually the point here is to still be able to benefit of di automated
 generation while opportunistically marking certain functions as "put
 the body in the .di file".

If you aren't going to strip the files, I don't see the point in it.

Inlining.

No, I mean if dmd -H isn't going to strip the files, what is the point of dmd -H? I can already copy the .d to .di and have inlining/ctfe, or simply use the .d directly. At this point, in order to get CTFE to work, you have to keep just about everything, including private imports. If we want to ensure CTFE works, dmd -H becomes a glorified cp. If we have some half-assed guess at what could be CTFE'd (which is growing by the day), then it's likely to not fit with the goals of the developer running dmd -H. -Steve

If you can CTFE, you can know what is CTFEable. If it is currently half assed, then work on it and provide a better tool.
May 10 2012
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 05/10/2012 06:04 PM, deadalnix wrote:
 If you can CTFE, you can know what is CTFEable.

CTFEability is undecidable.
May 10 2012
prev sibling next sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 10/05/2012 18:56, Steven Schveighoffer a écrit :
 There is already a better tool -- cp. I ask again, what is the benefit
 of .di generation if it is mostly a glorified (faulty?) copy operation?

Please stop with that cp argument, this is complete bullshit.
 As Adam points out in his original post, ensuring CTFE availability may
 not be (and is likely not) why you are creating a .di file.

You want to create a di file to hide implementation of some functionality to the user of you lib. The better approach is to mark such code as this. Note that in C/C++ you maintain headers manually. It is already a big improvement.
 Plus, what isn't CTFEable today may be CTFEable tomorrow.

Good point.
May 10 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 10/05/2012 19:51, Steven Schveighoffer a écrit :
 On Thu, 10 May 2012 13:27:23 -0400, deadalnix <deadalnix gmail.com> wrote:

 Le 10/05/2012 18:56, Steven Schveighoffer a écrit :
 There is already a better tool -- cp. I ask again, what is the benefit
 of .di generation if it is mostly a glorified (faulty?) copy operation?

Please stop with that cp argument, this is complete bullshit.

Not complete. Maybe it's somewhat of an exaggeration ;) But really, I look at the current situation that started this thread. The intention of .di header generation retaining implementation is to allow for inlining, not making CTFE available. Yet a side effect is that sometimes CTFE *is* available. Well, let's say something becomes uninlinable, and now dmd decides to remove its implementation. But another piece of code is already depending on that source to be available for CTFE! Now you have broken code inadvertently, and the only way to fix it is to hand-edit the .di file.

The di generator can remove code that isn't CTFEable (at least can be proven to not be CTFEable). It is the case in your example.
 But the compiler should stay out of the decision to strip or not based
 on optimization predictions.

The compiler should provide something by default. It is up to the user to mark the code accordingly.
 I agree the module system is way better than having an interface and
 implementation file separate. But when you actually *do* want it to be
 separate (for whatever reason), D pretty much devolves to C.

At least it is not worse.
May 10 2012
parent deadalnix <deadalnix gmail.com> writes:
Le 10/05/2012 20:25, deadalnix a écrit :
 Le 10/05/2012 19:51, Steven Schveighoffer a écrit :
 On Thu, 10 May 2012 13:27:23 -0400, deadalnix <deadalnix gmail.com>
 wrote:

 Le 10/05/2012 18:56, Steven Schveighoffer a écrit :
 There is already a better tool -- cp. I ask again, what is the benefit
 of .di generation if it is mostly a glorified (faulty?) copy operation?

Please stop with that cp argument, this is complete bullshit.

Not complete. Maybe it's somewhat of an exaggeration ;) But really, I look at the current situation that started this thread. The intention of .di header generation retaining implementation is to allow for inlining, not making CTFE available. Yet a side effect is that sometimes CTFE *is* available. Well, let's say something becomes uninlinable, and now dmd decides to remove its implementation. But another piece of code is already depending on that source to be available for CTFE! Now you have broken code inadvertently, and the only way to fix it is to hand-edit the .di file.

The di generator can remove code that isn't CTFEable (at least can be proven to not be CTFEable). It is the case in your example.
 But the compiler should stay out of the decision to strip or not based
 on optimization predictions.

The compiler should provide something by default. It is up to the user to mark the code accordingly.

I wanted to add that the default behavior should break anything. So it have to be conservative with CTFEable code unless told otherwise by some attribute.
May 10 2012
prev sibling next sibling parent reply David Gileadi <gileadis NSPMgmail.com> writes:
On 5/10/12 11:01 AM, Adam Wilson wrote:
 It does require some semantic information. And the solution I've seen
 seen most talked about here is some kind of attribute similar to  pure
 that tells the compiler to include the implementation in the DI file.

I may be off-base here, but this strikes me as a good case for a pragma. No?
May 10 2012
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 05/10/2012 08:15 PM, Adam Wilson wrote:
 On Thu, 10 May 2012 11:10:15 -0700, David Gileadi
 <gileadis nspmgmail.com> wrote:

 On 5/10/12 11:01 AM, Adam Wilson wrote:
 It does require some semantic information. And the solution I've seen
 seen most talked about here is some kind of attribute similar to  pure
 that tells the compiler to include the implementation in the DI file.

I may be off-base here, but this strikes me as a good case for a pragma. No?

Well, it's needs to be at a function level to be useful.

pragmas can apply to declarations. The syntax is pragma(identifier,...) Declaration (Where Declaration can be the empty declaration, ';') pragma(keepImplementation) void foo(){ ... }
May 10 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 10/05/2012 20:22, Timon Gehr a écrit :
 On 05/10/2012 08:15 PM, Adam Wilson wrote:
 On Thu, 10 May 2012 11:10:15 -0700, David Gileadi
 <gileadis nspmgmail.com> wrote:

 On 5/10/12 11:01 AM, Adam Wilson wrote:
 It does require some semantic information. And the solution I've seen
 seen most talked about here is some kind of attribute similar to  pure
 that tells the compiler to include the implementation in the DI file.

I may be off-base here, but this strikes me as a good case for a pragma. No?

Well, it's needs to be at a function level to be useful.

pragmas can apply to declarations. The syntax is pragma(identifier,...) Declaration (Where Declaration can be the empty declaration, ';') pragma(keepImplementation) void foo(){ ... }

You want to specify strip implementation, not keep implementation. Strip implementation may break things. Keeping it cannot. The default behavior should be on the safe side of the medal. The DIfier can remove code if it knows that it isn't CTFEable or don't worth inlining by default. Additional code removal can be specified by attributes.
May 10 2012
next sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 10/05/2012 20:35, Adam Wilson a écrit :
 The problem is that it DOES NOT know if it's CTFEable or not. No
 analysis is performed prior to DI generation!

It doesn't seems undoable.
May 10 2012
parent deadalnix <deadalnix gmail.com> writes:
Le 10/05/2012 22:39, Adam Wilson a écrit :
 On Thu, 10 May 2012 12:51:03 -0700, deadalnix <deadalnix gmail.com> wrote:

 Le 10/05/2012 20:35, Adam Wilson a écrit :
 The problem is that it DOES NOT know if it's CTFEable or not. No
 analysis is performed prior to DI generation!

It doesn't seems undoable.

It isn't, but it would require that DI generation got it's own specialized form of semantic analysis, and that is a significant amount of work. I'm not saying it shouldn't be done, just that it's not a valid short-term solution. A long-term solution would be to embed a semantically analyzed form of the source into the object itself. But that's years away with concerted group effort.

I wouldn't introduce a language feature for short term solution. This can lead to tedious technical debt to manage.
May 10 2012
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 10/05/2012 21:12, Steven Schveighoffer a écrit :
 No, it's definitely keep implementation. By default, I want .di files to
 contain nothing but interface. If I wanted the source by default, I
 wouldn't be using .di files.

 Strip implementation may break things. Keeping it cannot. The default
 behavior should be on the safe side of the medal.

Current behavior is junk, there is no reason to save it.

This isn't the current behavior we talking about here.
May 10 2012
parent reply deadalnix <deadalnix gmail.com> writes:
Le 10/05/2012 21:57, Steven Schveighoffer a écrit :
 Then what "breaks"? If you aren't using di generation, how can changing
 the way di generation works break your code?

I don't know in which word you live, but in mine, 100% of project I'm doing use 3rd party code. You don't have control on 3rd party code.
May 10 2012
parent deadalnix <deadalnix gmail.com> writes:
Le 10/05/2012 22:16, Steven Schveighoffer a écrit :
 It is not the job of the compiler or language to make up for the failure
 to run *basic tests* of your third party vendors.

It is not the job of the compiler to generate di files. This whole DI stuff has been very badly started from the begining.
May 10 2012
prev sibling parent deadalnix <deadalnix gmail.com> writes:
Le 10/05/2012 20:10, David Gileadi a écrit :
 On 5/10/12 11:01 AM, Adam Wilson wrote:
 It does require some semantic information. And the solution I've seen
 seen most talked about here is some kind of attribute similar to  pure
 that tells the compiler to include the implementation in the DI file.

I may be off-base here, but this strikes me as a good case for a pragma. No?

Adding more features to the core language when a proper attribute system would do the trick is a poor design decision IMO.
May 10 2012
prev sibling parent deadalnix <deadalnix gmail.com> writes:
Le 10/05/2012 20:01, Adam Wilson a écrit :
 On Thu, 10 May 2012 10:57:37 -0700, Christopher Bergqvist
 <spambox0 digitalpoetry.se> wrote:

 On Thursday, 10 May 2012 at 17:37:59 UTC, Adam Wilson wrote:
 On Thu, 10 May 2012 09:56:06 -0700, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:

 On Thu, 10 May 2012 12:04:44 -0400, deadalnix <deadalnix gmail.com>
 wrote:

 Le 10/05/2012 17:54, Steven Schveighoffer a écrit :
 On Thu, 10 May 2012 10:47:59 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 5/10/12 6:17 AM, Steven Schveighoffer wrote:
 On Wed, 09 May 2012 23:00:07 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Actually the point here is to still be able to benefit of di
 automated
 generation while opportunistically marking certain functions as
 "put
 the body in the .di file".

If you aren't going to strip the files, I don't see the point in it.

Inlining.

No, I mean if dmd -H isn't going to strip the files, what is the point of dmd -H? I can already copy the .d to .di and have inlining/ctfe, or simply use the .d directly. At this point, in order to get CTFE to work, you have to keep just about everything, including private imports. If we want to ensure CTFE works, dmd -H becomes a glorified cp. If we have some half-assed guess at what could be CTFE'd (which is growing by the day), then it's likely to not fit with the goals of the developer running dmd -H. -Steve

If you can CTFE, you can know what is CTFEable. If it is currently half assed, then work on it and provide a better tool.

There is already a better tool -- cp. I ask again, what is the benefit of .di generation if it is mostly a glorified (faulty?) copy operation? As Adam points out in his original post, ensuring CTFE availability may not be (and is likely not) why you are creating a .di file. Plus, what isn't CTFEable today may be CTFEable tomorrow. inlining is one thing, because that's an optimization that has a valid fallback. CTFE does not. -Steve

Exactly this. I am currently in the process of changing the DRuntime makefiles such that some of the files are not processed as DI's. This allows Phobos CTFE dependencies on the DRT to remain valid while still allowing DI's to be generated for parts where they matter, with the goal of making both a shared and static library build of the DRT. The tool I am using to accomplish this feat? cp. It works, it delivers exactly what we need and it's *is not* a broken operation like the current DI generation. Like Steve said, most people generating DI files are not really worried about CTFE working, in fact they almost undoubtedly *know* that they are breaking CTFE, yet they choose to do it anyways. They have their reasons, and frankly, it doesn't concern us as compiler writers if those reasons don't line up with our personal moral world-view. Our job is to provide a tool that DOES WHAT PEOPLE EXPECT. Otherwise they will move on to one that does. If people expected DI generation to be glorified (and not broken) copy operation, they would (and do) use cp.

How about: dmd -H mySource.d --keepImplementation MyClass.fooMethod ? It should be good enough for makefiles as in the case of core.time/dur, but get's a bit hairy with overloads (append "[0]" to select specific ones?). Maybe it requires semantic information though.

It does require some semantic information. And the solution I've seen seen most talked about here is some kind of attribute similar to pure that tells the compiler to include the implementation in the DI file. IMO, this is a fine solution, but the compiler cannot be involved the decision to keep an implementation in or out based on anything other than programmer directives because the compiler just don't know what's being depended on. That's how we ended up where we are today, DI files are the source with unittests and comments removed.

Frankly, the compiler shouldn't do that. It's internal structure isn't made to do such stuff and it will do a poor job here.
May 10 2012
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 09 May 2012 15:57:46 -0400, Adam D. Ruppe  
<destructionator gmail.com> wrote:

 The real WTF is we use .di files for druntime in the
 first place. It is performance sensitive and open source.

 We should be using the actual sources for inlining, ctfe,
 etc. anyway.

 Let's not torpedo the .di patch's value for just phobos.

I agree (although not generating .di files does not fix all the problems of inlining and ctfe -- there are many stubbed functions even in the .d files). In my opinion, .di generation should by default generate fully-stripped code except for templates. If you want functions to be CTFE-able, don't use auto-generated .di files to import them. -Steve
May 09 2012
parent deadalnix <deadalnix gmail.com> writes:
Le 09/05/2012 22:58, Adam Wilson a écrit :
 On Wed, 09 May 2012 13:51:22 -0700, Jonathan M Davis
 <jmdavisProg gmx.com> wrote:

 On Wednesday, May 09, 2012 22:46:58 foobar wrote:
 This makes sense.
 So this means the datetime example would fail to compile when
 using druntime's .di files. This should be emphasized in the
 spec/docs to minimize the chance for gotchas for users.

 We could add an exception to this rule by tagging functions with
 e.g. "export". What do you think?

export already has another meaning. It also goes against the whole idea that any function is supposed to be CTFEable without special annotations. - Jonathan M Davis

I think an attribute like implementation would be useful here. It could easily be used by the DI generator to keep the implementation in the DI file. You would only need to apply it to functions that you want to be CTFEable externally, internal CTFE would still work the same.

It is reversed logic. The more code you have available, the better for the compiler and the user. Code should be stripped only if a reason tells us to do so (compile time, source code shouldn't be released, etc . . .).
May 09 2012
prev sibling next sibling parent "foobar" <foo bar.com> writes:
On Wednesday, 9 May 2012 at 19:49:07 UTC, deadalnix wrote:
 Le 09/05/2012 21:27, Adam Wilson a écrit :
 Questions? Comments? Rants? Raves?
 What do you think?

You miss the point of the importance of CTFE. It « just » allow us to get the fastest possible regex engine given the regex is known at compiletime (common case) for instance. This is a major feature of D. Maybe an interpretable bytecode solution is the key. But clearly the situation is not satisfying. Annotation could also be used to provide hint for the di generator. This feature had great interest. And is something powerful we should build on, when in place. Finally, di generator should do a part of the semaintic work to work fine. auto must be resolved, and non CTFEable code could be safely removed. This is already a major improvement.

CTFE is interpreted and requires the source code. Moving this to a byte-code interpreter isn't really progress IMO. Are we to implement a subset of a JVM inside the compiler now? We already have a fast native compiler. Ideally, it should be possible to run binary obj code directly. This however requires a lot of infrastructure change and a lot of thinking to figure out how to handle cross-compiling. This is most definitely possible - but does require effort. I don't know if it will ever happen, even for a future D3 (or 4, or 5, etc..)
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 13:14:32 -0700, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Wed, 09 May 2012 15:57:46 -0400, Adam D. Ruppe  
 <destructionator gmail.com> wrote:

 The real WTF is we use .di files for druntime in the
 first place. It is performance sensitive and open source.

 We should be using the actual sources for inlining, ctfe,
 etc. anyway.

 Let's not torpedo the .di patch's value for just phobos.

I agree (although not generating .di files does not fix all the problems of inlining and ctfe -- there are many stubbed functions even in the .d files). In my opinion, .di generation should by default generate fully-stripped code except for templates. If you want functions to be CTFE-able, don't use auto-generated .di files to import them. -Steve

That is what my patch does, unfortunately, Phobos won't compile with the patch applied because of the CTFE reliance on the DRT source. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 12:57:46 -0700, Adam D. Ruppe  
<destructionator gmail.com> wrote:

 The real WTF is we use .di files for druntime in the
 first place. It is performance sensitive and open source.

 We should be using the actual sources for inlining, ctfe,
 etc. anyway.

 Let's not torpedo the .di patch's value for just phobos.

Except that there is a distinct need for the DRuntime as a shared library. Particularly as end-user software starts relying on multiple shared libraries. Currently, things just explode when you try to use multiple libraries of software that link to different versions of the DRT, a shared lib for DRT would pretty much solve that. Or at least make a solution possible, but the *only* way to make a DRT shared lib work is DI files that don't contain implementation. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2012-05-09 21:27, Adam Wilson wrote:
 Hello Everyone,

 I am afraid that we as a community have reached an inflection point. The
 crossroads of CTFE and DI.

 I recently completed my work on a patch for the DI generation system. In
 the process I solicited the feedback of the community about what should
 and should not be in a DI file. The most agreed up point was that all
 functions that can loose their implementations should. In the
 communities opinion that means only auto-functions and
 template-functions should retain their implementations.

 The problem is thus: CTFE requires that any function that it could
 possibly evaluated by CTFE, must retain it's implementation.
 Unfortunately, there is simply no way for the DI generation system to
 know which functions are capable of being called by CTFE and which ones
 actually are.

 This limitation is due to the fact that DI generation must be run before
 Semantic Analysis because said analysis may perform significant rewrites
 of the AST. There is even a large (for DMD) comment in the main function
 of DMD explaining that DI generation should not be moved from where it
 is due to the inconsistencies that could arise.

In the long run the compiler needs to run some form of (limited) semantic analysis to be able resolve inferred types.
 The patch I created currently fails in the autotester because the
 template function dur() in the druntime is called via CTFE from Phobos.
 Per the community agreed upon DI rules, the function implementation of
 the Duration constructor that is called by the dur() function is
 stripped away and CTFE fails.

A workaround to force a function/type to appear in DI files with their implementation is to make it into a template. void foo () () {} Will always show up in the DI files. -- /Jacob Carlborg
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 13:43:12 -0700, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 05/09/2012 10:40 PM, Adam Wilson wrote:
 ... unfortunately, Phobos won't compile with the
 patch applied because of the CTFE reliance on the DRT source.

It is actually the .di reliance. Maybe you should just fix the makefile so that it does not generate .di files for druntime.

It's not that easy. Phobos imports the DI's and it's make file would need to be adjusted as well. Which could very well break other things. Also, see the issues with SharedLibs. The issue of runtime commonality is no small deal. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "foobar" <foo bar.com> writes:
On Wednesday, 9 May 2012 at 20:14:32 UTC, Steven Schveighoffer 
wrote:
 On Wed, 09 May 2012 15:57:46 -0400, Adam D. Ruppe 
 <destructionator gmail.com> wrote:

 The real WTF is we use .di files for druntime in the
 first place. It is performance sensitive and open source.

 We should be using the actual sources for inlining, ctfe,
 etc. anyway.

 Let's not torpedo the .di patch's value for just phobos.

I agree (although not generating .di files does not fix all the problems of inlining and ctfe -- there are many stubbed functions even in the .d files). In my opinion, .di generation should by default generate fully-stripped code except for templates. If you want functions to be CTFE-able, don't use auto-generated .di files to import them. -Steve

This makes sense. So this means the datetime example would fail to compile when using druntime's .di files. This should be emphasized in the spec/docs to minimize the chance for gotchas for users. We could add an exception to this rule by tagging functions with e.g. "export". What do you think?
May 09 2012
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, May 09, 2012 22:46:58 foobar wrote:
 This makes sense.
 So this means the datetime example would fail to compile when
 using druntime's .di files. This should be emphasized in the
 spec/docs to minimize the chance for gotchas for users.
 
 We could add an exception to this rule by tagging functions with
 e.g. "export". What do you think?

export already has another meaning. It also goes against the whole idea that any function is supposed to be CTFEable without special annotations. - Jonathan M Davis
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 12:55:50 -0700, deadalnix <deadalnix gmail.com> wrot=
e:

 Le 09/05/2012 21:27, Adam Wilson a =E9crit :
 Questions? Comments? Rants? Raves?
 What do you think?

You miss the point of the importance of CTFE. It =AB just =BB allow us=

 get the fastest possible regex engine given the regex is known at  =

 compiletime (common case) for instance.

I was not trying to devalue CTFE, if anything I just don't know the case= s = in which it's used. But my point was, is that while CTFE allows for some= = very cool code. Complete Source DI's are a blocking bug for a significan= t = chuck of the software development world.
 This is a major feature of D.

 Maybe an interpretable bytecode solution is the key. But clearly the  =

 situation is not satisfying.

 Annotation could also be used to provide hint for the di generator. Th=

 feature had great interest. And is something powerful we should build =

 on, when in place.

 Finally, di generator should do a part of the semaintic work to work  =

 fine. auto must be resolved, and non CTFEable code could be safely  =

 removed. This is already a major improvement.

This requires modifying significant chunks of the D semantic analysis = engine and is a project that only a few people could pull off, I imagine= = that that list goes something like Walter, Don, Brad, and Kenji. It's = doable, but the guys on that list have way bigger fish to fry and frankl= y, = that seems like a sledgehammer solution to a needle-sized problem. I don= 't = think we need to go that far right now. My patch leaves auto-functions intact in DI so that D can do its analysi= s, = and the non-CTFEable code problem can be solved by scrubbing Phobos of a= ny = reliance on DRT CTFE. It's a much simpler solution. For D3 we can tackle= = the big work. -- = Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 13:45:04 -0700, Jacob Carlborg <doob me.com> wrote:

 On 2012-05-09 21:27, Adam Wilson wrote:
 Hello Everyone,

 I am afraid that we as a community have reached an inflection point. The
 crossroads of CTFE and DI.

 I recently completed my work on a patch for the DI generation system. In
 the process I solicited the feedback of the community about what should
 and should not be in a DI file. The most agreed up point was that all
 functions that can loose their implementations should. In the
 communities opinion that means only auto-functions and
 template-functions should retain their implementations.

 The problem is thus: CTFE requires that any function that it could
 possibly evaluated by CTFE, must retain it's implementation.
 Unfortunately, there is simply no way for the DI generation system to
 know which functions are capable of being called by CTFE and which ones
 actually are.

 This limitation is due to the fact that DI generation must be run before
 Semantic Analysis because said analysis may perform significant rewrites
 of the AST. There is even a large (for DMD) comment in the main function
 of DMD explaining that DI generation should not be moved from where it
 is due to the inconsistencies that could arise.

In the long run the compiler needs to run some form of (limited) semantic analysis to be able resolve inferred types.

It certainly does, but that's a LOT more work than we have time for at the moment.
 The patch I created currently fails in the autotester because the
 template function dur() in the druntime is called via CTFE from Phobos.
 Per the community agreed upon DI rules, the function implementation of
 the Duration constructor that is called by the dur() function is
 stripped away and CTFE fails.

A workaround to force a function/type to appear in DI files with their implementation is to make it into a template. void foo () () {} Will always show up in the DI files.

If this example works on constructors, I have no problem making the changes and opening a pull on the drt assuming it doesn't semantically alter the function. In the long run I think an implementation attribute would be nice because I could use that to direct the DI generator to include the whole implementation in the DI file, and this would be an acceptable solution for business. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 13:51:22 -0700, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Wednesday, May 09, 2012 22:46:58 foobar wrote:
 This makes sense.
 So this means the datetime example would fail to compile when
 using druntime's .di files. This should be emphasized in the
 spec/docs to minimize the chance for gotchas for users.

 We could add an exception to this rule by tagging functions with
 e.g. "export". What do you think?

export already has another meaning. It also goes against the whole idea that any function is supposed to be CTFEable without special annotations. - Jonathan M Davis

I think an attribute like implementation would be useful here. It could easily be used by the DI generator to keep the implementation in the DI file. You would only need to apply it to functions that you want to be CTFEable externally, internal CTFE would still work the same. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 09 May 2012 16:45:36 -0400, Adam Wilson <flyboynw gmail.com> wrote:

 On Wed, 09 May 2012 13:43:12 -0700, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 05/09/2012 10:40 PM, Adam Wilson wrote:
 ... unfortunately, Phobos won't compile with the
 patch applied because of the CTFE reliance on the DRT source.

It is actually the .di reliance. Maybe you should just fix the makefile so that it does not generate .di files for druntime.

It's not that easy. Phobos imports the DI's and it's make file would need to be adjusted as well. Which could very well break other things. Also, see the issues with SharedLibs. The issue of runtime commonality is no small deal.

core.time is not exactly "runtime". It's simply contains types *used* in the runtime. I don't see why it needs to be dynamic whatsoever. I think the right solution might be to split modules into "support" modules, and "runtime" modules, and only .di the runtime modules. When I think of runtime parts of core, I think of the GC, threads, and synchronization pieces (I think maybe intrinsics are there too). Everything else is there as support. -Steve
May 09 2012
prev sibling next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 05/09/2012 09:27 PM, Adam Wilson wrote:
 We as a community need to decide how important these two features are.
 Here are the Pro's of each feature as I see it. I encourage you to add
 to this list and debate the merits of each.

I don't see the point. Use whatever is convenient.
 A Potential Solution:

 In my estimation there is a solution that allows both features to retain
 their full functionality and only requires minimal rewrites to Phobos.
 However this solution CANNOT be statically enforced by the compiler,
 except though a better, more explicit, error message(s).

 The core of the solution is to disallow externally dependent CTFE in all
 modules accepted into Phobos. This would also require a clear and
 concise explanation accompanying the CTFE documentation stating that
 calling external code via CTFE will most likely result in a compiler error.

 The reason I am proposing this solution is that I would argue that the
 author of std.datetime choose to utilize CTFE against an external module
 (in this case, the DRuntime) when Walter has (to the best of my
 knowledge) explicitly stated that DI files in the future would not
 contain unneeded implementations and that the current DI implementation
 was essentially a hack to get *something* working. Looking at the DI
 generation code, I can tell you, it is definitely not meant to be
 permanent, it is filled with hard-coded idiosyncrasies (like it's
 ridiculous indenting) that had to be fixed.

 In essence I am arguing that it is their usage of CTFE that is
 incorrect, and not the fault of DI generation.

Debating whose fault it is is a waste of time, because there is no fault.
 As such, those modules
 should be rewritten in light of said assumptions changing. It is my
 opinion that

Make Phobos build and submit a patch? Why would there an explicit strict policy need to be stated when the auto tester will just catch all 'offending' code?
 A longer term solution would be to give CTFE the ability to work on
 functions without the source code, so long as it mets the other rules of
 CTFE and can run the required code. However, there are some serious
 security issues that would need to be cleared up before this could work.

Security issues are a minor concern compared to actually making that work.
May 09 2012
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 09/05/12 22:53, Adam Wilson wrote:
 Complete Source DI's are a blocking bug for a significant chuck of the
 software development world.

Has this been a blocking issue for Python?
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 14:02:06 -0700, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 On Wed, 09 May 2012 16:45:36 -0400, Adam Wilson <flyboynw gmail.com>  
 wrote:

 On Wed, 09 May 2012 13:43:12 -0700, Timon Gehr <timon.gehr gmx.ch>  
 wrote:

 On 05/09/2012 10:40 PM, Adam Wilson wrote:
 ... unfortunately, Phobos won't compile with the
 patch applied because of the CTFE reliance on the DRT source.

It is actually the .di reliance. Maybe you should just fix the makefile so that it does not generate .di files for druntime.

It's not that easy. Phobos imports the DI's and it's make file would need to be adjusted as well. Which could very well break other things. Also, see the issues with SharedLibs. The issue of runtime commonality is no small deal.

core.time is not exactly "runtime". It's simply contains types *used* in the runtime. I don't see why it needs to be dynamic whatsoever. I think the right solution might be to split modules into "support" modules, and "runtime" modules, and only .di the runtime modules.

Essentially re-architect the DRT? I can see the argument for this, although from a consistency point of view the support modules should be .di extensions, that way in import statement is just *.di. It would certainly make it that much easier to build the DRT as a Shared Library. Is there any support on this idea from the community and from the stakeholders on the DRT?
 When I think of runtime parts of core, I think of the GC, threads, and  
 synchronization pieces (I think maybe intrinsics are there too).   
 Everything else is there as support.

 -Steve

-- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 14:04:17 -0700, Joseph Rushton Wakeling  
<joseph.wakeling webdrake.net> wrote:

 On 09/05/12 22:53, Adam Wilson wrote:
 Complete Source DI's are a blocking bug for a significant chuck of the
 software development world.

Has this been a blocking issue for Python?

Do companies regularly release python code to end-users? Also python is dynamic, all the issues that DI's solve around typing don't apply. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 14:10:27 -0700, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 05/09/2012 11:09 PM, Adam Wilson wrote:
 Essentially re-architect the DRT?

Slightly re-architect the makefile.

Right, but that would pretty significantly change the layout of the DRT at least the import level. Although it'd be nice to physically separate the support modules from the core runtime modules. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 14:14:18 -0700, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 05/09/2012 10:45 PM, Adam Wilson wrote:
 On Wed, 09 May 2012 13:43:12 -0700, Timon Gehr <timon.gehr gmx.ch>  
 wrote:

 On 05/09/2012 10:40 PM, Adam Wilson wrote:
 ... unfortunately, Phobos won't compile with the
 patch applied because of the CTFE reliance on the DRT source.

It is actually the .di reliance. Maybe you should just fix the makefile so that it does not generate .di files for druntime.

It's not that easy. Phobos imports the DI's

It imports the modules.

The phobos makefile imports the .di files from the drt. It just so happens that the .di files contain full implementations so it looks the same to the compiler.
 and it's make file would
 need to be adjusted as well. Which could very well break other things.

Slightly adjusting the makefile is certainly simpler than changing the language.

I completely agree. I would rather do this than change the language. At most I'd argue for an attribute the tells the DI generator to surface the implementation in the DI file.
 Also, see the issues with SharedLibs. The issue of runtime commonality
 is no small deal.

It is absolutely unrealistic to do that any time soon. The runtime still regularly gets essential updates.

For the moment yes, but lately that has been slowing down or at least that new pull rate on the DRT seems to be slowing down. But that doesn't negate the usefullness of having the DRT as a shared lib today. For example I could have multiple Shared Libraries that my company has written and their all compiled against the same DRT version. A Shared Lib in this case makes sense. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 14:17:30 -0700, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 05/09/2012 11:13 PM, Adam Wilson wrote:
 On Wed, 09 May 2012 14:10:27 -0700, Timon Gehr <timon.gehr gmx.ch>  
 wrote:

 On 05/09/2012 11:09 PM, Adam Wilson wrote:
 Essentially re-architect the DRT?

Slightly re-architect the makefile.

Right, but that would pretty significantly change the layout of the DRT at least the import level.

Why? For some modules do dmd -H ... for some other modules do cp module.d some/path/module.di

Or that, I'm not perfect (and I don't always see the easiest way), and I like it, it's simple. :-) Although the Phobos imports would have to be updated as well, not a big deal I suppose.
 Although it'd be nice to physically separate
 the support modules from the core runtime modules.

The directory layout is given by the import names.

-- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Wednesday, 9 May 2012 at 19:27:19 UTC, Adam Wilson wrote:
 The problem is thus: CTFE requires that any function that it  
 could possibly evaluated by CTFE, must retain it's  
 implementation. Unfortunately, there is simply no way for the  
 DI generation system to know which functions are capable of  
 being called by CTFE and which ones actually are.

 This limitation is due to the fact that DI generation must be  
 run before Semantic Analysis because said analysis may perform  
 significant rewrites of the AST. There is even a large (for  
 DMD) comment in the main function of DMD explaining that DI  
 generation should not be moved from where it is due to the  
 inconsistencies that could arise.

 The patch I created currently fails in the autotester because  
 the template function dur() in the druntime is called via CTFE  
 from Phobos. Per the community agreed upon DI rules, the  
 function implementation of the Duration constructor that is  
 called by the dur() function is stripped away and CTFE fails.

 We as a community need to decide how important these two  
 features are. Here are the Pro's of each feature as I see it. I 
  encourage you to add to this list and debate the merits of 
 each.

 Pro's for DI:
 Shared libraries are useless without proper header-style 
 interfaces to the code.
 Can reduce compile time.
 Required by business so as not to share the entire code-base of 
 their product.

 Pro's of CTFE:
 Makes writing certain types of otherwise complicated code 
 simple.
 Very useful to systems programmers.

 By my view of it, lack of DI is a major blocker to any business 
  looking to use D, including mine; and that CTFE is just  
 "nice-to-have". And I would argue that if D wants to see any  
 real usage above the 0.3% it got on the May TIOBE index, it  
 needs serious business investment in using the language. My  
 company would love to use D, but we can't because we don't want 
  to release our entire code-base; hence my work on the DI  
 generation patch. I would suggest to you that almost every  
 business looking at D is going to find the current DI situation 
  ... untenable.

Perhaps I missed something as I'm reading this. Why would this be such a big deal? As I understand it some of this comes from D couldn't compile to libraries (if that's different now I am not sure, haven't kept up with all the updates) so everything in phobos is distributed as source. If we can't compile to a callable library (static or dynamic) for a while and can't use CTFE on non-source, then the problem is more explicitly present and either needs a workaround or some type of convention. However IF we can compile to libraries and those compiled libraries are exported out with the .di files (I'd personally require the .di file information also bs part of the library as a public string, so you can't mix up wrong versions of .di files, which you can then extract) why then that a problem? The binary execution code is already available and we should be able to call it through the compiler as long as the interfaces are used properly. I see this only as a partial problem being as the compiler is written in C++ and not D. Course there are security issues, if a module harbored a virus and using CTFE or calling those functions unleashed it, assuming the program had permissions to do any damage... At which time the compiler would need very low permissions (or it's own UID) allowed to run so in those cases it could crash gracefully... Perhaps I'm just rambling now..
May 09 2012
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Wednesday, 9 May 2012 at 20:41:05 UTC, Adam Wilson wrote:
 Except that there is a distinct need for the DRuntime as a 
 shared library.

That doesn't really matter - you can deploy as a shared library and still use full source as the interface file. Hell, that's what putting implementations in the .di file does anyway!
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 15:07:44 -0700, Adam D. Ruppe  
<destructionator gmail.com> wrote:

 On Wednesday, 9 May 2012 at 20:41:05 UTC, Adam Wilson wrote:
 Except that there is a distinct need for the DRuntime as a shared  
 library.

That doesn't really matter - you can deploy as a shared library and still use full source as the interface file. Hell, that's what putting implementations in the .di file does anyway!

Sure, but a lot of software developers, particularly those with money, don't want their source getting out, and in a lot of cases, there is no good reason to distribute the source. There are also a bunch of cases where you don't even want something to be CTFEable like Walter's example on a different thread of the GC. Why would ever want to CTFE the GC? Until D starts to see some serious usage in business, it's never going to get out of "toy"/"hobby" language status in the eyes of the developer community at large. Few businesses want to release their source. DI's as a complete source file are a non-starter to that large segment of the development world. Improving DI generation is just taking down another barrier to D usage by that group of people. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Wednesday, 9 May 2012 at 22:07:45 UTC, Adam D. Ruppe wrote:
 On Wednesday, 9 May 2012 at 20:41:05 UTC, Adam Wilson wrote:
 Except that there is a distinct need for the DRuntime as a  
 shared library.

That doesn't really matter - you can deploy as a shared library and still use full source as the interface file. Hell, that's what putting implementations in the .di file does anyway!

Wow, what's what my post was basically asking about...
May 09 2012
prev sibling next sibling parent "foobar" <foo bar.com> writes:
On Wednesday, 9 May 2012 at 22:07:45 UTC, Adam D. Ruppe wrote:
 On Wednesday, 9 May 2012 at 20:41:05 UTC, Adam Wilson wrote:
 Except that there is a distinct need for the DRuntime as a 
 shared library.

That doesn't really matter - you can deploy as a shared library and still use full source as the interface file. Hell, that's what putting implementations in the .di file does anyway!

Not if his product is closed source and for business reasons the code can't be released. That was the entire premise of using .di files in the first place.
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 15:03:21 -0700, Era Scarecrow <rtcvb32 yahoo.com>  
wrote:

 On Wednesday, 9 May 2012 at 19:27:19 UTC, Adam Wilson wrote:
 The problem is thus: CTFE requires that any function that it  could  
 possibly evaluated by CTFE, must retain it's  implementation.  
 Unfortunately, there is simply no way for the  DI generation system to  
 know which functions are capable of  being called by CTFE and which  
 ones actually are.

 This limitation is due to the fact that DI generation must be  run  
 before Semantic Analysis because said analysis may perform  significant  
 rewrites of the AST. There is even a large (for  DMD) comment in the  
 main function of DMD explaining that DI  generation should not be moved  
 from where it is due to the  inconsistencies that could arise.

 The patch I created currently fails in the autotester because  the  
 template function dur() in the druntime is called via CTFE  from  
 Phobos. Per the community agreed upon DI rules, the  function  
 implementation of the Duration constructor that is  called by the dur()  
 function is stripped away and CTFE fails.

 We as a community need to decide how important these two  features are.  
 Here are the Pro's of each feature as I see it. I  encourage you to add  
 to this list and debate the merits of each.

 Pro's for DI:
 Shared libraries are useless without proper header-style interfaces to  
 the code.
 Can reduce compile time.
 Required by business so as not to share the entire code-base of their  
 product.

 Pro's of CTFE:
 Makes writing certain types of otherwise complicated code simple.
 Very useful to systems programmers.

 By my view of it, lack of DI is a major blocker to any business  
  looking to use D, including mine; and that CTFE is just   
 "nice-to-have". And I would argue that if D wants to see any  real  
 usage above the 0.3% it got on the May TIOBE index, it  needs serious  
 business investment in using the language. My  company would love to  
 use D, but we can't because we don't want  to release our entire  
 code-base; hence my work on the DI  generation patch. I would suggest  
 to you that almost every  business looking at D is going to find the  
 current DI situation  ... untenable.

Perhaps I missed something as I'm reading this. Why would this be such a big deal? As I understand it some of this comes from D couldn't compile to libraries (if that's different now I am not sure, haven't kept up with all the updates) so everything in phobos is distributed as source.

Theoretically D can compile Shared Libraries now. Which means that DI files are going to be more useful than ever.
   If we can't compile to a callable library (static or dynamic) for a  
 while and can't use CTFE on non-source, then the problem is more  
 explicitly present and either needs a workaround or some type of  
 convention.

CTFE cannot currently call a function without it's source.
   However IF we can compile to libraries and those compiled libraries  
 are exported out with the .di files (I'd personally require the .di file  
 information also bs part of the library as a public string, so you can't  
 mix up wrong versions of .di files, which you can then extract) why then  
 that a problem? The binary execution code is already available and we  
 should be able to call it through the compiler as long as the interfaces  
 are used properly. I see this only as a partial problem being as the  
 compiler is written in C++ and not D.


   Course there are security issues, if a module harbored a virus and  
 using CTFE or calling those functions unleashed it, assuming the program  
 had permissions to do any damage... At which time the compiler would  
 need very low permissions (or it's own UID) allowed to run so in those  
 cases it could crash gracefully...

   Perhaps I'm just rambling now..

-- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Wednesday, 9 May 2012 at 22:16:17 UTC, Adam Wilson wrote:
 On Wed, 09 May 2012 15:03:21 -0700, Era Scarecrow 
 <rtcvb32 yahoo.com> wrote:
  Why would this be such a big deal? As I understand it some of 
 this comes from D couldn't compile to libraries (if that's 
 different now I am not sure, haven't kept up with all the 
 updates) so everything in phobos is distributed as source.

Theoretically D can compile Shared Libraries now. Which means that DI files are going to be more useful than ever.
  If we can't compile to a callable library (static or dynamic) 
 for a while and can't use CTFE on non-source, then the problem 
 is more explicitly present and either needs a workaround or 
 some type of convention.

CTFE cannot currently call a function without it's source.

Currently? If it can later the problem goes away...
May 09 2012
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 09/05/12 23:10, Adam Wilson wrote:
 On Wed, 09 May 2012 14:04:17 -0700, Joseph Rushton Wakeling
 <joseph.wakeling webdrake.net> wrote:

 On 09/05/12 22:53, Adam Wilson wrote:
 Complete Source DI's are a blocking bug for a significant chuck of the
 software development world.

Has this been a blocking issue for Python?

Do companies regularly release python code to end-users?

OK, OK, you can release Python compiled to bytecode. JavaScript, then. You _have_ to pass the browser the full source. Has that stopped zillions of proprietary web applications?
May 09 2012
prev sibling next sibling parent reply "Nick Sabalausky" <SeeWebsiteToContactMe semitwist.com> writes:
My take, FWIW:

1. DI is only useful for those anachronistic corporations who beleive in 
code-hiding (and even then, only the ones who release libs), which 
regardless of everything else, isn't even *realistic* anyway - there's 
always reverse-engineering, and with the super-popular JS there *IS NO* 
pre-compiled form, and yet non-OSS companies *still* get by just fine 
anyway. If you're relying on the increasingly-irrelevent practice of 
code-hiding (which there is *no such thing* - only obfuscation, which is 
exactly what compiling does, it only obfuscates the source, it doesn't hide 
it), then you need to accept that there *are* going to be things you will 
*never* be able to do, period, like virtual templates (which *are* possible 
in theory if all the source is available, even if D doesn't currently allow 
it).

2. We should be seriously looking into the idea of making CTFE work by 
executing already-compiled code, a la Nemerle (but without needing the extra 
build step). There may be enough technical hurdles involved to hold this 
back for [the still-hypothtical] D3, but it should at least be a direction 
we should be seriously considering. (Unless someone can already come up with 
a deal-breaking reason now.) Actually, there's *FAR* more important things 
than this right now, like a solid ARM-tablet toolchain, so this should 
definitely just be an "on hold for now" feature.
May 09 2012
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, May 09, 2012 at 06:17:41PM -0400, Nick Sabalausky wrote:
 My take, FWIW:
 
 1. DI is only useful for those anachronistic corporations who beleive
 in code-hiding (and even then, only the ones who release libs), which
 regardless of everything else, isn't even *realistic* anyway - there's
 always reverse-engineering, and with the super-popular JS there *IS
 NO* pre-compiled form, and yet non-OSS companies *still* get by just
 fine anyway. If you're relying on the increasingly-irrelevent practice
 of code-hiding (which there is *no such thing* - only obfuscation,
 which is exactly what compiling does, it only obfuscates the source,
 it doesn't hide it), then you need to accept that there *are* going to
 be things you will *never* be able to do, period, like virtual
 templates (which *are* possible in theory if all the source is
 available, even if D doesn't currently allow it).

This is why I kept proposing that .di's should have zero implementation. ZERO. No function bodies, no template bodies, NOTHING except API information. All of the implementation stuff should be compiled into some intermediate form and stuck into special sections in the object file that the compiler understands. For example, it can be a serialized AST of the corresponding source. When you import the module, the compiler automatically looks up the corresponding section in the precompiled library and gets whatever info it needs (template bodies, CTFE function bodies, whatever). Yes such a thing can be reverse-engineered, but that is no different from distributing your binary in the first place (someone determined enough to steal your code will be able to reverse-engineer even the most sophisticated obfuscations you apply to your code -- if the CPU can run the code, it can be reverse-engineered). It really is just a matter of deterring the casual shoulder-peekers from peeking at your "precious" code. Code in the form of ASTs stored in the compiled library should be deterrent enough -- anyone that actually bothers to reverse engineer that is determined enough that you will not be able to stop him no matter what you do anyway.
 2. We should be seriously looking into the idea of making CTFE work by
 executing already-compiled code, a la Nemerle (but without needing the
 extra build step). There may be enough technical hurdles involved to
 hold this back for [the still-hypothtical] D3, but it should at least
 be a direction we should be seriously considering. (Unless someone can
 already come up with a deal-breaking reason now.) Actually, there's
 *FAR* more important things than this right now, like a solid
 ARM-tablet toolchain, so this should definitely just be an "on hold
 for now" feature.

+1. You do have the problem of what to do in a cross-compiler, though. T -- The peace of mind---from knowing that viruses which exploit Microsoft system vulnerabilities cannot touch Linux---is priceless. -- Frustrated system administrator.
May 09 2012
next sibling parent reply "Nick Sabalausky" <SeeWebsiteToContactMe semitwist.com> writes:
"H. S. Teoh" <hsteoh quickfur.ath.cx> wrote in message 
news:mailman.489.1336603453.24740.digitalmars-d puremagic.com...
 On Wed, May 09, 2012 at 06:17:41PM -0400, Nick Sabalausky wrote:
 My take, FWIW:

 1. DI is only useful for those anachronistic corporations who beleive
 in code-hiding (and even then, only the ones who release libs), which
 regardless of everything else, isn't even *realistic* anyway - there's
 always reverse-engineering, and with the super-popular JS there *IS
 NO* pre-compiled form, and yet non-OSS companies *still* get by just
 fine anyway. If you're relying on the increasingly-irrelevent practice
 of code-hiding (which there is *no such thing* - only obfuscation,
 which is exactly what compiling does, it only obfuscates the source,
 it doesn't hide it), then you need to accept that there *are* going to
 be things you will *never* be able to do, period, like virtual
 templates (which *are* possible in theory if all the source is
 available, even if D doesn't currently allow it).

This is why I kept proposing that .di's should have zero implementation. ZERO. No function bodies, no template bodies, NOTHING except API information. All of the implementation stuff should be compiled into some intermediate form and stuck into special sections in the object file that the compiler understands. For example, it can be a serialized AST of the corresponding source. When you import the module, the compiler automatically looks up the corresponding section in the precompiled library and gets whatever info it needs (template bodies, CTFE function bodies, whatever). Yes such a thing can be reverse-engineered, but that is no different from distributing your binary in the first place (someone determined enough to steal your code will be able to reverse-engineer even the most sophisticated obfuscations you apply to your code -- if the CPU can run the code, it can be reverse-engineered). It really is just a matter of deterring the casual shoulder-peekers from peeking at your "precious" code. Code in the form of ASTs stored in the compiled library should be deterrent enough -- anyone that actually bothers to reverse engineer that is determined enough that you will not be able to stop him no matter what you do anyway.

There's no need for all that. The whole point here is "Compile to some obfuscated form" right? So just make/use a good code obfuscator. Done. Problem solved. Inventing an AST storage format just to obfuscate some code is unnecesary overkill (although maybe it might have some other use). This "just use an obfuscator" approach even makes the whole DI system become totally redundant (except for binding to C code, of course).
May 09 2012
next sibling parent "Nick Sabalausky" <SeeWebsiteToContactMe semitwist.com> writes:
"Adam Wilson" <flyboynw gmail.com> wrote in message 
news:op.wd2prcc4707hn8 invictus.skynet.com...
 I actually agree with you, im just telling you what I hear from PHB's.

I was just kinda rambling anyway ;) Not directed at any particular poster.
 We need some way to export the symbols without the underlying code, it 
 makes for faster compile times and having the API handy can be useful to 
 development tools.
 However, my experience with PHB's is that as long as you don't send out 
 the actual source files but some form of sanitized header, the PHB's don't 
 really care beyond that.
 That'd why I think embedding a version of the source D files that has been 
 semantically analyzed could be helpful, you can pull in the source for 
 CTFE as needed, but the only thing you have to actually ship out is the 
 library file itself, it just happens to have source files inside. In my 
 experience in the .NET world, this is good enough for the PHB's. Out of 
 sight, out of mind as they say. So what if it's trickery, we developers 
 get a benefit to, we don't have to wrangle include files.

Well, if that works for the PHBs, then it works for me (Hmm...Never thought I'd say something like that ;) ) Thinking about it more, I suppose it's debatable whether a PHB-comlpiant obfuscator or a lib-with-embedded-source would be easier to implement and deal with.
May 09 2012
prev sibling next sibling parent deadalnix <deadalnix gmail.com> writes:
Le 10/05/2012 16:51, H. S. Teoh a écrit :
 Yeah I never understood the reasoning behind treating corporations as
 legal entities on the same level as persons. That simply makes no sense
 on so many levels it's not even funny.

I would believe that lie the day a company goes in prison. Or is sentenced to death where it is legal.
May 10 2012
prev sibling next sibling parent "Nick Sabalausky" <SeeWebsiteToContactMe semitwist.com> writes:
"Victor Vicente de Carvalho" <victor.v.carvalho gmail.com> wrote in message 
news:tjlsibfmwcpuvkvsncnc forum.dlang.org...
 I think you're missing the point here. Many companies ship their code as 
 .so's and java .classes and .NET clr not for obfuscation, but for 
 copyright. They spent a lot of money using their employees time to 
 generate valuable code and don't want to share it in a fashion that would 
 make it easy to copy/embed/integrate with their competitor. Plain and 
 simply. And fair and reasonably, from my point of view.

And obfuscation is their method of copyright protection, so everything I said still applies.
May 10 2012
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
Le 10/05/2012 22:52, H. S. Teoh a écrit :
 On Thu, May 10, 2012 at 03:39:47PM -0400, Nick Sabalausky wrote:
 [...]
 The only thing that can come close to being uncrackable is something
 that's so hard to use that most people wouldn't bother with it.
 Which gave me a funny thought: one way of writing code that nobody
 can steal is to write it in MALBOLGE
 (http://www.lscheffer.com/malbolge.shtml).

Geez that's insane! And I thought brainfuck was nuts.

Welcome to the weird and wonderful world of esolangs. :-) BF was just the beginning of the madness. INTERCAL is an amusing esolang in which you have to insert adequate amounts of the PLEASE command, otherwise you risk instant program termination. Then there's Java2K, which is a probabilistic language, in the sense that all operations only have a 90% chance of actually returning the correct result -- so the challenge is how to write code in such a way that the probability of your program producing the desired output is as close to 100% as you can get (it's not possible to guarantee 100% correctness... but then, what programming language can guarantee that? :-P). There's also Befunge and its spawn, in which the program counter is a vector rather than a counter, and program code is written on an n-dimensional grid. The neat thing about Befunge is that there's no special syntax for comments: you just write comments in-place and route your PC around the comment text. :-) (Clever programmers could, in theory, route certain program paths through comments to ensure that comments are always up-to-date with the code. :-P) In fact, I myself wrote a Befunge-like esolang once (when I was really, really, REALLY bored). My esolang is even weirder than Befunge in that it doesn't even have a program counter. Instead, the code consists of symbols written in a 4-dimensional array, and every iteration, every element of the array is "executed". Some symbols are passive (they don't do anything when "executed") and other symbols are active: they erase themselves from the current location and write themselves into a neighbouring location in the array. Based on the symbol in the target array cell, different things happen. The result is a bizarre physics-like simulation where active symbols (aka particles, representing binary 1 and 0) fly around in a 4D array, get reflected/duplicated by reflectors, and trigger output by simultaneously striking the output symbol in groups of 8, representing 8 bits of the output character (now you know what I chose a 4D array: in 4D, there are exactly 8 neighbours surrounding each cell). The hello world program, for example, looks like this: ..%.. ..}.. ..%.. ..... ..... ..'.. .v... .. v. ..... ..,.. ..-.. ..... %)%.. .v .. ..,.. ..... ..v.. '.... ...V. ..,.. ..... .... %(<.. .... ..%.. .. .. .+... ..V,. ..... .... .}... )!<]. .{)!< .+.{. ..... ..... .'+.. ]v-. )!<+. .{... ..].. .)!<. ..{]. .-)!< ...{. ..... ..-.. ..... ...-. ..v.. ..%.. .. .. ..%.. ..... .>))% ..+.. .^.v. .V+A. ..... ....+ % %.. }.].. %)!(% .^{V[ ..% % '.... .V^.. ..-.. ...A. ..'.. %((.. .... ..%.. .. .. ..%.. ..A.. ...,. .,... ..... .... ...]. .])!( )!<{. .[.+. ..... .... .,.'. -^]. ..)!< .^{. .]... )!<.. .[... ...-. ..... ..... .'... ..A.. ..... .. .. ..%.. ..[.. ..{.. ..... ..... ..... ...^. .A... ..... ..... %<(.. }.... %)%.. ..A. ..... .. .. .^... .. .. ..... ..... ))%.. ..].. ..%.. ..... ..... This program is written on a 5x5x5x5 array, which is represented in ASCII as a 5x5 grid of slanted 5x5 grids (thus covering all 4 dimensions). Why slanted? 'cos the perspective makes it clearer (*ahem*cough*) that they're slanting into the 3rd dimension. Granted, though, MALBOLGE takes the cake in terms of difficulty of implementation. My esolang is just a matter of routing and synchronizing the movement of particles so that they strike the output operator at the right times to produce the desired sequence of ASCII characters. MALBOLGE, OTOH, is so insanely twisted that no amount of rationalization will ever give you a working model of mapping desired semantics to code. T

Almost perfect. One of my favorite is french : goto++ http://www.gotopp.org/ Everything is made of goto. INTERCAL have the funiest comment system ever : all incorrect statements are comments.
May 10 2012
parent "Nick Sabalausky" <SeeWebsiteToContactMe semitwist.com> writes:
From: "deadalnix" <deadalnix gmail.com>
 Le 10/05/2012 22:52, H. S. Teoh a écrit :
 On Thu, May 10, 2012 at 03:39:47PM -0400, Nick Sabalausky wrote:
 [...]
 The only thing that can come close to being uncrackable is something
 that's so hard to use that most people wouldn't bother with it.
 Which gave me a funny thought: one way of writing code that nobody
 can steal is to write it in MALBOLGE
 (http://www.lscheffer.com/malbolge.shtml).

Geez that's insane! And I thought brainfuck was nuts.

Welcome to the weird and wonderful world of esolangs. :-) BF was just the beginning of the madness.


I love some of the brainfuck variants, in particular fuckfuck ( http://esolangs.org/wiki/Fuckfuck sample code: https://bitbucket.org/Abscissa/duckfuck/src/ba63ca55332a/hello.ff ) and whitespace ( http://compsoc.dur.ac.uk/whitespace/ Ok, maybe not exactly a bf equivalent, but similar)
 INTERCAL is an amusing esolang in which
 you have to insert adequate amounts of the PLEASE command, otherwise you
 risk instant program termination. Then there's Java2K, which is a
 probabilistic language, in the sense that all operations only have a 90%
 chance of actually returning the correct result [...]

 There's also Befunge and its spawn, in which the program counter is a
 vector rather than a counter, and program code is written on an
 n-dimensional grid. The neat thing about Befunge is that there's no
 special syntax for comments: you just write comments in-place and route
 your PC around the comment text. :-) [...]


Heh, cool :)
 In fact, I myself wrote a Befunge-like esolang once (when I was really,
 really, REALLY bored). My esolang is even weirder than Befunge in that
 it doesn't even have a program counter. Instead, the code consists of
 symbols written in a 4-dimensional array, and every iteration, every
 element of the array is "executed". Some symbols are passive (they don't
 do anything when "executed") and other symbols are active: they erase
 themselves from the current location and write themselves into a
 neighbouring location in the array. Based on the symbol in the target
 array cell, different things happen.


That's really cool.
 [...]MALBOLGE, OTOH, is so insanely twisted that no amount of 
 rationalization
 will ever give you a working model of mapping desired semantics to code.


Yup
 Almost perfect. One of my favorite is french : goto++ 
 http://www.gotopp.org/

Heh, for me, dealing with french code/docs would add to the challege. I know only a handful of words that aren't food-related ;)
 Everything is made of goto. INTERCAL have the funiest comment system ever 
 : all incorrect statements are comments.

Sounds like ActionScript 2. Yes. Seriously. It just ignores any statement that would normally be a runtime error (and IIRC, there's not much in the way of semantic errors either). It's a goddamn *accidental* isolang. Come to think of it, "accidental isolang" also describes PHP. (And I'm not trying to be tongue-in-cheek, I genuinely mean that.)
May 10 2012
prev sibling parent reply Paulo Pinto <pjmlp progtools.org> writes:
Am 10.05.2012 01:01, schrieb Adam Wilson:
 On Wed, 09 May 2012 15:55:36 -0700, Mehrdad <wfunction hotmail.com> wrote:

 I am 100% for this. It would be very .NET like. In fact I'm curious
 enough what it would take to make this work that I could see myself
 trying. My guess is that it needs a new linker with the glorious
 side-effect of dumping optlink! In that case it would mean upgrading
 the D backend to emit COFF (ELF and Mach-O already support custom
 sections), which I am fine with trying to do. Then you would add your
 AST or other intermediate representations to a custom section in the
 object file and the linker could then link it in. D would then need a
 way to extract said information. Which would not be terribly hard.
 Except that you'll have to train other compilers how to read that IR.
 Maybe we could train D to read the LLVM IR?

:O I was writing a response pretty much exactly like this (i.e. doing what .NET does), but then I dumped it, thinking it'd be dismissed as too huge of a change...

:-D This isn't the first time it's been suggested in recent forum history. I think there is a significant body of support for making D libraries single file with no import files, it solves a *TON* of issues around how to import API's. I imagine that it's much the same reason .NET went with their metadata plan. And ended up where we are suggesting to go.

And also possible in languages like Turbo Pascal, Delphi or more recent, Go. Actually, this is one of the features I really like in Go. -- Paulo
May 10 2012
parent Paulo Pinto <pjmlp progtools.org> writes:
Am 10.05.2012 20:37, schrieb Adam Wilson:
 On Thu, 10 May 2012 11:24:04 -0700, Paulo Pinto <pjmlp progtools.org>
 wrote:

 Am 10.05.2012 01:01, schrieb Adam Wilson:
 On Wed, 09 May 2012 15:55:36 -0700, Mehrdad <wfunction hotmail.com>
 wrote:

 I am 100% for this. It would be very .NET like. In fact I'm curious
 enough what it would take to make this work that I could see myself
 trying. My guess is that it needs a new linker with the glorious
 side-effect of dumping optlink! In that case it would mean upgrading
 the D backend to emit COFF (ELF and Mach-O already support custom
 sections), which I am fine with trying to do. Then you would add your
 AST or other intermediate representations to a custom section in the
 object file and the linker could then link it in. D would then need a
 way to extract said information. Which would not be terribly hard.
 Except that you'll have to train other compilers how to read that IR.
 Maybe we could train D to read the LLVM IR?

:O I was writing a response pretty much exactly like this (i.e. doing what .NET does), but then I dumped it, thinking it'd be dismissed as too huge of a change...

:-D This isn't the first time it's been suggested in recent forum history. I think there is a significant body of support for making D libraries single file with no import files, it solves a *TON* of issues around how to import API's. I imagine that it's much the same reason .NET went with their metadata plan. And ended up where we are suggesting to go.

And also possible in languages like Turbo Pascal, Delphi or more recent, Go. Actually, this is one of the features I really like in Go. -- Paulo

I am seriously considering starting this type of project given how strong the support for it is. However, I'd need help. Linkers aren't easy and the modifications that DMD will require are even worse. In the end we get a modern linker, written in D, and COFF support for DMD. At least that's how it goes in my head. I am thinking of kicking off the project proposal with a more detailed post later today.

Thanks to open source, here is some information how Free Pascal stores the required information, http://www.freepascal.org/docs-html/prog/progap1.html#progse67.html There is also some Oberon documentation at the ETHZ web sites, but those compilers usually spit an extra .sym file. Which could be easily embedded in the object file anyway. For Go, I am afraid currently the only information how they store the symbol table is "read the code". -- Paulo
May 10 2012
prev sibling next sibling parent "Mehrdad" <wfunction hotmail.com> writes:
 I am 100% for this. It would be very .NET like. In fact I'm 
 curious enough what it would take to make this work that I 
 could see myself trying. My guess is that it needs a new linker 
 with the glorious side-effect of dumping optlink! In that case 
 it would mean upgrading the D backend to emit COFF (ELF and 
 Mach-O already support custom sections), which I am fine with 
 trying to do. Then you would add your AST or other intermediate 
 representations to a custom section in the object file and the 
 linker could then link it in. D would then need a way to 
 extract said information. Which would not be terribly hard. 
 Except that you'll have to train other compilers how to read 
 that IR. Maybe we could train D to read the LLVM IR?

:O I was writing a response pretty much exactly like this (i.e. doing what .NET does), but then I dumped it, thinking it'd be dismissed as too huge of a change...
May 09 2012
prev sibling next sibling parent "Nick Sabalausky" <SeeWebsiteToContactMe semitwist.com> writes:
"Adam Wilson" <flyboynw gmail.com> wrote in message 
news:op.wd2beab6707hn8 apollo.hra.local...
 On Wed, 09 May 2012 15:17:41 -0700, Nick Sabalausky 
 <SeeWebsiteToContactMe semitwist.com> wrote:

 My take, FWIW:

 1. DI is only useful for those anachronistic corporations who beleive in
 code-hiding (and even then, only the ones who release libs), which
 regardless of everything else, isn't even *realistic* anyway - there's
 always reverse-engineering, and with the super-popular JS there *IS NO*
 pre-compiled form, and yet non-OSS companies *still* get by just fine
 anyway. If you're relying on the increasingly-irrelevent practice of
 code-hiding (which there is *no such thing* - only obfuscation, which is
 exactly what compiling does, it only obfuscates the source, it doesn't 
 hide
 it), then you need to accept that there *are* going to be things you will
 *never* be able to do, period, like virtual templates (which *are* 
 possible
 in theory if all the source is available, even if D doesn't currently 
 allow
 it).

Anachronistic or not, MANY companies still require it. And JS is not exactly D, they attack to very different segments. And most companies don't put anything of intellectual value in JS. But im not hear to argue the morality of the point. Only that the DI generation issue will stop a lot of groups from using D.

My random ranting made it unclear, but my main point was that if a company requires their libs be distributed in binary-only form - for *whatever* reason - then they MUST accept that there will be things they *can't* do. Note that's *not* merely some policy I'm proposing that D take - it's hard, immutable reality.
May 09 2012
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, May 09, 2012 at 08:06:24PM -0400, Nick Sabalausky wrote:
 "H. S. Teoh" <hsteoh quickfur.ath.cx> wrote in message 
 news:mailman.489.1336603453.24740.digitalmars-d puremagic.com...

 This is why I kept proposing that .di's should have zero
 implementation.  ZERO. No function bodies, no template bodies,
 NOTHING except API information. All of the implementation stuff
 should be compiled into some intermediate form and stuck into
 special sections in the object file that the compiler understands.
 For example, it can be a serialized AST of the corresponding source.
 When you import the module, the compiler automatically looks up the
 corresponding section in the precompiled library and gets whatever
 info it needs (template bodies, CTFE function bodies, whatever).


 There's no need for all that.
 
 The whole point here is "Compile to some obfuscated form" right? So
 just make/use a good code obfuscator. Done. Problem solved.
 
 Inventing an AST storage format just to obfuscate some code is
 unnecesary overkill (although maybe it might have some other use).
 This "just use an obfuscator" approach even makes the whole DI system
 become totally redundant (except for binding to C code, of course).

This is an interesting idea. Probably more feasible than mine. You don't necessarily have to throw away the DI system; some people may sleep better at night if their proprietary algorithms are in binary executable form only (though personally I think that's just self delusion, but who am I to judge?). So you just take the existing .di files, complete with all their warts and function bodies and whatever, and run an obfuscator on them. Ship the .di and your shared library as usual. Problem solved. Plus, all of this is already possible with the current system, except for the only missing piece of a D obfuscator. (Wait, I hear you cry. But what about the library API? How would users know how to use the library if the .di is incomprehensible? As somebody pointed out in another thread: just ship the ddocs generated from the unobfuscated source with the library. Users don't need to read the .di. Problem solved.) T -- IBM = I'll Buy Microsoft!
May 09 2012
parent "Nick Sabalausky" <SeeWebsiteToContactMe semitwist.com> writes:
"H. S. Teoh" <hsteoh quickfur.ath.cx> wrote in message 
news:mailman.510.1336610145.24740.digitalmars-d puremagic.com...
 On Wed, May 09, 2012 at 08:06:24PM -0400, Nick Sabalausky wrote:
 There's no need for all that.

 The whole point here is "Compile to some obfuscated form" right? So
 just make/use a good code obfuscator. Done. Problem solved.

 Inventing an AST storage format just to obfuscate some code is
 unnecesary overkill (although maybe it might have some other use).
 This "just use an obfuscator" approach even makes the whole DI system
 become totally redundant (except for binding to C code, of course).

This is an interesting idea. Probably more feasible than mine. You don't necessarily have to throw away the DI system; some people may sleep better at night if their proprietary algorithms are in binary executable form only (though personally I think that's just self delusion, but who am I to judge?).

An ambassador of sanity, that's who ;) It's not just your personal opinion, it's hard fact: From a reverse-engineering standpoint, executable binary form *is* nothing more than obfuscation (except perhaps if it's an *encrypted* binary form, but I've never heard of a lib that did that, heck it would require special toolchain support anyway). Believing binary libs are more secure than that is just simply incorrect, period, opinion doesn't enter into it. 2+2 *is* 4 whether you believe it or not. Life isn't looney tunes, you don't walk on air just because nobody taught you gravity. Etc. Fuck if you want to steal something, you don't even need any source - obfuscated or not. Commercial games, for example, never release *any* source. *Only* the final binaries are distributed, and even *those* are usually encrypted. And yet they *still* get pirated like crazy. So the source is locked away - fat lot of fucking good THAT did! (Ok, so it's harder to make an unauthorized modification, who the hell cares - the *original* is *already* out there getting ripped off, plus why would deviants wanna waste time modding when they can just sell bootlegs?) So source vs binary doesn't make a damn bit of difference, period - if all you have is the binary, well, to use it you just *run* it! You don't need *any* sources to use it. You just use it. The only thing that can even make any difference is encryption (which still isn't truly "secure"). And for that matter, nobody's algorithms are proprietary. Code is proprietary. 99.9999999...9999999% of algorithms are not. For example, wrapping some action in a foreach to make a batch processor and adding an option box to enable it is not a fucking proprietary algorithm no matter what the suits and the subhuman USPTO fuckwads think. Real-world example: There isn't a fucking thing proprietary in Marmalade's MKB build system (it's a stinking *build system* for fucks sake!). And even for any algos that are proprietary, if such algos even exist...well, why bother trying to get the source? If you've got the binary already, just *USE* it! Who cares about the damn source? If I wanted to give someone access to Marmalade's MKB build system, the fact that half of it's distributed in pyc-only does would do jack shit to stop me. And obviously there's no proprietary algos in there, again, it's just a fucking build system. So there's no algorithms to steal. *Only* thing it does is make it impractical for me to work around any problems I encounter. Oh yea, and it gives Marmalade's suit-department a big collective stiffy because their mini-monkey brains are telling them they're actually *earning* their paychecks. (Corona's 50x worse though, FWIW. You don't even *have* their software, you just rent the right to send your project to them and have *them* build it for you.) Excess offtopic ranting aside, *some* things are opinion: "Red is the best color" <-- That's an opinion. "It is/isn't worthwhile to keep the source locked up." <- Even *that's* an opinion, too, note the vauge "worthwhile". But merely having different viewpoints doesn't make something opinion. Either it's opinion or it isn't. You're not stating mere opinion here - you're stating honest-to-goodness FACT: Considering well-obfuscated source less secure than compiled binary form *is* delusion, period.
 So you just take the existing .di
 files, complete with all their warts and function bodies and whatever,
 and run an obfuscator on them. Ship the .di and your shared library as
 usual.  Problem solved.

Or just skip the di entirely. It'll all just get obfuscated one way or the other, so there's not much point.
 Plus, all of this is already possible with the current system, except
 for the only missing piece of a D obfuscator.

DustMite has some obfuscation capability, although I don't know how extensive it is or whether it would be enough make the pointy-haired suits happy. (Then again, *anything* can be enough to make a suit happy - you just have to present it in the right salesmany (read: "convoluted and full of shit") way. They'll swallow any amount of bullshit you give them, you just have to make it *sound* good. Suits don't know the difference. Fuck, most of them don't even know there *is* a difference. That's why salesmen exist - because bullshitting WORKS on suits (and on many others), in fact, most of the time, it's the *only* thing that works on suits. Bullshit is the only language those fuckers speak and understand.)
 (Wait, I hear you cry. But what about the library API? How would users
 know how to use the library if the .di is incomprehensible?  As somebody
 pointed out in another thread: just ship the ddocs generated from the
 unobfuscated source with the library. Users don't need to read the .di.
 Problem solved.)

Yea, the signatures (not even the definitions) of the symbols which make up the public interface are the only parts that must remain non-obfuscated. But of course, those *still* need to be non-obfuscated even under the old-fashioned C-style "lib + headers" approach. Don't even need any markup to signal these "don't touch" symbols to the obfuscator - just make a series of wrappers in separate files for the public API and tell the obfuscator "don't obfuscate the signatures in files x, y and z."
May 09 2012
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, May 09, 2012 at 10:24:48PM -0400, Nick Sabalausky wrote:
 "H. S. Teoh" <hsteoh quickfur.ath.cx> wrote in message 
 news:mailman.510.1336610145.24740.digitalmars-d puremagic.com...

 You don't necessarily have to throw away the DI system; some people
 may sleep better at night if their proprietary algorithms are in
 binary executable form only (though personally I think that's just
 self delusion, but who am I to judge?).

An ambassador of sanity, that's who ;) It's not just your personal opinion, it's hard fact: From a reverse-engineering standpoint, executable binary form *is* nothing more than obfuscation (except perhaps if it's an *encrypted* binary form, but I've never heard of a lib that did that, heck it would require special toolchain support anyway).

Encrypted binaries are still reverse-engineerable. Yes it may be very hard, if you use the right encryption algorithms, but the hard and cold fact is, if your CPU can run it, then it's reverse-engineerable. The only thing that can't be reverse-engineered is something that can't run on your CPU.
 Believing binary libs are more secure than that is just simply
 incorrect, period, opinion doesn't enter into it. 2+2 *is* 4 whether
 you believe it or not. Life isn't looney tunes, you don't walk on air
 just because nobody taught you gravity. Etc.

Well, it's not *that* black and white. Technically speaking, door locks are useless because somebody determined enough to break into your house will find a way (smash the lock or find a different entrance), no matter what you do. That doesn't imply that you should just forget about door locks (or doors, for that matter). A door lock isn't secure, technically speaking, but it still keeps out the petty thieves. Doesn't stop the professionals, but then 90% of thieves aren't professional. Ineffective as it is, door locks do still keep out 90% of would-be breakers into your house. So I'd say there's some value to be had there. Binary libs *aren't* secure, if you're talking about ultimate security. (If you want ultimate security, don't distribute your program. Period. Then nobody will be able to reverse engineer it. Problem solved.) But it does stop "petty thieves" from having their fun with it. And I daresay 90% of would-be code thieves aren't competent enough to know how to reverse engineer a binary anyway (since otherwise they'd just write their own code instead of stealing others' code). So it still does give some amount of protection there. Of course, there are a whole lot of other issues with binary-only distributions (read Richard Stallman's biography for poignant examples of that, etc.) -- which is why I don't believe in binary-only distributions. But that doesn't mean some people don't find value in it. [...]
 So source vs binary doesn't make a damn bit of difference, period - if
 all you have is the binary, well, to use it you just *run* it! You
 don't need *any* sources to use it. You just use it. The only thing
 that can even make any difference is encryption (which still isn't
 truly "secure").

Encryption only slows them down. It doesn't stop them if they are determined enough. And sometimes you don't *need* to break the encryption. The fact that the CPU eventually sees the decrypted code is good enough. I've personally traced encrypted bootloaders myself -- by running pieces of them in what's effectively a crude sandbox of sorts, allowing them to decrypt themselves/the subsequent stage and passing control back to me each time, thus alleviating any need of breaking the encryption in the first place. Remember, as long as it can run on your CPU, it can be reverse-engineered. You're just keeping out the petty thieves; determined professionals will break in no matter what you do.
 And for that matter, nobody's algorithms are proprietary. Code is
 proprietary. 99.9999999...9999999% of algorithms are not. For example,
 wrapping some action in a foreach to make a batch processor and adding
 an option box to enable it is not a fucking proprietary algorithm no
 matter what the suits and the subhuman USPTO fuckwads think.
 Real-world example: There isn't a fucking thing proprietary in
 Marmalade's MKB build system (it's a stinking *build system* for fucks
 sake!).

Well, that's a different kettle o' fish. There are a lot of idiotic patents out there (blame the PTO, blame the system, blame incompetent employees, blame whatever). Personally, I hate the system, but there are companies whose livelihood depends on keeping their l'il precious algos safe under the covers. (Even if it's something known for 20 years in the industry save to the one programmer of questionable repute who re-invented it (poorly) under the auspices of said company.) I don't believe in that kind of business model, but unfortunately many people do. It's a sad fact that in this day and age, patent-squatting is a widespread practice in the IT sector. I've even heard that some investors consider patent portfolio to be an important factor in a company's value -- i.e., the more patents you hold, the more valuable you are. Yeah, life sucks. Deal with it.
 [...] *Only* thing it does is make it impractical for me to work
 around any problems I encounter.

Yeah, it doesn't solve the theft problem and screws over your customers. What else is new in the corporate world? T -- To provoke is to call someone stupid; to argue is to call each other stupid.
May 09 2012
parent "Nick Sabalausky" <SeeWebsiteToContactMe semitwist.com> writes:
From: "H. S. Teoh" <hsteoh quickfur.ath.cx>
 On Wed, May 09, 2012 at 10:24:48PM -0400, Nick Sabalausky wrote:

 Believing binary libs are more secure than that is just simply
 incorrect, period, opinion doesn't enter into it. 2+2 *is* 4 whether
 you believe it or not. Life isn't looney tunes, you don't walk on air
 just because nobody taught you gravity. Etc.

Well, it's not *that* black and white. Technically speaking, door locks are useless because somebody determined enough to break into your house will find a way (smash the lock or find a different entrance), no matter what you do. That doesn't imply that you should just forget about door locks (or doors, for that matter). A door lock isn't secure, technically speaking, but it still keeps out the petty thieves. Doesn't stop the professionals, but then 90% of thieves aren't professional. Ineffective as it is, door locks do still keep out 90% of would-be breakers into your house. So I'd say there's some value to be had there. Binary libs *aren't* secure, if you're talking about ultimate security. [...] Of course, there are a whole lot of other issues with binary-only distributions[...] But that doesn't mean some people don't find value in it.

Right, I just meant specifically "well-obfuscated source" vs "non-encrypted compiled binaries". (And then I started ranting and raving and rambling ;) Hey, the three R's!)
 (read Richard Stallman's biography for poignant examples
 of that, etc.) -- which is why I don't believe in binary-only
 distributions.

I haven't read it (I'm afraid I'll agree with it *so much* that it'll just piss me off too much thinking about closed source and I wouldn't be able to get anything done for the rest of the day ;) ), but I've heard a little about it, and I think I'm pretty much on the same page. Basically, binary-only hurts your customers (for various reasons I won't list here), *really* hurts then if (erm..."when") you ever go under or just loose interest in the product, and it doesn't provide you with nearly as much benefit as it would seem (the binary itself can just be passed around, most people are honest as long as you don't give them reason not to be, and the dishonest people will be dishonest no matter what you do, etc.). Actually, here's a great example of the evils of closed...well, the evils of closed *platforms* which IMNSHO are 100x worse than merely "closed source software": http://www.techdirt.com/articles/20120326/08360818246/patents-threaten-to-silence-little-girl-literally.shtml
 [...]
 So source vs binary doesn't make a damn bit of difference, period - if
 all you have is the binary, well, to use it you just *run* it! You
 don't need *any* sources to use it. You just use it. The only thing
 that can even make any difference is encryption (which still isn't
 truly "secure").

Encryption only slows them down. It doesn't stop them if they are determined enough. And sometimes you don't *need* to break the encryption. The fact that the CPU eventually sees the decrypted code is good enough. I've personally traced encrypted bootloaders myself -- by running pieces of them in what's effectively a crude sandbox of sorts, allowing them to decrypt themselves/the subsequent stage and passing control back to me each time, thus alleviating any need of breaking the encryption in the first place. Remember, as long as it can run on your CPU, it can be reverse-engineered. You're just keeping out the petty thieves; determined professionals will break in no matter what you do.

Oh right, totally agree. Like I wads saying, all it can do is make *some* difference (unlike "well-obfuscated source" vs "non-encrypted compiled binaries"), and not actually be truly secure. But there are systems that are real PITA with encryption though: For example, the RockBox project never did (last I checked) manage to crack the Zune or the particular model of Toshiba Gigibeat the Zune was derived from (the "S" I think), and a big part of that was b/c of some nasty DRM/security measures that were built into the hardware itself, unlike a normal x86 for example. So you couldn't just do some simple man-in-the-middle like you described. Of course, game systems have hardware-level DRM/securtity too and they always get cracked, but they're much more popular than the Zune ever was (Which is a shame, I would have considered the original Zune 1 (not the shitty second one) to be the world's most perfect music player if it weren't for Apple-inspired truckload of DRM/lockout bullshit that was involved anytime you wanted it to communicate with a computer). Point being, consumer devices with hardware-level DRM/security fucking suck ;)
 And for that matter, nobody's algorithms are proprietary. Code is
 proprietary. 99.9999999...9999999% of algorithms are not. For example,
 wrapping some action in a foreach to make a batch processor and adding
 an option box to enable it is not a fucking proprietary algorithm no
 matter what the suits and the subhuman USPTO fuckwads think.
 Real-world example: There isn't a fucking thing proprietary in
 Marmalade's MKB build system (it's a stinking *build system* for fucks
 sake!).

Well, that's a different kettle o' fish. There are a lot of idiotic patents out there (blame the PTO, blame the system, blame incompetent employees, blame whatever). Personally, I hate the system, but there are companies whose livelihood depends on keeping their l'il precious algos safe under the covers. (Even if it's something known for 20 years in the industry save to the one programmer of questionable repute who re-invented it (poorly) under the auspices of said company.) I don't believe in that kind of business model, but unfortunately many people do. It's a sad fact that in this day and age, patent-squatting is a widespread practice in the IT sector. I've even heard that some investors consider patent portfolio to be an important factor in a company's value -- i.e., the more patents you hold, the more valuable you are.

Yea, the fact this shit is even *allowed* to exist in a *cough*"modern"*cough* society makes my blood boil. I know corporation are legal entities, but for sanity (let alone anything as luxurious as justice) to prevail, "corporate entities" must be deemed second-class citizens, at best. Meh, I usually try not to think about it just so I can actually get things done. And then I bitch about it at every opportunity ;)
 Yeah, life sucks. Deal with it.

Life sucks. Suits and corporations suck worse.
 [...] *Only* thing it does is make it impractical for me to work
 around any problems I encounter.

Yeah, it doesn't solve the theft problem and screws over your customers. What else is new in the corporate world?

That's why I *looove* OSS. Well, that and the fact that that if you want software that *just fucking works right* period, then 9 times out of 10 the only place it exists is the OSS scene (My current theory is that's b/c OSS projects are managed by developers rather than suits...but there's my venomous rantyness seeping in again ;) ).
May 10 2012
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, May 09, 2012 at 09:18:36PM -0700, Adam Wilson wrote:
 On Wed, 09 May 2012 21:12:53 -0700, Nick Sabalausky
 <SeeWebsiteToContactMe semitwist.com> wrote:

Well, if that works for the PHBs, then it works for me (Hmm...Never
thought I'd say something like that ;) )


Beware the dark side! ;-)
Thinking about it more, I suppose it's debatable whether a
PHB-comlpiant obfuscator or a lib-with-embedded-source would be
easier to implement and deal with.

I'm a fan of embedded source as it's relatively easy to get from the compiler when it's time to build the output file. No extra steps required. :-)

Exactly. We can satisfy the PHBs *and* have our fun with loading ASTs from the compiler too. Sounds like a win-win situation to me. T -- Some days you win; most days you lose.
May 09 2012
prev sibling next sibling parent "Victor Vicente de Carvalho" <victor.v.carvalho gmail.com> writes:
On Thursday, 10 May 2012 at 02:26:46 UTC, Nick Sabalausky wrote:
 "H. S. Teoh" <hsteoh quickfur.ath.cx> wrote in message
 news:mailman.510.1336610145.24740.digitalmars-d puremagic.com...
 On Wed, May 09, 2012 at 08:06:24PM -0400, Nick Sabalausky 
 wrote:
 There's no need for all that.

 The whole point here is "Compile to some obfuscated form" 
 right? So
 just make/use a good code obfuscator. Done. Problem solved.

 Inventing an AST storage format just to obfuscate some code is
 unnecesary overkill (although maybe it might have some other 
 use).
 This "just use an obfuscator" approach even makes the whole 
 DI system
 become totally redundant (except for binding to C code, of 
 course).

This is an interesting idea. Probably more feasible than mine. You don't necessarily have to throw away the DI system; some people may sleep better at night if their proprietary algorithms are in binary executable form only (though personally I think that's just self delusion, but who am I to judge?).

An ambassador of sanity, that's who ;) It's not just your personal opinion, it's hard fact: From a reverse-engineering standpoint, executable binary form *is* nothing more than obfuscation (except perhaps if it's an *encrypted* binary form, but I've never heard of a lib that did that, heck it would require special toolchain support anyway). Believing binary libs are more secure than that is just simply incorrect, period, opinion doesn't enter into it. 2+2 *is* 4 whether you believe it or not. Life isn't looney tunes, you don't walk on air just because nobody taught you gravity. Etc. Fuck if you want to steal something, you don't even need any source - obfuscated or not. Commercial games, for example, never release *any* source. *Only* the final binaries are distributed, and even *those* are usually encrypted. And yet they *still* get pirated like crazy. So the source is locked away - fat lot of fucking good THAT did! (Ok, so it's harder to make an unauthorized modification, who the hell cares - the *original* is *already* out there getting ripped off, plus why would deviants wanna waste time modding when they can just sell bootlegs?) So source vs binary doesn't make a damn bit of difference, period - if all you have is the binary, well, to use it you just *run* it! You don't need *any* sources to use it. You just use it. The only thing that can even make any difference is encryption (which still isn't truly "secure"). And for that matter, nobody's algorithms are proprietary. Code is proprietary. 99.9999999...9999999% of algorithms are not. For example, wrapping some action in a foreach to make a batch processor and adding an option box to enable it is not a fucking proprietary algorithm no matter what the suits and the subhuman USPTO fuckwads think. Real-world example: There isn't a fucking thing proprietary in Marmalade's MKB build system (it's a stinking *build system* for fucks sake!). And even for any algos that are proprietary, if such algos even exist...well, why bother trying to get the source? If you've got the binary already, just *USE* it! Who cares about the damn source? If I wanted to give someone access to Marmalade's MKB build system, the fact that half of it's distributed in pyc-only does would do jack shit to stop me. And obviously there's no proprietary algos in there, again, it's just a fucking build system. So there's no algorithms to steal. *Only* thing it does is make it impractical for me to work around any problems I encounter. Oh yea, and it gives Marmalade's suit-department a big collective stiffy because their mini-monkey brains are telling them they're actually *earning* their paychecks. (Corona's 50x worse though, FWIW. You don't even *have* their software, you just rent the right to send your project to them and have *them* build it for you.) Excess offtopic ranting aside, *some* things are opinion: "Red is the best color" <-- That's an opinion. "It is/isn't worthwhile to keep the source locked up." <- Even *that's* an opinion, too, note the vauge "worthwhile". But merely having different viewpoints doesn't make something opinion. Either it's opinion or it isn't. You're not stating mere opinion here - you're stating honest-to-goodness FACT: Considering well-obfuscated source less secure than compiled binary form *is* delusion, period.
 So you just take the existing .di
 files, complete with all their warts and function bodies and 
 whatever,
 and run an obfuscator on them. Ship the .di and your shared 
 library as
 usual.  Problem solved.

Or just skip the di entirely. It'll all just get obfuscated one way or the other, so there's not much point.
 Plus, all of this is already possible with the current system, 
 except
 for the only missing piece of a D obfuscator.

DustMite has some obfuscation capability, although I don't know how extensive it is or whether it would be enough make the pointy-haired suits happy. (Then again, *anything* can be enough to make a suit happy - you just have to present it in the right salesmany (read: "convoluted and full of shit") way. They'll swallow any amount of bullshit you give them, you just have to make it *sound* good. Suits don't know the difference. Fuck, most of them don't even know there *is* a difference. That's why salesmen exist - because bullshitting WORKS on suits (and on many others), in fact, most of the time, it's the *only* thing that works on suits. Bullshit is the only language those fuckers speak and understand.)
 (Wait, I hear you cry. But what about the library API? How 
 would users
 know how to use the library if the .di is incomprehensible?  
 As somebody
 pointed out in another thread: just ship the ddocs generated 
 from the
 unobfuscated source with the library. Users don't need to read 
 the .di.
 Problem solved.)

Yea, the signatures (not even the definitions) of the symbols which make up the public interface are the only parts that must remain non-obfuscated. But of course, those *still* need to be non-obfuscated even under the old-fashioned C-style "lib + headers" approach. Don't even need any markup to signal these "don't touch" symbols to the obfuscator - just make a series of wrappers in separate files for the public API and tell the obfuscator "don't obfuscate the signatures in files x, y and z."

I think you're missing the point here. Many companies ship their code as .so's and java .classes and .NET clr not for obfuscation, but for copyright. They spent a lot of money using their employees time to generate valuable code and don't want to share it in a fashion that would make it easy to copy/embed/integrate with their competitor. Plain and simply. And fair and reasonably, from my point of view. A motivated competitor would still get the code? Yes, sure. But the vast majority wouldn't be able to do that, be it for technical or financial reasons.
May 10 2012
prev sibling next sibling parent =?UTF-8?B?Ik1pY2hhw6ts?= Larouche" <michael.larouche gmail.com> writes:
On Thursday, 10 May 2012 at 12:56:00 UTC, Victor Vicente de 
Carvalho wrote:
 On Thursday, 10 May 2012 at 02:26:46 UTC, Nick Sabalausky wrote:
 "H. S. Teoh" <hsteoh quickfur.ath.cx> wrote in message
 news:mailman.510.1336610145.24740.digitalmars-d puremagic.com...
 On Wed, May 09, 2012 at 08:06:24PM -0400, Nick Sabalausky 
 wrote:
 There's no need for all that.

 The whole point here is "Compile to some obfuscated form" 
 right? So
 just make/use a good code obfuscator. Done. Problem solved.

 Inventing an AST storage format just to obfuscate some code 
 is
 unnecesary overkill (although maybe it might have some other 
 use).
 This "just use an obfuscator" approach even makes the whole 
 DI system
 become totally redundant (except for binding to C code, of 
 course).

This is an interesting idea. Probably more feasible than mine. You don't necessarily have to throw away the DI system; some people may sleep better at night if their proprietary algorithms are in binary executable form only (though personally I think that's just self delusion, but who am I to judge?).

An ambassador of sanity, that's who ;) It's not just your personal opinion, it's hard fact: From a reverse-engineering standpoint, executable binary form *is* nothing more than obfuscation (except perhaps if it's an *encrypted* binary form, but I've never heard of a lib that did that, heck it would require special toolchain support anyway). Believing binary libs are more secure than that is just simply incorrect, period, opinion doesn't enter into it. 2+2 *is* 4 whether you believe it or not. Life isn't looney tunes, you don't walk on air just because nobody taught you gravity. Etc. Fuck if you want to steal something, you don't even need any source - obfuscated or not. Commercial games, for example, never release *any* source. *Only* the final binaries are distributed, and even *those* are usually encrypted. And yet they *still* get pirated like crazy. So the source is locked away - fat lot of fucking good THAT did! (Ok, so it's harder to make an unauthorized modification, who the hell cares - the *original* is *already* out there getting ripped off, plus why would deviants wanna waste time modding when they can just sell bootlegs?) So source vs binary doesn't make a damn bit of difference, period - if all you have is the binary, well, to use it you just *run* it! You don't need *any* sources to use it. You just use it. The only thing that can even make any difference is encryption (which still isn't truly "secure"). And for that matter, nobody's algorithms are proprietary. Code is proprietary. 99.9999999...9999999% of algorithms are not. For example, wrapping some action in a foreach to make a batch processor and adding an option box to enable it is not a fucking proprietary algorithm no matter what the suits and the subhuman USPTO fuckwads think. Real-world example: There isn't a fucking thing proprietary in Marmalade's MKB build system (it's a stinking *build system* for fucks sake!). And even for any algos that are proprietary, if such algos even exist...well, why bother trying to get the source? If you've got the binary already, just *USE* it! Who cares about the damn source? If I wanted to give someone access to Marmalade's MKB build system, the fact that half of it's distributed in pyc-only does would do jack shit to stop me. And obviously there's no proprietary algos in there, again, it's just a fucking build system. So there's no algorithms to steal. *Only* thing it does is make it impractical for me to work around any problems I encounter. Oh yea, and it gives Marmalade's suit-department a big collective stiffy because their mini-monkey brains are telling them they're actually *earning* their paychecks. (Corona's 50x worse though, FWIW. You don't even *have* their software, you just rent the right to send your project to them and have *them* build it for you.) Excess offtopic ranting aside, *some* things are opinion: "Red is the best color" <-- That's an opinion. "It is/isn't worthwhile to keep the source locked up." <- Even *that's* an opinion, too, note the vauge "worthwhile". But merely having different viewpoints doesn't make something opinion. Either it's opinion or it isn't. You're not stating mere opinion here - you're stating honest-to-goodness FACT: Considering well-obfuscated source less secure than compiled binary form *is* delusion, period.
 So you just take the existing .di
 files, complete with all their warts and function bodies and 
 whatever,
 and run an obfuscator on them. Ship the .di and your shared 
 library as
 usual.  Problem solved.

Or just skip the di entirely. It'll all just get obfuscated one way or the other, so there's not much point.
 Plus, all of this is already possible with the current 
 system, except
 for the only missing piece of a D obfuscator.

DustMite has some obfuscation capability, although I don't know how extensive it is or whether it would be enough make the pointy-haired suits happy. (Then again, *anything* can be enough to make a suit happy - you just have to present it in the right salesmany (read: "convoluted and full of shit") way. They'll swallow any amount of bullshit you give them, you just have to make it *sound* good. Suits don't know the difference. Fuck, most of them don't even know there *is* a difference. That's why salesmen exist - because bullshitting WORKS on suits (and on many others), in fact, most of the time, it's the *only* thing that works on suits. Bullshit is the only language those fuckers speak and understand.)
 (Wait, I hear you cry. But what about the library API? How 
 would users
 know how to use the library if the .di is incomprehensible?  
 As somebody
 pointed out in another thread: just ship the ddocs generated 
 from the
 unobfuscated source with the library. Users don't need to 
 read the .di.
 Problem solved.)

Yea, the signatures (not even the definitions) of the symbols which make up the public interface are the only parts that must remain non-obfuscated. But of course, those *still* need to be non-obfuscated even under the old-fashioned C-style "lib + headers" approach. Don't even need any markup to signal these "don't touch" symbols to the obfuscator - just make a series of wrappers in separate files for the public API and tell the obfuscator "don't obfuscate the signatures in files x, y and z."

I think you're missing the point here. Many companies ship their code as .so's and java .classes and .NET clr not for obfuscation, but for copyright. They spent a lot of money using their employees time to generate valuable code and don't want to share it in a fashion that would make it easy to copy/embed/integrate with their competitor. Plain and simply. And fair and reasonably, from my point of view. A motivated competitor would still get the code? Yes, sure. But the vast majority wouldn't be able to do that, be it for technical or financial reasons.

I agree, you can't expect a company that invested so much money and time in research and development to give their source code away to their competitors (I'm thinking Havok Physics and other game middleware for instance). Writing good and optimized software take time and care. Also, support may seen superficial but I can tell you I'm really glad we can contact middleware suppliers when we got a bug that block our production, that's mean we can use the time waiting for the reply to do other things.
May 10 2012
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, May 10, 2012 at 03:50:42AM -0400, Nick Sabalausky wrote:
[...]
 Actually, here's a great example of the evils of closed...well, the
 evils of closed *platforms* which IMNSHO are 100x worse than merely
 "closed source software":
 
 http://www.techdirt.com/articles/20120326/08360818246/patents-threaten-to-silence-little-girl-literally.shtml

That wouldn't be the first time something like this happened. It happens on a routine basis. Pharmaceutical companies do things like this all the time. They will refuse to fund research, or actively seek to hamper research (e.g. by holding patents they threaten to sue anyone with who dares produce said cure), that may lead to a cure to some disease which they currently sell treatments for, or they will outright refuse to produce said cure when it's discovered -- because once there's a cure, there's no more need of treatments, so they will lose money. So they'd rather there will *never* be a cure so that people will continue buying treatments. Human life? What's that? Never heard of such a thing. More money, more wealth, at the expense of our customers! [...]
 But there are systems that are real PITA with encryption though: For
 example, the RockBox project never did (last I checked) manage to
 crack the Zune or the particular model of Toshiba Gigibeat the Zune
 was derived from (the "S" I think), and a big part of that was b/c of
 some nasty DRM/security measures that were built into the hardware
 itself, unlike a normal x86 for example. So you couldn't just do some
 simple man-in-the-middle like you described. Of course, game systems
 have hardware-level DRM/securtity too and they always get cracked, but
 they're much more popular than the Zune ever was (Which is a shame, I
 would have considered the original Zune 1 (not the shitty second one)
 to be the world's most perfect music player if it weren't for
 Apple-inspired truckload of DRM/lockout bullshit that was involved
 anytime you wanted it to communicate with a computer). Point being,
 consumer devices with hardware-level DRM/security fucking suck ;)

Yeah, wasn't TPM being touted as the silver bullet to viruses and hacking and stuff, like, a decade ago? And nowadays they're nowhere near as widespread as they were predicted to be? The fact is, hardware encryption makes interoperability a big bear, and today's online world is all about interoperability. It also makes things a royal pain to use, should you ever want to upgrade your system and migrate your data. The only thing that can come close to being uncrackable is something that's so hard to use that most people wouldn't bother with it. Which gave me a funny thought: one way of writing code that nobody can steal is to write it in MALBOLGE (http://www.lscheffer.com/malbolge.shtml). Then you can freely distribute source code and everything -- nobody will be able to understand how the damn thing works, and they won't be able to modify the code without totally breaking it. The only catch is, this "nobody" includes the programmer, because MALBOLGE is practically impossible to write non-trivial programs in. For example, here's the hello world program: http://www2.latech.edu/~acm/helloworld/malbolge.html This is, to date, the most complex MALBOLGE program _ever_ written. Now, say you wish the change the message from hello world to something else, say goodbye code thieves. You basically have to re-architect the whole thing from scratch. It's not a matter of changing a few characters here and there; you have to literally start over from the drawing board. The resulting program will look NOTHING like this one. Now try writing cp in MALBOLGE. The day you can do that, is the day you can retire, because it's so incredibly hard that you might as well be solving the halting problem instead. [...]
 Yea, the fact this shit is even *allowed* to exist in a
 *cough*"modern"*cough* society makes my blood boil. I know corporation
 are legal entities, but for sanity (let alone anything as luxurious as
 justice) to prevail, "corporate entities" must be deemed second-class
 citizens, at best. Meh, I usually try not to think about it just so I
 can actually get things done. And then I bitch about it at every
 opportunity ;)

Yeah I never understood the reasoning behind treating corporations as legal entities on the same level as persons. That simply makes no sense on so many levels it's not even funny. And it leads to stupidities like "the corporation" doing things that no person with any shred of conscience would be able to do and be able to live with themselves afterwards. As though the corporation has a personality of its own apart from the personalities of its constituents. Never made a shred of sense to me. T -- "Computer Science is no more about computers than astronomy is about telescopes." -- E.W. Dijkstra
May 10 2012
parent "Nick Sabalausky" <SeeWebsiteToContactMe semitwist.com> writes:
"H. S. Teoh" <hsteoh quickfur.ath.cx> wrote in message 
news:mailman.525.1336661506.24740.digitalmars-d puremagic.com...
 On Thu, May 10, 2012 at 03:50:42AM -0400, Nick Sabalausky wrote:
 [...]
 Actually, here's a great example of the evils of closed...well, the
 evils of closed *platforms* which IMNSHO are 100x worse than merely
 "closed source software":

 http://www.techdirt.com/articles/20120326/08360818246/patents-threaten-to-silence-little-girl-literally.shtml

That wouldn't be the first time something like this happened. It happens on a routine basis. Pharmaceutical companies do things like this all the time. They will refuse to fund research, or actively seek to hamper research (e.g. by holding patents they threaten to sue anyone with who dares produce said cure), that may lead to a cure to some disease which they currently sell treatments for, or they will outright refuse to produce said cure when it's discovered -- because once there's a cure, there's no more need of treatments, so they will lose money. So they'd rather there will *never* be a cure so that people will continue buying treatments. Human life? What's that? Never heard of such a thing. More money, more wealth, at the expense of our customers!

I always suspected that.
 The only thing that can come close to being uncrackable is something
 that's so hard to use that most people wouldn't bother with it. Which
 gave me a funny thought: one way of writing code that nobody can steal
 is to write it in MALBOLGE (http://www.lscheffer.com/malbolge.shtml).

Geez that's insane! And I thought brainfuck was nuts.
May 10 2012
prev sibling next sibling parent "Richard Webb" <webby beardmouse.org.uk> writes:
On Thursday, 10 May 2012 at 14:50:22 UTC, H. S. Teoh wrote:

 This is, to date, the most complex MALBOLGE program _ever_ 
 written.

This one looks a bit more complex: http://www.99-bottles-of-beer.net/language-malbolge-995.html
May 10 2012
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, May 10, 2012 at 03:39:47PM -0400, Nick Sabalausky wrote:
[...]
 The only thing that can come close to being uncrackable is something
 that's so hard to use that most people wouldn't bother with it.
 Which gave me a funny thought: one way of writing code that nobody
 can steal is to write it in MALBOLGE
 (http://www.lscheffer.com/malbolge.shtml).

Geez that's insane! And I thought brainfuck was nuts.

Welcome to the weird and wonderful world of esolangs. :-) BF was just the beginning of the madness. INTERCAL is an amusing esolang in which you have to insert adequate amounts of the PLEASE command, otherwise you risk instant program termination. Then there's Java2K, which is a probabilistic language, in the sense that all operations only have a 90% chance of actually returning the correct result -- so the challenge is how to write code in such a way that the probability of your program producing the desired output is as close to 100% as you can get (it's not possible to guarantee 100% correctness... but then, what programming language can guarantee that? :-P). There's also Befunge and its spawn, in which the program counter is a vector rather than a counter, and program code is written on an n-dimensional grid. The neat thing about Befunge is that there's no special syntax for comments: you just write comments in-place and route your PC around the comment text. :-) (Clever programmers could, in theory, route certain program paths through comments to ensure that comments are always up-to-date with the code. :-P) In fact, I myself wrote a Befunge-like esolang once (when I was really, really, REALLY bored). My esolang is even weirder than Befunge in that it doesn't even have a program counter. Instead, the code consists of symbols written in a 4-dimensional array, and every iteration, every element of the array is "executed". Some symbols are passive (they don't do anything when "executed") and other symbols are active: they erase themselves from the current location and write themselves into a neighbouring location in the array. Based on the symbol in the target array cell, different things happen. The result is a bizarre physics-like simulation where active symbols (aka particles, representing binary 1 and 0) fly around in a 4D array, get reflected/duplicated by reflectors, and trigger output by simultaneously striking the output symbol in groups of 8, representing 8 bits of the output character (now you know what I chose a 4D array: in 4D, there are exactly 8 neighbours surrounding each cell). The hello world program, for example, looks like this: ..%.. ..}.. ..%.. ..... ..... ..'.. .v... .. v. ..... ..,.. ..-.. ..... %)%.. .v .. ..,.. ..... ..v.. '.... ...V. ..,.. ..... .... %(<.. .... ..%.. .. .. .+... ..V,. ..... .... .}... )!<]. .{)!< .+.{. ..... ..... .'+.. ]v-. )!<+. .{... ..].. .)!<. ..{]. .-)!< ...{. ..... ..-.. ..... ...-. ..v.. ..%.. .. .. ..%.. ..... .>))% ..+.. .^.v. .V+A. ..... ....+ % %.. }.].. %)!(% .^{V[ ..% % '.... .V^.. ..-.. ...A. ..'.. %((.. .... ..%.. .. .. ..%.. ..A.. ...,. .,... ..... .... ...]. .])!( )!<{. .[.+. ..... .... .,.'. -^]. ..)!< .^{. .]... )!<.. .[... ...-. ..... ..... .'... ..A.. ..... .. .. ..%.. ..[.. ..{.. ..... ..... ..... ...^. .A... ..... ..... %<(.. }.... %)%.. ..A. ..... .. .. .^... .. .. ..... ..... ))%.. ..].. ..%.. ..... ..... This program is written on a 5x5x5x5 array, which is represented in ASCII as a 5x5 grid of slanted 5x5 grids (thus covering all 4 dimensions). Why slanted? 'cos the perspective makes it clearer (*ahem*cough*) that they're slanting into the 3rd dimension. Granted, though, MALBOLGE takes the cake in terms of difficulty of implementation. My esolang is just a matter of routing and synchronizing the movement of particles so that they strike the output operator at the right times to produce the desired sequence of ASCII characters. MALBOLGE, OTOH, is so insanely twisted that no amount of rationalization will ever give you a working model of mapping desired semantics to code. T -- Some days you win; most days you lose.
May 10 2012
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, May 10, 2012 at 12:18:56AM +0200, Joseph Rushton Wakeling wrote:
 On 09/05/12 23:10, Adam Wilson wrote:

Do companies regularly release python code to end-users?

OK, OK, you can release Python compiled to bytecode. JavaScript, then. You _have_ to pass the browser the full source. Has that stopped zillions of proprietary web applications?

Which is what fueled the market for hundreds (if not thousands) of JS obfuscators. T -- To provoke is to call someone stupid; to argue is to call each other stupid.
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 15:18:49 -0700, Era Scarecrow <rtcvb32 yahoo.com>  
wrote:

 On Wednesday, 9 May 2012 at 22:16:17 UTC, Adam Wilson wrote:
 On Wed, 09 May 2012 15:03:21 -0700, Era Scarecrow <rtcvb32 yahoo.com>  
 wrote:
  Why would this be such a big deal? As I understand it some of this  
 comes from D couldn't compile to libraries (if that's different now I  
 am not sure, haven't kept up with all the updates) so everything in  
 phobos is distributed as source.

Theoretically D can compile Shared Libraries now. Which means that DI files are going to be more useful than ever.
  If we can't compile to a callable library (static or dynamic) for a  
 while and can't use CTFE on non-source, then the problem is more  
 explicitly present and either needs a workaround or some type of  
 convention.

CTFE cannot currently call a function without it's source.

Currently? If it can later the problem goes away...

There are no plans to do this at all any point in the future so waiting for it would be fruitless. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 15:17:41 -0700, Nick Sabalausky  
<SeeWebsiteToContactMe semitwist.com> wrote:

 My take, FWIW:

 1. DI is only useful for those anachronistic corporations who beleive in
 code-hiding (and even then, only the ones who release libs), which
 regardless of everything else, isn't even *realistic* anyway - there's
 always reverse-engineering, and with the super-popular JS there *IS NO*
 pre-compiled form, and yet non-OSS companies *still* get by just fine
 anyway. If you're relying on the increasingly-irrelevent practice of
 code-hiding (which there is *no such thing* - only obfuscation, which is
 exactly what compiling does, it only obfuscates the source, it doesn't  
 hide
 it), then you need to accept that there *are* going to be things you will
 *never* be able to do, period, like virtual templates (which *are*  
 possible
 in theory if all the source is available, even if D doesn't currently  
 allow
 it).

Anachronistic or not, MANY companies still require it. And JS is not exactly D, they attack to very different segments. And most companies don't put anything of intellectual value in JS. But im not hear to argue the morality of the point. Only that the DI generation issue will stop a lot of groups from using D.
 2. We should be seriously looking into the idea of making CTFE work by
 executing already-compiled code, a la Nemerle (but without needing the  
 extra
 build step). There may be enough technical hurdles involved to hold this
 back for [the still-hypothtical] D3, but it should at least be a  
 direction
 we should be seriously considering. (Unless someone can already come up  
 with
 a deal-breaking reason now.) Actually, there's *FAR* more important  
 things
 than this right now, like a solid ARM-tablet toolchain, so this should
 definitely just be an "on hold for now" feature.

I concur here. There are definitely more important things than making CTFE work against object-code. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Wednesday, 9 May 2012 at 22:15:02 UTC, Adam Wilson wrote:
 Sure, but a lot of software developers, particularly those with 
 money, don't want their source getting out, and in a lot of 
 cases, there is no good reason to distribute the source.

Yeah, you're preaching to the choir... which is why I'm against changing anything for druntime's sake. Your path for the .di files does the right thing. Don't change that just because it breaks druntime. druntime is open source, so there's no need for it to be using .di generation in the first place, whether it is a shared library or not
May 09 2012
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, May 10, 2012 00:18:49 Era Scarecrow wrote:
 On Wednesday, 9 May 2012 at 22:16:17 UTC, Adam Wilson wrote:
 On Wed, 09 May 2012 15:03:21 -0700, Era Scarecrow
 
 <rtcvb32 yahoo.com> wrote:
 Why would this be such a big deal? As I understand it some of
 
 this comes from D couldn't compile to libraries (if that's
 different now I am not sure, haven't kept up with all the
 updates) so everything in phobos is distributed as source.

Theoretically D can compile Shared Libraries now. Which means that DI files are going to be more useful than ever.
 If we can't compile to a callable library (static or dynamic)
 
 for a while and can't use CTFE on non-source, then the problem
 is more explicitly present and either needs a workaround or
 some type of convention.

CTFE cannot currently call a function without it's source.

Currently? If it can later the problem goes away...

Not going to happen for D2. CTFE would have to be completely redesigned for it to not need the full source. _If_ that were to ever happen, it would have to be in a future version of D (which may or may not ever happen but definitely won't happen soon). - Jonathan M Davis
May 09 2012
prev sibling next sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 10/05/12 00:25, H. S. Teoh wrote:
 Which is what fueled the market for hundreds (if not thousands) of JS
 obfuscators.

Well, that's kind of my point really. Is it so bad (from a proprietary point of view) to have to distribute .d rather than .di files, if you can obfuscate them?
May 09 2012
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Wednesday, 9 May 2012 at 22:30:22 UTC, Jonathan M Davis wrote:
 On Thursday, May 10, 2012 00:18:49 Era Scarecrow wrote:
 On Wednesday, 9 May 2012 at 22:16:17 UTC, Adam Wilson wrote:
 CTFE cannot currently call a function without it's source.

Currently? If it can later the problem goes away...

Not going to happen for D2. CTFE would have to be completely redesigned for it to not need the full source. _If_ that were to ever happen, it would have to be in a future version of D (which may or may not ever happen but definitely won't happen soon).

Gotcha. Probably better this way for now. I'll look forward to when that happens, but I won't be holding my breath :)
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 15:29:23 -0700, Adam D. Ruppe  
<destructionator gmail.com> wrote:

 On Wednesday, 9 May 2012 at 22:15:02 UTC, Adam Wilson wrote:
 Sure, but a lot of software developers, particularly those with money,  
 don't want their source getting out, and in a lot of cases, there is no  
 good reason to distribute the source.

Yeah, you're preaching to the choir... which is why I'm against changing anything for druntime's sake. Your path for the .di files does the right thing. Don't change that just because it breaks druntime. druntime is open source, so there's no need for it to be using .di generation in the first place, whether it is a shared library or not

Actually there is a need for a shared library DRT, and that is when there are other multiple libraries that depend on DRT (anything made with D) being linked into the same executable. Specifically, what is happening is that static libraries that are compiled with two different versions of the DRT will crash on start due to conflicts between the two codebases. This was particularly noticeable anytime the GC changed between two releases. But if both are compiled against a shared library, and the DRT API hasn't changed then it doesn't matter which version of the DRT is used, both shared libs are dynamically linked to the same version of the DRT. This was discussed a few months ago and it was agreed that DRT needs a shared library variant. IIRC that was actually the impetus for much of the shared library work that went into 2.058. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 15:34:30 -0700, Joseph Rushton Wakeling  
<joseph.wakeling webdrake.net> wrote:

 On 10/05/12 00:25, H. S. Teoh wrote:
 Which is what fueled the market for hundreds (if not thousands) of JS
 obfuscators.

Well, that's kind of my point really. Is it so bad (from a proprietary point of view) to have to distribute .d rather than .di files, if you can obfuscate them?

In a word yes. Obsfucation hides details, but it can't hide algorithms very well. It can just make them harder to follow. Hence why most companies don't put anything of value into JS. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, May 10, 2012 00:29:23 Adam D. Ruppe wrote:
 On Wednesday, 9 May 2012 at 22:15:02 UTC, Adam Wilson wrote:
 Sure, but a lot of software developers, particularly those with
 money, don't want their source getting out, and in a lot of
 cases, there is no good reason to distribute the source.

Yeah, you're preaching to the choir... which is why I'm against changing anything for druntime's sake. Your path for the .di files does the right thing. Don't change that just because it breaks druntime. druntime is open source, so there's no need for it to be using .di generation in the first place, whether it is a shared library or not

What we probably should do is change druntime's makefile so that it generates .di files for certain files and just uses the .d files for others. And in some cases, we may want to do what we do with object.di and hand edit the .di file rather than generate it every time. Some stuff should definitely use .di files (e.g. the rt stuff), but other stuff clearly shouldn't (e.g core.time). thread.d actually seems to already have a hand-crafted .di file. And for much of druntime, generating .di files is utterly pointless anyway, because it's just a bunch of extern(C) declarations. Regardless of the state of automatic .di generation, I think that the blind generation of .di files for pretty much all of druntime like we do now is a mistake. If it were fixed so that only specific portions used automatic .di generation or so that all of its .di files were maintained by hand, then it wouldn't matter if the automatic .di generation was overly enthusiastic in stripping out code. - Jonathan M Davis
May 09 2012
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Wednesday, 9 May 2012 at 22:41:12 UTC, Adam Wilson wrote:
 Actually there is a need for a shared library DRT

My point is though that shared library and .di are orthogonal issues here. You can use a shared library with full source files as imports. You can use a static library with no implementation .di files.
May 09 2012
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Wednesday, 9 May 2012 at 22:44:01 UTC, Jonathan M Davis wrote:
 What we probably should do is change druntime's makefile so 
 that it generates .di files for certain files and just uses
 the .d files for others.

Yes, I agree. Perhaps at some point we'll want a hint for di generation on a function by function basis, but for druntime, using the .d files as the interface (when needed) is the best solution.
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 15:46:42 -0700, Adam D. Ruppe  
<destructionator gmail.com> wrote:

 On Wednesday, 9 May 2012 at 22:41:12 UTC, Adam Wilson wrote:
 Actually there is a need for a shared library DRT

My point is though that shared library and .di are orthogonal issues here. You can use a shared library with full source files as imports.

IIRC D compiles in the implementation code in the D and does not use the code in the shared library, but I may be wrong.
 You can use a static library with no implementation .di files.

-- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 15:43:57 -0700, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Wed, May 09, 2012 at 06:17:41PM -0400, Nick Sabalausky wrote:
 My take, FWIW:

 1. DI is only useful for those anachronistic corporations who beleive
 in code-hiding (and even then, only the ones who release libs), which
 regardless of everything else, isn't even *realistic* anyway - there's
 always reverse-engineering, and with the super-popular JS there *IS
 NO* pre-compiled form, and yet non-OSS companies *still* get by just
 fine anyway. If you're relying on the increasingly-irrelevent practice
 of code-hiding (which there is *no such thing* - only obfuscation,
 which is exactly what compiling does, it only obfuscates the source,
 it doesn't hide it), then you need to accept that there *are* going to
 be things you will *never* be able to do, period, like virtual
 templates (which *are* possible in theory if all the source is
 available, even if D doesn't currently allow it).

This is why I kept proposing that .di's should have zero implementation. ZERO. No function bodies, no template bodies, NOTHING except API information. All of the implementation stuff should be compiled into some intermediate form and stuck into special sections in the object file that the compiler understands. For example, it can be a serialized AST of the corresponding source. When you import the module, the compiler automatically looks up the corresponding section in the precompiled library and gets whatever info it needs (template bodies, CTFE function bodies, whatever). Yes such a thing can be reverse-engineered, but that is no different from distributing your binary in the first place (someone determined enough to steal your code will be able to reverse-engineer even the most sophisticated obfuscations you apply to your code -- if the CPU can run the code, it can be reverse-engineered). It really is just a matter of deterring the casual shoulder-peekers from peeking at your "precious" code. Code in the form of ASTs stored in the compiled library should be deterrent enough -- anyone that actually bothers to reverse engineer that is determined enough that you will not be able to stop him no matter what you do anyway.

I am 100% for this. It would be very .NET like. In fact I'm curious enough what it would take to make this work that I could see myself trying. My guess is that it needs a new linker with the glorious side-effect of dumping optlink! In that case it would mean upgrading the D backend to emit COFF (ELF and Mach-O already support custom sections), which I am fine with trying to do. Then you would add your AST or other intermediate representations to a custom section in the object file and the linker could then link it in. D would then need a way to extract said information. Which would not be terribly hard. Except that you'll have to train other compilers how to read that IR. Maybe we could train D to read the LLVM IR?
 2. We should be seriously looking into the idea of making CTFE work by
 executing already-compiled code, a la Nemerle (but without needing the
 extra build step). There may be enough technical hurdles involved to
 hold this back for [the still-hypothtical] D3, but it should at least
 be a direction we should be seriously considering. (Unless someone can
 already come up with a deal-breaking reason now.) Actually, there's
 *FAR* more important things than this right now, like a solid
 ARM-tablet toolchain, so this should definitely just be an "on hold
 for now" feature.

+1. You do have the problem of what to do in a cross-compiler, though. T

-- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
On Wednesday, 9 May 2012 at 22:49:55 UTC, Adam Wilson wrote:
 IIRC D compiles in the implementation code in the D and does 
 not use the code in the shared library, but I may be wrong.

It depends on how you pass it all to the compiler. If it finds it in the import path - not on the command line - it treats it as a simple import. You can see this by making an implementation file and compiling something that uses it. make: foo/bar.d compile: dmd a.d then you'd get a bunch of linker errors for the functions you need in bar.d. So it isn't pulling the implementation there. But if you do dmd a.d foo/bar.d then it pulls the impl too.
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 15:55:36 -0700, Mehrdad <wfunction hotmail.com> wrote:

 I am 100% for this. It would be very .NET like. In fact I'm curious  
 enough what it would take to make this work that I could see myself  
 trying. My guess is that it needs a new linker with the glorious  
 side-effect of dumping optlink! In that case it would mean upgrading  
 the D backend to emit COFF (ELF and Mach-O already support custom  
 sections), which I am fine with trying to do. Then you would add your  
 AST or other intermediate representations to a custom section in the  
 object file and the linker could then link it in. D would then need a  
 way to extract said information. Which would not be terribly hard.  
 Except that you'll have to train other compilers how to read that IR.  
 Maybe we could train D to read the LLVM IR?

:O I was writing a response pretty much exactly like this (i.e. doing what .NET does), but then I dumped it, thinking it'd be dismissed as too huge of a change...

:-D This isn't the first time it's been suggested in recent forum history. I think there is a significant body of support for making D libraries single file with no import files, it solves a *TON* of issues around how to import API's. I imagine that it's much the same reason .NET went with their metadata plan. And ended up where we are suggesting to go. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 15:56:09 -0700, Artur Skawina <art.08.09 gmail.com>  
wrote:

 On 05/10/12 00:15, Adam Wilson wrote:
 On Wed, 09 May 2012 15:07:44 -0700, Adam D. Ruppe  
 <destructionator gmail.com> wrote:

 On Wednesday, 9 May 2012 at 20:41:05 UTC, Adam Wilson wrote:
 Except that there is a distinct need for the DRuntime as a shared  
 library.

That doesn't really matter - you can deploy as a shared library and still use full source as the interface file. Hell, that's what putting implementations in the .di file does anyway!

Sure, but a lot of software developers, particularly those with money, don't want their source getting out, and in a lot of cases, there is no good reason to distribute the source. There are also a bunch of cases where you don't even want something to be CTFEable like Walter's example on a different thread of the GC. Why would ever want to CTFE the GC? Until D starts to see some serious usage in business, it's never going to get out of "toy"/"hobby" language status in the eyes of the developer community at large. Few businesses want to release their source. DI's as a complete source file are a non-starter to that large segment of the development world. Improving DI generation is just taking down another barrier to D usage by that group of people.

A "group of people" that wants to distribute binary closed-source libs, yet finds having to manually specify the API of their library to be a barrier? If having to write all the required declarations from scratch (instead of using some *.d -> *.di converter) is a real problem, then, umm, it's most likely not their biggest one... artur

I agree, probably not the biggest one. But i've seen a lot of frustration around the fact that D offers automatic header generation, but when you actually use it, all it does is regurgitate your code. Headers mean something to people. And DI files aren't even close to matching what they are looking for. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 17:12:57 -0700, Nick Sabalausky  
<SeeWebsiteToContactMe semitwist.com> wrote:

 "Adam Wilson" <flyboynw gmail.com> wrote in message
 news:op.wd2beab6707hn8 apollo.hra.local...
 On Wed, 09 May 2012 15:17:41 -0700, Nick Sabalausky
 <SeeWebsiteToContactMe semitwist.com> wrote:

 My take, FWIW:

 1. DI is only useful for those anachronistic corporations who beleive  
 in
 code-hiding (and even then, only the ones who release libs), which
 regardless of everything else, isn't even *realistic* anyway - there's
 always reverse-engineering, and with the super-popular JS there *IS NO*
 pre-compiled form, and yet non-OSS companies *still* get by just fine
 anyway. If you're relying on the increasingly-irrelevent practice of
 code-hiding (which there is *no such thing* - only obfuscation, which  
 is
 exactly what compiling does, it only obfuscates the source, it doesn't
 hide
 it), then you need to accept that there *are* going to be things you  
 will
 *never* be able to do, period, like virtual templates (which *are*
 possible
 in theory if all the source is available, even if D doesn't currently
 allow
 it).

Anachronistic or not, MANY companies still require it. And JS is not exactly D, they attack to very different segments. And most companies don't put anything of intellectual value in JS. But im not hear to argue the morality of the point. Only that the DI generation issue will stop a lot of groups from using D.

My random ranting made it unclear, but my main point was that if a company requires their libs be distributed in binary-only form - for *whatever* reason - then they MUST accept that there will be things they *can't* do. Note that's *not* merely some policy I'm proposing that D take - it's hard, immutable reality.

We aren't talking about hard-binary-only form, most companies realizes that's never going to happen. But distributing something in source form isn't OK either. Companies ship .NET assemblies all the time, they aren't hard-binary, nor are they the original source but an intermediate language (although they semantically are, which has given rise to a lot of obsfucators). Somewhat this about lawyers not technical people. The best possible answer is to embed some kind of intermediate representation into the library that describes the API and Implementation and can be extracted/read by other tools. But thats a ways off. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 17:06:24 -0700, Nick Sabalausky  
<SeeWebsiteToContactMe semitwist.com> wrote:

 "H. S. Teoh" <hsteoh quickfur.ath.cx> wrote in message
 news:mailman.489.1336603453.24740.digitalmars-d puremagic.com...
 On Wed, May 09, 2012 at 06:17:41PM -0400, Nick Sabalausky wrote:
 My take, FWIW:

 1. DI is only useful for those anachronistic corporations who beleive
 in code-hiding (and even then, only the ones who release libs), which
 regardless of everything else, isn't even *realistic* anyway - there's
 always reverse-engineering, and with the super-popular JS there *IS
 NO* pre-compiled form, and yet non-OSS companies *still* get by just
 fine anyway. If you're relying on the increasingly-irrelevent practice
 of code-hiding (which there is *no such thing* - only obfuscation,
 which is exactly what compiling does, it only obfuscates the source,
 it doesn't hide it), then you need to accept that there *are* going to
 be things you will *never* be able to do, period, like virtual
 templates (which *are* possible in theory if all the source is
 available, even if D doesn't currently allow it).

This is why I kept proposing that .di's should have zero implementation. ZERO. No function bodies, no template bodies, NOTHING except API information. All of the implementation stuff should be compiled into some intermediate form and stuck into special sections in the object file that the compiler understands. For example, it can be a serialized AST of the corresponding source. When you import the module, the compiler automatically looks up the corresponding section in the precompiled library and gets whatever info it needs (template bodies, CTFE function bodies, whatever). Yes such a thing can be reverse-engineered, but that is no different from distributing your binary in the first place (someone determined enough to steal your code will be able to reverse-engineer even the most sophisticated obfuscations you apply to your code -- if the CPU can run the code, it can be reverse-engineered). It really is just a matter of deterring the casual shoulder-peekers from peeking at your "precious" code. Code in the form of ASTs stored in the compiled library should be deterrent enough -- anyone that actually bothers to reverse engineer that is determined enough that you will not be able to stop him no matter what you do anyway.

There's no need for all that. The whole point here is "Compile to some obfuscated form" right? So just make/use a good code obfuscator. Done. Problem solved. Inventing an AST storage format just to obfuscate some code is unnecesary overkill (although maybe it might have some other use). This "just use an obfuscator" approach even makes the whole DI system become totally redundant (except for binding to C code, of course).

I agree if that were the sole purpose it would be kind of pointless, but the idea is really to give us a system whereby the library itself contains all the necessary information to code with it. Much like the .NET Assembly strategy. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent =?UTF-8?B?Ik1pY2hhw6ts?= Larouche" <michael.larouche gmail.com> writes:
On Thursday, 10 May 2012 at 02:59:22 UTC, Andrei Alexandrescu 
wrote:
 On 5/9/12 3:14 PM, Steven Schveighoffer wrote:
 On Wed, 09 May 2012 15:57:46 -0400, Adam D. Ruppe
 <destructionator gmail.com> wrote:

 The real WTF is we use .di files for druntime in the
 first place. It is performance sensitive and open source.

 We should be using the actual sources for inlining, ctfe,
 etc. anyway.

 Let's not torpedo the .di patch's value for just phobos.

I agree (although not generating .di files does not fix all the problems of inlining and ctfe -- there are many stubbed functions even in the .d files). In my opinion, .di generation should by default generate fully-stripped code except for templates. If you want functions to be CTFE-able, don't use auto-generated .di files to import them. -Steve

Actually the point here is to still be able to benefit of di automated generation while opportunistically marking certain functions as "put the body in the .di file". inline anyone? Andrei

I find the inline confusing, people could mistook it with a force inline attribute. Something like compiletime would be more clear for the tool and the user.
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 20:17:17 -0700, Micha=EBl Larouche  =

<michael.larouche gmail.com> wrote:

 On Thursday, 10 May 2012 at 02:59:22 UTC, Andrei Alexandrescu wrote:
 On 5/9/12 3:14 PM, Steven Schveighoffer wrote:
 On Wed, 09 May 2012 15:57:46 -0400, Adam D. Ruppe
 <destructionator gmail.com> wrote:

 The real WTF is we use .di files for druntime in the
 first place. It is performance sensitive and open source.

 We should be using the actual sources for inlining, ctfe,
 etc. anyway.

 Let's not torpedo the .di patch's value for just phobos.

I agree (although not generating .di files does not fix all the =



 problems
 of inlining and ctfe -- there are many stubbed functions even in the=



 files).

 In my opinion, .di generation should by default generate fully-strip=



 code except for templates. If you want functions to be CTFE-able, do=



 use auto-generated .di files to import them.

 -Steve

Actually the point here is to still be able to benefit of di automate=


 generation while opportunistically marking certain functions as "put =


 the body in the .di file".

  inline anyone?


 Andrei

I find the inline confusing, people could mistook it with a force =

 inline attribute.

 Something like  compiletime would be more clear for the tool and the  =

 user.

I had the thought to use embed, it's short, not taken, and you are embe= d = the function in the DI file. Another option is include although that on= e = could be ambiguous. -- = Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 19:24:48 -0700, Nick Sabalausky  
<SeeWebsiteToContactMe semitwist.com> wrote:

 "H. S. Teoh" <hsteoh quickfur.ath.cx> wrote in message
 news:mailman.510.1336610145.24740.digitalmars-d puremagic.com...
 On Wed, May 09, 2012 at 08:06:24PM -0400, Nick Sabalausky wrote:
 There's no need for all that.

 The whole point here is "Compile to some obfuscated form" right? So
 just make/use a good code obfuscator. Done. Problem solved.

 Inventing an AST storage format just to obfuscate some code is
 unnecesary overkill (although maybe it might have some other use).
 This "just use an obfuscator" approach even makes the whole DI system
 become totally redundant (except for binding to C code, of course).

This is an interesting idea. Probably more feasible than mine. You don't necessarily have to throw away the DI system; some people may sleep better at night if their proprietary algorithms are in binary executable form only (though personally I think that's just self delusion, but who am I to judge?).

An ambassador of sanity, that's who ;) It's not just your personal opinion, it's hard fact: From a reverse-engineering standpoint, executable binary form *is* nothing more than obfuscation (except perhaps if it's an *encrypted* binary form, but I've never heard of a lib that did that, heck it would require special toolchain support anyway). Believing binary libs are more secure than that is just simply incorrect, period, opinion doesn't enter into it. 2+2 *is* 4 whether you believe it or not. Life isn't looney tunes, you don't walk on air just because nobody taught you gravity. Etc. Fuck if you want to steal something, you don't even need any source - obfuscated or not. Commercial games, for example, never release *any* source. *Only* the final binaries are distributed, and even *those* are usually encrypted. And yet they *still* get pirated like crazy. So the source is locked away - fat lot of fucking good THAT did! (Ok, so it's harder to make an unauthorized modification, who the hell cares - the *original* is *already* out there getting ripped off, plus why would deviants wanna waste time modding when they can just sell bootlegs?) So source vs binary doesn't make a damn bit of difference, period - if all you have is the binary, well, to use it you just *run* it! You don't need *any* sources to use it. You just use it. The only thing that can even make any difference is encryption (which still isn't truly "secure"). And for that matter, nobody's algorithms are proprietary. Code is proprietary. 99.9999999...9999999% of algorithms are not. For example, wrapping some action in a foreach to make a batch processor and adding an option box to enable it is not a fucking proprietary algorithm no matter what the suits and the subhuman USPTO fuckwads think. Real-world example: There isn't a fucking thing proprietary in Marmalade's MKB build system (it's a stinking *build system* for fucks sake!). And even for any algos that are proprietary, if such algos even exist...well, why bother trying to get the source? If you've got the binary already, just *USE* it! Who cares about the damn source? If I wanted to give someone access to Marmalade's MKB build system, the fact that half of it's distributed in pyc-only does would do jack shit to stop me. And obviously there's no proprietary algos in there, again, it's just a fucking build system. So there's no algorithms to steal. *Only* thing it does is make it impractical for me to work around any problems I encounter. Oh yea, and it gives Marmalade's suit-department a big collective stiffy because their mini-monkey brains are telling them they're actually *earning* their paychecks. (Corona's 50x worse though, FWIW. You don't even *have* their software, you just rent the right to send your project to them and have *them* build it for you.) Excess offtopic ranting aside, *some* things are opinion: "Red is the best color" <-- That's an opinion. "It is/isn't worthwhile to keep the source locked up." <- Even *that's* an opinion, too, note the vauge "worthwhile". But merely having different viewpoints doesn't make something opinion. Either it's opinion or it isn't. You're not stating mere opinion here - you're stating honest-to-goodness FACT: Considering well-obfuscated source less secure than compiled binary form *is* delusion, period.

I actually agree with you, im just telling you what I hear from PHB's.
 So you just take the existing .di
 files, complete with all their warts and function bodies and whatever,
 and run an obfuscator on them. Ship the .di and your shared library as
 usual.  Problem solved.

Or just skip the di entirely. It'll all just get obfuscated one way or the other, so there's not much point.

We need some way to export the symbols without the underlying code, it makes for faster compile times and having the API handy can be useful to development tools. However, my experience with PHB's is that as long as you don't send out the actual source files but some form of sanitized header, the PHB's don't really care beyond that. That'd why I think embedding a version of the source D files that has been semantically analyzed could be helpful, you can pull in the source for CTFE as needed, but the only thing you have to actually ship out is the library file itself, it just happens to have source files inside. In my experience in the .NET world, this is good enough for the PHB's. Out of sight, out of mind as they say. So what if it's trickery, we developers get a benefit to, we don't have to wrangle include files.
 Plus, all of this is already possible with the current system, except
 for the only missing piece of a D obfuscator.

DustMite has some obfuscation capability, although I don't know how extensive it is or whether it would be enough make the pointy-haired suits happy. (Then again, *anything* can be enough to make a suit happy - you just have to present it in the right salesmany (read: "convoluted and full of shit") way. They'll swallow any amount of bullshit you give them, you just have to make it *sound* good. Suits don't know the difference. Fuck, most of them don't even know there *is* a difference. That's why salesmen exist - because bullshitting WORKS on suits (and on many others), in fact, most of the time, it's the *only* thing that works on suits. Bullshit is the only language those fuckers speak and understand.)
 (Wait, I hear you cry. But what about the library API? How would users
 know how to use the library if the .di is incomprehensible?  As somebody
 pointed out in another thread: just ship the ddocs generated from the
 unobfuscated source with the library. Users don't need to read the .di.
 Problem solved.)

Yea, the signatures (not even the definitions) of the symbols which make up the public interface are the only parts that must remain non-obfuscated. But of course, those *still* need to be non-obfuscated even under the old-fashioned C-style "lib + headers" approach. Don't even need any markup to signal these "don't touch" symbols to the obfuscator - just make a series of wrappers in separate files for the public API and tell the obfuscator "don't obfuscate the signatures in files x, y and z."

-- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Thursday, 10 May 2012 at 03:17:20 UTC, Michaël Larouche wrote:
 On Thursday, 10 May 2012 at 02:59:22 UTC, Andrei Alexandrescu
  inline anyone?

I find the inline confusing, people could mistook it with a force inline attribute.

 Something like  compiletime would be more clear for the tool  
 and the user.

I know I have some functions that are used only during comile-time, specifically ones that generate code for me when using mixins. Seems useless to me to compile and keep functions that aren't used in actual Run-time.. Either way, a compiltime or CT CTFE would be a good label for it not to be stripped... And I agree, inline suggests it's forced, half the time I've read, forcing in-lining is usually self-defeating since the compiler in C/C++ will auto inline appropriate functions (During optimizing) when the size/speed/code ratio is workable regardless of the keyword hint.
May 09 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Wed, 09 May 2012 21:12:53 -0700, Nick Sabalausky  
<SeeWebsiteToContactMe semitwist.com> wrote:

 "Adam Wilson" <flyboynw gmail.com> wrote in message
 news:op.wd2prcc4707hn8 invictus.skynet.com...
 I actually agree with you, im just telling you what I hear from PHB's.

I was just kinda rambling anyway ;) Not directed at any particular poster.
 We need some way to export the symbols without the underlying code, it
 makes for faster compile times and having the API handy can be useful to
 development tools.
 However, my experience with PHB's is that as long as you don't send out
 the actual source files but some form of sanitized header, the PHB's  
 don't
 really care beyond that.
 That'd why I think embedding a version of the source D files that has  
 been
 semantically analyzed could be helpful, you can pull in the source for
 CTFE as needed, but the only thing you have to actually ship out is the
 library file itself, it just happens to have source files inside. In my
 experience in the .NET world, this is good enough for the PHB's. Out of
 sight, out of mind as they say. So what if it's trickery, we developers
 get a benefit to, we don't have to wrangle include files.

Well, if that works for the PHBs, then it works for me (Hmm...Never thought I'd say something like that ;) ) Thinking about it more, I suppose it's debatable whether a PHB-comlpiant obfuscator or a lib-with-embedded-source would be easier to implement and deal with.

I'm a fan of embedded source as it's relatively easy to get from the compiler when it's time to build the output file. No extra steps required. :-) -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 09 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 09 May 2012 23:00:07 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 5/9/12 3:14 PM, Steven Schveighoffer wrote:
 On Wed, 09 May 2012 15:57:46 -0400, Adam D. Ruppe
 <destructionator gmail.com> wrote:

 The real WTF is we use .di files for druntime in the
 first place. It is performance sensitive and open source.

 We should be using the actual sources for inlining, ctfe,
 etc. anyway.

 Let's not torpedo the .di patch's value for just phobos.

I agree (although not generating .di files does not fix all the problems of inlining and ctfe -- there are many stubbed functions even in the .d files). In my opinion, .di generation should by default generate fully-stripped code except for templates. If you want functions to be CTFE-able, don't use auto-generated .di files to import them. -Steve

Actually the point here is to still be able to benefit of di automated generation while opportunistically marking certain functions as "put the body in the .di file".

If you aren't going to strip the files, I don't see the point in it. If you want a 'half stripped' .di file, use the plethora of shell commands to build it. The point is, dmd -H does the wrong thing, no matter which way you look at it. We have a tool to make a .di file with function bodies in it, it's called cp. dmd -H should do the thing that the shell cannot, let me worry about it's granularity (i.e. I'll decide on a module basis which functions should be stripped). -Steve
May 10 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 10 May 2012 10:47:59 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 5/10/12 6:17 AM, Steven Schveighoffer wrote:
 On Wed, 09 May 2012 23:00:07 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Actually the point here is to still be able to benefit of di automated
 generation while opportunistically marking certain functions as "put
 the body in the .di file".

If you aren't going to strip the files, I don't see the point in it.

Inlining.

No, I mean if dmd -H isn't going to strip the files, what is the point of dmd -H? I can already copy the .d to .di and have inlining/ctfe, or simply use the .d directly. At this point, in order to get CTFE to work, you have to keep just about everything, including private imports. If we want to ensure CTFE works, dmd -H becomes a glorified cp. If we have some half-assed guess at what could be CTFE'd (which is growing by the day), then it's likely to not fit with the goals of the developer running dmd -H. -Steve
May 10 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 10 May 2012 12:04:44 -0400, deadalnix <deadalnix gmail.com> wrot=
e:

 Le 10/05/2012 17:54, Steven Schveighoffer a =C3=A9crit :
 On Thu, 10 May 2012 10:47:59 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 5/10/12 6:17 AM, Steven Schveighoffer wrote:
 On Wed, 09 May 2012 23:00:07 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Actually the point here is to still be able to benefit of di  =





 automated
 generation while opportunistically marking certain functions as "p=





 the body in the .di file".

If you aren't going to strip the files, I don't see the point in it=




 Inlining.

No, I mean if dmd -H isn't going to strip the files, what is the poin=


 of dmd -H? I can already copy the .d to .di and have inlining/ctfe, o=


 simply use the .d directly.

 At this point, in order to get CTFE to work, you have to keep just ab=


 everything, including private imports. If we want to ensure CTFE work=


 dmd -H becomes a glorified cp. If we have some half-assed guess at wh=


 could be CTFE'd (which is growing by the day), then it's likely to no=


 fit with the goals of the developer running dmd -H.

 -Steve

If you can CTFE, you can know what is CTFEable. If it is currently hal=

 assed, then work on it and provide a better tool.

There is already a better tool -- cp. I ask again, what is the benefit = of = .di generation if it is mostly a glorified (faulty?) copy operation? As Adam points out in his original post, ensuring CTFE availability may = = not be (and is likely not) why you are creating a .di file. Plus, what isn't CTFEable today may be CTFEable tomorrow. inlining is one thing, because that's an optimization that has a valid = fallback. CTFE does not. -Steve
May 10 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Thu, 10 May 2012 09:56:06 -0700, Steven Schveighoffer  =

<schveiguy yahoo.com> wrote:

 On Thu, 10 May 2012 12:04:44 -0400, deadalnix <deadalnix gmail.com>  =

 wrote:

 Le 10/05/2012 17:54, Steven Schveighoffer a =E9crit :
 On Thu, 10 May 2012 10:47:59 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 5/10/12 6:17 AM, Steven Schveighoffer wrote:
 On Wed, 09 May 2012 23:00:07 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Actually the point here is to still be able to benefit of di  =






 automated
 generation while opportunistically marking certain functions as "=






 the body in the .di file".

If you aren't going to strip the files, I don't see the point in i=





 Inlining.

No, I mean if dmd -H isn't going to strip the files, what is the poi=



 of dmd -H? I can already copy the .d to .di and have inlining/ctfe, =



 simply use the .d directly.

 At this point, in order to get CTFE to work, you have to keep just  =



 about
 everything, including private imports. If we want to ensure CTFE wor=



 dmd -H becomes a glorified cp. If we have some half-assed guess at w=



 could be CTFE'd (which is growing by the day), then it's likely to n=



 fit with the goals of the developer running dmd -H.

 -Steve

If you can CTFE, you can know what is CTFEable. If it is currently ha=


 assed, then work on it and provide a better tool.

There is already a better tool -- cp. I ask again, what is the benefi=

 of .di generation if it is mostly a glorified (faulty?) copy operation=

 As Adam points out in his original post, ensuring CTFE availability ma=

 not be (and is likely not) why you are creating a .di file.

 Plus, what isn't CTFEable today may be CTFEable tomorrow.

 inlining is one thing, because that's an optimization that has a valid=

 fallback.  CTFE does not.

 -Steve

Exactly this. I am currently in the process of changing the DRuntime = makefiles such that some of the files are not processed as DI's. This = allows Phobos CTFE dependencies on the DRT to remain valid while still = allowing DI's to be generated for parts where they matter, with the goal= = of making both a shared and static library build of the DRT. The tool I = am = using to accomplish this feat? cp. It works, it delivers exactly what we= = need and it's *is not* a broken operation like the current DI generation= . Like Steve said, most people generating DI files are not really worried = = about CTFE working, in fact they almost undoubtedly *know* that they are= = breaking CTFE, yet they choose to do it anyways. They have their reasons= , = and frankly, it doesn't concern us as compiler writers if those reasons = = don't line up with our personal moral world-view. Our job is to provide = a = tool that DOES WHAT PEOPLE EXPECT. Otherwise they will move on to one th= at = does. If people expected DI generation to be glorified (and not broken) = = copy operation, they would (and do) use cp. -- = Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 10 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 10 May 2012 13:27:23 -0400, deadalnix <deadalnix gmail.com> wrot=
e:

 Le 10/05/2012 18:56, Steven Schveighoffer a =C3=A9crit :
 There is already a better tool -- cp. I ask again, what is the benefi=


 of .di generation if it is mostly a glorified (faulty?) copy operatio=



Please stop with that cp argument, this is complete bullshit.

Not complete. Maybe it's somewhat of an exaggeration ;) But really, I look at the current situation that started this thread. T= he = intention of .di header generation retaining implementation is to allow = = for inlining, not making CTFE available. Yet a side effect is that = sometimes CTFE *is* available. Well, let's say something becomes uninlinable, and now dmd decides to = remove its implementation. But another piece of code is already dependi= ng = on that source to be available for CTFE! Now you have broken code = inadvertently, and the only way to fix it is to hand-edit the .di file. I don't think the situation is "fixable" without deferring to the user. = = Either we defer by using compiler directives exclusively, or we defer by= = simply processing the entire file by stripping all the implementation ou= t, = then let the author decide what interface functions to put in those = modules. Having the compiler decide makes no sense to me. In most the cases I've seen, dmd -H pretty much includes the whole file = = except comments and whitespace. But really, all it takes is it includin= g = one implementation that you *didn't* want public, and you are stuck = hand-editing. It's not a very usable situation.
 As Adam points out in his original post, ensuring CTFE availability m=


 not be (and is likely not) why you are creating a .di file.

You want to create a di file to hide implementation of some =

 functionality to the user of you lib. The better approach is to mark  =

 such code as this.

Marking code to specify whether it will be included or not is a valid = solution. I would be fine with that too. But the compiler should stay out of the decision to strip or not based o= n = optimization predictions.
 Note that in C/C++ you maintain headers manually. It is already a big =

 improvement.

Let's not kid ourselves -- you have to do the same in D if you want a = proper interface file. I agree the module system is way better than having an interface and = implementation file separate. But when you actually *do* want it to be = = separate (for whatever reason), D pretty much devolves to C. -Steve
May 10 2012
prev sibling next sibling parent "Christopher Bergqvist" <spambox0 digitalpoetry.se> writes:
On Thursday, 10 May 2012 at 17:37:59 UTC, Adam Wilson wrote:
 On Thu, 10 May 2012 09:56:06 -0700, Steven Schveighoffer 
 <schveiguy yahoo.com> wrote:

 On Thu, 10 May 2012 12:04:44 -0400, deadalnix 
 <deadalnix gmail.com> wrote:

 Le 10/05/2012 17:54, Steven Schveighoffer a écrit :
 On Thu, 10 May 2012 10:47:59 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 5/10/12 6:17 AM, Steven Schveighoffer wrote:
 On Wed, 09 May 2012 23:00:07 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Actually the point here is to still be able to benefit of 
 di automated
 generation while opportunistically marking certain 
 functions as "put
 the body in the .di file".

If you aren't going to strip the files, I don't see the point in it.

Inlining.

No, I mean if dmd -H isn't going to strip the files, what is the point of dmd -H? I can already copy the .d to .di and have inlining/ctfe, or simply use the .d directly. At this point, in order to get CTFE to work, you have to keep just about everything, including private imports. If we want to ensure CTFE works, dmd -H becomes a glorified cp. If we have some half-assed guess at what could be CTFE'd (which is growing by the day), then it's likely to not fit with the goals of the developer running dmd -H. -Steve

If you can CTFE, you can know what is CTFEable. If it is currently half assed, then work on it and provide a better tool.

There is already a better tool -- cp. I ask again, what is the benefit of .di generation if it is mostly a glorified (faulty?) copy operation? As Adam points out in his original post, ensuring CTFE availability may not be (and is likely not) why you are creating a .di file. Plus, what isn't CTFEable today may be CTFEable tomorrow. inlining is one thing, because that's an optimization that has a valid fallback. CTFE does not. -Steve

Exactly this. I am currently in the process of changing the DRuntime makefiles such that some of the files are not processed as DI's. This allows Phobos CTFE dependencies on the DRT to remain valid while still allowing DI's to be generated for parts where they matter, with the goal of making both a shared and static library build of the DRT. The tool I am using to accomplish this feat? cp. It works, it delivers exactly what we need and it's *is not* a broken operation like the current DI generation. Like Steve said, most people generating DI files are not really worried about CTFE working, in fact they almost undoubtedly *know* that they are breaking CTFE, yet they choose to do it anyways. They have their reasons, and frankly, it doesn't concern us as compiler writers if those reasons don't line up with our personal moral world-view. Our job is to provide a tool that DOES WHAT PEOPLE EXPECT. Otherwise they will move on to one that does. If people expected DI generation to be glorified (and not broken) copy operation, they would (and do) use cp.

How about: dmd -H mySource.d --keepImplementation MyClass.fooMethod ? It should be good enough for makefiles as in the case of core.time/dur, but get's a bit hairy with overloads (append "[0]" to select specific ones?). Maybe it requires semantic information though.
May 10 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Thu, 10 May 2012 10:57:37 -0700, Christopher Bergqvist  =

<spambox0 digitalpoetry.se> wrote:

 On Thursday, 10 May 2012 at 17:37:59 UTC, Adam Wilson wrote:
 On Thu, 10 May 2012 09:56:06 -0700, Steven Schveighoffer  =


 <schveiguy yahoo.com> wrote:

 On Thu, 10 May 2012 12:04:44 -0400, deadalnix <deadalnix gmail.com> =



 wrote:

 Le 10/05/2012 17:54, Steven Schveighoffer a =E9crit :
 On Thu, 10 May 2012 10:47:59 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 5/10/12 6:17 AM, Steven Schveighoffer wrote:
 On Wed, 09 May 2012 23:00:07 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Actually the point here is to still be able to benefit of di  =








 automated
 generation while opportunistically marking certain functions as=








 "put
 the body in the .di file".

If you aren't going to strip the files, I don't see the point in=







 it.

Inlining.

No, I mean if dmd -H isn't going to strip the files, what is the =





 point
 of dmd -H? I can already copy the .d to .di and have inlining/ctfe=





 or
 simply use the .d directly.

 At this point, in order to get CTFE to work, you have to keep just=





 about
 everything, including private imports. If we want to ensure CTFE  =





 works,
 dmd -H becomes a glorified cp. If we have some half-assed guess at=





 what
 could be CTFE'd (which is growing by the day), then it's likely to=





 not
 fit with the goals of the developer running dmd -H.

 -Steve

If you can CTFE, you can know what is CTFEable. If it is currently =




 half assed, then work on it and provide a better tool.

There is already a better tool -- cp. I ask again, what is the =



 benefit of .di generation if it is mostly a glorified (faulty?) copy=



 operation?

 As Adam points out in his original post, ensuring CTFE availability =



 may not be (and is likely not) why you are creating a .di file.

 Plus, what isn't CTFEable today may be CTFEable tomorrow.

 inlining is one thing, because that's an optimization that has a val=



 fallback.  CTFE does not.

 -Steve

Exactly this. I am currently in the process of changing the DRuntime =


 makefiles such that some of the files are not processed as DI's. This=


 allows Phobos CTFE dependencies on the DRT to remain valid while stil=


 allowing DI's to be generated for parts where they matter, with the  =


 goal of making both a shared and static library build of the DRT. The=


 tool I am using to accomplish this feat? cp. It works, it delivers  =


 exactly what we need and it's *is not* a broken operation like the  =


 current DI generation.

 Like Steve said, most people generating DI files are not really worri=


 about CTFE working, in fact they almost undoubtedly *know* that they =


 are breaking CTFE, yet they choose to do it anyways. They have their =


 reasons, and frankly, it doesn't concern us as compiler writers if  =


 those reasons don't line up with our personal moral world-view. Our j=


 is to provide a tool that DOES WHAT PEOPLE EXPECT. Otherwise they wil=


 move on to one that does. If people expected DI generation to be  =


 glorified (and not broken) copy operation, they would (and do) use cp=


 How about:
 dmd -H mySource.d --keepImplementation MyClass.fooMethod
 ?

 It should be good enough for makefiles as in the case of core.time/dur=

 but get's a bit hairy with overloads (append "[0]" to select specific =

 ones?).  Maybe it requires semantic information though.

It does require some semantic information. And the solution I've seen se= en = most talked about here is some kind of attribute similar to pure that = tells the compiler to include the implementation in the DI file. IMO, th= is = is a fine solution, but the compiler cannot be involved the decision to = = keep an implementation in or out based on anything other than programmer= = directives because the compiler just don't know what's being depended on= . = That's how we ended up where we are today, DI files are the source with = = unittests and comments removed. -- = Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 10 2012
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Thursday, May 10, 2012 10:37:58 Adam Wilson wrote:
 Like Steve said, most people generating DI files are not really worried
 about CTFE working, in fact they almost undoubtedly *know* that they are
 breaking CTFE, yet they choose to do it anyways.

Actually, I expect that they _don't_ know in most cases until they've actually done it and had any CTFE stuff that they do break (or nothing break if they don't use CTFE). However, ultimately, if you decide to use .di files, you _are_ choosing between having a stripped interface and having CTFE and inlinability. So, even if the programmer is not fully aware of the tradeoffs when they first attempt it, that's ultimately what they have to decide. Honestly, I think that if you really want to be using .di files though, in most cases, you're going to have to maintain them by hand. As such, you basically have the choice between copying the .d file and then stripping it down by hand or using the tool to strip it and then adding stuff back in by hand. I really think that druntime's choice of automatically generating .di files as part of the build process is a flawed idea in the general case. But it looks like you've started the process of changing how druntime deals with that. I would warn you however that taking the approach of just copying the implementation over for what Phobos needs for CTFE is inherently flawed and will undoubtedly break existing programs. I'd argue that for the most part, anything that currently has its implementation in druntime's .di files needs to keep it. The sole exceptions would be stuff which isn't CTFEable (e.g. the rt stuff) and stuff which already has a hand-written .di file (like object and thread). So, you should be erring on the side of not generating the .di file with your updated .di generator rather than using it as far as druntime goes. - Jonathan M Davis
May 10 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Thu, 10 May 2012 11:10:15 -0700, David Gileadi <gileadis nspmgmail.com>  
wrote:

 On 5/10/12 11:01 AM, Adam Wilson wrote:
 It does require some semantic information. And the solution I've seen
 seen most talked about here is some kind of attribute similar to  pure
 that tells the compiler to include the implementation in the DI file.

I may be off-base here, but this strikes me as a good case for a pragma. No?

Well, it's needs to be at a function level to be useful. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 10 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Thu, 10 May 2012 11:04:34 -0700, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Thursday, May 10, 2012 10:37:58 Adam Wilson wrote:
 Like Steve said, most people generating DI files are not really worried
 about CTFE working, in fact they almost undoubtedly *know* that they are
 breaking CTFE, yet they choose to do it anyways.

Actually, I expect that they _don't_ know in most cases until they've actually done it and had any CTFE stuff that they do break (or nothing break if they don't use CTFE). However, ultimately, if you decide to use .di files, you _are_ choosing between having a stripped interface and having CTFE and inlinability. So, even if the programmer is not fully aware of the tradeoffs when they first attempt it, that's ultimately what they have to decide.

Yes they well, but to be honest, I've never met a competent native language programmer who didn't understand that using something like DI files was going to be a trade-off at some level, they might not know to what degree initially, but the understanding is there. More importantly the user is already expecting some things to not work with that model so the won't be shocked when something actually doesn't.
 Honestly, I think that if you really want to be using .di files though,  
 in most
 cases, you're going to have to maintain them by hand. As such, you  
 basically
 have the choice between copying the .d file and then stripping it down  
 by hand
 or using the tool to strip it and then adding stuff back in by hand. I  
 really
 think that druntime's choice of automatically generating .di files as  
 part of
 the build process is a flawed idea in the general case. But it looks like
 you've started the process of changing how druntime deals with that.

Well the DI generator does a few things intelligently now so that you don't have to hand modify as much, namely keeping implementations for auto functions and template functions while stripping the rest. I tried to balance out what was needed to make it work at all with what is expected by it's likely users. I think this patch will cover 95% of the DI use cases out there. And 5% modifications beats the pants off of 100% modification.
 I would warn you however that taking the approach of just copying the
 implementation over for what Phobos needs for CTFE is inherently flawed  
 and
 will undoubtedly break existing programs. I'd argue that for the most  
 part,
 anything that currently has its implementation in druntime's .di files  
 needs to
 keep it. The sole exceptions would be stuff which isn't CTFEable (e.g.  
 the rt
 stuff) and stuff which already has a hand-written .di file (like object  
 and
 thread). So, you should be erring on the side of not generating the .di  
 file
 with your updated .di generator rather than using it as far as druntime  
 goes.

 - Jonathan M Davis

I am actually in the process of doing just that. I've removed some of the DRT files from DI generation; most of core.* and all of core.stdc.*. I am taking it slow so that I can be deliberate about my testing precisely because the DRT is the foundation of every D program in existence, and I don't want to break them. For now I've removed the things that can be pretty obviously removed. However, the goal is to make the DRT a shared library, and that means that somethings will necessarily require DI files so they don't get accidentally built in the program and a create a problem of incompatible version; but mostly that stuff is, as you mentioned, the actual runtime parts. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 10 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Thu, 10 May 2012 11:22:36 -0700, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 05/10/2012 08:15 PM, Adam Wilson wrote:
 On Thu, 10 May 2012 11:10:15 -0700, David Gileadi
 <gileadis nspmgmail.com> wrote:

 On 5/10/12 11:01 AM, Adam Wilson wrote:
 It does require some semantic information. And the solution I've seen
 seen most talked about here is some kind of attribute similar to  pure
 that tells the compiler to include the implementation in the DI file.

I may be off-base here, but this strikes me as a good case for a pragma. No?

Well, it's needs to be at a function level to be useful.

pragmas can apply to declarations. The syntax is pragma(identifier,...) Declaration (Where Declaration can be the empty declaration, ';') pragma(keepImplementation) void foo(){ ... }

That could work, although it's more typing than I personally want to do. It depends on how much of the pragma the DI generator actually sees though ... you'd be surprised at what it doesn't see. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 10 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Thu, 10 May 2012 11:25:11 -0700, deadalnix <deadalnix gmail.com> wrot=
e:

 Le 10/05/2012 19:51, Steven Schveighoffer a =E9crit :
 On Thu, 10 May 2012 13:27:23 -0400, deadalnix <deadalnix gmail.com>  =


 wrote:

 Le 10/05/2012 18:56, Steven Schveighoffer a =E9crit :
 There is already a better tool -- cp. I ask again, what is the bene=




 of .di generation if it is mostly a glorified (faulty?) copy  =




 operation?

Please stop with that cp argument, this is complete bullshit.

Not complete. Maybe it's somewhat of an exaggeration ;) But really, I look at the current situation that started this thread.=


 The intention of .di header generation retaining implementation is to=


 allow for inlining, not making CTFE available. Yet a side effect is t=


 sometimes CTFE *is* available.

 Well, let's say something becomes uninlinable, and now dmd decides to=


 remove its implementation. But another piece of code is already
 depending on that source to be available for CTFE! Now you have broke=


 code inadvertently, and the only way to fix it is to hand-edit the .d=


 file.

The di generator can remove code that isn't CTFEable (at least can be =

 proven to not be CTFEable). It is the case in your example.

False. No semantic analysis has been performed therefore DMD has no idea= = what is and is not CTFEable.
 But the compiler should stay out of the decision to strip or not base=


 on optimization predictions.

The compiler should provide something by default. It is up to the user=

 to mark the code accordingly.

This is pretty much what we are advocating with the attribute solution.
 I agree the module system is way better than having an interface and
 implementation file separate. But when you actually *do* want it to b=


 separate (for whatever reason), D pretty much devolves to C.

At least it is not worse.

Does not D strive to be better than C++, much less C? I think we can do = = better with DI's too. -- = Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 10 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Thu, 10 May 2012 11:32:27 -0700, deadalnix <deadalnix gmail.com> wrot=
e:

 Le 10/05/2012 20:22, Timon Gehr a =E9crit :
 On 05/10/2012 08:15 PM, Adam Wilson wrote:
 On Thu, 10 May 2012 11:10:15 -0700, David Gileadi
 <gileadis nspmgmail.com> wrote:

 On 5/10/12 11:01 AM, Adam Wilson wrote:
 It does require some semantic information. And the solution I've s=





 seen most talked about here is some kind of attribute similar to  =





  pure
 that tells the compiler to include the implementation in the DI fi=





 I may be off-base here, but this strikes me as a good case for a
 pragma. No?

Well, it's needs to be at a function level to be useful.

pragmas can apply to declarations. The syntax is pragma(identifier,...) Declaration (Where Declaration can be the empty declaration, ';') pragma(keepImplementation) void foo(){ ... }

You want to specify strip implementation, not keep implementation. Strip implementation may break things. Keeping it cannot. The default =

 behavior should be on the safe side of the medal.

 The DIfier can remove code if it knows that it isn't CTFEable or don't=

 worth inlining by default. Additional code removal can be specified by=

 attributes.

The problem is that it DOES NOT know if it's CTFEable or not. No analysi= s = is performed prior to DI generation! -- = Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 10 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Thu, 10 May 2012 11:24:04 -0700, Paulo Pinto <pjmlp progtools.org>  
wrote:

 Am 10.05.2012 01:01, schrieb Adam Wilson:
 On Wed, 09 May 2012 15:55:36 -0700, Mehrdad <wfunction hotmail.com>  
 wrote:

 I am 100% for this. It would be very .NET like. In fact I'm curious
 enough what it would take to make this work that I could see myself
 trying. My guess is that it needs a new linker with the glorious
 side-effect of dumping optlink! In that case it would mean upgrading
 the D backend to emit COFF (ELF and Mach-O already support custom
 sections), which I am fine with trying to do. Then you would add your
 AST or other intermediate representations to a custom section in the
 object file and the linker could then link it in. D would then need a
 way to extract said information. Which would not be terribly hard.
 Except that you'll have to train other compilers how to read that IR.
 Maybe we could train D to read the LLVM IR?

:O I was writing a response pretty much exactly like this (i.e. doing what .NET does), but then I dumped it, thinking it'd be dismissed as too huge of a change...

:-D This isn't the first time it's been suggested in recent forum history. I think there is a significant body of support for making D libraries single file with no import files, it solves a *TON* of issues around how to import API's. I imagine that it's much the same reason .NET went with their metadata plan. And ended up where we are suggesting to go.

And also possible in languages like Turbo Pascal, Delphi or more recent, Go. Actually, this is one of the features I really like in Go. -- Paulo

I am seriously considering starting this type of project given how strong the support for it is. However, I'd need help. Linkers aren't easy and the modifications that DMD will require are even worse. In the end we get a modern linker, written in D, and COFF support for DMD. At least that's how it goes in my head. I am thinking of kicking off the project proposal with a more detailed post later today. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 10 2012
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, May 10, 2012 at 08:26:45PM +0200, Paulo Pinto wrote:
 Am 10.05.2012 00:34, schrieb Joseph Rushton Wakeling:
On 10/05/12 00:25, H. S. Teoh wrote:
Which is what fueled the market for hundreds (if not thousands) of
JS obfuscators.

Well, that's kind of my point really. Is it so bad (from a proprietary point of view) to have to distribute .d rather than .di files, if you can obfuscate them?

Try to find an error on an obfuscated dump. Not fun.

Yeah, like those Javascript errors from deep within the compressed core of Yahoo UI or jQuery, which are impossible to figure out because function names, variable names, etc., are all compressed. It's much worse with deliberately obfuscated source. But it still wouldn't stop a determined code thief. This is one of many reasons I prefer OSS. Trying to prevent code "theft" is an exercise in futility, reduces utility, and only results in debility. But if the only way for some people to be happy is to obfuscate their code, well, then they just have to live with the consequences. Anyway, all of this just backs up my proposal that instead of obfuscation, we just store the parsed source (or publically-exposed parts of it) in the object file in some kind of intermediate form (like ASTs). Don't bother with .di's at all, just generate the ddocs and let users use that as reference, you just ship the binary shared lib. The compiler can load the ASTs as necessary whenever it needs function bodies, etc.. Problem solved. T -- Let's not fight disease by killing the patient. -- Sean 'Shaleh' Perry
May 10 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Thu, 10 May 2012 11:51:00 -0700, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Thu, May 10, 2012 at 08:26:45PM +0200, Paulo Pinto wrote:
 Am 10.05.2012 00:34, schrieb Joseph Rushton Wakeling:
On 10/05/12 00:25, H. S. Teoh wrote:
Which is what fueled the market for hundreds (if not thousands) of
JS obfuscators.

Well, that's kind of my point really. Is it so bad (from a proprietary point of view) to have to distribute .d rather than .di files, if you can obfuscate them?

Try to find an error on an obfuscated dump. Not fun.

Yeah, like those Javascript errors from deep within the compressed core of Yahoo UI or jQuery, which are impossible to figure out because function names, variable names, etc., are all compressed. It's much worse with deliberately obfuscated source. But it still wouldn't stop a determined code thief. This is one of many reasons I prefer OSS. Trying to prevent code "theft" is an exercise in futility, reduces utility, and only results in debility. But if the only way for some people to be happy is to obfuscate their code, well, then they just have to live with the consequences. Anyway, all of this just backs up my proposal that instead of obfuscation, we just store the parsed source (or publically-exposed parts of it) in the object file in some kind of intermediate form (like ASTs). Don't bother with .di's at all, just generate the ddocs and let users use that as reference, you just ship the binary shared lib. The compiler can load the ASTs as necessary whenever it needs function bodies, etc.. Problem solved. T

I think this represents the best possible long-term strategy. However, getting D to do all that is going to take a *LOT* of effort. I've decided that I will propose just such a project later today, but I'll need help. Linkers aren't easy and the changes to DMD will be even worse. In the end though, we'll get a modular linker written in D and COFF support for DMD. But we need something in the mean time and for now the DI patch will suffice. -- Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 10 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 10 May 2012 14:25:04 -0400, Adam Wilson <flyboynw gmail.com> wrote:

 On Thu, 10 May 2012 11:22:36 -0700, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 05/10/2012 08:15 PM, Adam Wilson wrote:
 On Thu, 10 May 2012 11:10:15 -0700, David Gileadi
 <gileadis nspmgmail.com> wrote:

 On 5/10/12 11:01 AM, Adam Wilson wrote:
 It does require some semantic information. And the solution I've seen
 seen most talked about here is some kind of attribute similar to  
  pure
 that tells the compiler to include the implementation in the DI file.

I may be off-base here, but this strikes me as a good case for a pragma. No?

Well, it's needs to be at a function level to be useful.

pragmas can apply to declarations. The syntax is pragma(identifier,...) Declaration (Where Declaration can be the empty declaration, ';') pragma(keepImplementation) void foo(){ ... }

That could work, although it's more typing than I personally want to do. It depends on how much of the pragma the DI generator actually sees though ... you'd be surprised at what it doesn't see.

pragma == specific to compiler attribute == language feature. I think we should go with language feature on this one. -Steve
May 10 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 10 May 2012 14:32:27 -0400, deadalnix <deadalnix gmail.com> wrot=
e:

 Le 10/05/2012 20:22, Timon Gehr a =C3=A9crit :
 On 05/10/2012 08:15 PM, Adam Wilson wrote:
 On Thu, 10 May 2012 11:10:15 -0700, David Gileadi
 <gileadis nspmgmail.com> wrote:

 On 5/10/12 11:01 AM, Adam Wilson wrote:
 It does require some semantic information. And the solution I've s=





 seen most talked about here is some kind of attribute similar to  =





  pure
 that tells the compiler to include the implementation in the DI fi=





 I may be off-base here, but this strikes me as a good case for a
 pragma. No?

Well, it's needs to be at a function level to be useful.

pragmas can apply to declarations. The syntax is pragma(identifier,...) Declaration (Where Declaration can be the empty declaration, ';') pragma(keepImplementation) void foo(){ ... }

You want to specify strip implementation, not keep implementation.

No, it's definitely keep implementation. By default, I want .di files t= o = contain nothing but interface. If I wanted the source by default, I = wouldn't be using .di files.
 Strip implementation may break things. Keeping it cannot. The default =

 behavior should be on the safe side of the medal.

Current behavior is junk, there is no reason to save it. -Steve
May 10 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 10 May 2012 15:53:47 -0400, deadalnix <deadalnix gmail.com> wrot=
e:

 Le 10/05/2012 21:12, Steven Schveighoffer a =C3=A9crit :
 No, it's definitely keep implementation. By default, I want .di files=


 contain nothing but interface. If I wanted the source by default, I
 wouldn't be using .di files.

 Strip implementation may break things. Keeping it cannot. The defaul=



 behavior should be on the safe side of the medal.

Current behavior is junk, there is no reason to save it.

This isn't the current behavior we talking about here.

Then what "breaks"? If you aren't using di generation, how can changing= = the way di generation works break your code? -Steve
May 10 2012
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 10 May 2012 16:16:39 -0400, deadalnix <deadalnix gmail.com> wrot=
e:

 Le 10/05/2012 21:57, Steven Schveighoffer a =C3=A9crit :
 Then what "breaks"? If you aren't using di generation, how can changi=


 the way di generation works break your code?

I don't know in which word you live, but in mine, 100% of project I'm =

 doing use 3rd party code.

 You don't have control on 3rd party code.

It is not the job of the compiler or language to make up for the failure= = to run *basic tests* of your third party vendors. -Steve
May 10 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Thu, 10 May 2012 12:51:03 -0700, deadalnix <deadalnix gmail.com> wrot=
e:

 Le 10/05/2012 20:35, Adam Wilson a =E9crit :
 The problem is that it DOES NOT know if it's CTFEable or not. No
 analysis is performed prior to DI generation!

It doesn't seems undoable.

It isn't, but it would require that DI generation got it's own specializ= ed = form of semantic analysis, and that is a significant amount of work. I'm= = not saying it shouldn't be done, just that it's not a valid short-term = solution. A long-term solution would be to embed a semantically analyzed= = form of the source into the object itself. But that's years away with = concerted group effort. -- = Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 10 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Thu, 10 May 2012 14:25:18 -0700, deadalnix <deadalnix gmail.com> wrot=
e:

 Le 10/05/2012 22:39, Adam Wilson a =E9crit :
 On Thu, 10 May 2012 12:51:03 -0700, deadalnix <deadalnix gmail.com>  =


 wrote:

 Le 10/05/2012 20:35, Adam Wilson a =E9crit :
 The problem is that it DOES NOT know if it's CTFEable or not. No
 analysis is performed prior to DI generation!

It doesn't seems undoable.

It isn't, but it would require that DI generation got it's own specialized form of semantic analysis, and that is a significant amou=


 of work. I'm not saying it shouldn't be done, just that it's not a va=


 short-term solution. A long-term solution would be to embed a
 semantically analyzed form of the source into the object itself. But
 that's years away with concerted group effort.

I wouldn't introduce a language feature for short term solution. This =

 can lead to tedious technical debt to manage.

I would tend to agree with you on that. And personally I would be fine = without it and let DI generation do it's thing as is. It's an option = though. -- = Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 10 2012
prev sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Thu, 10 May 2012 09:56:06 -0700, Steven Schveighoffer  =

<schveiguy yahoo.com> wrote:

 On Thu, 10 May 2012 12:04:44 -0400, deadalnix <deadalnix gmail.com>  =

 wrote:

 Le 10/05/2012 17:54, Steven Schveighoffer a =E9crit :
 On Thu, 10 May 2012 10:47:59 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 5/10/12 6:17 AM, Steven Schveighoffer wrote:
 On Wed, 09 May 2012 23:00:07 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 Actually the point here is to still be able to benefit of di  =






 automated
 generation while opportunistically marking certain functions as "=






 the body in the .di file".

If you aren't going to strip the files, I don't see the point in i=





 Inlining.

No, I mean if dmd -H isn't going to strip the files, what is the poi=



 of dmd -H? I can already copy the .d to .di and have inlining/ctfe, =



 simply use the .d directly.

 At this point, in order to get CTFE to work, you have to keep just  =



 about
 everything, including private imports. If we want to ensure CTFE wor=



 dmd -H becomes a glorified cp. If we have some half-assed guess at w=



 could be CTFE'd (which is growing by the day), then it's likely to n=



 fit with the goals of the developer running dmd -H.

 -Steve

If you can CTFE, you can know what is CTFEable. If it is currently ha=


 assed, then work on it and provide a better tool.

There is already a better tool -- cp. I ask again, what is the benefi=

 of .di generation if it is mostly a glorified (faulty?) copy operation=

 As Adam points out in his original post, ensuring CTFE availability ma=

 not be (and is likely not) why you are creating a .di file.

 Plus, what isn't CTFEable today may be CTFEable tomorrow.

 inlining is one thing, because that's an optimization that has a valid=

 fallback.  CTFE does not.

 -Steve

FYI, I've submitted the pull request and it is passing the autotester. -- = Adam Wilson IRC: LightBender Project Coordinator The Horizon Project http://www.thehorizonproject.org/
May 11 2012