www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Proposed improvements to the separate compilation model

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
A chat in IRC revealed a couple of possible improvements to the 
development scenario in which the interface file (.di) is managed 
manually and separately from the implementation file (.d).

After discussing a few ideas with Walter, the following language 
improvement came about. Consider a very simple scenario in which we have 
files a.d, a.di, and client.d, all situated in the same directory, with 
the following contents:

// a.di
class A { private int x; int foo(); }

// a.d
import a;
int A.foo() { return x + 1; }

// client.d
import a;
void main() { (new A).foo(); }

To compile:

dmd -c a.d
dmd -c client.d
dmd client.o a.o

Currently, in order for a program with separately-implemented methods to 
work properly, there must be TWO different files for the same class, and 
the .di file and the .d file MUST specify the same exact field layout. 
Ironically, the .d file is forbidden from including the .di file, which 
makes any checks and balances impossible. Any change in the layout (e.g. 
swapping, inserting, or removing fields) is undetectable and has 
undefined behavior.

I see this as a source of problems going forward, and I propose the 
following changes to the language:

1. The compiler shall accept method definitions outside a class.

2. A method cannot be implemented unless it was declared with the same 
signature in the class definition.

Walter is reluctantly on board with a change in this direction, with the 
note that he'd just recommend interfaces for this kind of separation. My 
stance in this matter is that we shouldn't constrain without necessity 
the ability of programmers to organize the physical design of their 
large projects.

Please discuss here. I should mention that Walter has his hands too full 
to embark on this, so implementation should be approached by one of our 
other dmd contributors (after of course there's a shared view on the 
design).


Andrei
Jul 22 2011
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/22/11 5:06 PM, Andrei Alexandrescu wrote:
[snip]

I almost forgot - we should also have a means to allow a class to import 
its own incomplete declaration, as follows:

// a.di
class A { private int x; int foo(); }

// a.d
import a;
class A { private int x; int foo() { return x + 1; }

The compiler should accept this because that's what the .di generator 
outputs. The import will be used as a means to verify that the layout 
declared in the .di file is identical to the one in the .d file.

Without this feature, it would be impossible for the compiler to ensure 
that no inadvertent change was made to one of the files.


Andrei
Jul 22 2011
parent Brad Roberts <braddr slice-2.puremagic.com> writes:
On Fri, 22 Jul 2011, Andrei Alexandrescu wrote:

 On 7/22/11 5:06 PM, Andrei Alexandrescu wrote:
 [snip]
 
 I almost forgot - we should also have a means to allow a class to import its
 own incomplete declaration, as follows:
 
 // a.di
 class A { private int x; int foo(); }
 
 // a.d
 import a;
 class A { private int x; int foo() { return x + 1; } 
}
 
 The compiler should accept this because that's what the .di generator outputs.
 The import will be used as a means to verify that the layout declared in the
 .di file is identical to the one in the .d file.
 
 Without this feature, it would be impossible for the compiler to ensure that
 no inadvertent change was made to one of the files.
 
 
 Andrei
How about another case: // a.di class A { private int; int foo() { return x+1; } } // a.d imoprt a; class A { private int; int foo() { return x+1; } } ie, a case where the .di file has a function body (be it due to manual creation or automatic). How closely do they need to match? Later, Brad
Jul 22 2011
prev sibling next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Andrei:

 Currently, in order for a program with separately-implemented methods to 
 work properly, there must be TWO different files for the same class, and 
 the .di file and the .d file MUST specify the same exact field layout. 
 Ironically, the .d file is forbidden from including the .di file, which 
 makes any checks and balances impossible. Any change in the layout (e.g. 
 swapping, inserting, or removing fields) is undetectable and has 
 undefined behavior.
Aren't .di files automatically generated? So isn't the job of tools like IDEs too keep .di files in sync with their .d file? .di files are better left as computer-generated things. And if you really want to be sure, why you don't add something like this, that asks the compiler to verify they are in sync (most times a 128 bit hash of the .d module copied inside the .di file is enough to be sure the .d is in sync with its .di file: pragma(di_sync); There's also the possibility of a more fine grained hashing, usable to assert that the things you care for in a struct/class (like its layout) have not changed compared to the .di file. Bye, bearophile
Jul 22 2011
prev sibling next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sat, 23 Jul 2011 01:06:19 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 A chat in IRC revealed a couple of possible improvements to the  
 development scenario in which the interface file (.di) is managed  
 manually and separately from the implementation file (.d).
I might be forgetting bits of our discussions at the moment, but has it been considered that instead of facilitating use of manually-maintained .di files, to instead improve auto-generated .di files to the point that manual maintenance is not necessary? Some ideas: 1) As discussed, ctfeable (include function body in .di to allow using in CTFE) 2) Removing imports for modules containing symbols not mentioned in .di code (avoid pulling in dependencies of the implementation) 3) Perhaps, adding opaque objects ("class X;") to the language and emitting those as appropriate? (semi-automatic Pimpls) Did I miss anything? -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Jul 22 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/22/11 9:29 PM, Vladimir Panteleev wrote:
 On Sat, 23 Jul 2011 01:06:19 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 A chat in IRC revealed a couple of possible improvements to the
 development scenario in which the interface file (.di) is managed
 manually and separately from the implementation file (.d).
I might be forgetting bits of our discussions at the moment, but has it been considered that instead of facilitating use of manually-maintained .di files, to instead improve auto-generated .di files to the point that manual maintenance is not necessary? Some ideas: 1) As discussed, ctfeable (include function body in .di to allow using in CTFE) 2) Removing imports for modules containing symbols not mentioned in .di code (avoid pulling in dependencies of the implementation) 3) Perhaps, adding opaque objects ("class X;") to the language and emitting those as appropriate? (semi-automatic Pimpls) Did I miss anything?
I don't think it's an either-or situation. For a variety of reasons, some organizations want separate control of the "declaration" and "definition" files. Inability to do so is a common criticism leveled against Java and one of the reasons for the proliferation of XML configuration files and dynamic loading in that language. Andrei
Jul 22 2011
next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sat, 23 Jul 2011 05:52:12 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 I don't think it's an either-or situation. For a variety of reasons,  
 some organizations want separate control of the "declaration" and  
 "definition" files. Inability to do so is a common criticism leveled  
 against Java and one of the reasons for the proliferation of XML  
 configuration files and dynamic loading in that language.
Now I'm curious, what are those reasons? Can we improve .di generation to accommodate everyone, even if we'd need to add attributes or pragmas to the language or frontend? It just seems to me like this path kills two birds with one stone, and is less work overall than doing both. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Jul 22 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/23/11 12:19 AM, Vladimir Panteleev wrote:
 On Sat, 23 Jul 2011 05:52:12 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 I don't think it's an either-or situation. For a variety of reasons,
 some organizations want separate control of the "declaration" and
 "definition" files. Inability to do so is a common criticism leveled
 against Java and one of the reasons for the proliferation of XML
 configuration files and dynamic loading in that language.
Now I'm curious, what are those reasons? Can we improve .di generation to accommodate everyone, even if we'd need to add attributes or pragmas to the language or frontend? It just seems to me like this path kills two birds with one stone, and is less work overall than doing both.
Improving .di generation is great. Large projects may have policies that restrict changing interface files so as to not trigger recompilation without necessity. Such policies are difficult to accommodate with .di files that are generated automatically. Andrei
Jul 23 2011
parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sat, 23 Jul 2011 17:16:28 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 7/23/11 12:19 AM, Vladimir Panteleev wrote:
 On Sat, 23 Jul 2011 05:52:12 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 I don't think it's an either-or situation. For a variety of reasons,
 some organizations want separate control of the "declaration" and
 "definition" files. Inability to do so is a common criticism leveled
 against Java and one of the reasons for the proliferation of XML
 configuration files and dynamic loading in that language.
Now I'm curious, what are those reasons? Can we improve .di generation to accommodate everyone, even if we'd need to add attributes or pragmas to the language or frontend? It just seems to me like this path kills two birds with one stone, and is less work overall than doing both.
Improving .di generation is great. Large projects may have policies that restrict changing interface files so as to not trigger recompilation without necessity. Such policies are difficult to accommodate with .di files that are generated automatically.
So don't change a generated .di file's mtime if the contents is identical to the existing version on disk. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Jul 23 2011
prev sibling next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Andrei:

 I don't think it's an either-or situation.
OK, but first it's better to work as much as possible on the automatic/compiler side of things, and see how much far it goes. Turning .di files into hand-written things has to be left as very last chance.
 For a variety of reasons, 
 some organizations want separate control of the "declaration" and 
 "definition" files.
Example: to not give full source code to clients. Even Python (that today is far more used in large and medium organizations than D) has received similar pressures. But each of those pressures have disadvantages, sometimes for the community of the programmers. So Python has chosen to resist to some of those pressures.
 Inability to do so is a common criticism leveled 
 against Java and one of the reasons for the proliferation of XML 
 configuration files and dynamic loading in that language.
There is JSON too today :-) Dynamic loading is a source of many troubles, it seems hard to make it safe. Bye, bearophile
Jul 23 2011
prev sibling parent Kagamin <spam here.lot> writes:
Andrei Alexandrescu Wrote:

 I don't think it's an either-or situation. For a variety of reasons, 
 some organizations want separate control of the "declaration" and 
 "definition" files. Inability to do so is a common criticism leveled 
 against Java and one of the reasons for the proliferation of XML 
 configuration files and dynamic loading in that language.
can perfectly go to separate files and assemblies.
Jul 25 2011
prev sibling next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 7/23/11, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 Currently, in order for a program with separately-implemented methods to
 work properly, there must be TWO different files for the same class
Can we get a chat log of this discussion to figure out what is trying to be solved here? interface IFoo { int foo(); } class Foo : IFoo { private int x; version(one) { int foo() { return x; } } else version(two) { int foo() { return x++; } } } $ dmd -c foo.d -version=one or: class Foo : IFoo { mixin(import("FooImpl.d")); } $ dmd -c foo.d -J./my/foo/ There are probably other ways to do this within the existing framework. My vote goes to Walter and other contributors to work on fixing existing bugs so we can have not a perfect but a working language.
Jul 22 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/23/11 12:34 AM, Andrej Mitrovic wrote:
 On 7/23/11, Andrei Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:
 Currently, in order for a program with separately-implemented methods to
 work properly, there must be TWO different files for the same class
Can we get a chat log of this discussion to figure out what is trying to be solved here?
It was on the phone.
 interface IFoo
 {
      int foo();
 }

 class Foo : IFoo
 {
      private int x;

      version(one)
      {
          int foo() { return x; }
      }
      else
      version(two)
      {
          int foo() { return x++; }
      }
 }

 $ dmd -c foo.d -version=one

 or:

 class Foo : IFoo
 {
      mixin(import("FooImpl.d"));
 }

 $ dmd -c foo.d -J./my/foo/

 There are probably other ways to do this within the existing framework.
Imposing one interface for each class hierarchy is an option, but not an attractive one. It's essentially requiring boilerplate.
 My vote goes to Walter and other contributors to work on fixing
 existing bugs so we can have not a perfect but a working language.
There are not many D large projects at the time being, and everybody who worked on one has had problems. Large projects will come, and the language must be prepared. This _is_ important. Of course, it can happen only of a contributor owns this. We'd be hasty to dismiss the issue though. Thanks, Andrei
Jul 23 2011
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sat, 23 Jul 2011 17:21:26 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 There are not many D large projects at the time being, and everybody who  
 worked on one has had problems. Large projects will come, and the  
 language must be prepared. This _is_ important. Of course, it can happen  
 only of a contributor owns this. We'd be hasty to dismiss the issue  
 though.
FORCING people who want to create large D projects to manually maintain .di files seems like a horrible solution for me. One of the reasons I ran from C++ is due to not having to maintain header files any longer. Whatever your reasons are for manually-maintained .di files, let's not consider them a cure-all for all large project problems. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Jul 23 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/23/11 11:26 AM, Vladimir Panteleev wrote:
 On Sat, 23 Jul 2011 17:21:26 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 There are not many D large projects at the time being, and everybody
 who worked on one has had problems. Large projects will come, and the
 language must be prepared. This _is_ important. Of course, it can
 happen only of a contributor owns this. We'd be hasty to dismiss the
 issue though.
FORCING people who want to create large D projects to manually maintain .di files seems like a horrible solution for me.
Agreed.
 One of the reasons I
 ran from C++ is due to not having to maintain header files any longer.
 Whatever your reasons are for manually-maintained .di files, let's not
 consider them a cure-all for all large project problems.
Probably this is a misunderstanding. I'm giving a realistic option, not adding a constraint. Andrei
Jul 23 2011
parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sat, 23 Jul 2011 20:43:13 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 One of the reasons I
 ran from C++ is due to not having to maintain header files any longer.
 Whatever your reasons are for manually-maintained .di files, let's not
 consider them a cure-all for all large project problems.
Probably this is a misunderstanding. I'm giving a realistic option, not adding a constraint.
OK. It's just that from where I'm standing, to me it looks like you've already decided on a course without seriously considering the benefits of a more general solution, as well as comparing the amount of work necessary for either path. As you've seen, I have some strong personal opinions about manually-maintained .di files. Performing the same change twice many times per day is the kind of repetitive, mind-numbing work that makes you hate the language, the project, your job, and your boss (for not allowing use of a better language/solution). I've done it in C, C++, Delphi, I don't want to do it in D too. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Jul 23 2011
parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Isn't the biggest issue of large D projects the problems with
incremental compilation (e.g.
https://bitbucket.org/h3r3tic/xfbuild/issue/7/make-incremental-building-reliable),
optlink, and the toolchain?

I'm not sure how adding another responsibility to the programmer will
help big projects.
Which existing D projects does this proposal actually apply to? E.g.
which ones will benefit from it?
Jul 23 2011
next sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sat, 23 Jul 2011 21:53:30 +0300, Andrej Mitrovic  
<andrej.mitrovich gmail.com> wrote:

 Isn't the biggest issue of large D projects the problems with
 incremental compilation (e.g.
 https://bitbucket.org/h3r3tic/xfbuild/issue/7/make-incremental-building-reliable),
 optlink, and the toolchain?
Yes. This is a difficult problem. Due to how DMD is designed, incremental compilation is only reliable when compiling one module at a time. Normally, this would require DMD to parse all imports recursively, for every invocation (thus, for every module). .di files should greatly speed up parsing (and thus, assuming that is the major bottleneck, make one-module-at-a-time compilation faster). -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Jul 23 2011
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/23/11 1:53 PM, Andrej Mitrovic wrote:
 Isn't the biggest issue of large D projects the problems with
 incremental compilation (e.g.
 https://bitbucket.org/h3r3tic/xfbuild/issue/7/make-incremental-building-reliable),
 optlink, and the toolchain?
The proposed improvement would mark a step forward in the toolchain and generally in the development of large programs. In particular, it would provide a simple means to decouple compilation of modules used together. It's not easy for me to figure how people don't get it's a net step forward from the current situation. Andrei
Jul 23 2011
next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sat, 23 Jul 2011 23:16:20 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 7/23/11 1:53 PM, Andrej Mitrovic wrote:
 Isn't the biggest issue of large D projects the problems with
 incremental compilation (e.g.
 https://bitbucket.org/h3r3tic/xfbuild/issue/7/make-incremental-building-reliable),
 optlink, and the toolchain?
The proposed improvement would mark a step forward in the toolchain and generally in the development of large programs. In particular, it would provide a simple means to decouple compilation of modules used together. It's not easy for me to figure how people don't get it's a net step forward from the current situation.
Then you don't understand what I'm ranting about. It is certainly an improvement, but: 1) We don't have an infinity of programmer-hours. I'm saying that the time would likely be better spent at improving .di generation, which should have a much greater overall benefit per required work unit - and for all I can tell, you don't even want to seriously consider this option. 2) Once manually-maintained .di files are usable, they will be used as an excuse to shoo away people working on large projects (people complaining about compilation speed will be told to just manually write .di files for their 100KLoC projects). -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Jul 23 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/23/11 4:01 PM, Vladimir Panteleev wrote:
 On Sat, 23 Jul 2011 23:16:20 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 7/23/11 1:53 PM, Andrej Mitrovic wrote:
 Isn't the biggest issue of large D projects the problems with
 incremental compilation (e.g.
 https://bitbucket.org/h3r3tic/xfbuild/issue/7/make-incremental-building-reliable),

 optlink, and the toolchain?
The proposed improvement would mark a step forward in the toolchain and generally in the development of large programs. In particular, it would provide a simple means to decouple compilation of modules used together. It's not easy for me to figure how people don't get it's a net step forward from the current situation.
Then you don't understand what I'm ranting about.
That's a bit assuming. I thought about it for a little and concluded that I'd do good to explain the current state of affairs a bit. Consider: // file a.di class A { int a; double b; string c; void fun(); } Say the team working on A wants to "freeze" a.di without precluding work on A.fun(). In a large project, changing a.di would trigger a lot of recompilations, re-uploads, the need for retests etc. so they'd want to have control over that. So they freeze a.di and define a.d as follows: // file a.d class A { int a = 42; double b = 43; string c = "44"; void fun() { assert(a == 42 && b == 43 && c == "44"); } } Now the team has achieved their goal: developers can work on A.fun without inadvertently messing up a.di. Everybody is happy. The client code would work like this: // file main.d import std.stdio; import a; void main() { auto a = new A; a.fun(); writeln(a.tupleof); } To build and run: dmd -c a.d dmd -c main.d a.o ./main The program prints "424344" as expected. The problem with this setup is that it's extremely fragile, in ways that are undetectable during compilation or runtime. For example, just swapping a and b in the implementation file makes the program print "08.96566e-31344". Similar issues occur if fields or methods are added or removed from one file but not the other. In an attempt to fix this, the developers may add an "import a" to a.d, thinking that the compiler would import a.di and would verify the bodies of the two classes for correspondence. That doesn't work - the compiler simply ignores the import. Things can be tenuously arranged such that the .d file and the .di file have different names, but in that case the compiler complains about duplicate definitions. So the programmers conclude they need to define an interface for A (and generally each and every hierarchy or isolated class in the project). But the same problem occurs for struct, and there's no way to define interfaces for structs. Ultimately the programmers figure there's no way to keep files separate without establishing a build mechanism that e.g. generates a.di from a.d, compares it against the existing a.di, and complains if the two aren't identical. Upon such a build failure, a senior engineer would figure out what action to take. But wait, there's less. The programmers don't have the option of grouping method implementations in a hierarchy by functionality (which is common in visitation patterns - even dmd does so). They must define one class with everything in one place, and there's no way out of that. My understanding is that the scenarios above are of no value to you, and if the language would accept them you'd consider that a degradation of the status quo. Given that the status quo includes a fair amount of impossible to detect failures and tenuous mechanisms, I disagree. Let me also play a card I wish I hadn't - I've worked on numerous large projects and I can tell from direct experience that the inherent problems are... well, odd. Engineers embarked on such projects need all the help they could get and would be willing to explore options that seem ridiculous for projects one fraction the size. Improved .di generation would be of great help. Enabling other options would be even better.
 It is certainly an
 improvement, but:

 1) We don't have an infinity of programmer-hours. I'm saying that the
 time would likely be better spent at improving .di generation, which
 should have a much greater overall benefit per required work unit - and
 for all I can tell, you don't even want to seriously consider this option.
Generation of .di files does not compete with the proposed feature.
 2) Once manually-maintained .di files are usable, they will be used as
 an excuse to shoo away people working on large projects (people
 complaining about compilation speed will be told to just manually write
 .di files for their 100KLoC projects).
Your ability to predict future is much better than mine. Andrei
Jul 23 2011
next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sun, 24 Jul 2011 00:54:57 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 7/23/11 4:01 PM, Vladimir Panteleev wrote:
 On Sat, 23 Jul 2011 23:16:20 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 7/23/11 1:53 PM, Andrej Mitrovic wrote:
 Isn't the biggest issue of large D projects the problems with
 incremental compilation (e.g.
 https://bitbucket.org/h3r3tic/xfbuild/issue/7/make-incremental-building-reliable),

 optlink, and the toolchain?
The proposed improvement would mark a step forward in the toolchain and generally in the development of large programs. In particular, it would provide a simple means to decouple compilation of modules used together. It's not easy for me to figure how people don't get it's a net step forward from the current situation.
Then you don't understand what I'm ranting about.
That's a bit assuming.
OK, that implies that you weren't talking about me above.
 I thought about it for a little and concluded that I'd do good to  
 explain the current state of affairs a bit.

 [snip]
There was no need to go in such great detail; I think the basics are well understood, I was asking for arguments and counter-arguments.
 Ultimately the programmers figure there's no way to keep files separate  
 without establishing a build mechanism that e.g. generates a.di from  
 a.d, compares it against the existing a.di, and complains if the two  
 aren't identical. Upon such a build failure, a senior engineer would  
 figure out what action to take.
I was going to suggest something like this, but without creating the dependency on a 3rd-party build tool. I mentioned in another thread how DMD shouldn't touch the .di file's mtime. A better idea is to not attempt to overwrite the file at all if the generated .di is identical to the old one. With this behavior, you can simply make .di files read-only - the compiler will bail out when it tries to save a new version of the .di file. Yes, this is a hack, but it's not the only solution. Aside writing a build tool, as you mentioned, I believe many organizations include automatic tests coupled with source control, which could easily detect check-ins that change .di files. Yes, neither of the above are "proper" solutions. But, unless I've lost track of something, you're trying to justify a solid amount of work on the compiler to implement the "proper" solution, when the above alternatives are much simpler in practice. (If you have more counter-arguments, I'd like to hear them.)
 But wait, there's less. The programmers don't have the option of  
 grouping method implementations in a hierarchy by functionality (which  
 is common in visitation patterns - even dmd does so). They must define  
 one class with everything in one place, and there's no way out of that.
Sorry, I don't understand this part. Could you elaborate?
 My understanding is that the scenarios above are of no value to you, and  
 if the language would accept them you'd consider that a degradation of  
 the status quo.
I'm not trying to argue for my personal opinion and the way I use D. I was trying to point out that your suggestion seems less efficient in terms of benefit per work-unit for all users of D.
 Given that the status quo includes a fair amount of impossible to detect  
 failures and tenuous mechanisms, I disagree. Let me also play a card I  
 wish I hadn't - I've worked on numerous large projects and I can tell  
 from direct experience that the inherent problems are... well, odd.  
 Engineers embarked on such projects need all the help they could get and  
 would be willing to explore options that seem ridiculous for projects  
 one fraction the size. Improved .di generation would be of great help.  
 Enabling other options would be even better.
[snip]
 1) We don't have an infinity of programmer-hours. I'm saying that the
 time would likely be better spent at improving .di generation, which
 should have a much greater overall benefit per required work unit - and
 for all I can tell, you don't even want to seriously consider this  
 option.
Generation of .di files does not compete with the proposed feature.
Again, I'm not saying that this is a bad direction, just not the best one. [off-topic]
 2) Once manually-maintained .di files are usable, they will be used as
 an excuse to shoo away people working on large projects (people
 complaining about compilation speed will be told to just manually write
 .di files for their 100KLoC projects).
Your ability to predict future is much better than mine.
I didn't say who'll say that... It might not be you or Walter, but can you account for all users of D on IRC, Reddit, StackOverflow, etc.? Good enough is the enemy of better, etc. [/off-topic] P.S. I appreciate you taking the time for this discussion. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Jul 23 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/23/11 5:34 PM, Vladimir Panteleev wrote:
 I was going to suggest something like this, but without creating the
 dependency on a 3rd-party build tool. I mentioned in another thread how
 DMD shouldn't touch the .di file's mtime. A better idea is to not
 attempt to overwrite the file at all if the generated .di is identical
 to the old one.
That's a must at any rate, and should be filed as an enhancement request. The compiler should generate filename.di.tmp, compare it against filename.di (if any), and then either remove the .tmp if identical or rename it forcefully to filename.di if different. That's a classic in code generation tools.
 Yes, this is a hack, but it's not the only solution. Aside writing a
 build tool, as you mentioned, I believe many organizations include
 automatic tests coupled with source control, which could easily detect
 check-ins that change .di files.

 Yes, neither of the above are "proper" solutions. But, unless I've lost
 track of something, you're trying to justify a solid amount of work on
 the compiler to implement the "proper" solution, when the above
 alternatives are much simpler in practice. (If you have more
 counter-arguments, I'd like to hear them.)
I don't think at all these aren't proper. As I said, people are willing to do crazy things to keep large projects sane. The larger question here is how to solve a failure scenario, i.e. what can we offer that senior engineer who fixes the build when the .di files are not in sync anymore.
 But wait, there's less. The programmers don't have the option of
 grouping method implementations in a hierarchy by functionality (which
 is common in visitation patterns - even dmd does so). They must define
 one class with everything in one place, and there's no way out of that.
Sorry, I don't understand this part. Could you elaborate?
A class hierarchy defines foo() and bar(). We want to put all foo() implementations together and all bar() implementations together. Andrei
Jul 23 2011
next sibling parent reply so <so so.so> writes:
On Sun, 24 Jul 2011 01:47:32 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 and then either remove the .tmp if identical or rename it forcefully to  
 filename.di if different. That's a classic in code generation tools.
Not sure i got it right but how renaming it forcefully would solve this? This must be two separate process for the compiler, issuing an error and if it was intended then the .di file must be generated.
Jul 23 2011
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/23/11 6:02 PM, so wrote:
 On Sun, 24 Jul 2011 01:47:32 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 and then either remove the .tmp if identical or rename it forcefully
 to filename.di if different. That's a classic in code generation tools.
Not sure i got it right but how renaming it forcefully would solve this? This must be two separate process for the compiler, issuing an error and if it was intended then the .di file must be generated.
There'd be an error if the file were read-only. Andrei
Jul 23 2011
prev sibling parent reply Daniel Gibson <metalcaedes gmail.com> writes:
Am 24.07.2011 01:02, schrieb so:
 On Sun, 24 Jul 2011 01:47:32 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 and then either remove the .tmp if identical or rename it forcefully
 to filename.di if different. That's a classic in code generation tools.
Not sure i got it right but how renaming it forcefully would solve this? This must be two separate process for the compiler, issuing an error and if it was intended then the .di file must be generated.
Because if not done forcefully there may be an error because filename.di already exists.
Jul 23 2011
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/23/11 6:10 PM, Daniel Gibson wrote:
 Am 24.07.2011 01:02, schrieb so:
 On Sun, 24 Jul 2011 01:47:32 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 and then either remove the .tmp if identical or rename it forcefully
 to filename.di if different. That's a classic in code generation tools.
Not sure i got it right but how renaming it forcefully would solve this? This must be two separate process for the compiler, issuing an error and if it was intended then the .di file must be generated.
Because if not done forcefully there may be an error because filename.di already exists.
Exactly right. Andrei
Jul 23 2011
prev sibling parent so <so so.so> writes:
On Sun, 24 Jul 2011 02:10:47 +0300, Daniel Gibson <metalcaedes gmail.com>  
wrote:

 Am 24.07.2011 01:02, schrieb so:
 On Sun, 24 Jul 2011 01:47:32 +0300, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 and then either remove the .tmp if identical or rename it forcefully
 to filename.di if different. That's a classic in code generation tools.
Not sure i got it right but how renaming it forcefully would solve this? This must be two separate process for the compiler, issuing an error and if it was intended then the .di file must be generated.
Because if not done forcefully there may be an error because filename.di already exists.
What i meant is: You got file.di file.d, file.di being your reference. You got a conflict. Now this conflict could be a result of something not intended and it most likely wasn't intended because file.di is your reference and you don't change many things often in there. There IMO you should just issue an error and let the user deal with it by explicitly generating file.di from file.d, if you are doing this explicitly compiler just shouldn't bother if it was already there. (again it is free to compare results of these two and keep the old one if they are same) Doing this forcefully also is a solution but it better be optional (a compiler flag) not the default.
Jul 23 2011
prev sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sun, 24 Jul 2011 01:47:32 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Yes, neither of the above are "proper" solutions. But, unless I've lost
 track of something, you're trying to justify a solid amount of work on
 the compiler to implement the "proper" solution, when the above
 alternatives are much simpler in practice. (If you have more
 counter-arguments, I'd like to hear them.)
I don't think at all these aren't proper. As I said, people are willing to do crazy things to keep large projects sane. The larger question here is how to solve a failure scenario, i.e. what can we offer that senior engineer who fixes the build when the .di files are not in sync anymore.
OK. I'm not going to ask your estimates of how probable this is to happen in practice, and if it justifies the implementation effort of going into this direction, as I think I've fully elaborated my point already. One thing that I should have mentioned before, is one of my reasons for this argument: the design for implementing verification of manually-maintained .di files has been discussed and published by D's creators, so now someone just needs to go and implement it. However, there is no discussion or consensus about improving .di-file generation. Even choosing the name of whatever attribute gets DMD to copy function bodies to .di files would be a start.
 But wait, there's less. The programmers don't have the option of
 grouping method implementations in a hierarchy by functionality (which
 is common in visitation patterns - even dmd does so). They must define
 one class with everything in one place, and there's no way out of that.
Sorry, I don't understand this part. Could you elaborate?
A class hierarchy defines foo() and bar(). We want to put all foo() implementations together and all bar() implementations together.
Ah, so this is a new possibility created by the "method definitions outside a class" requirement. But how would this play with the module system? Would you be allowed to place method definitions in modules other than the class declaration? -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Jul 23 2011
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei:

I am not expert on this, but it doesn't look like esoteric stuff.

 Consider:
Thank you for explaining better, with more examples. This usually helps the discussion.
 Say the team working on A wants to "freeze" a.di without precluding work 
 on A.fun().
Currently in D there is no explicit & enforced way to state this desire to the compiler?
 The problem with this setup is that it's extremely fragile, in ways that 
 are undetectable during compilation or runtime.
Is it possible to invent ways to make this less fragile?
 For example, just 
 swapping a and b in the implementation file makes the program print
 "08.96566e-31344". Similar issues occur if fields or methods are added 
 or removed from one file but not the other.
I have suggested some fine-grained hashing. Compute a hash from a class definition, and later quickly compare this value with a value stored elsewhere (like automatically written in the .di file).
 Ultimately the programmers figure there's no way to keep files separate 
 without establishing a build mechanism that e.g. generates a.di from 
 a.d, compares it against the existing a.di, and complains if the two 
 aren't identical.
Comparing .di files looks tricky. DMD generates them deterministically, so in theory it works, but it doesn't sound like a good thing to do.
 They must define 
 one class with everything in one place, and there's no way out of that.
http://msdn.microsoft.com/en-us/library/wa80x488%28v=vs.80%29.aspx Bye, bearophile
Jul 23 2011
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/23/11 5:39 PM, bearophile wrote:
 Andrei:

 I am not expert on this, but it doesn't look like esoteric stuff.

 Consider:
Thank you for explaining better, with more examples. This usually helps the discussion.
 Say the team working on A wants to "freeze" a.di without precluding
 work on A.fun().
Currently in D there is no explicit& enforced way to state this desire to the compiler?
It is as I described in the example. It works today. Fragility is the problem there.
 The problem with this setup is that it's extremely fragile, in ways
 that are undetectable during compilation or runtime.
Is it possible to invent ways to make this less fragile?
 For example, just swapping a and b in the implementation file makes
 the program print "08.96566e-31344". Similar issues occur if fields
 or methods are added or removed from one file but not the other.
I have suggested some fine-grained hashing. Compute a hash from a class definition, and later quickly compare this value with a value stored elsewhere (like automatically written in the .di file).
I discussed four options with Walter, and this was one of them. It has issues. The proposal as in this thread is the simplest and most effective I could find.
 Ultimately the programmers figure there's no way to keep files
 separate without establishing a build mechanism that e.g. generates
 a.di from a.d, compares it against the existing a.di, and complains
 if the two aren't identical.
Comparing .di files looks tricky. DMD generates them deterministically, so in theory it works, but it doesn't sound like a good thing to do.
It's commonplace with code generation tools.
 They must define one class with everything in one place, and
 there's no way out of that.
files. http://msdn.microsoft.com/en-us/library/wa80x488%28v=vs.80%29.aspx
used like that. Andrei
Jul 23 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/23/2011 3:50 PM, Andrei Alexandrescu wrote:
 On 7/23/11 5:39 PM, bearophile wrote:
 I have suggested some fine-grained hashing. Compute a hash from a
 class definition, and later quickly compare this value with a value
 stored elsewhere (like automatically written in the .di file).
I discussed four options with Walter, and this was one of them. It has issues. The proposal as in this thread is the simplest and most effective I could find.
The only way the linker can detect mismatches is by embedding the hash into the name, i.e. more name mangling. This has serious issues: 1. The hashing cannot be reversed. Hence, the user will be faced with really, really ugly error messages from the linker that will make today's mangled names look like a marvel of clarity. Consider all the users today, who have a really hard time with things like: undefined symbol: _foo from the linker. Now imagine it's: undefined symbol: _foo12345WQERTYHBVCFDERTYHGFRTYHGFTYUHGTYUHGTYUJHGTYU They'll run screaming, and I would, too. 2. This hash will get added to all struct/class names, so there will be an explosion in the length of names the linker sees. This can make tools that deal with symbolic names in the executable (like debuggers, disassemblers, profilers, etc.) much more messy to deal with. 3. Hashes aren't perfect, they can have collisions, unless you want to go with really long ones like MD5.
Jul 23 2011
next sibling parent "Roald Ribe" <rr pogostick.net> writes:
On Sat, 23 Jul 2011 21:14:27 -0300, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 7/23/2011 3:50 PM, Andrei Alexandrescu wrote:
 On 7/23/11 5:39 PM, bearophile wrote:
 I have suggested some fine-grained hashing. Compute a hash from a
 class definition, and later quickly compare this value with a value
 stored elsewhere (like automatically written in the .di file).
I discussed four options with Walter, and this was one of them. It has issues. The proposal as in this thread is the simplest and most effective I could find.
The only way the linker can detect mismatches is by embedding the hash into the name, i.e. more name mangling. This has serious issues: 1. The hashing cannot be reversed. Hence, the user will be faced with really, really ugly error messages from the linker that will make today's mangled names look like a marvel of clarity. Consider all the users today, who have a really hard time with things like: undefined symbol: _foo from the linker. Now imagine it's: undefined symbol: _foo12345WQERTYHBVCFDERTYHGFRTYHGFTYUHGTYUHGTYUJHGTYU They'll run screaming, and I would, too.
A simplistic suggestion: This could be made better by specifying a hash introduction character, known by or specifyable in all tools. That could give _foo^12345WQERTYHBVCFDERTYHGFRTYHGFTYUHGTYUHGTYUJHGTYU in tools not yet aware of the hash intro character, and just _foo in tools that has been adapted to take advantage of it. Both cases are easier to read IMHO, and the system enables easy implementation of the second case in varous tools.
 2. This hash will get added to all struct/class names, so there will be  
 an explosion in the length of names the linker sees. This can make tools  
 that deal with symbolic names in the executable (like debuggers,  
 disassemblers, profilers, etc.) much more messy to deal with.
 3. Hashes aren't perfect, they can have collisions, unless you want to  
 go with really long ones like MD5.
The system above would make the length of the hash almost irrelevant, because it would simplify the adaption of tools to not display the symbols hash, while also make the symbol easier to read in old tools not yet adapted. I do not know if other compiled languages has the same problem, but if they do such a convention might be nice for them as well. Roald
Jul 24 2011
prev sibling parent reply Kagamin <spam here.lot> writes:
Walter Bright Wrote:

 The only way the linker can detect mismatches is by embedding the hash into
the 
 name, i.e. more name mangling.
It's not the only way. You can keep current mangling scheme and store hashes in another place. If the linker doesn't support hashes, everything will just work, if it does, it can provide extra safety for this particular scenario.
Jul 25 2011
parent Kagamin <spam here.lot> writes:
Kagamin Wrote:

 Walter Bright Wrote:
 
 The only way the linker can detect mismatches is by embedding the hash into
the 
 name, i.e. more name mangling.
It's not the only way. You can keep current mangling scheme and store hashes in another place. If the linker doesn't support hashes, everything will just work, if it does, it can provide extra safety for this particular scenario.
Anyways, is it a good idea to force dependency of client on the fields of a class. Fields are implementation details and should not affect client code. This means we end up with plain old interfaces we already have up and working.
Jul 25 2011
prev sibling parent "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Sun, 24 Jul 2011 01:39:01 +0300, bearophile <bearophileHUGS lycos.com>  
wrote:

 Andrei:

 I am not expert on this, but it doesn't look like esoteric stuff.

 Consider:
Thank you for explaining better, with more examples. This usually helps the discussion.
I retract my comment about this :)
 The problem with this setup is that it's extremely fragile, in ways that
 are undetectable during compilation or runtime.
Is it possible to invent ways to make this less fragile?
This is what Andrei's proposal attempts to solve. -- Best regards, Vladimir mailto:vladimir thecybershadow.net
Jul 23 2011
prev sibling next sibling parent so <so so.so> writes:
On Sun, 24 Jul 2011 00:54:57 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 The program prints "424344" as expected.

 The problem with this setup is that it's extremely fragile, in ways that  
 are undetectable during compilation or runtime. For example, just  
 swapping a and b in the implementation file makes the program print
 "08.96566e-31344". Similar issues occur if fields or methods are added  
 or removed from one file but not the other.
I am not a fun of .di files but there is no reason i can think of that the compiler should allow such thing. If there is a .di file for the module compare it to the implementation. If the implementation is different, it is an error. If it is intended then .di file must be generated explicitly.
 So the programmers conclude they need to define an interface for A (and  
 generally each and every hierarchy or isolated class in the project).  
 But the same problem occurs for struct, and there's no way to define  
 interfaces for structs.
I don't understand why people keep bringing this as a solution, this is D.
 Ultimately the programmers figure there's no way to keep files separate  
 without establishing a build mechanism that e.g. generates a.di from  
 a.d, compares it against the existing a.di, and complains if the two  
 aren't identical. Upon such a build failure, a senior engineer would  
 figure out what action to take.

 But wait, there's less. The programmers don't have the option of  
 grouping method implementations in a hierarchy by functionality (which  
 is common in visitation patterns - even dmd does so). They must define  
 one class with everything in one place, and there's no way out of that.

 My understanding is that the scenarios above are of no value to you, and  
 if the language would accept them you'd consider that a degradation of  
 the status quo. Given that the status quo includes a fair amount of  
 impossible to detect failures and tenuous mechanisms, I disagree. Let me  
 also play a card I wish I hadn't - I've worked on numerous large  
 projects and I can tell from direct experience that the inherent  
 problems are... well, odd. Engineers embarked on such projects need all  
 the help they could get and would be willing to explore options that  
 seem ridiculous for projects one fraction the size. Improved .di  
 generation would be of great help. Enabling other options would be even  
 better.
Interface design, separate compilation, library development are one of the many things C++ couldn't make better than C but took it to a new low. And i don't get why no one else here considers these are big issues and just go suggest C++ ways, last two times i brought up something related to this there was only one response. I want to believe it is because me, failing to express myself :)
Jul 23 2011
prev sibling parent reply Mike Wey <mike-wey example.com> writes:
On 07/23/2011 11:54 PM, Andrei Alexandrescu wrote:
 The problem with this setup is that it's extremely fragile, in ways that
 are undetectable during compilation or runtime. For example, just
 swapping a and b in the implementation file makes the program print
 "08.96566e-31344". Similar issues occur if fields or methods are added
 or removed from one file but not the other.

 In an attempt to fix this, the developers may add an "import a" to a.d,
 thinking that the compiler would import a.di and would verify the bodies
 of the two classes for correspondence. That doesn't work - the compiler
 simply ignores the import. Things can be tenuously arranged such that
 the .d file and the .di file have different names, but in that case the
 compiler complains about duplicate definitions.
If the .di files are this fragile, the compiler should just always check if the .d file matches (if present) .di file, then there is no need for the extra import. -- Mike Wey
Jul 24 2011
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/24/11 4:31 AM, Mike Wey wrote:
 On 07/23/2011 11:54 PM, Andrei Alexandrescu wrote:
 The problem with this setup is that it's extremely fragile, in ways that
 are undetectable during compilation or runtime. For example, just
 swapping a and b in the implementation file makes the program print
 "08.96566e-31344". Similar issues occur if fields or methods are added
 or removed from one file but not the other.

 In an attempt to fix this, the developers may add an "import a" to a.d,
 thinking that the compiler would import a.di and would verify the bodies
 of the two classes for correspondence. That doesn't work - the compiler
 simply ignores the import. Things can be tenuously arranged such that
 the .d file and the .di file have different names, but in that case the
 compiler complains about duplicate definitions.
If the .di files are this fragile, the compiler should just always check if the .d file matches (if present) .di file, then there is no need for the extra import.
Trouble is, they sometimes may be in different dirs. Andrei
Jul 24 2011
prev sibling parent reply Don <nospam nospam.com> writes:
Andrei Alexandrescu wrote:
 On 7/23/11 1:53 PM, Andrej Mitrovic wrote:
 Isn't the biggest issue of large D projects the problems with
 incremental compilation (e.g.
 https://bitbucket.org/h3r3tic/xfbuild/issue/7/make-incremental
building-reliable), 

 optlink, and the toolchain?
The proposed improvement would mark a step forward in the toolchain and generally in the development of large programs. In particular, it would provide a simple means to decouple compilation of modules used together. It's not easy for me to figure how people don't get it's a net step forward from the current situation. Andrei
Personally I fear that it may be too much cost for too little benefit. The role of .di files for information hiding is clear. But it's not _at all_ obvious to me that .di files will provide a significant improvement in compilation speed. Do we actually have profiling data that shows that parsing is the bottleneck? It's also not clear to me that it's a "simple means to decouple compilation of modules" -- it seems complicated to me. As far as performance goes, it would seem much simpler to just cache the symbol tables (similar to precompiled headers in C++, but the idea works better in D because D has an actual module system). That would give faster compilation times than .di files, because you'd also be caching CTFE results.
Jul 23 2011
parent bearophile <bearophileHUGS lycos.com> writes:
Don:

 As far as performance goes, it would seem much simpler to just
Have Walter and Andrei seen this post by Don? Bye, bearophile
Jul 24 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday 23 July 2011 07:34:43 Andrej Mitrovic wrote:
 On 7/23/11, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 Currently, in order for a program with separately-implemented methods to
 work properly, there must be TWO different files for the same class
Can we get a chat log of this discussion to figure out what is trying to be solved here? interface IFoo { int foo(); } class Foo : IFoo { private int x; version(one) { int foo() { return x; } } else version(two) { int foo() { return x++; } } } $ dmd -c foo.d -version=one or: class Foo : IFoo { mixin(import("FooImpl.d")); } $ dmd -c foo.d -J./my/foo/ There are probably other ways to do this within the existing framework. My vote goes to Walter and other contributors to work on fixing existing bugs so we can have not a perfect but a working language.
One of the key issues at hand is the fact that if you have a .di file and no actual implementation for a particular function, then you can't use that function in CTFE. - Jonathan M Davis
Jul 23 2011
prev sibling next sibling parent Robert Clipsham <robert octarineparrot.com> writes:
On 22/07/2011 23:06, Andrei Alexandrescu wrote:
 I see this as a source of problems going forward, and I propose the
 following changes to the language:

 1. The compiler shall accept method definitions outside a class.

 2. A method cannot be implemented unless it was declared with the same
 signature in the class definition.
So what you're proposing is akin to this in C++? class A { void foo(); } void A::foo() { } The equivalent D being s/::/./? My question is then, do we allow something like: // a.d class A { void foo(); } // b.d import a; void A.foo() {} Which causes the module system to breakdown somewhat, or do we require the method to be implemented in the same module (.di or .d)? -- Robert http://octarineparrot.com/
Jul 24 2011
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 22 Jul 2011 18:06:19 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 A chat in IRC revealed a couple of possible improvements to the  
 development scenario in which the interface file (.di) is managed  
 manually and separately from the implementation file (.d).
I have mixed feelings on this proposal. On one hand, one of the best parts of D over C++ is it's module system. No more #ifdef __thisfile_included crap, and implementations just sit right where they are defined. My concern with such an improvement is that it would encourage people (especially C++ coders) to unnecessarily split their implementation from the definition. I think there have been at least 2 or 3 people ask how to do this in D. I really like that implementation and definition are all in one file, and cannot be out of sync (a common problem with C++). Already di files allow for these problems to creep back in, but I understand it's a necessary evil. On the other hand, the duplication of definitions and functions for inlining when you do want to use .di files is a real problem, and this would seem to address it. And as you point out, it doesn't eliminate the existing solution. Plus, D does not suffer from multiple-personality syndrome like C++ can. For example, including a file more than once with different macro definitions to alter the impact of the include file. As a criticism though, I don't really like this: module std.foo; import std.foo; // huh? I know we don't want to add too much to the syntax, but can we make that line a bit more descriptive? and also less redundant? Suggestion: module std.foo; import interface; // imports the .di file that corresponds to std.foo. I know that's keyword abuse, but it's kind of accurate ;) -Steve
Jul 25 2011
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 7/25/11 11:19 AM, Steven Schveighoffer wrote:
 On Fri, 22 Jul 2011 18:06:19 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 A chat in IRC revealed a couple of possible improvements to the
 development scenario in which the interface file (.di) is managed
 manually and separately from the implementation file (.d).
I have mixed feelings on this proposal. On one hand, one of the best parts of D over C++ is it's module system. No more #ifdef __thisfile_included crap, and implementations just sit right where they are defined. My concern with such an improvement is that it would encourage people (especially C++ coders) to unnecessarily split their implementation from the definition. I think there have been at least 2 or 3 people ask how to do this in D. I really like that implementation and definition are all in one file, and cannot be out of sync (a common problem with C++). Already di files allow for these problems to creep back in, but I understand it's a necessary evil.
Once we accept that, we should also acknowledge its big issues and fix them. The current .di model is marred by numerous problems. Andrei
Jul 25 2011
prev sibling next sibling parent reply "Andrej Mitrovic" <andrej.mitrovich gmail.com> writes:
On Friday, 22 July 2011 at 22:06:20 UTC, Andrei Alexandrescu 
wrote:
 1. The compiler shall accept method definitions outside a class.
Sorry to bump this old thread, but after having spent some time with D I like the idea of having the ability to implement class methods outside of the class, but I would like to have this ability in the same file as the module: module test; class C { void foo1(); void bar1() { // N1 -- two indents } class Inner { void foo2(); void bar2() { // N2 -- three indents } } } // if allowed: void C.foo() { // 1 indent } void C.Inner.foo2() { // 1 indent } The benefit? Avoiding the indentation tax. I've come to find it really annoying that hundreds of lines of code end up being indented twice or more times. Changing the indentation setting is not a real solution, I like my indentation size the way it is, I just don't like being forced to indent too much because of the requirement to implement everything inline.
Mar 02 2013
parent reply Manu <turkeyman gmail.com> writes:
+1_000_000_000

Yes please!
It's near impossible to get a brief overview of a class at a glance in D!


On 3 March 2013 15:35, Andrej Mitrovic <andrej.mitrovich gmail.com> wrote:

 On Friday, 22 July 2011 at 22:06:20 UTC, Andrei Alexandrescu wrote:

 1. The compiler shall accept method definitions outside a class.
Sorry to bump this old thread, but after having spent some time with D I like the idea of having the ability to implement class methods outside of the class, but I would like to have this ability in the same file as the module: module test; class C { void foo1(); void bar1() { // N1 -- two indents } class Inner { void foo2(); void bar2() { // N2 -- three indents } } } // if allowed: void C.foo() { // 1 indent } void C.Inner.foo2() { // 1 indent } The benefit? Avoiding the indentation tax. I've come to find it really annoying that hundreds of lines of code end up being indented twice or more times. Changing the indentation setting is not a real solution, I like my indentation size the way it is, I just don't like being forced to indent too much because of the requirement to implement everything inline.
Mar 03 2013
next sibling parent reply "Rob T" <alanb ucora.com> writes:
On Monday, 4 March 2013 at 06:18:35 UTC, Manu wrote:
 +1_000_000_000

 Yes please!
 It's near impossible to get a brief overview of a class at a 
 glance in D!
I agree it's very handy to get an overview of a class or struct or module interface, however I think that the IDE should deal with this problem rather than using a duplication of source code as a solution. The relatively primitive IDE that I use allows me to at least perform function folding, which in some ways helps solve the problem, albeit in a poor way, but it hints that better tools can solve the problem rather nicely without a duplication of code. --rt
Mar 04 2013
next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 3/5/13, Rob T <alanb ucora.com> wrote:
 The relatively primitive IDE that I use allows me to at least
 perform function folding, which in some ways helps solve the
 problem, albeit in a poor way, but it hints that better tools can
 solve the problem rather nicely without a duplication of code.
Code folding is available in almost every text editor, the problem (for me) is the indenting.
Mar 04 2013
parent reply "Rob T" <alanb ucora.com> writes:
On Tuesday, 5 March 2013 at 00:17:03 UTC, Andrej Mitrovic wrote:
 On 3/5/13, Rob T <alanb ucora.com> wrote:
 The relatively primitive IDE that I use allows me to at least
 perform function folding, which in some ways helps solve the
 problem, albeit in a poor way, but it hints that better tools 
 can
 solve the problem rather nicely without a duplication of code.
Code folding is available in almost every text editor, the problem (for me) is the indenting.
In that case it would be nice to avoid most of the class name duplication, that's one of the things I really disliked about C++. class C { void foo1(); void bar1() { // N1 -- two indents } } // This avoids all but one duplication of the parent class name. class.C.inner { void foo() { // 1 indent } void foo2() { // 1 indent } } --rt
Mar 04 2013
parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
 3/5/13, Rob T <alanb ucora.com> wrote:
 In that case it would be nice to avoid most of the class name
 duplication, that's one of the things I really disliked about C++.
I guess you meant: class A { class B; } class A.B { // implementation } Ideally both styles would be supported.
Mar 04 2013
parent reply "Rob T" <alanb ucora.com> writes:
On Tuesday, 5 March 2013 at 00:41:23 UTC, Andrej Mitrovic wrote:
  3/5/13, Rob T <alanb ucora.com> wrote:
 In that case it would be nice to avoid most of the class name
 duplication, that's one of the things I really disliked about 
 C++.
I guess you meant: class A { class B; } class A.B { // implementation } Ideally both styles would be supported.
Actually I had meant this instead, which reduces duplication class A { } class A.B { // implementation } There should be no need to define class B inside class A, because the compiler can easily know that class A.B means that B is included inside A. It adds on some more work for the compiler, but it's doable. However, I'm not so sure how useful such a thing really is if all it does is save on adding indentations for sub classes. --rt
Mar 05 2013
parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 3/6/13, Rob T <alanb ucora.com> wrote:
 Actually I had meant this instead, which reduces duplication

 class A
 {

 }

 class A.B
 {
      // implementation
 }
I don't like this idea at all. It again makes the class unreadable because you can't tell at a glance what it contains.
Mar 06 2013
parent "Rob T" <alanb ucora.com> writes:
On Wednesday, 6 March 2013 at 13:44:39 UTC, Andrej Mitrovic wrote:
 On 3/6/13, Rob T <alanb ucora.com> wrote:
 Actually I had meant this instead, which reduces duplication

 class A
 {

 }

 class A.B
 {
      // implementation
 }
I don't like this idea at all. It again makes the class unreadable because you can't tell at a glance what it contains.
I suppose you are right, but I wouldn't want to break up a class except for defining a separate interface from the implementation specifics, so I would expect that anything not in the interface must be implementation details, which means it should not show up in the interface. If I understood you correctly, what you want to accomplish is different from defining an interface. You want to be able to break up a class into smaller parts to save on indentations? --rt
Mar 06 2013
prev sibling parent "Dicebot" <m.strashun gmail.com> writes:
On Monday, 4 March 2013 at 23:36:18 UTC, Rob T wrote:
 On Monday, 4 March 2013 at 06:18:35 UTC, Manu wrote:
 +1_000_000_000

 Yes please!
 It's near impossible to get a brief overview of a class at a 
 glance in D!
I agree it's very handy to get an overview of a class or struct or module interface, however I think that the IDE should deal with this problem rather than using a duplication of source code as a solution.
Good presentation is not something IDE can decide for you. For example, you generally want to leave unit tests out of overview, but than still put some of them as extended documentation examples there. That is not something a program can decide for you.
Mar 05 2013
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Manu:

 +1_000_000_000

 Yes please!
 It's near impossible to get a brief overview of a class at a 
 glance in D!
-1 from me. It's one more step toward turning D code into a syntax soup. Bye, bearophile
Mar 04 2013
prev sibling parent reply "eles" <eles eles.com> writes:
On Monday, 4 March 2013 at 06:18:35 UTC, Manu wrote:
 +1_000_000_000

 Yes please!
 It's near impossible to get a brief overview of a class at a 
 glance in D!
Exactly for this reason, what about make this way at least the recommended way, if not the single one? What is to lose? As about what to win, basically each .d file will carry its .di file (class definition) inside it, and the latter can be easily extracted (both visually and automatically). Just one note: please allow that the private variables of a class (those that are not exposed outside the file) be declarable outside the main definition of the class, that is with the . syntax. This will completely make declaration and implementation independent, to the point that the class definition is an interface and nothing else.
Mar 05 2013
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2013-03-05 13:01, eles wrote:

 Exactly for this reason, what about make this way at least the
 recommended way, if not the single one?
You want to break every single piece of D code that uses classes?
 What is to lose? As about what to win, basically each .d file will carry
 its .di file (class definition) inside it, and the latter can be easily
 extracted (both visually and automatically).

 Just one note: please allow that the private variables of a class (those
 that are not exposed outside the file) be declarable outside the main
 definition of the class, that is with the . syntax. This will completely
 make declaration and implementation independent, to the point that the
 class definition is an interface and nothing else.
The compiler will need to know the size of the class, for that it needs to know the instance variables. -- /Jacob Carlborg
Mar 05 2013
next sibling parent "Dicebot" <m.strashun gmail.com> writes:
On Tuesday, 5 March 2013 at 12:19:17 UTC, Jacob Carlborg wrote:
 The compiler will need to know the size of the class, for that 
 it needs to know the instance variables.
Quite an interesting problem to address, by the way. I am on the side that it can be better addressed by providing some general means to duck-typing verification of interfaces for structs and thus eliminating the need to keep anything with private stuff in definitions in .di at all.
Mar 05 2013
prev sibling parent reply "eles" <eles eles.com> writes:
On Tuesday, 5 March 2013 at 12:19:17 UTC, Jacob Carlborg wrote:
 On 2013-03-05 13:01, eles wrote:
 The compiler will need to know the size of the class, for that 
 it needs to know the instance variables.
Maybe, but private variables should have nothing to do w.r.t. the behaviour of the class, including its behaviour w.r.t. the sizeof() operator and so on. What compiler does know and what the user care about are different things.
Mar 05 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-03-05 13:48, eles wrote:

 Maybe, but private variables should have nothing to do w.r.t. the
 behaviour of the class, including its behaviour w.r.t. the sizeof()
 operator and so on.

 What compiler does know and what the user care about are different things.
True. But how does the compiler get this information if you're using a pre compiled library and a .di file? -- /Jacob Carlborg
Mar 05 2013
parent "eles" <eles eles.com> writes:
On Tuesday, 5 March 2013 at 14:40:57 UTC, Jacob Carlborg wrote:
 On 2013-03-05 13:48, eles wrote:
 True. But how does the compiler get this information if you're 
 using a pre compiled library and a .di file?
I imagine that a similar question was asked about the virtual methods... Hide that into a pre-compiled's library special field, for example. The runtime would maintain a "private data description" information about each class, read from the module. Private variables would simply not be put into the class, but in run-time specific data.
Mar 05 2013
prev sibling parent reply "Rob T" <alanb ucora.com> writes:
On Tuesday, 5 March 2013 at 12:01:54 UTC, eles wrote:
 On Monday, 4 March 2013 at 06:18:35 UTC, Manu wrote:
 +1_000_000_000

 Yes please!
 It's near impossible to get a brief overview of a class at a 
 glance in D!
Exactly for this reason, what about make this way at least the recommended way, if not the single one? What is to lose? As about what to win, basically each .d file will carry its .di file (class definition) inside it, and the latter can be easily extracted (both visually and automatically). Just one note: please allow that the private variables of a class (those that are not exposed outside the file) be declarable outside the main definition of the class, that is with the . syntax. This will completely make declaration and implementation independent, to the point that the class definition is an interface and nothing else.
This is what I'd like to see happen with modules which seems to be in agreement with what you are proposing: The module source file should contain everything needed to automatically generate and maintain a .di file. There should no need to manually maintain a .di file. How we accomplish that goal is an implementation detail, however I fully agree that something needs to be done and it should be done without any need to manually maintain separate .di files. --rt
Mar 05 2013
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Mar 05, 2013 at 06:39:07PM +0100, Rob T wrote:
 On Tuesday, 5 March 2013 at 12:01:54 UTC, eles wrote:
On Monday, 4 March 2013 at 06:18:35 UTC, Manu wrote:
+1_000_000_000

Yes please!
It's near impossible to get a brief overview of a class at a
glance in D!
Exactly for this reason, what about make this way at least the recommended way, if not the single one? What is to lose? As about what to win, basically each .d file will carry its .di file (class definition) inside it, and the latter can be easily extracted (both visually and automatically). Just one note: please allow that the private variables of a class (those that are not exposed outside the file) be declarable outside the main definition of the class, that is with the . syntax. This will completely make declaration and implementation independent, to the point that the class definition is an interface and nothing else.
This is what I'd like to see happen with modules which seems to be in agreement with what you are proposing: The module source file should contain everything needed to automatically generate and maintain a .di file. There should no need to manually maintain a .di file.
+1. It should be the compiler's job to export the public API of a given .d module, not the programmer's. By default, it should export only public members. Optionally, we can use UDAs to override compiler defaults in certain cases, where it's necessary. Ideally, such cases will be very rare.
 How we accomplish that goal is an implementation detail, however I
 fully agree that something needs to be done and it should be done
 without any need to manually maintain separate .di files.
+1. Separate maintenance of header files in C/C++ has always been a pain and source of inconsistencies. It's one thing when you have a small dedicated team working on the same codebase; I have seen such code in projects that were handed around to different people (due to the usual reasons -- turnover, company reorg, etc.), and inevitably, the headers go out of sync with the implementation files. I've actually seen (and fixed) a case of two C functions that were declared with the same name in two different libraries, and in one module, the wrong library was linked in, thus linking the call to the wrong function, causing strange runtime problems that no amount of staring at the source code would reveal the reason for. The #include was perfectly correct; but the problem was that the declaration was detached from the implementation, so the compiler has no way to check for this kind of problem. Not to mention the silliness of having to copy-n-paste function prototypes in two places -- in this day and age, why are we still promoting such error-prone manual hackish ways of coding? Such things should be automated! D's module system is the beginning of a sane handling of such issues. Let's not go back to the bad ole split header/implementation approach. It should be the compiler's job to extract the public API of a module *automatically*, so that consistency is guaranteed. No room for human error. T -- The best compiler is between your ears. -- Michael Abrash
Mar 05 2013
parent "Rob T" <alanb ucora.com> writes:
On Tuesday, 5 March 2013 at 18:00:59 UTC, H. S. Teoh wrote:
[...]
 I've actually seen (and fixed) a case of two C functions that 
 were
 declared with the same name in two different libraries, and in 
 one
 module, the wrong library was linked in, thus linking the call 
 to the
 wrong function, causing strange runtime problems that no amount 
 of
 staring at the source code would reveal the reason for. The 
 #include was
 perfectly correct; but the problem was that the declaration was 
 detached
 from the implementation, so the compiler has no way to check 
 for this
 kind of problem.
I have been the victim of this sort of insanity enough times to never want to have to suffer through it again, that's why I'm taking the time to make a case against manual .di maintenance. Note that one of the main points Andrei wanted to address is the desire to prevent unnecessary recompilations if the interface does not change despite the implementation changing. Such a thing can possibly be achieved by specifying to the compiler a path where the .di files are located, the compiler can search and parse through the files for matching module names (the files may be named differently than the associated module) and compare if the interface has changed or not, if not then do not overwrite with a fresh copy. --rt
Mar 05 2013
prev sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
05-Mar-2013 21:39, Rob T пишет:
 On Tuesday, 5 March 2013 at 12:01:54 UTC, eles wrote:
 On Monday, 4 March 2013 at 06:18:35 UTC, Manu wrote:
 +1_000_000_000

 Yes please!
 It's near impossible to get a brief overview of a class at a glance
 in D!
Exactly for this reason, what about make this way at least the recommended way, if not the single one? What is to lose? As about what to win, basically each .d file will carry its .di file (class definition) inside it, and the latter can be easily extracted (both visually and automatically). Just one note: please allow that the private variables of a class (those that are not exposed outside the file) be declarable outside the main definition of the class, that is with the . syntax. This will completely make declaration and implementation independent, to the point that the class definition is an interface and nothing else.
This is what I'd like to see happen with modules which seems to be in agreement with what you are proposing: The module source file should contain everything needed to automatically generate and maintain a .di file. There should no need to manually maintain a .di file.
Leverage UDAs ?
 How we accomplish that goal is an implementation detail, however I fully
 agree that something needs to be done and it should be done without any
 need to manually maintain separate .di files.

 --rt
-- Dmitry Olshansky
Mar 05 2013
parent "Rob T" <alanb ucora.com> writes:
On Tuesday, 5 March 2013 at 18:16:41 UTC, Dmitry Olshansky wrote:
 Leverage UDAs ?
Yes that's one possible method, tagging the interface with recognized attributes that guide the generation of the .di files. If the generation is smart enough that we can do without, even better. The other need expressed in here is to have a means to benefit from a separation of interface from implementation inside the source base. I'll admit that bundling the two together is somewhat of a pain as it can hide away the interface. However I wonder if this is not best served through automatic documentation, ie, if the .di generation can figure out the interface, then why can't DDoc do it too? Then again, this may be more of a coding convenience issue more than a documentation issue, so I'm not strongly opposed to separating interface from implementation, rather my main concern was seeing manual maintenance of .di files being taken seriously. From my POV it's akin to having to hex edit your .o files to keep them in sync with source code changes. If there's to be a method of interface-implementation separation, I think whatever is devised should minimize code duplication as much as possible, and the .di files should always be generated and maintained automatically. --rt
Mar 05 2013
prev sibling next sibling parent reply "Dicebot" <m.strashun gmail.com> writes:
I really like this and never never understood reason behind 
automatically generated .di files, as much as difficulties with 
separation of interface and implementation. If there is one thing 
that prevents me to say that D module system is 100% better than 
C++ - here it is. If there is something I would like to write by 
hand in first place, it is .di file for sure, because it is the 
one intended to be read by broader programmer count.

Add in my proposals about protection attributes and inner linkage 
and that would have been module system I really love.

But I tend to agree that importance of this issues is not urgent 
enough to prioritize it anyhow. Adding "prepproved" enhancement 
request after convincing D community may be enough for now.

Regarding CTFE - it is a reason to not use this layout for 
generic algorithm libraries and similar stuff, but there is no 
point in allowing application module code for CTFE by an accident.
Mar 03 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 3 March 2013 at 10:47:34 UTC, Dicebot wrote:
 Regarding CTFE - it is a reason to not use this layout for 
 generic algorithm libraries and similar stuff, but there is no 
 point in allowing application module code for CTFE by an 
 accident.
It is the virtual vs final by default argument. You'll find very good reason to do so, but it has drawback. I'm not sure if one is really better than the other.
Mar 03 2013
parent reply "Dicebot" <m.strashun gmail.com> writes:
On Sunday, 3 March 2013 at 11:03:55 UTC, deadalnix wrote:
 On Sunday, 3 March 2013 at 10:47:34 UTC, Dicebot wrote:
 Regarding CTFE - it is a reason to not use this layout for 
 generic algorithm libraries and similar stuff, but there is no 
 point in allowing application module code for CTFE by an 
 accident.
It is the virtual vs final by default argument. You'll find very good reason to do so, but it has drawback. I'm not sure if one is really better than the other.
How so? virtual vs final is a lot about overriding by accident (or not overriding), VMT overhead and similar things that do make difference. CTFE is all about either giving access to source for evaluation or not. Where is the catch?
Mar 03 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 3 March 2013 at 11:08:33 UTC, Dicebot wrote:
 On Sunday, 3 March 2013 at 11:03:55 UTC, deadalnix wrote:
 On Sunday, 3 March 2013 at 10:47:34 UTC, Dicebot wrote:
 Regarding CTFE - it is a reason to not use this layout for 
 generic algorithm libraries and similar stuff, but there is 
 no point in allowing application module code for CTFE by an 
 accident.
It is the virtual vs final by default argument. You'll find very good reason to do so, but it has drawback. I'm not sure if one is really better than the other.
How so? virtual vs final is a lot about overriding by accident (or not overriding), VMT overhead and similar things that do make difference. CTFE is all about either giving access to source for evaluation or not. Where is the catch?
That is the same debate with the same argument. People that argue for final by default say that you can still make the function virtual afterward if needed, when the people arguing the other way around say that you may not know in advance what will be overloaded, and that you create unnecessary restriction. CTFEability is exactly the same argument. Overriding by accident is not a concern as you have to specify override anyway.
Mar 03 2013
parent reply "Dicebot" <m.strashun gmail.com> writes:
On Sunday, 3 March 2013 at 13:19:35 UTC, deadalnix wrote:
 ...
Ah, beg my pardon, misunderstood you a bit. Still, difference I am trying to point is that virtual vs final does not rely on actual method implementation at all, while proper CTFE function needs to be written with CTFE in mind as different subset of language can be used. If one function will be found CTFE-able by accident and used in such, any change to its implementation that will add non-CTFE features (preserving public function interface and behavior) will break user code.
Mar 03 2013
parent "deadalnix" <deadalnix gmail.com> writes:
On Sunday, 3 March 2013 at 13:24:39 UTC, Dicebot wrote:
 On Sunday, 3 March 2013 at 13:19:35 UTC, deadalnix wrote:
 ...
Ah, beg my pardon, misunderstood you a bit. Still, difference I am trying to point is that virtual vs final does not rely on actual method implementation at all, while proper CTFE function needs to be written with CTFE in mind as different subset of language can be used. If one function will be found CTFE-able by accident and used in such, any change to its implementation that will add non-CTFE features (preserving public function interface and behavior) will break user code.
Yes both are technically very different. But you argue for CTFEability by default or when choosen with almost the same arguments. I explained my oppinion before on that : CTFEability can be decoupled from actual source code using some bytecode. Bytecode should be opaque enough to make that work.
Mar 03 2013
prev sibling next sibling parent reply "Rob T" <alanb ucora.com> writes:
On Friday, 22 July 2011 at 22:06:20 UTC, Andrei Alexandrescu 
wrote:
 A chat in IRC revealed a couple of possible improvements to the 
 development scenario in which the interface file (.di) is 
 managed manually and separately from the implementation file 
 (.d).

 After discussing a few ideas with Walter, the following 
 language improvement came about. Consider a very simple 
 scenario in which we have files a.d, a.di, and client.d, all 
 situated in the same directory, with the following contents:

 // a.di
 class A { private int x; int foo(); }

 // a.d
 import a;
 int A.foo() { return x + 1; }

 // client.d
 import a;
 void main() { (new A).foo(); }

 To compile:

 dmd -c a.d
 dmd -c client.d
 dmd client.o a.o

 Currently, in order for a program with separately-implemented 
 methods to work properly, there must be TWO different files for 
 the same class, and the .di file and the .d file MUST specify 
 the same exact field layout. Ironically, the .d file is 
 forbidden from including the .di file, which makes any checks 
 and balances impossible. Any change in the layout (e.g. 
 swapping, inserting, or removing fields) is undetectable and 
 has undefined behavior.

 I see this as a source of problems going forward, and I propose 
 the following changes to the language:

 1. The compiler shall accept method definitions outside a class.

 2. A method cannot be implemented unless it was declared with 
 the same signature in the class definition.

 Walter is reluctantly on board with a change in this direction, 
 with the note that he'd just recommend interfaces for this kind 
 of separation. My stance in this matter is that we shouldn't 
 constrain without necessity the ability of programmers to 
 organize the physical design of their large projects.

 Please discuss here. I should mention that Walter has his hands 
 too full to embark on this, so implementation should be 
 approached by one of our other dmd contributors (after of 
 course there's a shared view on the design).


 Andrei
One of the main selling points of the module system is to prevent exactly what you are proposing, so I think there must be a better solution. The biggest disappointment I have with modules, is that you cannot define an interface inside of them, and must instead resort to a .di file. Manually maintaining a .di file is a bad idea for what should be obvious reasons, and the auto generation of the .di is a failure because you cannot tell the compiler how to generate the .di file from inside the module. For a solution, what I'd like to see is manual control over what will go into an automatically generated .di placed directly inside the D module. For example, allow the programmer to specify the separation of interface from implementation directly inside the module through an improvement to how modules are defined, that is to say allow the programmer to specify what is a part of the interface and what is a part of the implementation directly inside the module. The compiler can then auto generate the necessary .di interface files from that information, the obvious benefit being that you'll get interface and implementation separation without the separation of where the control point is, eg it's all done inside the module where it should be done and nowhere else, and do it without a duplication of definitions. This can possibly be done without duplication through the attribute system. There should be a way for the compiler to determine if previously auto generated interface files are the same or differ to prevent overwriting and forcing unnecessary recompilation. If the .di files are stored in specific locations away from wher the compiler is dumping them, then you could add an "include" switch to tell the compiler where to look for previously generated .di and compare them. Sure it's a bit of effort to implement, but it seems like a far more useful solution that is safe, scalable, and easy to maintain. --rt PS: While we're on the topic of interfaces (or lack thereof), another problem point I have with D is that while you can define an interface for classes, you cannot do the same thing with structs. I don't understand why structs cannot have interfaces.
Mar 03 2013
next sibling parent "Dicebot" <m.strashun gmail.com> writes:
On Monday, 4 March 2013 at 06:06:14 UTC, Rob T wrote:
 One of the main selling points of the module system is to 
 prevent exactly what you are proposing, so I think there must 
 be a better solution.
Please explain. Main points of module system is to avoid header compilation time hell and control access. I don't how this is relevant to the topic.
Mar 04 2013
prev sibling parent reply "Dicebot" <m.strashun gmail.com> writes:
On Monday, 4 March 2013 at 06:06:14 UTC, Rob T wrote:
 Manually maintaining a .di file is a bad idea for what should 
 be obvious reasons
No, they are not obvious at all, please explain.
Mar 04 2013
parent reply "Rob T" <alanb ucora.com> writes:
 One of the main selling points of the module system is to 
 prevent exactly what you are proposing, so I think there must 
 be a better solution.
 Please explain. Main points of module system is to avoid header 
 compilation time hell and control access. I don't how this is 
 relevant to the topic.
This is what I read when I first read about D, modules are supposed to be a way to get rid of the need for maintaining separate source files, which as you stated also has the desired side effect of getting rid of separate header files. here's one source: http://dlang.org/overview.html The proposal as I read it, moves us more backwards than forwards, and it seems there's a better way as I had described. If there's to be a solution, it should be a solution that retains one source file. If we need a separation, it should be automated to prevent manual duplication and maintenance hell.
 On Monday, 4 March 2013 at 06:06:14 UTC, Rob T wrote:
 Manually maintaining a .di file is a bad idea for what should 
 be obvious reasons
No, they are not obvious at all, please explain.
I did not think that I would have to explain the difference between manually maintaining two separate source files containing manual duplication of code that must be kept in sync vs manually maintaining only one source file with no manual duplication of code. --rt
Mar 04 2013
parent reply "Dicebot" <m.strashun gmail.com> writes:
On Monday, 4 March 2013 at 23:24:02 UTC, Rob T wrote:
 One of the main selling points of the module system is to 
 prevent exactly what you are proposing, so I think there must 
 be a better solution.
 Please explain. Main points of module system is to avoid 
 header compilation time hell and control access. I don't how 
 this is relevant to the topic.
This is what I read when I first read about D, modules are supposed to be a way to get rid of the need for maintaining separate source files, which as you stated also has the desired side effect of getting rid of separate header files. here's one source: http://dlang.org/overview.html The proposal as I read it, moves us more backwards than forwards, and it seems there's a better way as I had described. If there's to be a solution, it should be a solution that retains one source file. If we need a separation, it should be automated to prevent manual duplication and maintenance hell.
I can find nothing on the topic of "separation" issue down that link. In fact I have never met a C/C++ programmer saying fact of having a separate headers is a problem, it was a loved feature if anything. Problem was the way it was designed, via pre-processor. D fixes this by introducing real symbolic imports and that is good, but completely irrelevant to the topic of separation of interface and implementation.
 On Monday, 4 March 2013 at 06:06:14 UTC, Rob T wrote:
 Manually maintaining a .di file is a bad idea for what should 
 be obvious reasons
No, they are not obvious at all, please explain.
I did not think that I would have to explain the difference between manually maintaining two separate source files containing manual duplication of code that must be kept in sync vs manually maintaining only one source file with no manual duplication of code.
There is close to zero code duplication. This proposal suggests ability to import definitions from own .di file to main module file. That means only code duplication is method names (full signatures for overloaded ones). That actually simplifies maintenance a lot.
Mar 05 2013
parent reply Jacob Carlborg <doob me.com> writes:
On 2013-03-05 09:48, Dicebot wrote:

 I can find nothing on the topic of "separation" issue down that link. In
 fact I have never met a C/C++ programmer saying fact of having a
 separate headers is a problem, it was a loved feature if anything.
 Problem was the way it was designed, via pre-processor. D fixes this by
 introducing real symbolic imports and that is good, but completely
 irrelevant to the topic of separation of interface and implementation.
We are all here :) -- /Jacob Carlborg
Mar 05 2013
parent reply "Dicebot" <m.strashun gmail.com> writes:
On Tuesday, 5 March 2013 at 10:41:36 UTC, Jacob Carlborg wrote:
 On 2013-03-05 09:48, Dicebot wrote:

 I can find nothing on the topic of "separation" issue down 
 that link. In
 fact I have never met a C/C++ programmer saying fact of having 
 a
 separate headers is a problem, it was a loved feature if 
 anything.
 Problem was the way it was designed, via pre-processor. D 
 fixes this by
 introducing real symbolic imports and that is good, but 
 completely
 irrelevant to the topic of separation of interface and 
 implementation.
We are all here :)
Well, that is why I am so surprised and asking for rationale with better explanation than "it is obvious". I am rather astonished by an overall negative reaction to a long awaited (by me) Andrei proposal.
Mar 05 2013
parent reply "Rob T" <alanb ucora.com> writes:
On Tuesday, 5 March 2013 at 12:20:15 UTC, Dicebot wrote:
 On Tuesday, 5 March 2013 at 10:41:36 UTC, Jacob Carlborg wrote:
 On 2013-03-05 09:48, Dicebot wrote:

 I can find nothing on the topic of "separation" issue down 
 that link. In
 fact I have never met a C/C++ programmer saying fact of 
 having a
 separate headers is a problem, it was a loved feature if 
 anything.
 Problem was the way it was designed, via pre-processor. D 
 fixes this by
 introducing real symbolic imports and that is good, but 
 completely
 irrelevant to the topic of separation of interface and 
 implementation.
We are all here :)
Well, that is why I am so surprised and asking for rationale with better explanation than "it is obvious". I am rather astonished by an overall negative reaction to a long awaited (by me) Andrei proposal.
Clearly there's a misunderstanding going on somewhere. For example when I say "code duplication" you say "There is close to zero code duplication." but from my POV there clearly is code duplication and it is indeed significant and completely unnecessary. Also my understanding of what a module accomplishes is a super set of what you think a module accomplishes, i.e., I include that it solves the code duplication problem, you don't. I agree that the current method of automated .di generation is a failure, but I think it is a failure only because the programmer has no control over specifying what goes into the .di file from within the module source, and that the automated .di generation and maintenance implementation is incomplete. At this point in the discussion, I don't see a sound reason why the automated .di generation cannot be fixed (proposed solutions haven't even been discussed in any depth), and regressing back to manual separation of code seems like a "quick fix" hack that introduces a set of serious problems that cancel out the benefits and probably make the situation worse. You seem to support the idea of using manual .di generation and maintenance, which implies to me that you think automated .di generation cannot succeed, is this correct? If so, then I have to ask why? --rt
Mar 05 2013
parent reply "Dicebot" <m.strashun gmail.com> writes:
On Tuesday, 5 March 2013 at 17:22:52 UTC, Rob T wrote:
 Clearly there's a misunderstanding going on somewhere. For 
 example when I say "code duplication" you say "There is close 
 to zero code duplication." but from my POV there clearly is 
 code duplication and it is indeed significant and completely 
 unnecessary. Also my understanding of what a module 
 accomplishes is a super set of what you think a module 
 accomplishes, i.e., I include that it solves the code 
 duplication problem, you don't.
Do you consider necessity to duplicate method signature when overriding in descendant to be a significant code duplication too? Because I fail to see the difference in this two cases.
 I agree that the current method of automated .di generation is 
 a failure, but I think it is a failure only because the 
 programmer has no control over specifying what goes into the 
 .di file from within the module source, and that the automated 
 .di generation and maintenance implementation is incomplete.

 At this point in the discussion, I don't see a sound reason why 
 the automated .di generation cannot be fixed (proposed 
 solutions haven't even been discussed in any depth), and 
 regressing back to manual separation of code seems like a 
 "quick fix" hack that introduces a set of serious problems that 
 cancel out the benefits and probably make the situation worse.
I do not object better automatic .di generation, that is good for sure. But I do object opposing proposed simplification of going other way around. Why do you call it "regressing" if it changes nothing in current behavior? It will considerably improve my usage scenario (start with .di, write .d after) and won't affect yours (write only .d)
 You seem to support the idea of using manual .di generation and 
 maintenance, which implies to me that you think automated .di 
 generation cannot succeed, is this correct? If so, then I have 
 to ask why?
"Success" is not the right term here, automated generation helps different development model - when you just go and write stuff you want, compile it and provide something to import to use it. Manual .di writing is all about strict separation of interface and implementation, not only logical separation, but a physical one. I find it a very solid thinking model - removing possibility to see an implementation details even by an accident and removing the temptation to alter interface few extra times by keeping it far from current context. Automatic generation won't help it because I _do want to maintain them as separate entities_. Also all maintenance issues mentioned in this thread for C have roots in header being processed in scope of importer translation unit, not actually the fact that it is separate. For example, naming clash at linking stage is not possible as all names are implicitly in modules namespace.
Mar 05 2013
parent reply "Rob T" <alanb ucora.com> writes:
On Tuesday, 5 March 2013 at 19:14:08 UTC, Dicebot wrote:
 Do you consider necessity to duplicate method signature when 
 overriding in descendant to be a significant code duplication 
 too? Because I fail to see the difference in this two cases.
I think there is a difference. For example the compiler will complain when there's a typo, as opposed to getting obfuscated linker errors. No matter if there *is* a sane way to prevent duplication of function overrides, then I'll support it over duplication. This is a common concept, you should *allays* want to avoid unnecessary duplications as every line of extra code you add will exponentially increase the amount of effort required to maintain bug free code.
 I do not object better automatic .di generation, that is good 
 for sure. But I do object opposing proposed simplification of 
 going other way around. Why do you call it "regressing" if it 
 changes nothing in current behavior? It will considerably 
 improve my usage scenario (start with .di, write .d after) and 
 won't affect yours (write only .d)
You'll need to explain more what you are trying to accomplish because I sincerely cannot understand what you are arguing for. Why would anyone ever want to start off with a .di interface?
 "Success" is not the right term here, automated generation 
 helps different development model - when you just go and write 
 stuff you want, compile it and provide something to import to 
 use it. Manual .di writing is all about strict separation of 
 interface and implementation, not only logical separation, but 
 a physical one. I find it a very solid thinking model - 
 removing possibility to see an implementation details even by 
 an accident and removing the temptation to alter interface few 
 extra times by keeping it far from current context. Automatic 
 generation won't help it because I _do want to maintain them as 
 separate entities_.
Again this goes back to previous question: Why would you ever want to maintain a separation of interface from source? I am fully aware of the need to generate a separation for a variety of reasons (source code hiding being a significant reason for one) but I fail to see any reason why the process ever has to be a manual effort. Perhaps you can give me an example?
 Also all maintenance issues mentioned in this thread for C have 
 roots in header being processed in scope of importer 
 translation unit, not actually the fact that it is separate. 
 For example, naming clash at linking stage is not possible as 
 all names are implicitly in modules namespace.
Perhaps then, what I think is the most significant point of all is being missed. I can see a very clear reason why in terms of source code, the interface and implementation must not be be separated. Automated .di generation is a means to solve the need for separation when distributing libraries, whoever what goes in the .di should not be considered as your "source code" because it should be auto generated from the source (eg your .o files are not source code in the same way). I f there's ever a real reason to manually maintain a .di, then it's the same as having to hex edit a .o file to make it work, ie. the compiler is busted! --rt
Mar 05 2013
next sibling parent "Rob T" <alanb ucora.com> writes:
I curse the lack of an edit feature!

Correction of my main point:

------------

... whatever goes in the .di should not be considered as your 
"source code" because it should be auto generated from the source 
(eg your .o files are not source code in the same way). If 
there's ever a real reason to manually maintain a .di, then it's 
the same as having to hex edit a .o file to make it work,
ie. the compiler is busted!

------------

--rt
Mar 05 2013
prev sibling next sibling parent reply "Vladimir Panteleev" <vladimir thecybershadow.net> writes:
On Tuesday, 5 March 2013 at 19:59:13 UTC, Rob T wrote:
 Perhaps then, what I think is the most significant point of all 
 is being missed. I can see a very clear reason why in terms of 
 source code, the interface and implementation must not be be 
 separated. Automated .di generation is a means to solve the 
 need for separation when distributing libraries, whoever what 
 goes in the .di should not be considered as your "source code" 
 because it should be auto generated from the source (eg your .o 
 files are not source code in the same way). I
I agree with Rob T on this point. Being free from the burden of maintaining an interface file was one of the selling points of D vs. C++ for me - repeating the same changes over and over in the .cpp and .h file was a pointless distraction and strain on my productivity, especially during prototyping. Even though the feature, if introduced, will be optional, sooner or later it will end up in a library (likely, written by a misguided programmer coming from C++), and I will have to debug it - so I don't consider "don't use it if you don't like it" a real argument. Also, I don't think that we should consider that a class declaration is the same thing as the class interface - for the simple reason that a class declaration must also contain private fields and methods. Having to recompile half of my program just because of a signature change in a class's private method makes no sense, and is the reason why hacks like PImpl emerged.
Mar 05 2013
next sibling parent reply "Rob T" <alanb ucora.com> writes:
On Tuesday, 5 March 2013 at 20:40:41 UTC, Vladimir Panteleev 
wrote:
 Also, I don't think that we should consider that a class 
 declaration is the same thing as the class interface - for the 
 simple reason that a class declaration must also contain 
 private fields and methods. Having to recompile half of my 
 program just because of a signature change in a class's private 
 method makes no sense, and is the reason why hacks like PImpl 
 emerged.
Yes I fully agree with that point. D should be adjusted to solve the problem since it makes working on large projects much more practical. --rt
Mar 05 2013
parent "Dicebot" <m.strashun gmail.com> writes:
On Tuesday, 5 March 2013 at 20:47:11 UTC, Rob T wrote:
 On Tuesday, 5 March 2013 at 20:40:41 UTC, Vladimir Panteleev 
 wrote:
 Also, I don't think that we should consider that a class 
 declaration is the same thing as the class interface - for the 
 simple reason that a class declaration must also contain 
 private fields and methods. Having to recompile half of my 
 program just because of a signature change in a class's 
 private method makes no sense, and is the reason why hacks 
 like PImpl emerged.
Yes I fully agree with that point. D should be adjusted to solve the problem since it makes working on large projects much more practical. --rt
That I agree for sure, too. That may be not that trivial, but it is a big whole in abstraction model of C++-like languages.
Mar 05 2013
prev sibling parent "Dicebot" <m.strashun gmail.com> writes:
On Tuesday, 5 March 2013 at 20:40:41 UTC, Vladimir Panteleev 
wrote:
 On Tuesday, 5 March 2013 at 19:59:13 UTC, Rob T wrote:
 Perhaps then, what I think is the most significant point of 
 all is being missed. I can see a very clear reason why in 
 terms of source code, the interface and implementation must 
 not be be separated. Automated .di generation is a means to 
 solve the need for separation when distributing libraries, 
 whoever what goes in the .di should not be considered as your 
 "source code" because it should be auto generated from the 
 source (eg your .o files are not source code in the same way). 
 I
I agree with Rob T on this point. Being free from the burden of maintaining an interface file was one of the selling points of D vs. C++ for me - repeating the same changes over and over in the .cpp and .h file was a pointless distraction and strain on my productivity, especially during prototyping. Even though the feature, if introduced, will be optional, sooner or later it will end up in a library (likely, written by a misguided programmer coming from C++), and I will have to debug it - so I don't consider "don't use it if you don't like it" a real argument. Also, I don't think that we should consider that a class declaration is the same thing as the class interface - for the simple reason that a class declaration must also contain private fields and methods. Having to recompile half of my program just because of a signature change in a class's private method makes no sense, and is the reason why hacks like PImpl emerged.
And what issues may arise exactly? So far all stuff mentioned is C-approach specific, everyone is speaking about possible problems without specific example.
Mar 05 2013
prev sibling parent reply "Dicebot" <m.strashun gmail.com> writes:
On Tuesday, 5 March 2013 at 19:59:13 UTC, Rob T wrote:
 On Tuesday, 5 March 2013 at 19:14:08 UTC, Dicebot wrote:
 Do you consider necessity to duplicate method signature when 
 overriding in descendant to be a significant code duplication 
 too? Because I fail to see the difference in this two cases.
I think there is a difference. For example the compiler will complain when there's a typo, as opposed to getting obfuscated linker errors. No matter if there *is* a sane way to prevent duplication of function overrides, then I'll support it over duplication. This is a common concept, you should *allays* want to avoid unnecessary duplications as every line of extra code you add will exponentially increase the amount of effort required to maintain bug free code.
As far as I get spirit of this proposal, compiler should do the very same checks and complain on mismatch between interface and implementation. Because it knows it is his interface, contrary to C. And there is no sane way to avoid it in the same sense as with inheritance - if you want to keep sync between entities in two places, some duplication is inevitable. And that is acceptable if it is minimal and not error-prone (== verified by compiler).
 I do not object better automatic .di generation, that is good 
 for sure. But I do object opposing proposed simplification of 
 going other way around. Why do you call it "regressing" if it 
 changes nothing in current behavior? It will considerably 
 improve my usage scenario (start with .di, write .d after) and 
 won't affect yours (write only .d)
You'll need to explain more what you are trying to accomplish because I sincerely cannot understand what you are arguing for. Why would anyone ever want to start off with a .di interface?
 ...
Ye, probably this is the root cause of disagreement. My use case is not closed-source distribution, it is more of design philosophy/approach for large projects. This is somewhat similar to normal interfaces - it helps abstraction to have a single place with all information about your module someone out there needs to know : public interface, definitions and documentation. Not because your implementation source is closed in licensing sense, but because that is the very point of abstraction to make your implementation a black-box so it may change in any possible way. It sometimes considered a good approach (which I agree with) to start writing module from public interface, before you actually know anything about your upcoming implementation and your thinking is closer to someone who will later use this module. So you write down this single file that can be thrown at other programmer with words "Here. It is all you need to know about my module to use it. Don't even try to learn more". Then, once you are satisfied, actual implementation starts. Can it be generated? Somewhat. But as this is intended for human reading, I'll need to verify result anyway with all attention. To make sure function definition order is as planned, no extra functions are made public by an accident, documentation formatting fits and so on. Especially verifying no extra stuff leaked there is hell of a work, much more than any definition duplication. And than there is also other way around. Lets pretend you have got your public interface design in same .d file and then started digging all the real stuff. It is so easy to forget what was intended to be known by outside world and tweak definitions here and there a bit, leaking implementation details in process. When interface is separate, you need a clear conscious effort to change anything about it. Funny thing, there have been link to a cool presentation about OOP and ADT somewhere in this newsgroup (http://www.infoq.com/presentations/It-Is-Possible-to-Do-OOP-in-Java) and author there actually mentions C separation of header and translation unit as a good example of OOP, as opposed to Java. Ironically but pretty damn seriously. tl;dr: I see no additional error-prone cases with this feature if implemented with proper compiler verification but it helps for maintenance and improves learning curve of project source.
Mar 05 2013
parent "Rob T" <alanb ucora.com> writes:
On Tuesday, 5 March 2013 at 20:51:02 UTC, Dicebot wrote:
 As far as I get spirit of this proposal, compiler should do the 
 very same checks and complain on mismatch between interface and 
 implementation. Because it knows it is his interface, contrary 
 to C. And there is no sane way to avoid it in the same sense as 
 with inheritance - if you want to keep sync between entities in 
 two places, some duplication is inevitable. And that is 
 acceptable if it is minimal and not error-prone (== verified by 
 compiler).
Yes complier verification has to be done between interface and the implementation. The ,di file however should be auto generated and not considered a part of the source code. The only verification I could think of for a .di file, is to make sure that the ,di is not overwritten if there's no change to the interface so as to prevent triggering unnecessary rebuilds. Such a thing would be useful for large projects.
 You'll need to explain more what you are trying to accomplish 
 because I sincerely cannot understand what you are arguing 
 for. Why would anyone ever want to start off with a .di 
 interface?
 ...
Ye, probably this is the root cause of disagreement. My use case is not closed-source distribution, it is more of design philosophy/approach for large projects.
[snip] Perhaps the idea of allowing separate interfaces for modules is what you are asking for. I don't oppose adding better interface abilities to D, and from the discussion in here I think valid arguments were made in favor of that. For your use case, you should get what you want if real interfaces were added to D, so if you are to make a case to meet your specific needs, then please push for adding better interfaces to D, rather than support suggestions that we can solve interface problems through direct manual maintenance of .di files. --rt
Mar 05 2013
prev sibling parent "Jakob Bornecrantz" <wallbraker gmail.com> writes:
On Friday, 22 July 2011 at 22:06:20 UTC, Andrei Alexandrescu 
wrote:

[SNIP]

 Walter is reluctantly on board with a change in this direction, 
 with the note that he'd just recommend interfaces for this kind 
 of separation. My stance in this matter is that we shouldn't 
 constrain without necessity the ability of programmers to 
 organize the physical design of their large projects.
I agree with Walter here, I have had good success with interfaces and abstract classes (where speed dictated direct field access) for largish projects, by separating This talk[1] made me rethink how to do OO with classes and interfaces, but as he says at the end is the key take away. I apply the approach more on the high level architecture of the program [2] & [3, 4, 5]. Either way the talk is really good, I recommend whole heartily you take the time to watch it. Cheers, Jakob. [1] http://www.infoq.com/presentations/It-Is-Possible-to-Do-OOP-in-Java [2] https://github.com/VoltLang/Volta/blob/master/src/volt/interfaces.d [3] https://github.com/Charged/Miners/blob/master/src/miners/classic/interfaces.d [4] https://github.com/Charged/Miners/blob/master/src/miners/interfaces.d [5] https://github.com/Charged/Miners/blob/master/src/charge/game/runner.d
Mar 04 2013