www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Templating everything? One module per function/struct/class/etc,

reply "JR" <zorael gmail.com> writes:
Given that...

1. importing a module makes it compile the entirety of it, as 
well as whatever it may be importing in turn
2. templates are only compiled if instantiated
3. the new package.d functionality

...is there a reason *not* to make every single 
function/struct/class separate submodules in a package, and make 
*all* of those templates? Unnused functionality would never be 
imported nor instantiated, and as such never be compiled, so my 
binary would only include what it actually uses.


std/stdio/package.d:
     module std.stdio;
     // still allows for importing the entirety of std.stdio

     public import std.stdio.foo;
     public import std.stdio.writefln;
     __EOF__


std/stdio/foo.d:
     module std.stdio.foo;

     void fooify(Args...)(Args args)
     if (Args.length > 0)
     {
         // ...
     }

     void fooify()()
     {
         // don't need this, won't compile this
     }
     __EOF__


std/stdio/writefln.d;
     module std.stdio.writefln;

     // nevermind the incompatible signature
     auto writefln(string pattern, Args...)(Args args)
     if (!pattern.canFind(PatternIdentifier.json))
     {
         // code that doesn't need std.json --> it is never 
imported
     }
     __EOF__


What am I missing?
May 12 2014
next sibling parent reply Jonathan M Davis via Digitalmars-d-learn writes:
On Mon, 12 May 2014 08:37:42 +0000
JR via Digitalmars-d-learn <digitalmars-d-learn puremagic.com> wrote:

 Given that...

 1. importing a module makes it compile the entirety of it, as
 well as whatever it may be importing in turn
 2. templates are only compiled if instantiated
 3. the new package.d functionality

 ...is there a reason *not* to make every single
 function/struct/class separate submodules in a package, and make
 *all* of those templates? Unnused functionality would never be
 imported nor instantiated, and as such never be compiled, so my
 binary would only include what it actually uses.


 std/stdio/package.d:
      module std.stdio;
      // still allows for importing the entirety of std.stdio

      public import std.stdio.foo;
      public import std.stdio.writefln;
      __EOF__


 std/stdio/foo.d:
      module std.stdio.foo;

      void fooify(Args...)(Args args)
      if (Args.length > 0)
      {
          // ...
      }

      void fooify()()
      {
          // don't need this, won't compile this
      }
      __EOF__


 std/stdio/writefln.d;
      module std.stdio.writefln;

      // nevermind the incompatible signature
      auto writefln(string pattern, Args...)(Args args)
      if (!pattern.canFind(PatternIdentifier.json))
      {
          // code that doesn't need std.json --> it is never
 imported
      }
      __EOF__


 What am I missing?
Well, that would be a lot of extraneous files, which would be very messy IMHO. It also makes it much harder to share private functionality, because everything is scattered across modules - you'd be force to use the package for that. It also wouldn't surprise me if it cost more to compile the code that way if you were actually using most of it (though it may very well save compilation time if you're using a relatively small number of the functions and types). So, from a purely organization perspective, I think that it's a very bad idea, though others may think that it's a good one. And since package.d imports all of those modules anyway, separating them out into separate files didn't even help you any. Also, templates cost more to compile, so while you may avoid having to compile some functions, becasue you're not using them, everything that _does_ get compiled will take longer to compile. And if you templatize them in a way that would result in more template instantiations (e.g. you actually templatize the parameters rather than just giving the function an empty template parameter list), then not only will the functions have to be compiled more frequently (due to multiple instantiations), but they'll take up more space in the final binary. Also, while D does a _much_ better job with template errors than C++ does, template-related errors in D are still far worse than with functions that aren't templated, so you're likely going to cost yourself more time debugging template-related compilation errors than you ever would gain in reduced compilation times. In addition, if a function is templatized, it's harder to use it with function prototypes, which can be a definite problem for some code. It's also a horrible idea for libraries to have functions templatized just to be templatized, because that means that a function has to be compiled _every_ time that a program uses it rather than having it compiled once when the library is compiled (the function would still have to be parsed unless it just had its signature in a .di file, but that's still faster than full compilation - and if a .di file is used, then all that has to be parsed is the signature). So, while it's often valuable to templatize functions, templatizing them to save compilation times is questionable at best. D already does a _very_ good job at compiling quickly. Often, the linking step costs more than the actualy compilation does (though obviosuly, as programs grow larger, the compilation time does definitely exceed the link time). Unless you're running into problems with compilation speed, I'd strongly advise against trying to work around the compiler to speed up code compilation. Templatize functions when it makes sense to do so, but don't templatize them just in an attempt to avoid having them be compiled. If you're looking to speed up compilation times, it makes far more sense to look at doing things like reducing how much CTFE you use and how many templates you use. CTFE in particular is ridiculously expensive thanks to it effectively just being hacked into the compiler originally. Don has been doing work to improve that, and I expect it to improve over time, but I don't know how far along he is, and I don't know that it'll ever be exactly cheap to use CTFE. Keep in mind that lexing and parsing are the _cheap_ part of the compiler. So, importing stuff really doesn't cost you much. Already, the compiler won't fully compile all of the symbols within a module except when it's compiling that module. Simply importing it just causes it to process the module as much as required to fully compile the module that's importing it - which generally means getting function signatures (though in the case of templates, it could mean fully compiling the functions which are used rather than just pulling in their signatures, since the template is not fully compiled with its own module but by the module that instantiates it). So, that explanation is probably too long, but essentially what you're suggesting is likely to be _more_ expensive, not less, and IMHO, it makes code organization much messier (though that part is obviously subjective). I've never seen the insanely large number of files required in Java to be a benefit of Java, and you're essentially suggesting that we take that to the extreme and have _everything_ have its own module rather than just each class. - Jonathan M Davis
May 12 2014
parent "JR" <zorael gmail.com> writes:
On Monday, 12 May 2014 at 09:16:53 UTC, Jonathan M Davis via 
Digitalmars-d-learn wrote:
 Well, that would be a lot of extraneous files, which would be 
 very messy IMHO.
 It also makes it much harder to share private functionality, 
 because
 everything is scattered across modules - you'd be force to use 
 the package for
 that. It also wouldn't surprise me if it cost more to compile 
 the code that
 way if you were actually using most of it (though it may very 
 well save
 compilation time if you're using a relatively small number of 
 the functions
 and types). So, from a purely organization perspective, I think 
 that it's a
 very bad idea, though others may think that it's a good one. 
 And since
 package.d imports all of those modules anyway, separating them 
 out into
 separate files didn't even help you any.
Thank you for answering. The package.d example was mostly to highlight that the current syntax and practice of importing everything in a package wouldn't change. To be a bit more specific I'm new to everything beyond advanced scripting, so my code has grown organically. Partly due to shotgun programming, and partly due to exploratory programming. In the process of figuring out what code depends on what, to allow for making pieces of it more self-contained, I realized that with my current practice of gathering everything related to the same topic into one module made it all unneccesarily tangled. (--gc-sections seems to cause segfaults.) As an example, both the homebrew string builder 'Yarn' and the simple 'string plurality(ptrdiff_t, string, string)' belong in mylib.string. But if I want to reuse only plurality there in other projects I would end up pulling in all the ancillary stuff Yarn needs to validate input, and it only segways from there. Correct? The logical step would be to split the all-encompassing mylib.string module into submodules, and the logical followup question would be why I shouldn't do that everywhere. Knowing that templates aren't built unless concretely instantiated, the idea of templatizing everything fell into the same basket. I hope you understand my concerns that I was falling into an anti-pattern.
May 12 2014
prev sibling next sibling parent Ary Borenszweig <ary esperanto.org.ar> writes:
On 5/12/14, 5:37 AM, JR wrote:
 Given that...

 1. importing a module makes it compile the entirety of it, as well as
 whatever it may be importing in turn
 2. templates are only compiled if instantiated
 3. the new package.d functionality

 ...is there a reason *not* to make every single function/struct/class
 separate submodules in a package, and make *all* of those templates?
 Unnused functionality would never be imported nor instantiated, and as
 such never be compiled, so my binary would only include what it actually
 uses.
Welcome to Crystal :-) In Crystal, every function and method is templated. When you compile your program, only what you use gets compiled. A simple hello world program is just 16KB. And contrary to what Jonathan M. Davis says, compiling programs is very fast. In fact, compilation times might indeed be faster, because you don't need to compile unused code. And the resulting binary is as small as possible. And the error messages are pretty good, also. But D has an entirely different philosophy, so I don't think they will like it. I also once suggested "auto" for parameters, but they didn't like it. I'm just saying this to say that yes, it's possible, and no, it doesn't hurt compilation times or error messages.
May 12 2014
prev sibling next sibling parent reply "Francesco Cattoglio" <francesco.cattoglio gmail.com> writes:
On Monday, 12 May 2014 at 08:37:43 UTC, JR wrote:
 What am I missing?
Error messages! If your code is not compiled, you can't know whether it is valid or not. I must say that since we have unittests, this is somewhat less relevant, but still... One nice thing would be stripping the executable of unneeded code. One trick I've seen done in a program which compiled some scripts to an intermediate language was zeroing the parts which are unused, then use some executable compressor.
May 12 2014
parent Jacob Carlborg <doob me.com> writes:
On 12/05/14 20:58, Francesco Cattoglio wrote:

 Error messages!
 If your code is not compiled, you can't know whether it is valid or not.

 I must say that since we have unittests, this is somewhat less relevant,
 but still...
 One nice thing would be stripping the executable of unneeded code.
 One trick I've seen done in a program which compiled some scripts to an
 intermediate language was zeroing the parts which are unused, then use
 some executable compressor.
I had an issue with a module in Tango back in the D1 days. It was a templated class that worked with "char", but when I instantiated it with "wchar" it didn't compile. -- /Jacob Carlborg
May 12 2014
prev sibling parent "Kagamin" <spam here.lot> writes:
You can write a tool, which will construct an amalgamation build 
of your code.
May 12 2014