www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - output minimal .di files?

reply "F i L" <witte2008 gmail.com> writes:
Given the code, test.d:

    import std.stdio;

    export void test()
    {
        writeln("Test");
    }

compiled with: # dmd -lib -H test.d
I end up with test.lib (good so far), and test.di:

    import std.stdio;

    export void test()
    {
    writeln("Test");
    }

wtf? why is test() fully represented? I thought interface files 
where suppose to be minimal, interface only structures (like C .h 
files). When I manually cut everything down to: "export void 
test();" everything still works fine, so why is DMD spitting out 
the implementation?
Jan 15 2012
next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, January 15, 2012 12:53:05 F i L wrote:
 Given the code, test.d:
 
     import std.stdio;
 
     export void test()
     {
         writeln("Test");
     }
 
 compiled with: # dmd -lib -H test.d
 I end up with test.lib (good so far), and test.di:
 
     import std.stdio;
 
     export void test()
     {
     writeln("Test");
     }
 
 wtf? why is test() fully represented? I thought interface files
 where suppose to be minimal, interface only structures (like C .h
 files). When I manually cut everything down to: "export void
 test();" everything still works fine, so why is DMD spitting out
 the implementation?

http://stackoverflow.com/questions/7720418/whats-not-in-an-interface-file - Jonathan M Davis
Jan 15 2012
prev sibling next sibling parent "F i L" <witte2008 gmail.com> writes:
Jonathan M Davis wrote:
 http://stackoverflow.com/questions/7720418/whats-not-in-an-interface-file

I see. Thanks again, Jonathan. I know this has been said before, but these sorts of explanations really should be part of the documentation.
Jan 15 2012
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sun, Jan 15, 2012 at 01:39:05PM +0100, F i L wrote:
 Jonathan M Davis wrote:
http://stackoverflow.com/questions/7720418/whats-not-in-an-interface-file

I see. Thanks again, Jonathan. I know this has been said before, but these sorts of explanations really should be part of the documentation.

Speaking of documentation, I notice that the std.uni documentation on d-programming-language.org is out-of-date (either that or it's incomplete). It's missing several important unicode classification functions that I have need of; I've had to resort to reading the library source to find out what's available. (The library source does have the doc comments in place, so probably all that's needed is to re-generate the docs afresh.) I see that the site is maintained by Andrei but I couldn't find his email address on his site (where he said it would be). How can I send a request for the docs to be updated? T -- Question authority. Don't ask why, just do it.
Jan 15 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, January 15, 2012 11:16:51 H. S. Teoh wrote:
 On Sun, Jan 15, 2012 at 01:39:05PM +0100, F i L wrote:
 Jonathan M Davis wrote:
http://stackoverflow.com/questions/7720418/whats-not-in-an-interface-f
ile> 

these sorts of explanations really should be part of the documentation.

Speaking of documentation, I notice that the std.uni documentation on d-programming-language.org is out-of-date (either that or it's incomplete). It's missing several important unicode classification functions that I have need of; I've had to resort to reading the library source to find out what's available. (The library source does have the doc comments in place, so probably all that's needed is to re-generate the docs afresh.) I see that the site is maintained by Andrei but I couldn't find his email address on his site (where he said it would be). How can I send a request for the docs to be updated?

You put them in the normal bugzilla. Just enter a D bug with websites as its component. d.puremagic.com/issues - Jonathan M Davis
Jan 15 2012
prev sibling next sibling parent reply "Adam Wilson" <flyboynw gmail.com> writes:
On Sun, 15 Jan 2012 03:53:05 -0800, F i L <witte2008 gmail.com> wrote:

 Given the code, test.d:

     import std.stdio;

     export void test()
     {
         writeln("Test");
     }

 compiled with: # dmd -lib -H test.d
 I end up with test.lib (good so far), and test.di:

     import std.stdio;

     export void test()
     {
     writeln("Test");
     }

 wtf? why is test() fully represented? I thought interface files where  
 suppose to be minimal, interface only structures (like C .h files). When  
 I manually cut everything down to: "export void test();" everything  
 still works fine, so why is DMD spitting out the implementation?

I'm assuming that your goal is to build either or static or dynamic libraries? If that is the case than you can assume that CTFE and inlining will not work anyways. This is an inherent limitation of libraries and not D. What D currently does is assume that you want everything to work, and spits out your implementation code symbol-for-symbol. The only thing I've found that D ever strips out of DI files is unittests. I have written a patch for DMD that strips out non-template class/function implementations with the understanding that CTFE and inlining will no longer work. Templated functions and classes retain their implementations, this is in line with the way C++ operates. Unfortunately my patch isn't well tested yet so I haven't opened the pull required to get it included into the main line DMD code. But it's a available from my Git account [https://LightBender github.com/LightBender/dmd.git] if you don't mind building DMD yourself. -- Adam Wilson Project Coordinator The Horizon Project http://www.thehorizonproject.org/
Jan 16 2012
parent reply =?ISO-8859-1?Q?Alex_R=F8nne_Petersen?= <xtzgzorex gmail.com> writes:
On 16-01-2012 21:08, H. S. Teoh wrote:
 On Mon, Jan 16, 2012 at 11:38:15AM -0800, Adam Wilson wrote:
 [...]
 I would say the main reason for using .h/.di files in libraries is
 that the library designer does not want his implementation public
 viewable. And in D, unlike C/C++, .di files are pretty much exclusive
 to the concept of libraries. I'd say that, based on how many questions
 are raised about .di files, almost no one expects the current
 behavior, I certainly didn't, hence my patch. The DI generation patch
 currently implements the C++ paradigm, where templated function
 implementations are publicly viewable, but non-templated function
 implementations are not. I feel that this paradigm, being the
 currently accepted convention, is the best path for D to take.

But if you remove function bodies from inline-able functions, then your library loses out on potential optimization by the compiler. Besides, all your templates are still world-readable, which, depending on what your library is, may pretty much comprise your entire library anyway. To *truly* have separation of API from implementation, interface files shouldn't even have templated functions. It should list ONLY public declarations, no private members, no function bodies, no template bodies, etc.. All function bodies, including inline functions, template bodies, private members, etc., should be in a binary format readable only by the compiler. One way to implement this is to store template/inline function bodies inside the precompiled object files as extra info that the compiler loads in order to be able to expand templates/inline functions, compute the size of structs/classes (because private members are not listed in the API file), and so on. How this is feasible to implement, I can't say; some platforms may not allow arbitrary data inside object files, so the compiler may not be able to store the requisite information in them. T

I... don't think the error messages from expanding raw object code would be very pleasant to read, if you used a template incorrectly... -- - Alex
Jan 16 2012
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 01/16/2012 09:40 PM, H. S. Teoh wrote:
 On Mon, Jan 16, 2012 at 09:32:57PM +0100, Alex Rønne Petersen wrote:
 [...]
 I... don't think the error messages from expanding raw object code
 would be very pleasant to read, if you used a template incorrectly...

It doesn't have to be *executable* object code; the compiler may store extra info (perhaps as debugging data?) so that it can generate nicer error messages. But like I said, this assumes the compiler is allowed to store arbitrary data inside object files, which may not be the case on some platforms. T

How would your proposal help hiding implementation details of templates anyway? All information still needs to be stored. Anyone could write a object file -> source file compiler for template implementations.
Jan 16 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, January 16, 2012 00:14:02 Adam Wilson wrote:
 I'm assuming that your goal is to build either or static or dynamic
 libraries?
 
 If that is the case than you can assume that CTFE and inlining will not
 work anyways. This is an inherent limitation of libraries and not D. What
 D currently does is assume that you want everything to work, and spits out
 your implementation code symbol-for-symbol. The only thing I've found that
 D ever strips out of DI files is unittests. I have written a patch for DMD
 that strips out non-template class/function implementations with the
 understanding that CTFE and inlining will no longer work. Templated
 functions and classes retain their implementations, this is in line with
 the way C++ operates. Unfortunately my patch isn't well tested yet so I
 haven't opened the pull required to get it included into the main line DMD
 code. But it's a available from my Git account
 [https://LightBender github.com/LightBender/dmd.git] if you don't mind
 building DMD yourself.

Inlining and CTFE should work just fine as long as everything that you're trying to inline or use with CTFE is in the .di file. Sure, whatever you strip out of the .di file won't work with CTFE or inlining, but inlining and CTFE should work just fine with dynamic libraries, exactly like if you had stuff in the .h file in C++. You just have to be willing to have it in the .di file. And you _still_ get the benefits of a dynamic library, since the symbols don't get duplicated between programs which share the library. It's just that you still have to recompile everything that it's in the .di file, so less can have its symbol hidden (for Windows anyway - there is no symbol hiding in shared libraries in linux). But you can definitely using inlining and CTFE with dynamic libraries. - Jonathan m Davis
Jan 16 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Mon, 16 Jan 2012 00:25:21 -0800, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Monday, January 16, 2012 00:14:02 Adam Wilson wrote:
 I'm assuming that your goal is to build either or static or dynamic
 libraries?

 If that is the case than you can assume that CTFE and inlining will not
 work anyways. This is an inherent limitation of libraries and not D.  
 What
 D currently does is assume that you want everything to work, and spits  
 out
 your implementation code symbol-for-symbol. The only thing I've found  
 that
 D ever strips out of DI files is unittests. I have written a patch for  
 DMD
 that strips out non-template class/function implementations with the
 understanding that CTFE and inlining will no longer work. Templated
 functions and classes retain their implementations, this is in line with
 the way C++ operates. Unfortunately my patch isn't well tested yet so I
 haven't opened the pull required to get it included into the main line  
 DMD
 code. But it's a available from my Git account
 [https://LightBender github.com/LightBender/dmd.git] if you don't mind
 building DMD yourself.

Inlining and CTFE should work just fine as long as everything that you're trying to inline or use with CTFE is in the .di file. Sure, whatever you strip out of the .di file won't work with CTFE or inlining, but inlining and CTFE should work just fine with dynamic libraries, exactly like if you had stuff in the .h file in C++. You just have to be willing to have it in the .di file. And you _still_ get the benefits of a dynamic library, since the symbols don't get duplicated between programs which share the library. It's just that you still have to recompile everything that it's in the .di file, so less can have its symbol hidden (for Windows anyway - there is no symbol hiding in shared libraries in linux). But you can definitely using inlining and CTFE with dynamic libraries. - Jonathan m Davis

I would say the main reason for using .h/.di files in libraries is that the library designer does not want his implementation public viewable. And in D, unlike C/C++, .di files are pretty much exclusive to the concept of libraries. I'd say that, based on how many questions are raised about .di files, almost no one expects the current behavior, I certainly didn't, hence my patch. The DI generation patch currently implements the C++ paradigm, where templated function implementations are publicly viewable, but non-templated function implementations are not. I feel that this paradigm, being the currently accepted convention, is the best path for D to take. -- Adam Wilson Project Coordinator The Horizon Project http://www.thehorizonproject.org/
Jan 16 2012
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Jan 16, 2012 at 11:38:15AM -0800, Adam Wilson wrote:
[...]
 I would say the main reason for using .h/.di files in libraries is
 that the library designer does not want his implementation public
 viewable. And in D, unlike C/C++, .di files are pretty much exclusive
 to the concept of libraries. I'd say that, based on how many questions
 are raised about .di files, almost no one expects the current
 behavior, I certainly didn't, hence my patch. The DI generation patch
 currently implements the C++ paradigm, where templated function
 implementations are publicly viewable, but non-templated function
 implementations are not. I feel that this paradigm, being the
 currently accepted convention, is the best path for D to take.

But if you remove function bodies from inline-able functions, then your library loses out on potential optimization by the compiler. Besides, all your templates are still world-readable, which, depending on what your library is, may pretty much comprise your entire library anyway. To *truly* have separation of API from implementation, interface files shouldn't even have templated functions. It should list ONLY public declarations, no private members, no function bodies, no template bodies, etc.. All function bodies, including inline functions, template bodies, private members, etc., should be in a binary format readable only by the compiler. One way to implement this is to store template/inline function bodies inside the precompiled object files as extra info that the compiler loads in order to be able to expand templates/inline functions, compute the size of structs/classes (because private members are not listed in the API file), and so on. How this is feasible to implement, I can't say; some platforms may not allow arbitrary data inside object files, so the compiler may not be able to store the requisite information in them. T -- First Rule of History: History doesn't repeat itself -- historians merely repeat each other.
Jan 16 2012
prev sibling next sibling parent "Adam Wilson" <flyboynw gmail.com> writes:
On Mon, 16 Jan 2012 12:08:53 -0800, H. S. Teoh <hsteoh quickfur.ath.cx>  
wrote:

 On Mon, Jan 16, 2012 at 11:38:15AM -0800, Adam Wilson wrote:
 [...]
 I would say the main reason for using .h/.di files in libraries is
 that the library designer does not want his implementation public
 viewable. And in D, unlike C/C++, .di files are pretty much exclusive
 to the concept of libraries. I'd say that, based on how many questions
 are raised about .di files, almost no one expects the current
 behavior, I certainly didn't, hence my patch. The DI generation patch
 currently implements the C++ paradigm, where templated function
 implementations are publicly viewable, but non-templated function
 implementations are not. I feel that this paradigm, being the
 currently accepted convention, is the best path for D to take.

But if you remove function bodies from inline-able functions, then your library loses out on potential optimization by the compiler. Besides, all your templates are still world-readable, which, depending on what your library is, may pretty much comprise your entire library anyway.

This is a VERY well known deal in the library community. Library writers expect that their functions won't be inlined and all template functions would be public, they are quite comfortable making the trade-off. If you're making a closed-source library, you assume up these things from the beginning. It's been that way since the dawn of time and shows no sign of changing, at least in the Native Compilation world. The fact that DMD does NOT work with DI files as programmers coming from the C/C++ world would expect has caused more confusion about DI files than any other subject. It certainly confused me. Hence the patch. DMD needs to offer as seamless a transition as possible, and frankly this area of it stinks. No programmer coming to D expects an "Include" file to include all implementations by default. This has actually been a subject of pain in a number of rants by ex-D programmers who ragequit D. It's not an Include file if its got all the implementations, it's a source file. Right now, the ONLY difference between .D and .DI is that .DI strips out unittests, that's not an include file by any relevant definition of the term.
 To *truly* have separation of API from implementation, interface files
 shouldn't even have templated functions. It should list ONLY public
 declarations, no private members, no function bodies, no template
 bodies, etc..  All function bodies, including inline functions, template
 bodies, private members, etc., should be in a binary format readable
 only by the compiler.

That's an API design decision and therefore best left to the library writers, it is NOT D's job to enforce that opinion.
 One way to implement this is to store template/inline function bodies
 inside the precompiled object files as extra info that the compiler
 loads in order to be able to expand templates/inline functions, compute
 the size of structs/classes (because private members are not listed in
 the API file), and so on. How this is feasible to implement, I can't
 say; some platforms may not allow arbitrary data inside object files, so
 the compiler may not be able to store the requisite information in them.


 T

Not a bad idea, it's similar in function to .NET's Metadata. unfortunately to be useful, other linkers would have to be taught how to read that data... -- Adam Wilson Project Coordinator The Horizon Project http://www.thehorizonproject.org/
Jan 16 2012
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Jan 16, 2012 at 09:32:57PM +0100, Alex Rønne Petersen wrote:
[...]
 I... don't think the error messages from expanding raw object code
 would be very pleasant to read, if you used a template incorrectly...

It doesn't have to be *executable* object code; the compiler may store extra info (perhaps as debugging data?) so that it can generate nicer error messages. But like I said, this assumes the compiler is allowed to store arbitrary data inside object files, which may not be the case on some platforms. T -- Which is worse: ignorance or apathy? Who knows? Who cares? -- Erich Schubert
Jan 16 2012
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Jan 16, 2012 at 12:32:01PM -0800, Adam Wilson wrote:
 On Mon, 16 Jan 2012 12:08:53 -0800, H. S. Teoh
 <hsteoh quickfur.ath.cx> wrote:

One way to implement this is to store template/inline function bodies
inside the precompiled object files as extra info that the compiler
loads in order to be able to expand templates/inline functions,
compute the size of structs/classes (because private members are not
listed in the API file), and so on. How this is feasible to
implement, I can't say; some platforms may not allow arbitrary data
inside object files, so the compiler may not be able to store the
requisite information in them.


 Not a bad idea, it's similar in function to .NET's Metadata.
 unfortunately to be useful, other linkers would have to be taught how
 to read that data...

That depends on how you do it. If we assume, for argument's sake, that we are allowed to store arbitrary data inside an object file (say inside a debug section or something), then the compiler could for example store things like parsed function bodies, partial syntax trees, etc., that allows it to treat templates and inline functions as though they actually were embedded in the interface file. When asked to compile a source file that imports the library, the compiler would read the object file, extract this info and use it to do whatever it needs to do (expand templates, inline functions, emit non-inlined function bodies, etc.). The generated object file can then be linked by the system's usual linker, assuming that the extra info in the library's object file is marked such that the linker simply ignores it. The linker doesn't have to know anything about it because the compiler has already done whatever needs to be done with it when it compiled the source file that imported the library. In fact, if a particular platform doesn't support such extra data inside object files, the compiler can simply save the data in its own internal format in another file, and the library writer just ships this file along with the human-readable API file and any precompiled library object files. As long as the customer's compiler knows to look for this file when compiling source that imports the library, it will have enough info to do what it needs to do. T -- Everybody talks about it, but nobody does anything about it! -- Mark Twain
Jan 16 2012
prev sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Jan 17, 2012 at 12:17:18AM +0100, Timon Gehr wrote:
 On 01/16/2012 09:40 PM, H. S. Teoh wrote:
On Mon, Jan 16, 2012 at 09:32:57PM +0100, Alex Rønne Petersen wrote:
[...]
I... don't think the error messages from expanding raw object code
would be very pleasant to read, if you used a template incorrectly...

It doesn't have to be *executable* object code; the compiler may store extra info (perhaps as debugging data?) so that it can generate nicer error messages. But like I said, this assumes the compiler is allowed to store arbitrary data inside object files, which may not be the case on some platforms. T

How would your proposal help hiding implementation details of templates anyway? All information still needs to be stored. Anyone could write a object file -> source file compiler for template implementations.

It depends on what you understand by "hiding implementation details". I see it from the point of view of encapsulation taken to its logical conclusion: users of the library only need to know the API of the library and nothing else. When they read the interface file, it should only contain the API and no implementation at all. In this sense, implementation details are completely hidden -- that's what encapsulation is all about. Of course, the *compiler* (and/or linker) obviously needs to know the full implementation details, otherwise it couldn't possibly produce the final executable. So this information has to come from somewhere, object files, compiler internal representation files, what-have-you. I don't think it's possible, even in theory, to prevent reverse-engineering of these sources of compiler information. Given enough determination, *anything* can be reverse engineered. And neither is prevention of reverse engineering the point. The point is that these are sources of *compiler* information, rather than information for library users. In that sense it's useful to separate information meant for the compiler, and information meant for users of the library. T -- All problems are easy in retrospect.
Jan 16 2012