www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - On attribute inference...

reply Marco Leise <Marco.Leise gmx.de> writes:
Currently the compiler makes sure, that it can see the entire
nested call chain when performing attribute inference. So it
limits itself to function literals and templates where the
source has to be right there for them to compile.

The remaining functions could roughly be classified as
functions that could reside in an external lib with only the
headers and no source being available.

But even so, the external functions will encode their
attributes into the mangled name and linking is impossible
without knowing them. Where do we find this information? In
the .di files, where they must have been inferred by the
header generator.

So external functions without bodies declare their attributes
_explicitly_ and they are a de-facto part of a library's API
due to the mangling.

This means that the compiler must not infer attributes on
functions which could be part of a library API and here is why:
Imagine what would happen if it inferred  nogc and you didn't
realize that. If later on you change the code to allocate
something,  nogc is magically revoked and you have yourself an
API breakage!

Anyone who is arguing for more attribute inference must be
aware of this. Whatever the Dlang spec says about "function
body availability" is misleading, because the real motivation
is forcing people to be explicit about their public APIs.

No case makes the distinction between "function body
availability" and "API stability" more clear than auto return.
Even though their source code is copied verbatim into the .di
files to allow the return of voldemort types (i.e. types that
are defined inside the function returning it), their
attributes are *not* inferred, because they may be part of a
public API!

Let's get into that "public API" mindset. We need to use .di
files for libraries or else the compiler will transitively
analyze every imported .d file from your project and any used
libraries. Imagine how much source code a project like
LibreOffice and all dependent libraries comprises! That does
not scale. I'd also propose a visibility level above 'public',
to tag symbols that are exported from a library. All others
are not invisible in .dll/.so files and removed during .di
generation. All functions that are not part of such "exported"
symbols can then have their attributes inferred.

Gains:
 + Attribute inference on all private API!
 + Faster compiles: .di files short circuit recursive imports!
 + Solves bugs!: https://issues.dlang.org/show_bug.cgi?id=9816

Ok, so who writes a DIP for this? Benjamin Thaut, Martin
Nowak and David Nadlinger: http://wiki.dlang.org/DIP45

-- 
Marco
Apr 18 2016
next sibling parent reply Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Tuesday, April 19, 2016 03:27:46 Marco Leise via Digitalmars-d wrote:
 No case makes the distinction between "function body
 availability" and "API stability" more clear than auto return.
 Even though their source code is copied verbatim into the .di
 files to allow the return of voldemort types (i.e. types that
 are defined inside the function returning it), their
 attributes are *not* inferred, because they may be part of a
 public API!
Except that unfortunately, the compiler _does_ do attribute inference for auto return functions now, because the body is guaranteed to be available. Relying on that inference in a public API that's part of a library will easily lead to code breakage when the function implementation is changed (or even when a function that it calls is changed, if that affects its attributes). Personally, I think that adding attribute inference for auto return functions was a mistake and just encourages bad practices, but unless Walter can be convinced that it was such a bad idea that it needs to be reverted (and break whatever code thatt that would break), we're stuck with it. Regardless, I think that it's clear that if you want a stable API, you need to infer attributes as little as possible.
 I'd also propose a visibility level above 'public',
 to tag symbols that are exported from a library. All others
 are not invisible in .dll/.so files and removed during .di
 generation. All functions that are not part of such "exported"
 symbols can then have their attributes inferred.
Isn't that basically what folks like Benjamin and Martin want the export keyword to do? - Jonathan M Davis
Apr 18 2016
parent Walter Bright <newshound2 digitalmars.com> writes:
On 4/18/2016 8:20 PM, Jonathan M Davis via Digitalmars-d wrote:
 Except that unfortunately, the compiler _does_ do attribute inference for
 auto return functions now, because the body is guaranteed to be available.
 Relying on that inference in a public API that's part of a library will
 easily lead to code breakage when the function implementation is changed (or
 even when a function that it calls is changed, if that affects its
 attributes).

 Personally, I think that adding attribute inference for auto return
 functions was a mistake and just encourages bad practices, but unless Walter
 can be convinced that it was such a bad idea that it needs to be reverted
 (and break whatever code thatt that would break), we're stuck with it.
 Regardless, I think that it's clear that if you want a stable API, you need
 to infer attributes as little as possible.
Auto function attribute inference is a good idea. They really aren't conceptually different from templates. Note that the mangling of an auto function will change if its attributes change, thus meaning you should get a link failure. Attribute inference is a huge win for D, it makes the mass of attributes manageable.
Apr 19 2016
prev sibling parent reply Satoshi <satoshi gshost.eu> writes:
Why cannot be exported every public/protected method by default?
When I am creating so/dll I want to export almost everything. 
Methods what I dont want to export I should mark as private or 
package or package(nameOfRootPackage).
Apr 19 2016
parent reply Jonathan M Davis via Digitalmars-d <digitalmars-d puremagic.com> writes:
On Tuesday, April 19, 2016 07:58:19 Satoshi via Digitalmars-d wrote:
 Why cannot be exported every public/protected method by default?
 When I am creating so/dll I want to export almost everything.
 Methods what I dont want to export I should mark as private or
 package or package(nameOfRootPackage).
From the standpoint of simplicity, that's definitely nicer. With C/C++ and
Linux, the default is that all symbols in a shared library are visible, whereas on Windows, on those which are explicitly exported are visible, and I always found having to deal with building on a Windows a royal pain - especially when what I had done just worked on Linux. However, there are some good arguments for not just exporting everything - particularly with regards to compilation efficiency. It's not that uncommon for a library to have a public API that everyone should see and use while having a bunch of functions that are used only internally and really shouldn't be exposed to users of the library. To some extent, package can be used to hide those, but the larger the library, the harder that gets. And if a library is very deep (i.e. its functions do a lot for you), then you're fairly quickly going to end up with functionality that ideally would be hidden, and putting everything in one module or package isn't always reasonable. As it is Phobos has std.internal which is theoretically not supposed to be used outside of Phobos, but it's completely importable and usable by any code using Phobos. Having code like that restricted by export could be desirable. I expect that hiding symbols which aren't public would certainly help a great deal in avoiding having a lot of unnecessary symbols in a compiled library, but it doesn't completely solve the problem. And when discussions on export have come up, some of the folks who are particularly knowledgeable on the subject have been _very_ much in favor of using export to restrict what is and isn't visible in the generated library rather than relying on public. They'll have to chime in to provide good details though. The main issue that I recall is the desire/need to reduce the number of symbols in order to make compiling/linking more efficient. - Jonathan M Davis
Apr 19 2016
next sibling parent Satoshi <satoshi gshost.eu> writes:
On Tuesday, 19 April 2016 at 08:57:21 UTC, Jonathan M Davis wrote:
 On Tuesday, April 19, 2016 07:58:19 Satoshi via Digitalmars-d 
 wrote:
 Why cannot be exported every public/protected method by 
 default? When I am creating so/dll I want to export almost 
 everything. Methods what I dont want to export I should mark 
 as private or package or package(nameOfRootPackage).
From the standpoint of simplicity, that's definitely nicer. 
With C/C++ and
Linux, the default is that all symbols in a shared library are visible, whereas on Windows, on those which are explicitly exported are visible, and I always found having to deal with building on a Windows a royal pain - especially when what I had done just worked on Linux. However, there are some good arguments for not just exporting everything - particularly with regards to compilation efficiency. It's not that uncommon for a library to have a public API that everyone should see and use while having a bunch of functions that are used only internally and really shouldn't be exposed to users of the library. To some extent, package can be used to hide those, but the larger the library, the harder that gets. And if a library is very deep (i.e. its functions do a lot for you), then you're fairly quickly going to end up with functionality that ideally would be hidden, and putting everything in one module or package isn't always reasonable. As it is Phobos has std.internal which is theoretically not supposed to be used outside of Phobos, but it's completely importable and usable by any code using Phobos. Having code like that restricted by export could be desirable. I expect that hiding symbols which aren't public would certainly help a great deal in avoiding having a lot of unnecessary symbols in a compiled library, but it doesn't completely solve the problem. And when discussions on export have come up, some of the folks who are particularly knowledgeable on the subject have been _very_ much in favor of using export to restrict what is and isn't visible in the generated library rather than relying on public. They'll have to chime in to provide good details though. The main issue that I recall is the desire/need to reduce the number of symbols in order to make compiling/linking more efficient. - Jonathan M Davis
But when I create two versions of the same library (shared and static one). Public symbols will be accessible through the static library but inaccessible through the shared library. Library is a one big package and marking symbols as package(MyLib) is same as mark them as public in shared object. e.g. I have MyLib and TestApp MyLib - math.d - foo.d - bar/test.d TestApp - main.d or subpackages like rikarin.core; rikarin.appkit; rikarin.base; rikarin.backend; etc. I can easily create internal methods through the package(rikarin) definition. And package symbols shouldn't be exported.
Apr 19 2016
prev sibling next sibling parent Johannes Pfau <nospam example.com> writes:
Am Tue, 19 Apr 2016 01:57:21 -0700
schrieb Jonathan M Davis via Digitalmars-d
<digitalmars-d puremagic.com>:

 I expect that hiding symbols which aren't public would certainly help
 a great deal in avoiding having a lot of unnecessary symbols in a
 compiled library, but it doesn't completely solve the problem. And
 when discussions on export have come up, some of the folks who are
 particularly knowledgeable on the subject have been _very_ much in
 favor of using export to restrict what is and isn't visible in the
 generated library rather than relying on public. They'll have to
 chime in to provide good details though. The main issue that I recall
 is the desire/need to reduce the number of symbols in order to make
 compiling/linking more efficient.
 
 - Jonathan M Davis
 
Not only compiling / linking: https://gcc.gnu.org/wiki/Visibility * It very substantially improves load times of your DSO (Dynamic Shared Object). For example, a huge C++ template-based library which was tested (the TnFOX Boost.Python bindings library) now loads in eight seconds rather than over six minutes! * It lets the optimiser produce better code. PLT indirections (when a function call or variable access must be looked up via the Global Offset Table such as in PIC code) can be completely avoided, thus substantially avoiding pipeline stalls on modern processors and thus much faster code. Furthermore when most of the symbols are bound locally, they can be safely elided (removed) completely through the entire DSO. This gives greater latitude especially to the inliner which no longer needs to keep an entry point around "just in case". * It reduces the size of your DSO by 5-20%. ELF's exported symbol table format is quite a space hog, giving the complete mangled symbol name which with heavy template usage can average around 1000 bytes. C++ templates spew out a huge amount of symbols and a typical C++ library can easily surpass 30,000 symbols which is around 5-6Mb! Therefore if you cut out the 60-80% of unnecessary symbols, your DSO can be megabytes smaller! * Much lower chance of symbol collision. The old woe of two libraries internally using the same symbol for different things is finally behind us with this patch. Hallelujah!
Apr 19 2016
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2016-04-19 10:57, Jonathan M Davis via Digitalmars-d wrote:

 However, there are some good arguments for not just exporting everything -
 particularly with regards to compilation efficiency. It's not that uncommon
 for a library to have a public API that everyone should see and use while
 having a bunch of functions that are used only internally and really
 shouldn't be exposed to users of the library. To some extent, package can be
 used to hide those, but the larger the library, the harder that gets. And if
 a library is very deep (i.e. its functions do a lot for you), then you're
 fairly quickly going to end up with functionality that ideally would be
 hidden, and putting everything in one module or package isn't always
 reasonable.
Can't the new package(name) syntax be used to solve this? Although this doesn't work for virtual methods. -- /Jacob Carlborg
Apr 19 2016