www.digitalmars.com         C & C++   DMDScript  

D.gnu - abi specs, multiple linkages, binary symbol information

reply Jakob Praher <jpraher yahoo.de> writes:
hi David,
hi all,

I like the D language. Since I also play with gcj (the static gcc java
compiler), which has a new ABI (additional to the c++ linkage), I was
wondering about the default D ABI:

* how classes/modules/functions/methods are mangled
* which type codes exist
* is there a way to describe any type using a type code (which is
probably needed for method overloading )

* is there a way to specifiy versioning in the ABI
* since D has its own linkage (opposed to C++ linkage) I would
appreciate a less is more approach and a more stable ABI like that of C++


Yes I looked at DMD but I thought, it would be pleased to know there is
a written spec (the language reference is quite quiet about that).

What would be interesting is to support many different types of
ABIs/Linkages.
This could be done by "helping" the compiler to understand the ABI that
one is using.

And: As the language is specified today, is there a way to do a load
time linking?

I would be interested to link GCJ shared objects against D in a very
native form, so that one could use for instance the many java libs
already developed.

for instance

 gcj import org.apache.xalan...TransformerImpl;
 gcj import java.lang.String;

int main( char[][] args ) {
	TransformerImpl impl = new TransformerImpl( );
	....

}
....


Plus: In order for instance to export a D class to be linked with GCJ,
one clearly needs more meta information exposed in the object file, or
distilled from the D sources.

For me I'd favor the first approach, which would be interesting, since
one could link against a D object file without the need of the
corresponding D source code.

The metadata approach used by GCJ is very straigt forward:

* There is an UTF8 table
* There is a Method table for each method of a class
* There are some other tables for Class Descriptors ...
* The method table contains also all the referenced methods (not only
the ones defined)
* There is a Class table for each class (which contains links to the
other tables)

	- vtable  (the class's methods)
these tables are used for the java binary compatiblity stuff:
	- otable  (offset table for referenced objects by an offset)
	- atable  (address table for referenced objects via address)
	- itable  (interface table)


Surely the simplicity of the java type system allows for a simple
implementation of that. D would need some more meta information
(modules, functions, custom types. .... )

But it would be an interesting task, since then binary compatiblity in D
would be more stable. And the interoperabilty between the gcj project
and D could also be interesting.

looking forward to some discussions

-- Jakob
Oct 18 2004
parent reply David Friedman <d3rdclsmail_a t_earthlink_d.t_net> writes:
Jakob Praher wrote:
 hi David,
 hi all,
 
 I like the D language. Since I also play with gcj (the static gcc java
 compiler), which has a new ABI (additional to the c++ linkage), I was
 wondering about the default D ABI:
 
 * how classes/modules/functions/methods are mangled
 * which type codes exist
 * is there a way to describe any type using a type code (which is
 probably needed for method overloading )

There is no formal spec, so your best bet is to check out mangle.c and mtype.c in the front-end source. Basically, functions and variables have the form: "_D" ~ <namespace mangle> ~ <type mangle> <namepsace> is formed from the package, module, aggregates, etc. down to the declaration's identifier. The following would have the same namespace mangle: module a; class b { class c { int i; } } module a.b; class c { idouble i; } <type> encodes the type of the declaration and may contain more namespace mangling if it involves classes, etc.
 
 * is there a way to specifiy versioning in the ABI

I don't think there is a way to do this now.
 * since D has its own linkage (opposed to C++ linkage) I would
 appreciate a less is more approach and a more stable ABI like that of C++
 
 
 Yes I looked at DMD but I thought, it would be pleased to know there is
 a written spec (the language reference is quite quiet about that).
 
 What would be interesting is to support many different types of
 ABIs/Linkages.
 This could be done by "helping" the compiler to understand the ABI that
 one is using.
 
 And: As the language is specified today, is there a way to do a load
 time linking?
 

Unix-style shared libraries are somewhat working now. There are still some initialization and linking issues. I can't really speak to the Windows platform.
 I would be interested to link GCJ shared objects against D in a very
 native form, so that one could use for instance the many java libs
 already developed.
 
 for instance
 
  gcj import org.apache.xalan...TransformerImpl;
  gcj import java.lang.String;
 
 int main( char[][] args ) {
     TransformerImpl impl = new TransformerImpl( );
     ....
 
 }
 ....
 

 Plus: In order for instance to export a D class to be linked with GCJ,
 one clearly needs more meta information exposed in the object file, or
 distilled from the D sources.

 For me I'd favor the first approach, which would be interesting, since
 one could link against a D object file without the need of the
 corresponding D source code.

I have been thinking of doing something along these lines (with Objective C!) In order to directly use another object ABI, it would be necessary to introduce a new basic type into D. The capabilities of D and Java objects are similar, but they are not binary compatible. Consider: Object o = someJavaObject; o.toString(); // Java String (another Object) or D char[] (a two-element struct) ? If it was important to have Java object pose as D objects (and vice-versa), it would be necessary to use wrappers and/or glue code. I think this could be mostly transparent. There are still more issues like synchronization, and garbage collection that would need to be worked out. An alternative method would be into implement D completely with the GCJ ABI. In this case, however, the ABI would have to be extended to D types like dynamic arrays and structs.
 The metadata approach used by GCJ is very straigt forward:
 
 * There is an UTF8 table
 * There is a Method table for each method of a class
 * There are some other tables for Class Descriptors ...
 * The method table contains also all the referenced methods (not only
 the ones defined)
 * There is a Class table for each class (which contains links to the
 other tables)
 
     - vtable  (the class's methods)
 these tables are used for the java binary compatiblity stuff:
     - otable  (offset table for referenced objects by an offset)
     - atable  (address table for referenced objects via address)
     - itable  (interface table)
 
 
 Surely the simplicity of the java type system allows for a simple
 implementation of that. D would need some more meta information
 (modules, functions, custom types. .... )
 
 But it would be an interesting task, since then binary compatiblity in D
 would be more stable. And the interoperabilty between the gcj project
 and D could also be interesting.
 

It really would be nice to have this kind of binary compatibility as D DLLs/shared libraries become more widespread. The nice thing about it is that using the tables could be optional if you wanted to maximize performance. David
 looking forward to some discussions
 
 -- Jakob

Oct 19 2004
parent reply Jakob Praher <jpraher yahoo.de> writes:
David Friedman wrote:
 Jakob Praher wrote:
 
 hi David,
 hi all,

 I like the D language. Since I also play with gcj (the static gcc java
 compiler), which has a new ABI (additional to the c++ linkage), I was
 wondering about the default D ABI:

 * how classes/modules/functions/methods are mangled
 * which type codes exist
 * is there a way to describe any type using a type code (which is
 probably needed for method overloading )

There is no formal spec, so your best bet is to check out mangle.c and mtype.c in the front-end source. Basically, functions and variables have the form: "_D" ~ <namespace mangle> ~ <type mangle>

~ is a concatenation right?
 
 <namepsace> is formed from the package, module, aggregates, etc. down to 
 the declaration's identifier. The following would have the same 
 namespace mangle:
 
   module a; class b { class c { int i; } }
   module a.b; class c { idouble i; }
 
 <type> encodes the type of the declaration and may contain more 
 namespace mangling if it involves classes, etc.

ok. will look into that. so you have * packages * modules * classes what is the difference between a package and a module? I have heard that modules can have initializers? Are they somewhat like static classes? are packages every used now?
 
 * is there a way to specifiy versioning in the ABI

I don't think there is a way to do this now.

hmm. this is probably no that easy. but on the other hand one could do
 
 * since D has its own linkage (opposed to C++ linkage) I would
 appreciate a less is more approach and a more stable ABI like that of C++


 Yes I looked at DMD but I thought, it would be pleased to know there is
 a written spec (the language reference is quite quiet about that).

 What would be interesting is to support many different types of
 ABIs/Linkages.
 This could be done by "helping" the compiler to understand the ABI that
 one is using.

 And: As the language is specified today, is there a way to do a load
 time linking?

Unix-style shared libraries are somewhat working now. There are still some initialization and linking issues. I can't really speak to the Windows platform.

 I have been thinking of doing something along these lines (with 
 Objective C!)  In order to directly use another object ABI, it would be 
 necessary to introduce a new basic type into D.  The capabilities of D 
 and Java objects are similar, but they are not binary compatible.  
 Consider:
 
   Object o = someJavaObject;
   o.toString(); // Java String (another Object) or D char[] (a 
 two-element struct) ?

 
 If it was important to have Java object pose as D objects (and 
 vice-versa), it would be necessary to use wrappers and/or glue code.  I 
 think this could be mostly transparent.
 
 There are still more issues like synchronization, and garbage collection 
 that would need to be worked out.
 
 An alternative method would be into implement D completely with the GCJ 
  ABI.  In this case, however, the ABI would have to be extended to D 
 types like dynamic arrays and structs.
 

What I'll probably do over the next (spare-)time is to define a concrete spec about the requirements of these table linkages and then perhaps we can settle on a mangling structure that is java-compatible, but also satisfies the D requirements. Objective C has also a type of mangling and structures are mangled using {<members>}, which could be a way to define them.... V // void I // int ... L...; // java class [<type>;// java array +----------------+ {<type><type>;// structure *<type>; // pointer or something like that ... value types have to be mangled too... /perhaps using the structure information, this would make {I} a value type who is just a int32, and {*I;} a struct of a pointer to int32 etc. we would of course need a away to introduce new built in types, which could be done using the _; stuff, for instance _uint16; would mean unsigned int 16 or something like that I also have looked into the .net stuff and found out that they use non-symbolic type information, but whether thats better is a question of taste ... What would be also great is a pointer free representation of all the exported meta data of a d compilation unit. for instance like a constant pool of items: Item = { byte type, int size } DCompilationUnit = { byte type; int size; int majorVersion int minorVersion int PackageRef } PackageDesc = { byte type; int size; int moduleCount int ModuleRefId .. } ModuelDesc = { byte type; int size; int classCount int varCount } ClassDesc = { byte type; int size; } ... ... this coud be placed in a special section of the relocatable Object file, for instance the .metadata section or something like that, which can be loaded read only (in the .text section and can be mmaped directly) with that information we could use an extraction tool that uses these information and could pass this stuff to the compiler ... Allthough I have a bit of ELF and linking knowledge I am by no means an expert. So if anyone has better ideas for laying out this stuff in Object files, please let me know. Jakob
Oct 19 2004
parent David Friedman <d3rdclsmail_a t_earthlink_d.t_net> writes:
Jakob Praher wrote:
 David Friedman wrote:
 
 Jakob Praher wrote:


 There is no formal spec, so your best bet is to check out mangle.c and 
 mtype.c in the front-end source. Basically, functions and variables 
 have the form:

 "_D" ~ <namespace mangle> ~ <type mangle>

~ is a concatenation right?

Have to use '~' for concat on a D forum ;) [snip]
 
 ok. will look into that.
 so you have
 * packages
 * modules
 * classes
 
 what is the difference between a package and a module?
     I have heard that modules can have initializers?
     Are they somewhat like static classes?
 
 are packages every used now?
 
 

Packages are just the names/directories containing modules (e.g. std.c)
 * is there a way to specifiy versioning in the ABI

I don't think there is a way to do this now.

hmm. this is probably no that easy. but on the other hand one could do
 * since D has its own linkage (opposed to C++ linkage) I would
 appreciate a less is more approach and a more stable ABI like that of 
 C++


 Yes I looked at DMD but I thought, it would be pleased to know there is
 a written spec (the language reference is quite quiet about that).

 What would be interesting is to support many different types of
 ABIs/Linkages.
 This could be done by "helping" the compiler to understand the ABI that
 one is using.

 And: As the language is specified today, is there a way to do a load
 time linking?

Unix-style shared libraries are somewhat working now. There are still some initialization and linking issues. I can't really speak to the Windows platform.

 I have been thinking of doing something along these lines (with 
 Objective C!)  In order to directly use another object ABI, it would 
 be necessary to introduce a new basic type into D.  The capabilities 
 of D and Java objects are similar, but they are not binary 
 compatible.  Consider:

   Object o = someJavaObject;
   o.toString(); // Java String (another Object) or D char[] (a 
 two-element struct) ?

 If it was important to have Java object pose as D objects (and 
 vice-versa), it would be necessary to use wrappers and/or glue code.  
 I think this could be mostly transparent.

 There are still more issues like synchronization, and garbage 
 collection that would need to be worked out.

 An alternative method would be into implement D completely with the 
 GCJ  ABI.  In this case, however, the ABI would have to be extended to 
 D types like dynamic arrays and structs.

What I'll probably do over the next (spare-)time is to define a concrete spec about the requirements of these table linkages and then perhaps we can settle on a mangling structure that is java-compatible, but also satisfies the D requirements. Objective C has also a type of mangling and structures are mangled using {<members>}, which could be a way to define them.... V // void I // int ... L...; // java class [<type>;// java array +----------------+ {<type><type>;// structure *<type>; // pointer or something like that ... value types have to be mangled too... /perhaps using the structure information, this would make {I} a value type who is just a int32, and {*I;} a struct of a pointer to int32 etc. we would of course need a away to introduce new built in types, which could be done using the _; stuff, for instance _uint16; would mean unsigned int 16 or something like that I also have looked into the .net stuff and found out that they use non-symbolic type information, but whether thats better is a question of taste ... What would be also great is a pointer free representation of all the exported meta data of a d compilation unit. for instance like a constant pool of items: Item = { byte type, int size } DCompilationUnit = { byte type; int size; int majorVersion int minorVersion int PackageRef } PackageDesc = { byte type; int size; int moduleCount int ModuleRefId .. } ModuelDesc = { byte type; int size; int classCount int varCount } ClassDesc = { byte type; int size; } ... ... this coud be placed in a special section of the relocatable Object file, for instance the .metadata section or something like that, which can be loaded read only (in the .text section and can be mmaped directly) with that information we could use an extraction tool that uses these information and could pass this stuff to the compiler ... Allthough I have a bit of ELF and linking knowledge I am by no means an expert. So if anyone has better ideas for laying out this stuff in Object files, please let me know. Jakob

Oct 20 2004