www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Are virtual tables really required?

reply parabolis <parabolis_member pathlink.com> writes:
Why would a language with only single inheritence need virtual tables?

Given the following code:

class Object
{  void print()
char[] toString()
uint toHash()
int cmp()
}

class DerivedEx : Object
{          void someFunc1()
override uint toHash()
override  int cmp()
void someFunc2()
}

And assuming a heap representation of a class is something like this:
(which is very close to class ClassInfo in object.d)

struct ClassInfo
{   void* superClassPtr;
uint  funcPtrTableLength;
void* funcPtrTable[0];
.
void* funcPtrTable[funcPtrTableLength - 1];
.  (other info)
}

Then the Object class on the Heap will be:

struct ClassInfo
{   void* superClassPtr = 0;
uint  funcPtrTableLength = 4;
void* funcPtrTable[0] = addrOf(print);
void* funcPtrTable[1] = addrOf(toString);
void* funcPtrTable[2] = addrOf(toHash);
void* funcPtrTable[3] = addrOf(cmp);
.  (other info)
}

and DerivedEx class will be represented in the Heap:

struct ClassInfo
{   void* superClassPtr = addrOf(Object ClassInfo);
uint  funcPtrTableLength = 6;
void* funcPtrTable[0] = addrOf(print);
void* funcPtrTable[1] = addrOf(toString);
void* funcPtrTable[2] = addrOf(DerivedEx.toHash);
void* funcPtrTable[3] = addrOf(DerivedEx.cmp);
void* funcPtrTable[4] = addrOf(someFunc1);
void* funcPtrTable[5] = addrOf(someFunc2);
.  (other info)
}

In this example the funcPtrTable for a Derived Class is first populated by the
contents of its super class. Then any function that overriddes another simply
replaces the appropriate slot in the funcPtrTable. Finally the table is expanded
to hold any other functions defined by the Derived Class. An abstract class
which does not implement a function would allocate space for it but set it to 0.
Jul 24 2004
next sibling parent Sha Chancellor <schancel pacific.net> writes:
In article <cdt5qd$lp6$1 digitaldaemon.com>,
 parabolis <parabolis_member pathlink.com> wrote:

 Why would a language with only single inheritence need virtual tables?
 
 Given the following code:
 
 class Object
 {  void print()
 char[] toString()
 uint toHash()
 int cmp()
 }
 
 class DerivedEx : Object
 {          void someFunc1()
 override uint toHash()
 override  int cmp()
 void someFunc2()
 }
 
 And assuming a heap representation of a class is something like this:
 (which is very close to class ClassInfo in object.d)
 
 struct ClassInfo
 {   void* superClassPtr;
 uint  funcPtrTableLength;
 void* funcPtrTable[0];
 .
 void* funcPtrTable[funcPtrTableLength - 1];
 .  (other info)
 }
 
 Then the Object class on the Heap will be:
 
 struct ClassInfo
 {   void* superClassPtr = 0;
 uint  funcPtrTableLength = 4;
 void* funcPtrTable[0] = addrOf(print);
 void* funcPtrTable[1] = addrOf(toString);
 void* funcPtrTable[2] = addrOf(toHash);
 void* funcPtrTable[3] = addrOf(cmp);
 .  (other info)
 }
 
 and DerivedEx class will be represented in the Heap:
 
 struct ClassInfo
 {   void* superClassPtr = addrOf(Object ClassInfo);
 uint  funcPtrTableLength = 6;
 void* funcPtrTable[0] = addrOf(print);
 void* funcPtrTable[1] = addrOf(toString);
 void* funcPtrTable[2] = addrOf(DerivedEx.toHash);
 void* funcPtrTable[3] = addrOf(DerivedEx.cmp);
 void* funcPtrTable[4] = addrOf(someFunc1);
 void* funcPtrTable[5] = addrOf(someFunc2);
 .  (other info)
 }
 
 In this example the funcPtrTable for a Derived Class is first populated by 
 the
 contents of its super class. Then any function that overriddes another simply
 replaces the appropriate slot in the funcPtrTable. Finally the table is 
 expanded
 to hold any other functions defined by the Derived Class. An abstract class
 which does not implement a function would allocate space for it but set it to 
 0.
So that it calls the correct function at runtime when the compile time type is Object and the runtime type is DerivedEx. I suppose you could make the argument that you would only need one "ClassInfo" for each class though and just have a pointer to it for any instances of the class. I'm not sure how D handles that though. But,they do certainly need virtual tables for polymorphism. I guess how many is the question.
Jul 24 2004
prev sibling next sibling parent reply Ilya Minkov <minkov cs.tum.edu> writes:
parabolis schrieb:

 Why would a language with only single inheritence need virtual tables?
And why would a language with multiple inheritance need virtual tables?
 And assuming a heap representation of a class is something like this:
 (which is very close to class ClassInfo in object.d)
Your assumption is wrong. The heap representation of a class consists of * VTable pointer * Monitor slot (used by synchonization primitives) * Fields of the class (not functions!) VTable is a function pointer table, and is stored in a program-global constant area. This saves memory requiered per class, memory requiered per process (if multiple processes are from the same executable), and initialization time per class. Derived classes replace the VTable pointer by a pointer to a compatible but extended VTable, abd extend the fields area, so a derived class is binary compatible to its superclass. The executaion time penalty of virtual functions comes mainly from the fact that a function is called by a pointer fetched at runtime, so the CPU cannot predict and pre-translate it into the microcode. Though dereferencing an extra VTable pointer does cost a portion of an extra cycle, it's nearly nothing compared to the general function call overhead when it is called by pointer, so it doesn't really matter. In the total, a separate VTable performs better than if the pointers were placed in a class.
 In this example the funcPtrTable for a Derived Class is first populated by the
 contents of its super class. Then any function that overriddes another simply
 replaces the appropriate slot in the funcPtrTable. Finally the table is
expanded
 to hold any other functions defined by the Derived Class. An abstract class
 which does not implement a function would allocate space for it but set it to
0.
What you describe is how virtual function tables work... so i don't see the point of your post, you just described how to replace virtual function tables by less efficient virtual function tables. Perhaps i misunderstood something, so please reformulate your question. I would also like to point out that ClassInfo is hidden in the VTable as well. -eye
Jul 25 2004
parent reply parabolis <parabolis_member pathlink.com> writes:
Thank you for your verbose reply. It has helped me learn quite a bit!

In article <ce08tl$26h1$1 digitaldaemon.com>, Ilya Minkov says...
parabolis schrieb:

 And assuming a heap representation of a class is something like this:
 (which is very close to class ClassInfo in object.d)
Your assumption is wrong. The heap representation of a class consists of
I think perhaps I should have been more specific. ClassInfo represents static class data as opposed to an instantiated object, which would have a reference to ClassInfo and the instance's data (including monitor). I am sorry for not making this more clear.
What you describe is how virtual function tables work... so i don't see 
the point of your post, you just described how to replace virtual 
function tables by less efficient virtual function tables.

Perhaps i misunderstood something, so please reformulate your question.
My question is whether an inheritance walk is ever a part of calling a function. I see no reason why calling a function at runtime should ever be more involved than fetching the function's address from a static table. However I am under the (incorrect?) impression that virtual function tables occasionally require a lookup in a super class. The point of my post was to figure out whether this is actually the case for single inheritance languages (like D).
 Why would a language with only single inheritence need virtual tables?
And why would a language with multiple inheritance need virtual tables?
In my DerivedEx example (from previous post) any class that extends object (which is in fact every class), the first 4 function table entries for the class are always: ---------------- table[0] print table[1] toString table[2] toHash table[3] cmp ---------------- So code in a derived class that calls print() would simply call table[0]. I believe that a language with multiple inheritance (like C++) would not be able to make any similar assumptions about the contents of the function pointer table. Consider the C++ example: ---------------- class Base1 { virtual int baseFunc1() { return 1; } } class Base2 { virtual int baseFunc2() { return 2; } } class Derived1 extends Base1 { int derivedFunc1() { baseFunc2(); } } class Derived2 extends Base2 { int derivedFunc2() { baseFunc2(); } } class Derived12 extends Base1, Base2 { int derivedFunc12() { return baseFunc1() + baseFunc2(); } } ---------------- What function is in table[0] for Derived1, Derived2 and Derived12? Isnt it impossible to tell without looking at super class info? Thanks again for the post. It really helped me understand this better.
Jul 25 2004
next sibling parent Ben Hinkle <bhinkle4 juno.com> writes:
I can't really help but I recommend browsing the code for dmd classes and
interfaces in dmd/src/dmd/class.c - it's all in there (or nearby!). That's
one of the cool things about D - most of the language semantics are right
there for the curious.

-Ben

parabolis wrote:

 Thank you for your verbose reply. It has helped me learn quite a bit!
 
 In article <ce08tl$26h1$1 digitaldaemon.com>, Ilya Minkov says...
parabolis schrieb:

 And assuming a heap representation of a class is something like this:
 (which is very close to class ClassInfo in object.d)
Your assumption is wrong. The heap representation of a class consists of
I think perhaps I should have been more specific. ClassInfo represents static class data as opposed to an instantiated object, which would have a reference to ClassInfo and the instance's data (including monitor). I am sorry for not making this more clear.
What you describe is how virtual function tables work... so i don't see
the point of your post, you just described how to replace virtual
function tables by less efficient virtual function tables.

Perhaps i misunderstood something, so please reformulate your question.
My question is whether an inheritance walk is ever a part of calling a function. I see no reason why calling a function at runtime should ever be more involved than fetching the function's address from a static table. However I am under the (incorrect?) impression that virtual function tables occasionally require a lookup in a super class. The point of my post was to figure out whether this is actually the case for single inheritance languages (like D).
 Why would a language with only single inheritence need virtual tables?
And why would a language with multiple inheritance need virtual tables?
In my DerivedEx example (from previous post) any class that extends object (which is in fact every class), the first 4 function table entries for the class are always: ---------------- table[0] print table[1] toString table[2] toHash table[3] cmp ---------------- So code in a derived class that calls print() would simply call table[0]. I believe that a language with multiple inheritance (like C++) would not be able to make any similar assumptions about the contents of the function pointer table. Consider the C++ example: ---------------- class Base1 { virtual int baseFunc1() { return 1; } } class Base2 { virtual int baseFunc2() { return 2; } } class Derived1 extends Base1 { int derivedFunc1() { baseFunc2(); } } class Derived2 extends Base2 { int derivedFunc2() { baseFunc2(); } } class Derived12 extends Base1, Base2 { int derivedFunc12() { return baseFunc1() + baseFunc2(); } } ---------------- What function is in table[0] for Derived1, Derived2 and Derived12? Isnt it impossible to tell without looking at super class info? Thanks again for the post. It really helped me understand this better.
Jul 25 2004
prev sibling parent Ilya Minkov <minkov cs.tum.edu> writes:
parabolis schrieb:

 Thank you for your verbose reply. It has helped me learn quite a bit!
Glad to be helpful.
parabolis schrieb:
I think perhaps I should have been more specific. ClassInfo represents static class data as opposed to an instantiated object, which would have a reference to ClassInfo and the instance's data (including monitor). I am sorry for not making this more clear.
OK.
 My question is whether an inheritance walk is ever a part of calling a
function.
 I see no reason why calling a function at runtime should ever be more involved
 than fetching the function's address from a static table. However I am under
the
 (incorrect?) impression that virtual function tables occasionally require a
 lookup in a super class. The point of my post was to figure out whether this is
 actually the case for single inheritance languages (like D).
No, there is no inheritance walk. Yes, VTable inherits all the entries from the superclasses, so one look-up is sufficent. The only language in which i actually recall an inheritance walk being involved is Delphi. It had 2 sorts of inheritance: virtual (working like C++ or D, using VTables), and dynamic, working in some horrendous way.
 So code in a derived class that calls print() would simply call table[0]. I
 believe that a language with multiple inheritance (like C++) would not be able
 to make any similar assumptions about the contents of the function pointer
 table. Consider the C++ example:
I recall C++ also does away with one or at most 2 look ups, but no way a complete inheritance walk. I think i read something on it, but it must be more than a year away and i wasn't that interested and i don't really remember.
 What function is in table[0] for Derived1, Derived2 and Derived12? Isnt it
 impossible to tell without looking at super class info?
There are multiple solutions to this problem... The one i'm getting from the top of my heat is multiple VTables, but if i remember correctly MSVC, DMD, and Borland do away with only one VTable... but i don't know how. I recall Walter said something about generating adjusting thunks, but i can neither recall nor find it, and i'm not good at understanding the matter. OK. i feel challenged and google for "multiple inheritance implementation". The first thing i find and seems to meet the point is here, it is even written by the Mr. C++ itself. Which doesn't mean that it corresponds in every detail to the current implementations, just that it's intended to work in a fairly similar manner from the points of view of performance etc. http://www-plan.cs.colorado.edu/diwan/class-papers/mi.pdf Microsoft hosts a blog entry from Stanley Lippman, the Architect of Visual C++ compiler, dealing with the question of multiple inheritance performance and implementation in their compiler. http://blogs.msdn.com/slippman/archive/2004/01/21/61182.aspx
 Thanks again for the post. It really helped me understand this better.
Stay cool. -eye
Jul 25 2004
prev sibling parent parabolis <parabolis_member pathlink.com> writes:
Thank you for your verbose reply. It has helped me learn quite a bit!

In article <ce08tl$26h1$1 digitaldaemon.com>, Ilya Minkov says...
parabolis schrieb:

 And assuming a heap representation of a class is something like this:
 (which is very close to class ClassInfo in object.d)
Your assumption is wrong. The heap representation of a class consists of
I think perhaps I should have been more specific. ClassInfo represents static class data as opposed to an instantiated object, which would have a reference to ClassInfo and the instance's data (including monitor). I am sorry for not making this more clear.
What you describe is how virtual function tables work... so i don't see 
the point of your post, you just described how to replace virtual 
function tables by less efficient virtual function tables.

Perhaps i misunderstood something, so please reformulate your question.
My question is whether an inheritance walk is ever a part of calling a function. I see no reason why calling a function at runtime should ever be more involved than fetching the function's address from a static table. However I am under the (incorrect?) impression that virtual function tables occasionally require a lookup in a super class. The point of my post was to figure out whether this is actually the case for single inheritance languages (like D).
 Why would a language with only single inheritence need virtual tables?
And why would a language with multiple inheritance need virtual tables?
In my DerivedEx example (from previous post) any class that extends object (which is in fact every class), the first 4 function table entries for the class are always: ---------------- table[0] print table[1] toString table[2] toHash table[3] cmp ---------------- So code in a derived class that calls print() would simply call table[0]. I believe that a language with multiple inheritance (like C++) would not be able to make any similar assumptions about the contents of the function pointer table. Consider the C++ example: ---------------- class Base1 { virtual int baseFunc1() { return 1; } } class Base2 { virtual int baseFunc2() { return 2; } } class Derived1 extends Base1 { int derivedFunc1() { baseFunc2(); } } class Derived2 extends Base2 { int derivedFunc2() { baseFunc2(); } } class Derived12 extends Base1, Base2 { int derivedFunc12() { return baseFunc1() + baseFunc2(); } } ---------------- What function is in table[0] for Derived1, Derived2 and Derived12? Isnt it impossible to tell without looking at super class info? Thanks again for the post. It really helped me understand this better.
Jul 25 2004