www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Suggestion : virtual member data

reply Steve Horne <stephenwantshornenospam100 aol.com> writes:
This suggestion was prompted by an existing thread title, but probably
has nothing to do with that thread.

In C++ development, I often find myself needing 'virtual member
variables'. By this, I mean data members that would be stored in the
virtual table.

As an example of the principle...


abstract class c_Container
{
  protected:
    abstract int  m_Max_Size;
    abstract bool m_Keys_Unique;

    abstract offsetof int[] m_Keys;
    abstract offsetof int[] m_Data;

  public:
    final int Key (int p_Index)
    {
      return m_Keys [p_Index];
    }
}

class c_Specific_Container
{
  protected:
    final int m_Max_Size     = 16;
    final bool m_Keys_Unique = true;

    int[m_Max_Size] m_Keys_Storage;
    int[m_Max_Size] m_Data_Storage;

    final offsetof int[] m_Keys = m_Keys_Storage;
    final offsetof int[] m_Data = m_Data_Storage;
}


Two tricks are implied above...

1.  m_Max_Size and m_Keys_Unique are constants to be referenced by the
    base class, but values are only assigned by the derived class.

    These can, in principle, simply be stored in the virtual table.

2.  m_Keys and m_Data refer to instance data in the derived class. The
    specific locations in the instance cannot be known in the base
    class and cannot be pre-allocated since the sizes aren't known.

    However, offsets into the instance can be held in the virtual
    table. And with the offsets available through the virtual table,
    the compiler can generate simple code to implement the lookup so
    that m_Keys and m_Data can generally be used as if they were
    member data in the base class.

Basically, the point is to allow base classes to refer to things
more-or-less directly that will be defined in the derived class, and
additionally to ensure that the derived class really does define them
(or else it is abstract and cannot be instantiated).

The rationale for this is about avoiding inner-loop overheads in a
safe way. All of these things could be handled using virtual member
functions, but this requires access functions. These access functions
cannot be inlined when called from the base class since the compiler
cannot know which implementation to inline - the final version is
defined by the derived class.

For instance, consider the trivial getter function Key which might be
called repeatedly in an inner loop. Ideally, it should be an inlined 
piece of trivial code. Needing a call to the derived class (to find
the actual location of the array) could easily be an inapproriate
overhead. As written above, there is still a virtual table lookup
overhead (and a bit of pointer arithmetic) but no function call
overhead. It should inline, and the optimiser may even be able to move
the m_Keys dereferencing (virtual table lookup of the offset, and
applying the offset to the instance pointer) out of inner loops.

This is enough of an issue that I have several examples in C++ where,
in effect, I've had to manage my own separate virtual tables for class
heirarchies. This can be very error prone, to say the least.

Abstract virtual members would not have values (as above) but there
may be some justification to having virtual member data that has a
default value, overrideable by a derived class. The above container
class might want to define a default maximum size, for instance.
Classes that included non-final definitions would not be able to
reference them at compile time, and would see the final overridden
definition at run-time.

The biggest issue would be how to declare overridable values,
considering that D takes the view that all member functions are
virtual, so the obvious keyword for C++ doesn't apply in D. Perhaps a
'vtable' storage class keyword?

As an aside, variables (as opposed to constants) held in the virtual
table might have some use, but the only things I can think of relate
to debug code and metrics. Getter and setter call overheads are mostly
irrelevant in those cases. I suspect the only kinds of 'variables'
that would be sensibly held in the virtual table are the (constant)
offsetof references to normal member data in the instance, as used for
the arrays in the above example.

Any thoughts?
Sep 06 2006
parent reply Kristian <kjkilpi gmail.com> writes:
On Wed, 06 Sep 2006 12:26:19 +0300, Steve Horne  =

<stephenwantshornenospam100 aol.com> wrote:
 This suggestion was prompted by an existing thread title, but probably=

 has nothing to do with that thread.

 In C++ development, I often find myself needing 'virtual member
 variables'. By this, I mean data members that would be stored in the
 virtual table.

 As an example of the principle...


 abstract class c_Container
 {
   protected:
     abstract int  m_Max_Size;
     abstract bool m_Keys_Unique;

     abstract offsetof int[] m_Keys;
     abstract offsetof int[] m_Data;

   public:
     final int Key (int p_Index)
     {
       return m_Keys [p_Index];
     }
 }

 class c_Specific_Container
 {
   protected:
     final int m_Max_Size     =3D 16;
     final bool m_Keys_Unique =3D true;

     int[m_Max_Size] m_Keys_Storage;
     int[m_Max_Size] m_Data_Storage;

     final offsetof int[] m_Keys =3D m_Keys_Storage;
     final offsetof int[] m_Data =3D m_Data_Storage;
 }


 Two tricks are implied above...

 1.  m_Max_Size and m_Keys_Unique are constants to be referenced by the=

     base class, but values are only assigned by the derived class.

     These can, in principle, simply be stored in the virtual table.

 2.  m_Keys and m_Data refer to instance data in the derived class. The=

     specific locations in the instance cannot be known in the base
     class and cannot be pre-allocated since the sizes aren't known.

     However, offsets into the instance can be held in the virtual
     table. And with the offsets available through the virtual table,
     the compiler can generate simple code to implement the lookup so
     that m_Keys and m_Data can generally be used as if they were
     member data in the base class.

 Basically, the point is to allow base classes to refer to things
 more-or-less directly that will be defined in the derived class, and
 additionally to ensure that the derived class really does define them
 (or else it is abstract and cannot be instantiated).

 The rationale for this is about avoiding inner-loop overheads in a
 safe way. All of these things could be handled using virtual member
 functions, but this requires access functions. These access functions
 cannot be inlined when called from the base class since the compiler
 cannot know which implementation to inline - the final version is
 defined by the derived class.

 For instance, consider the trivial getter function Key which might be
 called repeatedly in an inner loop. Ideally, it should be an inlined
 piece of trivial code. Needing a call to the derived class (to find
 the actual location of the array) could easily be an inapproriate
 overhead. As written above, there is still a virtual table lookup
 overhead (and a bit of pointer arithmetic) but no function call
 overhead. It should inline, and the optimiser may even be able to move=

 the m_Keys dereferencing (virtual table lookup of the offset, and
 applying the offset to the instance pointer) out of inner loops.

 This is enough of an issue that I have several examples in C++ where,
 in effect, I've had to manage my own separate virtual tables for class=

 heirarchies. This can be very error prone, to say the least.

 Abstract virtual members would not have values (as above) but there
 may be some justification to having virtual member data that has a
 default value, overrideable by a derived class. The above container
 class might want to define a default maximum size, for instance.
 Classes that included non-final definitions would not be able to
 reference them at compile time, and would see the final overridden
 definition at run-time.

 The biggest issue would be how to declare overridable values,
 considering that D takes the view that all member functions are
 virtual, so the obvious keyword for C++ doesn't apply in D. Perhaps a
 'vtable' storage class keyword?

 As an aside, variables (as opposed to constants) held in the virtual
 table might have some use, but the only things I can think of relate
 to debug code and metrics. Getter and setter call overheads are mostly=

 irrelevant in those cases. I suspect the only kinds of 'variables'
 that would be sensibly held in the virtual table are the (constant)
 offsetof references to normal member data in the instance, as used for=

 the arrays in the above example.

 Any thoughts?

I too have found myself needing virtual member variables with C++. = Actually I was about to suggest virtual variables (e.g. "virtual int a;"= ) = in my earlier message "properties should be treaded as virtual data = members", but I thought properties would be enough. However, there are = different reasons to use them, as you said here. I think they would make= = fine addition to the language. I agree that 'virtual' may not be the best choice for the storage class = = keyword. 'vtable' doesn't sound quite right either, though. (Hmm... = *smile* vvar, vir, var, virvar, virv, ... heheh) Heh, after having true virtual variables, you could really say "what = OO-languages did for functions, D does for variables!" :)
Sep 06 2006
parent reply Steve Horne <stephenwantshornenospam100 aol.com> writes:
On Wed, 06 Sep 2006 12:58:14 +0300, Kristian <kjkilpi gmail.com>
wrote:

I agree that 'virtual' may not be the best choice for the storage class  
keyword. 'vtable' doesn't sound quite right either, though.

Yes, but there is a fine tradition to uphold here. Storage class keywords must never make any sense! After all, 'static' only really makes sense if you've used assembler in the past. And even then, why isn't there a 'bss' storage class for uninitialised static variables? And for C local variables, well, 'stack' or 'temporary' I could have understood, but 'auto'!!!!! Actually, I've changed my mind. The new keyword should be 'magic' or 'freaky' or 'scarey' or something like that! Yes - D could be the first language to support freaky member data! OK, OK - I've been waiting over a year for a shrink - what do you expect. Remove 'wants' and 'nospam' from e-mail.
Sep 06 2006
next sibling parent Steve Horne <stephenwantshornenospam100 aol.com> writes:
blah blah blah

-- 
Remove 'wants' and 'nospam' from e-mail.
Sep 06 2006
prev sibling next sibling parent Kristian <kjkilpi gmail.com> writes:
On Wed, 06 Sep 2006 13:58:39 +0300, Steve Horne  
<stephenwantshornenospam100 aol.com> wrote:

 On Wed, 06 Sep 2006 12:58:14 +0300, Kristian <kjkilpi gmail.com>
 wrote:

 I agree that 'virtual' may not be the best choice for the storage class
 keyword. 'vtable' doesn't sound quite right either, though.

Yes, but there is a fine tradition to uphold here. Storage class keywords must never make any sense! After all, 'static' only really makes sense if you've used assembler in the past. And even then, why isn't there a 'bss' storage class for uninitialised static variables? And for C local variables, well, 'stack' or 'temporary' I could have understood, but 'auto'!!!!! Actually, I've changed my mind. The new keyword should be 'magic' or 'freaky' or 'scarey' or something like that! Yes - D could be the first language to support freaky member data! OK, OK - I've been waiting over a year for a shrink - what do you expect.

Hehheh, the 'freaky' keyword sounds right, doesn't it! ;) Well, *if* virtual variables will be implemented in some day (2.0 or 3.0? *smile*), I am sure Walter, and the D community, will pick a proper keyword for it. (Hmm, I think I'll vote for 'this_is_a_virtual_variable'... ;) )
Sep 06 2006
prev sibling parent nobody <nobody mailinator.com> writes:
Steve Horne wrote:
 Yes, but there is a fine tradition to uphold here. Storage class
 keywords must never make any sense!
 
 After all, 'static' only really makes sense if you've used assembler
 in the past. And even then, why isn't there a 'bss' storage class for
 uninitialised static variables? And for C local variables, well,
 'stack' or 'temporary' I could have understood, but 'auto'!!!!!
 
 Actually, I've changed my mind. The new keyword should be 'magic' or
 'freaky' or 'scarey' or something like that!

It sounds like this is a perfect time to suggest replacing 'const' with 'smurf' and thus with 'const' free we can use it here.
Sep 06 2006