www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Unions destructors and GC precision

reply "bearophile" <bearophileHUGS lycos.com> writes:
Before C++11 you weren't allowed to write something like:

union U {
     int x;
     std::vector<int> v;
} myu;


because v has an elaborate destructor.

In C++11 they have added "Unrestricted unions", already present 
in g++ since version 4.6:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2544.pdf

D2 doesn't give you that restriction, and when an union goes out 
of scope it calls the destructors of all its fields:


import std.stdio;

struct Foo1 {
     ~this() { writeln("Foo1.dtor"); }
}
struct Foo2 {
     ~this() { writeln("Foo2.dtor"); }
}
struct Foo3 {
     ~this() { writeln("Foo3.dtor"); }
}

union U {
     Foo1 f1;
     Foo2 f2;
     Foo3 f3;
}

void main() {
     U u;
}


Output:

Foo3.dtor
Foo2.dtor
Foo1.dtor


It looks cute, but I think that's wrong, and it causes problems.

This program crashes, because only u.f.__dtor() should be called, 
because a union is just one of its fields:


import std.stdio, core.stdc.stdlib;

struct Foo {
     int* p;

     ~this() {
         writeln("Foo.dtor");
         if (p) free(p);
     }
}

struct Bar {
     int* p;

     ~this() {
         writeln("Bar.dtor");
         if (p) free(p);
     }
}

union U {
     Foo f;
     Bar b;
     ~this() { writeln("U.dtor"); }
}

void main() {
     U u;
     u.f.p = cast(int*)malloc(10 * int.sizeof);
}


(This code can be fixed adding a  "p = null;" after the "if (p)" 
line in both Foo and Bar, but this is beside the point, because 
it means fixing the problem at the wrong level. What if I can't 
modify the source code of Foo and Bar?).


The compiler in general can't know what dtor field to call, C++11 
"solves" this problem looking at the union, if one of its fields 
has a destructor, it disables the automatic creation of the 
constructor, destructor, copy and assignment methods of the 
union. So you have to write those methods manually.

Why D isn't doing the same? It seems a simple idea. With that 
idea you are forced to write a destructor and opAssign (but only 
if one or more fields of the union has a destructor. If all union 
fields are simple like an int or float, then the compiler doesn't 
ask you to write the union dtor):


import std.stdio, core.stdc.stdlib;

struct Foo {
     int* p;

     ~this() {
         writeln("Foo.dtor");
         if (p) free(p);
     }
}

struct Bar {
     int* p;

     ~this() {
         writeln("Bar.dtor");
         if (p) free(p);
     }
}

struct Spam {
     bool isBar;

     union {
         Foo f;
         Bar b;

         ~this() {
             writeln("U.dtor ", isBar);
             if (isBar)
                 b.__dtor();
             else
                 f.__dtor();
         }
     }
}

void main() {
     Spam s;
     s.f.p = cast(int*)malloc(10 * int.sizeof);
}



If you don't have a easy to reach tag like isBar, then things 
become less easy. Probably you have to call b.__dtor() or 
f.__dtor() manually:


import std.stdio, core.stdc.stdlib;

struct Foo {
     int* p;

     ~this() {
         writeln("Foo.dtor");
         if (p) free(p);
     }
}

struct Bar {
     int* p;

     ~this() {
         writeln("Bar.dtor");
         if (p) free(p);
     }
}

struct Spam {
     bool isBar;

     union {
         Foo f;
         Bar b;

         ~this() {} // empty
     }
}

void main() {
     Spam s;
     s.f.p = cast(int*)malloc(10 * int.sizeof);
     scope(exit) s.f.__dtor();
}

------------------------------

A related problem with unions is the GC precision. We want a more 
precise GC, but unions reduce the precision.

To face this problem time ago I have suggested to add standard 
method named onMark() that is called at run-time by the GC. It 
returns the positional number of the union field currently 
active. This means during the mark phase of the GC it calls 
onMark of the union, in this example the union has just the f and 
b fields, so the onMark has to return just 0 or 1:


class Spam {
     bool isBar;

     union {
         Foo f;
         Bar b;

         ~this() {
             writeln("U.dtor ", isBar);
             if (isBar)
                 b.__dtor();
             else
                 f.__dtor();
         }

         size_t onMark() {
             return isBar ? 1 : 0;
         }
     }
}


onMark() is required only if the union contains one or more 
fields that contain pointers.

I don't know if this idea is good enough (where to store the mark 
bits?).

Again, if a nice isBar tag is not easy to reach, things become 
more complex.

-------------------------------

Maybe there is a way to merge the two solutions, creating 
something simpler. In this design instead of onMark it's required 
a method like activeField() that at runtime tells what's the 
field currently "active" of the union. This method is called by 
both the GC at runtime and when the union goes out of scope to 
know what field destructor to call:


struct Spam {
     bool isBar;

     union {
         Foo f;
         Bar b;

         size_t activeField(size_t delegate() callMe=null) {
             return isBar ? 1 : 0;
         }
     }
}


So with activeField there is no need to define the ctor of the 
union.

Again there is a problem when a nice tag like isBar (or 
equivalent information) is not easy to reach.

Bye,
bearophile
Aug 14 2012
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 8/14/12 3:25 PM, bearophile wrote:
 D2 doesn't give you that restriction, and when an union goes out of
 scope it calls the destructors of all its fields:

That's pretty surprising. "Major bug" doesn't begin to describe it. Unions should call no constructors and no destructors. Andrei
Aug 14 2012
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:

 That's pretty surprising. "Major bug" doesn't begin to describe 
 it.

If you want later I will add it to Bugzilla. But maybe before that other people will want to write some other comments in this thread.
 Unions should call no constructors and no destructors.

But this doesn't address the GC precision problem. Some kind of tagging field (or equivalent information) isn't always available, but in many cases it's available, so in many practical cases I am able to put something useful inside a standard method like activeField(). If this method is available for the GC, it's not unconceivable to use it to call the right union field destructor when the union instance goes out of scope :-) The precision of the GC is not a binary thing, even a not fully precise GC is useful, and probably more precision is better than less precision. Even if activeField() is not always usable, an increase of GC precision seems an improvement to me. Bye, bearophile
Aug 14 2012
prev sibling next sibling parent "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Tue, 14 Aug 2012 22:32:58 +0200, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 8/14/12 3:25 PM, bearophile wrote:
 D2 doesn't give you that restriction, and when an union goes out of
 scope it calls the destructors of all its fields:

That's pretty surprising. "Major bug" doesn't begin to describe it. Unions should call no constructors and no destructors.

That means the default case is unsafe. Should it also be an error (or at least a warning) for a union containing types with destructors or complex constructors not to have a defined constructor/destructor? -- Simen
Aug 14 2012
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, August 15, 2012 07:02:25 Simen Kjaeraas wrote:
 On Tue, 14 Aug 2012 22:32:58 +0200, Andrei Alexandrescu
 
 <SeeWebsiteForEmail erdani.org> wrote:
 On 8/14/12 3:25 PM, bearophile wrote:
 D2 doesn't give you that restriction, and when an union goes out of

 scope it calls the destructors of all its fields:

Unions should call no constructors and no destructors.

That means the default case is unsafe. Should it also be an error (or at least a warning) for a union containing types with destructors or complex constructors not to have a defined constructor/destructor?

I wouldn't expect unions to be considered safe in the first place. You're potentially reintrepreting one type as another with them. And I would expect that anything in them is in the same boat that anything initialized to void is. e.g. Type var = void; - Jonathan M Davis
Aug 14 2012
prev sibling next sibling parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Wednesday, 15 August 2012 at 05:10:02 UTC, Jonathan M Davis 
wrote:
 On Wednesday, August 15, 2012 07:02:25 Simen Kjaeraas wrote:
 On Tue, 14 Aug 2012 22:32:58 +0200, Andrei Alexandrescu
 
 <SeeWebsiteForEmail erdani.org> wrote:
 On 8/14/12 3:25 PM, bearophile wrote:
 D2 doesn't give you that restriction, and when an union 
 goes out of

 scope it calls the destructors of all its fields:

describe it. Unions should call no constructors and no destructors.

That means the default case is unsafe. Should it also be an error (or at least a warning) for a union containing types with destructors or complex constructors not to have a defined constructor/destructor?

I wouldn't expect unions to be considered safe in the first place. You're potentially reintrepreting one type as another with them. And I would expect that anything in them is in the same boat that anything initialized to void is. e.g. Type var = void; - Jonathan M Davis

I second this. That is actually one of the reasons why most languages with GC, ban pointer uses to unsafe sections, otherwise the GC would be very restricted in the ways it could work. Same thing about unions, as you wouldn't know which pointer/reference is the active one without some kind of tagging. -- Paulo
Aug 14 2012
prev sibling next sibling parent "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Wed, 15 Aug 2012 07:09:40 +0200, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Wednesday, August 15, 2012 07:02:25 Simen Kjaeraas wrote:
 On Tue, 14 Aug 2012 22:32:58 +0200, Andrei Alexandrescu

 <SeeWebsiteForEmail erdani.org> wrote:
 On 8/14/12 3:25 PM, bearophile wrote:
 D2 doesn't give you that restriction, and when an union goes out of

 scope it calls the destructors of all its fields:

Unions should call no constructors and no destructors.

That means the default case is unsafe. Should it also be an error (or at least a warning) for a union containing types with destructors or complex constructors not to have a defined constructor/destructor?

I wouldn't expect unions to be considered safe in the first place. You're potentially reintrepreting one type as another with them.

True, when the unioned types are or contain pointers. With POD there should be no problem. -- Simen
Aug 14 2012
prev sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Paulo Pinto:

 Same thing about unions, as you wouldn't know which 
 pointer/reference is the active one without some kind of 
 tagging.

But with a standard method like activeField the tagging doesn't need to be explicit. Bye, bearophile
Aug 15 2012