www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 4621] New: Destructors are inherently un- safe

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4621

           Summary: Destructors are inherently un- safe
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: normal
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: michel.fortin michelf.com



14:08:33 EDT ---
Accessing the GC heap through a member in a destructors is inherently unsafe
because the GC might have already freed that memory. So destructors in SafeD
should not be able to access the GC-heap through a member. Here is an example:

 safe:

class C {
    C other;
    ~this() {
        writeln(other.toString()); // "other" might already have been freed.
    }
}

void main() {
    C c1 = new C;
    C c2 = new C;
    c1.other = c2;
    c2.other = c1; // creating a circular reference
}

Given that the compiler has no way to know if a reference, pointer, or array
points to the GC heap or elsewhere, it might have to disallow any dereferencing
of any member and calls to functions that might dereference a member. And at
this point you can't do anything useful in a destructor, so you might just
disallow  safe destructors altogether.

Note that this applies to struct destructors too, since structs can be on the
heap (in their own memory block, part of an array, or as a member of a class).

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 11 2010
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4621


Jonathan M Davis <jmdavisProg gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jmdavisProg gmail.com



11:36:33 PDT ---
What about structs which are on the stack? I agree that the stuff on the heap
has this problem, but structs on the stack should be fine, shouldn't they?

I'd hate for structs on the stack not be able to have destructors except in
SafeD. It would make RAII only work in SafeD, which would not be good.

I do agree that destructors on the heap should be disallowed in SafeD, but I
don't want to see structs on the heap not being allowed to have destructors in
SafeD.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 11 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4621




14:37:52 EDT ---
Perhaps a better solution for structs: have a way to distinguish between a
struct that can be put on the GC heap and one that cannot. A struct that cannot
go on the GC heap make it safe to access GC-managed members in its destructor,
and thus can have a  safe destructor.

But at the same time such a struct would be prohibited at compile time from
being part of a class, or from being allocated with "new" (either solitary or
part of an array).

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 11 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4621




11:43:00 PDT ---
Ouch. That last sentence of my needs editing. I meant to say that I don't want
to see structs on the _stack_ not being allowed to have destructors in SafeD.
But obviously you understood what I meant.

As for your suggestion, couldn't the compiler just disallow structs with
destructors from anywhere but the stack in SafeD? If you try and declare them
anywhere else, you'd get an error. There shouldn't be any need to distinguish
them otherwise. The destructor itself could do that.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 11 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4621




15:33:11 EDT ---
The problem with structs is that many structs will need a destructor because
they encapsulate a resource not managed by the GC. That destructor can be made
GC-heap safe, and thus this struct can be put in a class. For instance, a File
struct wrapping a file handle could easily be made GC-heap safe if it's
destructor just calls fclose(handle), and thus could be put in a class. Are you
willing to make this File struct unusable in SafeD? Or std.container.Array,
which is totally safe to use on the heap too?

So I think there is a need to distinguish GC-safe structs from those that
aren't. Forbidding all structs with a destructor to be put on the heap in SafeD
is going to prevent too many useful things. Obviously, the struct itself would
need a  trusted destructor.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 11 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4621




13:10:28 PDT ---
This mess is just too complicated. Sigh. Well, we want to be able to use
structs with destructors in SafeD wherever is reasonable to use them, and those
uses should be allowed. Unsafe uses should not be allowed. If attributes of
some kind are required on the structs or their destructors to make it work,
then that's the path that we should take.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 11 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4621




16:20:44 EDT ---
It could be an attribute, or it could be something else.

For instance, instead of having just destructors, we could have destructors
(~this) and finalizers (~~this). A struct with neither can go anywhere, a
struct with a destructor but no finalizer cannot go on the GC-heap,  a struct
with only a finalizer can go anywhere (the finalizer is used as the
destructor), and a struct with both can go anywhere. The finalizer cannot be
made  safe.

Doing this with structs would probably mean allowing only finalizers (~~this)
for classes, which according to my syntax suggestion would break existing code
for class destructors. Perhaps the syntax should be different.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 11 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4621


Steven Schveighoffer <schveiguy yahoo.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |schveiguy yahoo.com



05:21:10 PDT ---

 It could be an attribute, or it could be something else.
 
 For instance, instead of having just destructors, we could have destructors
 (~this) and finalizers (~~this). A struct with neither can go anywhere, a
 struct with a destructor but no finalizer cannot go on the GC-heap,  a struct
 with only a finalizer can go anywhere (the finalizer is used as the
 destructor), and a struct with both can go anywhere. The finalizer cannot be
 made  safe.
I think rather than prevent where these items should go, you should just not call the destructor when the struct/class is being destroyed by the GC. I'd say you could even prevent what the finalizer contains, but it's too limiting for the compiler to assume what type of memory a reference is referencing. My thought is simply that a finalizer is not safe, and a destructor is safe. That at least gives a path for implementation (just mark your finalizer as trusted, and it can be used in SafeD). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 12 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4621


nfxjfg gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |nfxjfg gmail.com



*facepalm*


finalization part in gcx.d to make it safe. The only real issue is that
finalizers are not deterministic (unsolvable) and that finalizers are invoked
in an arbitrary context (solvable if you'd create a finalization worker thread
or so).

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 12 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4621




09:37:20 EDT ---

 I think rather than prevent where these items should go, you should just not
 call the destructor when the struct/class is being destroyed by the GC.
One thing is that I fear it becomes too easy to write a destructor and then forget the finalizer, which would results in leaks everywhere. So I think there has to be something to remind people that they need a finalizer when they need to dispose of some resource and the struct is allocated on the heap. But just not calling the finalizer is still a better option than calling the destructor during collection: a leak is less harmful than memory corruption. Forbidding GC-allocated structs could be useful to make some RAII patterns safer too. For instance, think of a struct representing a database transaction: it should probably live on the stack, you certainly don't want it to be garbaged collected. The compiler enforcing deterministic destruction would improve program correctness for this kind of thing.
 I'd say you could even prevent what the finalizer contains, but it's too
 limiting for the compiler to assume what type of memory a reference is
 referencing.  My thought is simply that a finalizer is not safe, and a
 destructor is safe.  That at least gives a path for implementation (just mark
 your finalizer as  trusted, and it can be used in SafeD).
Whatever it is, I'm now of the opinion that the limitation should only apply to SafeD. The limitation could be about dereferencing members and calling functions that might dereference a member, or it could be that the finalizer itself is not allowed to be safe. Allowing non-dereferencing actions in the finalizer in SafeD would still allow a few things, such as printing a "Hello I'm finalized!" message on the console or updating a global variable (such as a counter of live objects). Nothing very interesting though. But it'd also allow an empty finalizer to be safe, which might be useful if the presence of a destructor and the absence of a finalizer prevents allocation on the GC-heap. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 12 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4621




06:43:37 PDT ---


are not supposed to access/attempt to destroy GC managed resources. In fact, finalizers is to clean up non-GC resources. restrictions and guarantees as D is. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 12 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4621




09:54:51 EDT ---


Indeed. Java allows resurrection, which means that if you leak a reference to an object during the collection, the GC won't collect that object and will leave it in the state it was after the finalizer was called. This is an interesting idea, but I see two reasons it'll not work fro D. First, D doesn't emit special code notifying the GC when assigning to a pointer, so the GC would have to do a full scan again after a collection just to check if someone resurrected a memory block. The second reason is the D2 multithreading model. The GC might run on a different thread. If you resurrect a non-shared object, that object won't live on the same thread anymore but might continue to reference non-shared memory from other threads. This is in violation of the thread-safe type system. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 12 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4621




By the way, separation between finalizers and destructors has been in Tango for
ages.

There's the Object.dipose() method. This method is only called on deterministic
destruction, e.g on delete or with scope classes.

The "destructor" ~this is the finalizer and is always called, both on delete or
on collection.

(This was done so mainly for backward comnpatibility, while still satisfying
the need for knowing about deterministic destruction.)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 12 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4621





 This is an interesting idea, but I see two reasons it'll not work fro D. First,
 D doesn't emit special code notifying the GC when assigning to a pointer, so
 the GC would have to do a full scan again after a collection just to check if
 someone resurrected a memory block.
This isn't so much of a problem if you assume objects with finalizers are rare. They can be collected in the next GC cycle.
 The second reason is the D2 multithreading model. The GC might run on a
 different thread. If you resurrect a non-shared object, that object won't live
 on the same thread anymore but might continue to reference non-shared memory
 from other threads. This is in violation of the thread-safe type system.
Good point. By definition, if the object is not shared(), the finalizer (or anything) must not run on a different thread. It doesn't matter if you access references or not. I wonder how D2 can have finalizers at all with this. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 12 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4621




10:46:23 EDT ---

 This isn't so much of a problem if you assume objects with finalizers are rare.
 They can be collected in the next GC cycle.
I'm not sure we can make this assumption. Beside, is it worth it? I mean, what is the use of a resurrected object beyond providing a little more safety? Given the multithreading model, we know finalizers can't be made safe; them being half-safer doesn't bring us much.
 Good point. By definition, if the object is not shared(), the finalizer (or
 anything) must not run on a different thread. It doesn't matter if you access
 references or not. I wonder how D2 can have finalizers at all with this.
Well, the object's memory block itself is no longer referenced by other threads (otherwise it would not be collected), so I guess as long as you only access values inside this memory block you're safe. You can probably also access non-GC memory you're the sole owner of, or non-GC shared memory. But you shouldn't access non-shared globals, or non-shared memory that someone else could have a reference to. This is starting to be really complicated, but except for the non-GC memory part it looks quite similar to the restrictions applied to methods of a synchronized class. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 12 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4621





 Beside, is it worth it? I mean, what is the use of a resurrected object beyond
 providing a little more safety? Given the multithreading model, we know
 finalizers can't be made safe; them being half-safer doesn't bring us much.
If you'd move finalizer execution to a dedicated finalizer thread, it'd be already quite safe. Then you can acquire locks to do synchronized access to your data. But it still doesn't fit in D2's typesystem.
 Well, the object's memory block itself is no longer referenced by other threads
 (otherwise it would not be collected), so I guess as long as you only access
 values inside this memory block you're safe. You can probably also access
 non-GC memory you're the sole owner of, or non-GC shared memory. But you
 shouldn't access non-shared globals, or non-shared memory that someone else
 could have a reference to. This is starting to be really complicated, but
 except for the non-GC memory part it looks quite similar to the restrictions
 applied to methods of a synchronized class.
Running on a different thread still makes a severe difference to shared or C data. C APIs usually aren't thread-safe. For some OS APIs, the caller thread makes a difference (for instance, you'd break OS provided TLS). You'll have to make an explicit exception in the language spec for finalizers to allow this. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 12 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4621




12:30:40 EDT ---

 Running on a different thread still makes a severe difference to shared or C
 data. C APIs usually aren't thread-safe. For some OS APIs, the caller thread
 makes a difference (for instance, you'd break OS provided TLS).
This depends on the meaning of shared vs. non-shared. Is a non-shared object guarantied to always exist in the same thread? Or is it only guarantied to be accessible in one thread at a time? The former would forbid objects with a unique reference to a non-shared memory block from being moved from one thread to another with no copying, so I think the later is more useful.
 You'll have to make an explicit exception in the language spec for finalizers
 to allow this.
With the "accessible in one thread at a time" model, the only additional special case with multithreading is that you can't access non-shared memory referenced by a member if there's a chance this memory might still be in referenced by the original thread. Combine this with the problem of GC-managed memory which could have already be deallocated and multithreading only complicates the case where you have manually managed memory shared between objects (such as reference counting)... ... and I think I've found such a bug in std.containers.Array (added to bug 4624). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 12 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4621




07:27:54 EDT ---
In the event keeping a combined destructor-finalizer is the favored option,
this could be done by repurposing the to-be-deprecated "scope" qualifier.
"scope" could be applied as an attribute to structs and classes and would
prevent the the struct/class from being allocated on the GC-heap. The absence
of "scope" would make the destructor a finalizer and dereferencing a member
would be prohibited in it without some kind of cast.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 31 2010
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4621


Maxim Fomin <maxim maxim-fomin.ru> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |maxim maxim-fomin.ru
         Resolution|                            |INVALID



---
This is invalid report since  safe has nothing to do with accessing
pointers/references which turned out to be nulls. This is valid D code:

void foo(int* p)  safe // or ref
{
   *p = 0;
} 

void main()  safe
{
   foo(null);
}

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 05 2013
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=4621




17:13:32 EDT ---

 This is invalid report since  safe has nothing to do with accessing
 pointers/references which turned out to be nulls.
Maxim, you're the first to mention null here. I'm not sure I get your point. This issue is about accessing destructed/deallocated memory from the destructor while GC is finalizing an object (or a struct on the GC heap). This can happen if you have a circular reference for instance, or anytime multiple objects that references themselves are finalized in the same pass. The most evil thing you could do is to leak a reference to the a finalized object to the outer world, and then you have a pointer to deallocated memory lying around somewhere. There's no way to catch any of this (currently), hence why destructors are unsafe (when called from the GC). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 05 2013