www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 10821] New: .byKey erroneously returns a null key

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10821

           Summary: .byKey erroneously returns a null key
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: critical
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: andrej.mitrovich gmail.com



15:14:26 PDT ---
-----
import std.stdio;

struct Signal
{
    void connect(void delegate(int) func)
    {
        stderr.writefln("Func connect: %s", *cast(void**)&func);
        funcs[func] = 1;
    }

    void disconnect(void delegate(int) target)
    {
        stderr.writefln("Func disconnect: %s", *cast(void**)&target);
        funcs.remove(target);
    }

    void emit(int i)
    {
        version(VERSION_OK)
        {
            foreach (func; funcs.keys)
            {
                stderr.writefln("Calling: %s", *cast(void**)&func);
                func(i);
            }
        }
        else
        version(VERSION_CRASH)
        {
            foreach (func; funcs.byKey)
            {
                stderr.writefln("Calling: %s", *cast(void**)&func);
                func(i);
            }
        }
    }

    int[void delegate(int)] funcs;
}

void main()
{
    Signal signal;

    void delegate(int) handler;
    handler =
        (int i)
        {
            signal.disconnect(handler);
        };

    signal.connect(handler);

    signal.emit(1);
    signal.emit(2);
}
-----

Ok version:

$ dmd -version=VERSION_OK -run test.d
 Func connect: 1BF2FF0
 Calling: 1BF2FF0
 Func disconnect: 1BF2FF0
Buggy version: $ dmd -version=VERSION_CRASH -run test.d
 Func connect: 482FF0
 Calling: 482FF0
 Func disconnect: 482FF0
 Calling: null
 object.Error: Access Violation
It should have never called 'null', I don't know why byKey seems to return it after the delegate was removed from the associative array. This does not seem to be a regression (tested up to 2.050). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 14 2013
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10821




Even more strange: comment out the line that calls func(i), and suddenly the
writefln displays a non-null address! A codegen bug perhaps?

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 16 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10821




Oh actually, nevermind, the delegate removed itself from the hash.

Ahhh, I see what's going on. You're modifying the AA while iterating over it.
That's very bad, because byKey is implemented using a pointer to the current
Slot. But once the delegate removes itself from the table, this pointer is
invalidated. That's the bug in your code. :)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 16 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10821




01:59:09 PDT ---

 Oh actually, nevermind, the delegate removed itself from the hash.
 
 Ahhh, I see what's going on. You're modifying the AA while iterating over it.
 That's very bad, because byKey is implemented using a pointer to the current
 Slot. But once the delegate removes itself from the table, this pointer is
 invalidated. That's the bug in your code. :)
Oh damn, I can't believe I've run into this hash issue again. And it's very hard to spot too. I hope we can make new hashes which are not unstable like this. Hows your new hashes branch going? -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 17 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10821


Andrej Mitrovic <andrej.mitrovich gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID


-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 17 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10821




I don't know if it's possible to make hash iteration stable across deletes.
This isn't specific to hashes either; any data structure involving pointers
will have this problem. If you have an iterator pointing to bucket B1, and then
you delete B1 from inside the delegate and then try to advance the iterator,
you will run into this problem.

Even if you use an array to store your func ptrs, say, and keep an array index
to track where you are. If you then delete the current element from inside the
delegate and move the array elements up to fill in the gap, the loop wouldn't
know about it, and would increment the index, skipping over what was supposed
to be the next element. So your iteration breaks (even though in this case you
won't get invalid pointers). It may be possible to hack around this particular
instance of the problem, say by checking if the entry at the current index has
changed, but it doesn't help in the general case, say if your delegate removes
multiple entries from the array. Either way, your iteration will be screwed up.

Basically, it's impossible to have straightforward iteration when you're
modifying the container in the middle of things. Even your approach of using
.key to get an array of keys is not reliable -- imagine if the delegate deletes
some of the subsequent keys that you're about to visit. Then you'll get
RangeErrors because the keys in the array no longer exist in the AA.

I've had to deal with code that has to iterate over an array (or hash) of
callbacks before (this was in C++ not D, but the same issues apply). I found
that the only way to keep things consistent without strange side-effects was to
keep a list of to-be-deleted entries separately, and the delegate would never
modify the array of callbacks directly, but would append itself to the
to-be-deleted list if it wants to remove itself. Then during the iteration,
anything that is found in the to-be-deleted list is skipped, and after the
iteration is finished, then loop through the to-be-deleted list and actually
delete the entries.

This kind of issue gets worse if the delegates have the possibility of
triggering the destruction of the entire container. This is not applicable to
your code, but I had to deal with this when writing a protocol stack with
callback handlers -- the stack itself doesn't handle things like logout, it's
all handled by callbacks, which makes it very tricky when one of the callbacks
may be the logout handler, which will cleanup by destructing the entire
protocol stack. Upon returning to the protocol stack code that called the
callback in the first place, all references to the stack instance are now
invalid, so basically callbacks must be called after everything else has been
processed, otherwise it will segfault if the callback happens to be the logout
handler. This code is so tricky that even years after I wrote it I still have
to go back and fix subtle bugs in it every so often.

tl;dr: modifying a container while iterating over it is tricky business, and is
best avoided if you can. :)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 17 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10821




11:00:42 PDT ---

 I found
 that the only way to keep things consistent without strange side-effects was to
 keep a list of to-be-deleted entries separately, and the delegate would never
 modify the array of callbacks directly, but would append itself to the
 to-be-deleted list if it wants to remove itself.
Hmm yeah, it gets complicated real fast. Thanks for the insight. So here's a quick safer workaround implementation: ----- struct Signal { void connect(void delegate(int) func) { if (emitting) { funcsToAdd[func] = 1; } else { stderr.writefln("Func connect: %s", *cast(void**)&func); funcs[func] = 1; } } void disconnect(void delegate(int) func) { if (emitting) { funcsToDelete[func] = 1; } else { stderr.writefln("Func disconnect: %s", *cast(void**)&func); funcs.remove(func); } } void emit(int i) { emitting = true; scope(exit) emitting = false; foreach (func; funcs.byKey) { stderr.writefln("Calling: %s", *cast(void**)&func); func(i); } foreach (func; funcsToDelete.byKey()) funcs.remove(func); funcsToDelete = null; foreach (func; funcsToAdd.byKey()) funcs[func] = 1; funcsToAdd = null; } int[void delegate(int)] funcs; int[void delegate(int)] funcsToAdd; int[void delegate(int)] funcsToDelete; bool emitting; } void main() { Signal signal; void delegate(int) handler; handler = (int i) { signal.disconnect(handler); }; signal.connect(handler); signal.emit(1); signal.emit(2); signal.emit(3); signal.emit(4); } ----- Of course this still isn't perfect, there's no telling what a signal handler really wants to do, whether it actually wants to add/remove some other handler immediately or only schedule add/removal for later. I guess the bottom line is this stuff is more complicated than I thought. I wonder if that new signals implementation recently announced handles this sort of stuff. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 17 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10821





[...] 
 Of course this still isn't perfect, there's no telling what a signal handler
 really wants to do, whether it actually wants to add/remove some other handler
 immediately or only schedule add/removal for later.
I think the only safe way is to always process adds/deletes immediately after iterating over the current handlers. Adds/removes for later can use a timer mechanism. Modifying a container while iterating over it is never a good idea. :)
 I guess the bottom line is this stuff is more complicated than I thought. I
 wonder if that new signals implementation recently announced handles this sort
 of stuff.
Well, it should be audited for this, then. :) It should be fixed if it doesn't already handle this. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 17 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10821




12:30:29 PDT ---

 Modifying a container while iterating over it is never a good idea.
However some containers explicitly support this, like dcollections. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 17 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10821




It does not eliminate the bad behaviour though. Suppose you're iterating over a
RedBlackTree, and then add and delete some keys. Depending on whether those
keys are greater or less than the current key, they will be included / excluded
from the current iteration. So whether the key will be found in the current
iteration depends on the relative values of the current key and the key being
added/deleted, and there's no way to change this apart from scheduling
adds/deletes after the iteration is over.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 17 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10821




Besides, if you're iterating a RedBlackTree and then in the middle you delete
all keys, I bet that your iteration will break in a rather ugly way. :)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 17 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10821




14:36:30 PDT ---

 It does not eliminate the bad behaviour though. Suppose you're iterating over a
 RedBlackTree, and then add and delete some keys.
I was thinking of the purge method in various dcollection containers (purge doesn't seem to be in Phobos), it's explicitly used for removal during traversal. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Aug 17 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10821




Sorry, I mistakenly thought you were talking about std.container. :-/

Anyway, looking over the dcollections code, I'm pretty impressed to realize
that foreach can take function pointers?! That's something I've never knew
before!

In any case, purge is basically a restricted form of container modification
while iterating -- it iterates over the elements and lets you decide whether or
not to remove each one. This is a more controlled form of container
modification during iteration, and is made safe by (1) (implicitly) caching the
reference to the next element and not advancing the iterator if the element
*was* elected to be removed, and (2) only allowing the current element to be
removed, not any arbitrary element. If you relax (2), say your foreach body
directly references the container and deletes random elements, then all hell
breaks loose. If you enforce (2), then you can use (1) as implementation
strategy to allow deletion during iteration.

So one possibility in the case of your signal/slot code, is to have the
delegate return a bool/flag to indicate whether to remove itself from the
table. That way, you can avoid keeping lists of what to remove after the
iteration. It won't be as nice as a foreach loop, though, since you have to
manually control where your iterator is:

    auto r = aa.byKey;
    while (!r.empty) {
        auto key = r.front;
        auto dg = aa[key];
        bool deleteDg = dg(args) == Flag.deleteMe;
        r.popFront(); // important: this must come before aa.remove(key)
        if (deleteDg == Flag.deleteMe) {
            aa.remove(key);
        }
    }

By advancing the range before the aa.remove(key) call and only allowing the
current element to be deleted, you avoid running into invalid pointers.

This doesn't address the problem with dg inserting new things into aa, though,
since the newly inserted entry may or may not get iterated over by the loop
depending on where its hash value falls relative to the current position of the
byKey range. So there's still some unpredictability there.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Aug 17 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10821




13:53:50 PDT ---
Btw, Slist has a stableLinearRemove, which Phobos documents as:

stableLinearRemove - same as linearRemove, but guarantees iterators are not
invalidated.

That sounds like this pseudocode is safe:

-----
auto range = slist[];

foreach (item; range)
{
    // do something

    slist.stableLinearRemove(item);

    // do something
}
-----

Slist are used in a signals implementation by Johannes Pfau. He uses this
stableLinearRemove in his disconnect method.

If this is not actually safe, then the docs should be more clear by what they
mean.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 02 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10821




Well, linked lists are designed to not invalidate iterators when things are
removed. But still, it has the same issue with removing stuff while traversing
over the list. For example, say you have a list:

    A -> B -> C -> D -> E

and you have a foreach loop over the list, say foreach(e; myList) { ... }.

Now suppose you have iterated to C. So e is C. Now you remove C from the list.
Following the SList code in Phobos, it will result in this list:

    A -> C -> D -> E

which is fine. But since e is C, e.next is now null, so your foreach loop will
terminate prematurely. No iterators are invalidated, that's true; any existing
references to A, B, C, D, or E will remain valid. However, their connections
with each other may change when you modify the container. So still, changing
stuff in a container while iterating over it has quirky side-effects. You
cannot assume the iteration will go exactly from A to E unless you iterate over
a copy of references to the elements.

Let's take this further. Let's say you're in the middle of your foreach loop,
and e is C. Let's say for whatever reason you decide to remove the range B..D.
The resulting list would now be:

    A -> E

But since e is C, it is now pointing to a removed section of the list:

    B -> C -> D
         ^
         e

So e.next is D, and the next iteration of the loop will be on D (which no
longer exists in the container!). And after that, it stops, because D is no
longer connected to E. So again, your iteration has "non-intuitive" behaviour.

All hope is not lost, though. There *is* a way to guarantee "intuitive"
iteration over the container while still being able to delete elements during
the iteration. This is possible if you impose the restriction that you can only
remove the *current* element being iterated over. That is, you're not allowed
to remove arbitrary elements from the container (or add new elements while
you're at it, or modify the container in any other way). Here's how:

    Element e = list.front;
    while (e)
    {
        if (wantToRemove(e))
        {
            auto tmp = e.next;
            list.remove(e);
            e = tmp;
        }
        else
            e = e.next;
    }

Unfortunately, there is no nice syntax for this, and you have to depend on the
implementation details of the container to get it right.

One way to make it nicer is for the container to provide opApply, so that the
dirty details of what to do while removing the current element is abstracted
away:

    struct MyList
    {
        ...
        int opApply(scope int delegate(ref Element, out bool removeMe) dg)
        {
            Element e = list.front;
            while (e)
            {
                bool removeMe = false;
                int ret;
                if ((ret = dg(e, removeMe)) != 0)
                    return ret;

                if (removeMe)
                {
                    auto tmp = e.next;
                    this.remove(e);
                    e = tmp;
                }
                else
                    e = e.next;
            }
        }
    }

Then user code won't have to worry about the implementation details:

    void main() {
        MyList l = ...;
        foreach (ref e, ref removeMe; l) {
            removeMe = wantToRemove(e);
        }
    }

This is, in effect, what DCollections does.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 05 2013
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=10821




Argh, I hate bugzilla not having an edit function... the first example in my
previous comment, after removing C, should have the list:

    A -> B -> D -> E

not

    A -> C -> D -> E

Sorry for the confusion.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 05 2013