digitalmars.D.bugs - [Issue 4179] New: Deleting items from an associative array iterated over
- d-bugmail puremagic.com (71/71) May 12 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4179
- d-bugmail puremagic.com (10/10) May 12 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4179
- d-bugmail puremagic.com (11/11) May 26 2010 http://d.puremagic.com/issues/show_bug.cgi?id=4179
- d-bugmail puremagic.com (12/12) Mar 29 2011 http://d.puremagic.com/issues/show_bug.cgi?id=4179
- d-bugmail puremagic.com (21/21) Mar 29 2011 http://d.puremagic.com/issues/show_bug.cgi?id=4179
- d-bugmail puremagic.com (9/9) Mar 29 2011 http://d.puremagic.com/issues/show_bug.cgi?id=4179
- d-bugmail puremagic.com (17/17) Mar 29 2011 http://d.puremagic.com/issues/show_bug.cgi?id=4179
- d-bugmail puremagic.com (21/21) Mar 29 2011 http://d.puremagic.com/issues/show_bug.cgi?id=4179
- d-bugmail puremagic.com (16/16) Mar 29 2011 http://d.puremagic.com/issues/show_bug.cgi?id=4179
- d-bugmail puremagic.com (9/22) Jul 20 2013 http://d.puremagic.com/issues/show_bug.cgi?id=4179
http://d.puremagic.com/issues/show_bug.cgi?id=4179 Summary: Deleting items from an associative array iterated over Product: D Version: future Platform: x86 OS/Version: Windows Status: NEW Severity: normal Priority: P2 Component: druntime AssignedTo: sean invisibleduck.org ReportedBy: bearophile_hugs eml.cc This D2 program deletes items from an associative array while it is iterated over with a foreach: import std.stdio: writeln; void main() { auto aa = [1: "one", 2: "two", 3: "three"]; int count; foreach (k, v; aa) { count++; if (v == "two") aa.remove(k); } writeln(aa); writeln("count:", count); } The output, with DMD v.2.045: [1:one,3:three] count:104 The output shows this code deletes the item correctly from the associative array, but I am not so sure it can work in general. In general this can be unsafe (buggy, bad, wrong) code. And the value of 'count' at the end is too much large, it can be a bug in the associative array iteration. ---------------------- This is an alternative version of that program: import std.stdio: writeln; void main() { auto aa = [1: "one", 2: "two", 3: "three"]; foreach (k; aa.byKey()) { if (aa[k] == "two") aa.remove(k); } writeln(aa); } When compiled with DMD v.2.045 it prints at run-time: core.exception.RangeError test(5): Range violation ---------------------- Python avoids that probable programmer error (iterating over an associative array being modified) and troubles with an esplicit safety and error message: aa = {1: "one", 2: "two", 3: "three"} for k in aa: if aa[k] == "two": del aa[k] Python 2.6.5 prints: Traceback (most recent call last): File "...\test.py", line 2, in <module> for k in aa: RuntimeError: dictionary changed size during iteration To do this they set a flag when the Python dictionary is iterated over, and reset it later. Probably there is a way to circumvent this safety protection, but in most cases it is enough to avoid bugs, especially in simple code written by novices (that is the code that more often presents this error). Introducing the same safety measure in D can be useful, especially for code written by novices, safety belts are good, but it can slow down a little the deletion of items from the associative array. To totally avoid this performance problem this safety can be disabled when the code is compiled in release mode. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
May 12 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4179 bearophile_hugs eml.cc changed: What |Removed |Added ---------------------------------------------------------------------------- Severity|normal |enhancement It's an enhancement, mostly. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
May 12 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4179 To implement this idea the foreach() has to change a little, in nonrelease mode it has to set and later reset a boolean flag inside the container being iterated. So this boolean value if present must have a standard name, that has to be shown in the std.container page about the standard API of the container. If implemented this idea lessens a little the need for the stable ("soft") methods. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
May 26 2010
http://d.puremagic.com/issues/show_bug.cgi?id=4179 Vladimir <thecybershadow gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |thecybershadow gmail.com Severity|enhancement |critical --- Considering that this bug breaks safety of SafeD programs, it is definitely much more severe than an "enhancement". -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 29 2011
http://d.puremagic.com/issues/show_bug.cgi?id=4179 Jonathan M Davis <jmdavisProg gmx.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jmdavisProg gmx.com PDT --- SafeD refers to memory safety. The current behavior is completely memory safe. The same could happen with any container whose iterators/ranges are invalidated by the removal of an element from the container. It would be nice if removing elements from an AA while looping over it worked, but it's not at all surprising that it doesn't, and it may not be reasonable to require that it does. And it's easy enough to get around this. You just save a list of the items that you want to remove, and then you remove them after you're done iterating over the AA. There's nothing unsafe about the current behavior. It may be a bit bug-prone for the unwary, but this sort of things is normal when dealing with iterators or ranges. If it were unsafe, it wouldn't be throwing a RangeViolation but would keep trying to iterate using invalid data and do who-knows-what. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 29 2011
http://d.puremagic.com/issues/show_bug.cgi?id=4179 --- Eh? Is getting a segfault because the data structure has been modified considered "safe"? The workaround you suggested doesn't work when removing happens several calls deep in the call hierarchy. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 29 2011
http://d.puremagic.com/issues/show_bug.cgi?id=4179 PDT --- segfault? I didn't think that anyone was getting a segfault from this. If there's a segfault, then that's obviously bad. I'm not sure that it's technically unsafe though. That would depend on whether it could possibly corrupt memory. If it's guarantee to segfault and never corrupt memory, then I don't believe that it's technically unsafe and therefore would not be a problem for SafeD. Still, one would hope that it could do better than a segfault. I thought that it was throwing an exception though. If it's segfaulting, that would put increased pressure on either flagging it as a warning or fixing it so that it's not necessary (but that may not be reasonable if you want AA's to be reasonably efficient - I don't know how iterating over them is implemented). However, if it's a segfault with no risk of memory corruption, then it's still perfectly safe, just annoying. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 29 2011
http://d.puremagic.com/issues/show_bug.cgi?id=4179 Steven Schveighoffer <schveiguy yahoo.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |schveiguy yahoo.com 09:47:53 PDT --- It's unsafe. The issue is that if you remove the element being iterated, it goes back to the memory pool. Here is the code that removes an element: https://github.com/D-Programming-Language/druntime/blob/master/src/rt/aaA.d#L373 Note on line 402, the unused node is free'd using gc_free. The only way I know to make it safe is letting the iteration routine remove the current element at the right time . Dcollections allows this. The only way to solve this other than that is to keep a modification id, and throw an error if the iteration discovers the array structure has been modified during iteration. A compiler error or warning will not help, because the compiler cannot know whether a called function is going to modify the AA, or if two AA references point to the same object. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 29 2011
http://d.puremagic.com/issues/show_bug.cgi?id=4179 --- There are more possible ways to make it safe than this. One way is to have a list of current iterations associated with each AA. Any time an item is removed, any current iterations are also updated. The nodes of the list can be stored on the stack of aaApply or whatever function does the "foreach" for AAs. It can be as simple as one pointer in the AA structure, and a structure with two fields ("current iteration item" / "next list node") on the stack - this should allow everything to "just work" and have a small impact on performance. Alternatively, when items are removed from an AA they can be left for the GC to collect rather than deleted explicitly, and their "next" pointer can be set to a magic value indicating that they were deleted (e.g. cast(void*)1). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 29 2011
http://d.puremagic.com/issues/show_bug.cgi?id=4179There are more possible ways to make it safe than this. One way is to have a list of current iterations associated with each AA. Any time an item is removed, any current iterations are also updated. The nodes of the list can be stored on the stack of aaApply or whatever function does the "foreach" for AAs. It can be as simple as one pointer in the AA structure, and a structure with two fields ("current iteration item" / "next list node") on the stack - this should allow everything to "just work" and have a small impact on performance. Alternatively, when items are removed from an AA they can be left for the GC to collect rather than deleted explicitly, and their "next" pointer can be set to a magic value indicating that they were deleted (e.g. cast(void*)1).My suggestion was for the foreach to set a boolean flag inside the associative array, and reset it when the foreach ends. Then the AA.delete should look for this flag and throw an exception if it's set. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jul 20 2013