www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - ArrayBoundsError for associative array operation

reply Egor Starostin <egorst gmail.com> writes:
I have a testcase (simplified from real data) when access to
associative array by existing key produces ArrayBoundsError.

Here it is:
***
import std.stdio;
void main() {
    struct SKey {
        int dep;
        char[] hv;
        uint toHash() {
            uint hkey;
            for (int i=0;i<hv.length;i++) {
                hkey = hkey*10 + (hv[i]-'0');
            }
            return hkey + dep;
        }
    }
    int[SKey] aa;
    SKey ck;

// populate aa (in that exact order)
    ck.dep=0; ck.hv="1236448822";
    aa[ck] = 1;
    ck.dep=0; ck.hv="2716102924";
    aa[ck] = 1;
    ck.dep=0; ck.hv="315901071";
    aa[ck] = 1;

// remove one key
    ck.dep=0; ck.hv="1236448822";
    aa.remove(ck);

// get an error ArrayBoundsError
// while trying to access by one of remaining keys
    ck.dep=0; ck.hv="2716102924";
    writefln(aa[ck]);
}
***

Why I get an ArrayBoundsError for existing key?
Is this a bug in D? If yes -- can somebody suggest a workaround?


--
Egor
Dec 05 2006
parent reply xs0 <xs0 xs0.com> writes:
Egor Starostin wrote:
 I have a testcase (simplified from real data) when access to
 associative array by existing key produces ArrayBoundsError.
 
 Here it is:
 ***
 [snip]
 ***
 
 Why I get an ArrayBoundsError for existing key?
 Is this a bug in D? If yes -- can somebody suggest a workaround?
 
Most likely you forgot to implement a function or two: http://www.digitalmars.com/d/arrays.html#associative Structs or unions can be used as the KeyType. For this to work, the struct or union definition must define the following member functions: * hash_t toHash() * int opEquals(S) or int opEquals(S*) * int opCmp(S) or int opCmp(S*) Note that the parameter to opCmp and opEquals can be either the struct or union type, or a pointer to the struct or untion type. xs0
Dec 05 2006
parent reply Egor Starostin <egorst gmail.com> writes:
 Most likely you forgot to implement a function or two:
Sorry that I oversimplified posted testcase (throwed away opEquals and opCmp). In my working code I have the following functions in struct definition: *** int opEquals(SKey* s) { return ((dep == s.dep) && (hv == s.hv)); } int opCmp(SKey* s) { if (dep == s.dep) { return cmp(hv,s.hv); } else { return dep - s.dep; } } *** -- Egor
Dec 05 2006
parent reply xs0 <xs0 xs0.com> writes:
Egor Starostin wrote:
 Most likely you forgot to implement a function or two:
Sorry that I oversimplified posted testcase (throwed away opEquals and opCmp). In my working code I have the following functions in struct definition: *** int opEquals(SKey* s) { return ((dep == s.dep) && (hv == s.hv)); } int opCmp(SKey* s) { if (dep == s.dep) { return cmp(hv,s.hv); } else { return dep - s.dep; } } *** -- Egor
Yes, I think it's a bug.. The reason it happens is the following: 1) all the nodes happen to fall into the same bucket in the AA 2) when that is the case, entries get placed into a binary tree 3) the ordering in the binary tree is determined by hashcodes first, opCmp second (in this case, opCmp can be ignored, as it isn't called at all) 4) after that node is deleted, the tree looks like this: root.hash = 315901071 root->right.hash = 2716102924 5) and, finally, the bug: in aaA.d around line 319, the following takes place: c = key_hash - e.hash; // snip if (c < 0) e = e.left; else e = e.right; because those two hash values are too far apart, c becomes negative (-1894765443) and the left subtree is followed instead of the right subtree, causing the key to not be found. xs0
Dec 05 2006
next sibling parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
xs0 wrote:
 Yes, I think it's a bug.. The reason it happens is the following:
 
 1) all the nodes happen to fall into the same bucket in the AA
 2) when that is the case, entries get placed into a binary tree
 3) the ordering in the binary tree is determined by hashcodes first, 
 opCmp second (in this case, opCmp can be ignored, as it isn't called at 
 all)
 4) after that node is deleted, the tree looks like this:
      root.hash        =  315901071
      root->right.hash = 2716102924
 5) and, finally, the bug: in aaA.d around line 319, the following takes 
 place:
 
      c = key_hash - e.hash;
      // snip
      if (c < 0)
          e = e.left;
      else
          e = e.right;
 
    because those two hash values are too far apart, c becomes negative 
 (-1894765443) and the left subtree is followed instead of the right 
 subtree, causing the key to not be found.
Well, at least the fix is easy. Comparing the hashes directly should solve the problem.
Dec 05 2006
prev sibling parent Egor Starostin <egorst gmail.com> writes:
 Yes, I think it's a bug.. The reason it happens is the following:
Thank you for the explanation. One more question: I've found out that when I explicitly call aa.rehash after every aa.remove() then problem disappears. Is it really so or it's just hides (and will reappears on different data set)? -- Egor
Dec 06 2006