www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - AA strange behavior

reply marc michel <marc_member pathlink.com> writes:
I found a strange behavior using AAs :

--------------------- >8 ----------------
import std.stdio;
import std.stream;

void main() {

int[][char[]] aa;

int i;
char[] s;
File f=new File("bla", FileMode.In );
while ( ! f.eof ) {
s= f.readLine();
aa[s] ~= i++;
}

f.close;

// no more luck with this :
//      aa.rehash;

foreach( char[] s, int[] i; aa ) {
writefln( "%s  =>  %d", s, i );
}

writefln("\n------------------");

// workaround :
//  while ( aa.length > 0) {

foreach( char[]s, int[] i; aa ) {
aa.remove(s);
writefln("\"%s\" => removed ",s);
}

// } 

writefln("\n------there's still : ----------");
foreach( char[] s, int[] i; aa ) {
writefln( "%s  =>  %d", s, i );
}
writefln("------END---------");
}
--------------------- >8 ----------------


with a "bla" file like this one for example :


--------------------- >8 ----------------
apple
orange
pear
strawberry
cuncumber
lemon
salad
tomato
blackberry
orange
lemon
tomato
potatoe
root
--------------------- >8 ----------------


result :

--------------------- >8 ----------------
C:\home\dev\d>aa
tomato  =>  [7,11]
strawberry  =>  [3]
blackberry  =>  [8]
orange  =>  [1,9]
potatoe  =>  [12]
root  =>  [13]
salad  =>  [6]
apple  =>  [0]
lemon  =>  [5,10]
cuncumber  =>  [4]
pear  =>  [2]

------------------
"tomato" => removed
"strawberry" => removed
"blackberry" => removed
"orange" => removed
"root" => removed
"salad" => removed
"apple" => removed
"lemon" => removed
"cuncumber" => removed
"pear" => removed

------there's still : ----------
potatoe  =>  [12]
------END---------

--------------------- >8 ----------------


Note : I also tried to add "aa.rehash" after filling aa; with no more luck.
The only workaround is to add a "while( aa.length > 0 ) " surroundind the
foreach loop which does aa.remove().


Do I need holidays ?
Does this had been discussed many times already ?
Jun 12 2006
parent reply Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
marc michel skrev:
 I found a strange behavior using AAs :
[snip]
 foreach( char[]s, int[] i; aa ) {
 aa.remove(s);
 writefln("\"%s\" => removed ",s);
 }
You may not delete aa elements within a foreach loop. The foreach iterator will be confused.
 The only workaround is to add a "while( aa.length > 0 ) " surroundind the
 foreach loop which does aa.remove().
There are other ways. For instance: foreach(key;aa.keys) aa.remove(key); or faster (as you seem to want to delete all elements), but ugly, hackish and undocumented: struct BB { void *[] buckets; size_t nodes; } ... (cast(BB*)&aa).buckets = null; (cast(BB*)&aa).nodes = 0; (A wish would be for the above to be implemented as aa.clear()) Or if you just want to forget about your current aa instance: aa = null; /Oskar
Jun 12 2006
next sibling parent "lanael" <no mail.never> writes:
 There are other ways. For instance:

 foreach(key;aa.keys)
 	aa.remove(key);
ah, yes, that's it. I remember now : the "keys" property ! I forgot about this one, thanks ! Now, I'm sure I need holidays, cause I also remember this question has already been asked :/
 or faster (as you seem to want to delete all elements), but ugly, hackish and 
 undocumented:
In fact, I have real code using this kind of AA in which I call some glDeleteTextures() functions... but thanks anyway !
Jun 12 2006
prev sibling parent reply Carlos Santander <csantander619 gmail.com> writes:
Oskar Linde escribió:
 marc michel skrev:
 I found a strange behavior using AAs :
[snip]
 foreach( char[]s, int[] i; aa ) {
 aa.remove(s);
 writefln("\"%s\" => removed ",s);
 }
You may not delete aa elements within a foreach loop. The foreach iterator will be confused.
 The only workaround is to add a "while( aa.length > 0 ) " surroundind the
 foreach loop which does aa.remove().
There are other ways. For instance: foreach(key;aa.keys) aa.remove(key); or faster (as you seem to want to delete all elements), but ugly, hackish and undocumented: struct BB { void *[] buckets; size_t nodes; } .... (cast(BB*)&aa).buckets = null; (cast(BB*)&aa).nodes = 0; (A wish would be for the above to be implemented as aa.clear())
I think this would be a useful addition to AAs. It has been proposed more than once, I don't know why Walter hasn't added it yet.
 Or if you just want to forget about your current aa instance:
 
 aa = null;
 
 /Oskar
-- Carlos Santander Bernal
Jun 12 2006
parent reply Georg Wrede <georg.wrede nospam.org> writes:
Carlos Santander wrote:
 Oskar Linde escribió:
 There are other ways. For instance:

 foreach(key;aa.keys)
     aa.remove(key);

 or faster (as you seem to want to delete all elements), but ugly, 
 hackish and undocumented:

 struct BB { void *[] buckets; size_t nodes; }

 ....

 (cast(BB*)&aa).buckets = null;
 (cast(BB*)&aa).nodes = 0;

 (A wish would be for the above to be implemented as aa.clear())
I think this would be a useful addition to AAs. It has been proposed more than once, I don't know why Walter hasn't added it yet.
 Or if you just want to forget about your current aa instance:

 aa = null;
Hmm. Of course reuse is good. Greenpeace Likes Reuse(tm)! But is there really enough merit in reusing a hash, as compared with using a new one? I mean, in both cases we are effectively abandoning the buckets and the nodes to GC. -- To reuse /them/ would give savings, but I'm unable to believe it's worth the effort, or even smart at all. What's the real cost of creating a new hash, compared with emptying the old one? ((Besides, too much needless reuse only makes code harder to understand.)) --- FWIW, if reusing hashes really does turn out more efficient (or smarter and not more error prone), and become the Recommended Practice, then I, too, absolutely vote for aa.clear()! And if not, we sure as heck should _not_ implement it!
Jun 13 2006
parent reply Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Georg Wrede skrev:
 
 
 Carlos Santander wrote:
 Oskar Linde escribió:
 There are other ways. For instance:

 foreach(key;aa.keys)
     aa.remove(key);

 or faster (as you seem to want to delete all elements), but ugly, 
 hackish and undocumented:

 struct BB { void *[] buckets; size_t nodes; }

 ....

 (cast(BB*)&aa).buckets = null;
 (cast(BB*)&aa).nodes = 0;

 (A wish would be for the above to be implemented as aa.clear())
I think this would be a useful addition to AAs. It has been proposed more than once, I don't know why Walter hasn't added it yet.
 Or if you just want to forget about your current aa instance:

 aa = null;
Hmm. Of course reuse is good. Greenpeace Likes Reuse(tm)! But is there really enough merit in reusing a hash, as compared with using a new one? I mean, in both cases we are effectively abandoning the buckets and the nodes to GC. -- To reuse /them/ would give savings, but I'm unable to believe it's worth the effort, or even smart at all.
The AA is a reference type. There can be many references to the same AA. aa = null will only change one reference, while the proposed aa.clear() would clear the actual AA that all references refer to. There is a significant semantic difference.
 What's the real cost of creating a new hash, compared with emptying the 
 old one?
Nothing significant. /Oskar
Jun 13 2006
parent reply Sean Kelly <sean f4.ca> writes:
Oskar Linde wrote:
 Georg Wrede skrev:
 Carlos Santander wrote:
 Oskar Linde escribió:
 There are other ways. For instance:

 foreach(key;aa.keys)
     aa.remove(key);

 or faster (as you seem to want to delete all elements), but ugly, 
 hackish and undocumented:

 struct BB { void *[] buckets; size_t nodes; }

 ....

 (cast(BB*)&aa).buckets = null;
 (cast(BB*)&aa).nodes = 0;

 (A wish would be for the above to be implemented as aa.clear())
I think this would be a useful addition to AAs. It has been proposed more than once, I don't know why Walter hasn't added it yet.
 Or if you just want to forget about your current aa instance:

 aa = null;
Hmm. Of course reuse is good. Greenpeace Likes Reuse(tm)! But is there really enough merit in reusing a hash, as compared with using a new one? I mean, in both cases we are effectively abandoning the buckets and the nodes to GC. -- To reuse /them/ would give savings, but I'm unable to believe it's worth the effort, or even smart at all.
The AA is a reference type. There can be many references to the same AA. aa = null will only change one reference, while the proposed aa.clear() would clear the actual AA that all references refer to. There is a significant semantic difference.
What about "delete aa"? Or was the goal to keep the buckets around and just toss the data? Sean
Jun 13 2006
parent Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Sean Kelly skrev:
 Oskar Linde wrote:
 Georg Wrede skrev:
 Carlos Santander wrote:
 Oskar Linde escribió:
 There are other ways. For instance:

 foreach(key;aa.keys)
     aa.remove(key);

 or faster (as you seem to want to delete all elements), but ugly, 
 hackish and undocumented:

 struct BB { void *[] buckets; size_t nodes; }

 ....

 (cast(BB*)&aa).buckets = null;
 (cast(BB*)&aa).nodes = 0;
I just realized this is wrong. I had accidentally been linking to an old version of Phobos. In its current incarnation, those two lines should be: (*(cast(BB**)&aa)).buckets = null; (*(cast(BB**)&aa)).nodes = 0;
 (A wish would be for the above to be implemented as aa.clear())
I think this would be a useful addition to AAs. It has been proposed more than once, I don't know why Walter hasn't added it yet.
 Or if you just want to forget about your current aa instance:

 aa = null;
Hmm. Of course reuse is good. Greenpeace Likes Reuse(tm)! But is there really enough merit in reusing a hash, as compared with using a new one? I mean, in both cases we are effectively abandoning the buckets and the nodes to GC. -- To reuse /them/ would give savings, but I'm unable to believe it's worth the effort, or even smart at all.
The AA is a reference type. There can be many references to the same AA. aa = null will only change one reference, while the proposed aa.clear() would clear the actual AA that all references refer to. There is a significant semantic difference.
What about "delete aa"? Or was the goal to keep the buckets around and just toss the data?
That could work as a syntax, but I think a .clear() method is clearer. The goal is not to keep the buckets around. Just work with multiple references to the same AA. int[int] table1; int[int] table2; table1[1] = 1; table2 = table1; table1.clear(); assert(table2.length == 0); /Oskar
Jun 13 2006