www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.dtl - associative-arrays ate my lunch (and my support)

reply "Kris" <someidiot earthlink.dot.dot.dot.net> writes:
So, I've been happily thinking this whole time that associative arrays (AA)
in D are rather cool; being built into the language and so on ... then today
I spent several hours tracking down what seemed to be a truly bizarre bug,
which turned out to be caused by the way D associative arrays operate.
Here's what happened:

I want to check if there's an entry in the AA; use it if so, otherwise
continue. The documentation say to use this construct for checking:

class X
{
     MyClass [char[]] aa;

    void test ()
    {
         if ("hello" in aa)
             ....
     }
}

and then (presumably) you have to subsequently retrieve the entry as
follows:

    void test ()
    {
         if ("hello" in aa)
            {
            MyClass x = aa ["hello"];
            // do something with x ...
            }
     }

Of course, you've just performed two full lookups instead of one: first you
look to see if the entry exists, then you lookup again to retrieve it. Being
the freak that I am, I don't want that overhead. So ... I proceeded with
this instead:

    void test ()
    {
         MyClass x = aa ["hello"];

         if (x)
            {
            // do something with x ...
            }
     }

Much better! Right?

Well, apparently not. Now clearly I did not assign anything to the AA in the
above code, but an entry is created regardless. That is, AA entries are
created as a /side-effect/ of doing that lookup. Now, I've used many
hash-table implementations before, but this behavior is truly a first.
Hands-up please, all those who would have expected this ...

This sneaky behavior really wasted a whole lot of effort (I can almost hear
Walter laughing in the background :-)

That aside, one is forced to do a double lookup in such cases. This is not
good at all, and I simply cannot fathom what benefit there is to such an
approach. In my particular application, the double-lookup (and, of course,
double hash-computation) is entirely unacceptable. Lookup is the most common
operation on a hash-table, by far. It has to be efficient, and this
"side-effect" totally flies in the face of reason. If it were a Phobos
module it would have been replaced long ago.

For the moment I have to assume the behavior is "by design", so fair-warning
to everyone.

(Ben? Do you have a proper hash-table I can use please? If not, I'll do an
FNV port; AA's be damned. Out! Damn spot! Out, I say!)
Jul 26 2004
next sibling parent reply Sean Kelly <sean f4.ca> writes:
In article <ce3tul$1h6o$1 digitaldaemon.com>, Kris says...
Well, apparently not. Now clearly I did not assign anything to the AA in the
above code, but an entry is created regardless. That is, AA entries are
created as a /side-effect/ of doing that lookup. Now, I've used many
hash-table implementations before, but this behavior is truly a first.
Hands-up please, all those who would have expected this ...

Not I. Though I'd still want to look at the D code that's doing this because I still don't quite believe it. What if MyClass is not default constructible?
That aside, one is forced to do a double lookup in such cases. This is not
good at all, and I simply cannot fathom what benefit there is to such an
approach. In my particular application, the double-lookup (and, of course,
double hash-computation) is entirely unacceptable. Lookup is the most common
operation on a hash-table, by far. It has to be efficient, and this
"side-effect" totally flies in the face of reason. If it were a Phobos
module it would have been replaced long ago.

So how would you propose a built in associative array handle both class and primitive types? Obviously "in" can return a bit. But what does the subscript operator return for failed lookups on primitive types? I would likely return Type.init in all cases. Besides, this is a hashtable we're talking about, not a btree. A double lookup might not be ideal but it's still far less expensive than a double tree traversal. Sean
Jul 26 2004
next sibling parent Regan Heath <regan netwin.co.nz> writes:
On Mon, 26 Jul 2004 22:44:23 +0000 (UTC), Sean Kelly <sean f4.ca> wrote:

 In article <ce3tul$1h6o$1 digitaldaemon.com>, Kris says...
 Well, apparently not. Now clearly I did not assign anything to the AA 
 in the
 above code, but an entry is created regardless. That is, AA entries are
 created as a /side-effect/ of doing that lookup. Now, I've used many
 hash-table implementations before, but this behavior is truly a first.
 Hands-up please, all those who would have expected this ...

Not I. Though I'd still want to look at the D code that's doing this because I still don't quite believe it. What if MyClass is not default constructible?
 That aside, one is forced to do a double lookup in such cases. This is 
 not
 good at all, and I simply cannot fathom what benefit there is to such an
 approach. In my particular application, the double-lookup (and, of 
 course,
 double hash-computation) is entirely unacceptable. Lookup is the most 
 common
 operation on a hash-table, by far. It has to be efficient, and this
 "side-effect" totally flies in the face of reason. If it were a Phobos
 module it would have been replaced long ago.

So how would you propose a built in associative array handle both class and primitive types? Obviously "in" can return a bit. But what does the subscript operator return for failed lookups on primitive types? I would likely return Type.init in all cases. Besides, this is a hashtable we're talking about, not a btree. A double lookup might not be ideal but it's still far less expensive than a double tree traversal.

The problem is, reference types like pointers, arrays, and classes can be null, value types like int, float, structs etc cannot, yet you want some what of saying "does not exist" for value types. One solution is to wrap them in a reference type, i.e. a class, but this IMO is un-necessary baggage, instead you want to be able to call a method on the AA, something like: int[char[]] foo; int value; if (foo.find("label",value)) { } where find looks something like bool find(K key, out V value) { ..lookup key, assign to value.. ..return true on success.. ..else false.. } This allows you to check whether it exists (the return value) and get it's value (inout value) all at once for both reference and value types. Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jul 26 2004
prev sibling parent reply Ben Hinkle <bhinkle4 juno.com> writes:
Sean Kelly wrote:

 In article <ce3tul$1h6o$1 digitaldaemon.com>, Kris says...
Well, apparently not. Now clearly I did not assign anything to the AA in
the above code, but an entry is created regardless. That is, AA entries
are created as a /side-effect/ of doing that lookup. Now, I've used many
hash-table implementations before, but this behavior is truly a first.
Hands-up please, all those who would have expected this ...

Not I. Though I'd still want to look at the D code that's doing this because I still don't quite believe it. What if MyClass is not default constructible?

The value put into the array is null (more generally the Type.init value) so no constructor needs to be called.
That aside, one is forced to do a double lookup in such cases. This is not
good at all, and I simply cannot fathom what benefit there is to such an
approach. In my particular application, the double-lookup (and, of course,
double hash-computation) is entirely unacceptable. Lookup is the most
common operation on a hash-table, by far. It has to be efficient, and this
"side-effect" totally flies in the face of reason. If it were a Phobos
module it would have been replaced long ago.

So how would you propose a built in associative array handle both class and primitive types? Obviously "in" can return a bit. But what does the subscript operator return for failed lookups on primitive types? I would likely return Type.init in all cases. Besides, this is a hashtable we're talking about, not a btree. A double lookup might not be ideal but it's still far less expensive than a double tree traversal.

The mintl.map implementation has a private function that performs a lookup and if it fails returns in an inout parameter the parent node for the add call. This is tough to do without exposing some datatype internals so it isn't public.
 
 Sean

Jul 26 2004
next sibling parent Sean Kelly <sean f4.ca> writes:
In article <ce43vb$1j7m$1 digitaldaemon.com>, Ben Hinkle says...
Sean Kelly wrote:

 In article <ce3tul$1h6o$1 digitaldaemon.com>, Kris says...
Well, apparently not. Now clearly I did not assign anything to the AA in
the above code, but an entry is created regardless. That is, AA entries
are created as a /side-effect/ of doing that lookup. Now, I've used many
hash-table implementations before, but this behavior is truly a first.
Hands-up please, all those who would have expected this ...

Not I. Though I'd still want to look at the D code that's doing this because I still don't quite believe it. What if MyClass is not default constructible?

The value put into the array is null (more generally the Type.init value) so no constructor needs to be called.

Oops. I misread Kris' post. Yes, an entry will be created as a side effect but a lookup should return null for class value types. Sean
Jul 26 2004
prev sibling parent reply "Kris" <someidiot earthlink.dot.dot.dot.net> writes:
"Ben Hinkle"  wrote ...
 The value put into the array is null (more generally the Type.init value)

Why does it add an entry when used as an rValue? Any ideas Ben? - Kris
Jul 26 2004
parent Ben Hinkle <bhinkle4 juno.com> writes:
Kris wrote:

 "Ben Hinkle"  wrote ...
 The value put into the array is null (more generally the Type.init value)

Why does it add an entry when used as an rValue? Any ideas Ben? - Kris

The only reason I can think of is that it makes the code for AA's a little easier since the routine that gets a node (_aaGet in src/phobos/internal/aaA.d) returns a pointer to the value slot and so if the context is rvalue it assigns through the pointer and if it is lvalue it dereferences the pointer. Having the get routine know about rvalue vs lvalue would probably mean the context has to be passed to it or there would be two versions of get: _aaGetRValue and _aaGetLValue. -Ben
Jul 26 2004
prev sibling next sibling parent reply Ben Hinkle <bhinkle4 juno.com> writes:
Kris wrote:

 So, I've been happily thinking this whole time that associative arrays
 (AA) in D are rather cool; being built into the language and so on ...
 then today I spent several hours tracking down what seemed to be a truly
 bizarre bug, which turned out to be caused by the way D associative arrays
 operate. Here's what happened:
 
 I want to check if there's an entry in the AA; use it if so, otherwise
 continue. The documentation say to use this construct for checking:
 
 class X
 {
      MyClass [char[]] aa;
 
     void test ()
     {
          if ("hello" in aa)
              ....
      }
 }
 
 and then (presumably) you have to subsequently retrieve the entry as
 follows:
 
     void test ()
     {
          if ("hello" in aa)
             {
             MyClass x = aa ["hello"];
             // do something with x ...
             }
      }
 
 Of course, you've just performed two full lookups instead of one: first
 you look to see if the entry exists, then you lookup again to retrieve it.
 Being the freak that I am, I don't want that overhead. So ... I proceeded
 with this instead:
 
     void test ()
     {
          MyClass x = aa ["hello"];
 
          if (x)
             {
             // do something with x ...
             }
      }
 
 Much better! Right?
 
 Well, apparently not. Now clearly I did not assign anything to the AA in
 the above code, but an entry is created regardless. That is, AA entries
 are created as a /side-effect/ of doing that lookup. Now, I've used many
 hash-table implementations before, but this behavior is truly a first.
 Hands-up please, all those who would have expected this ...

I don't know about hash tables, but for STL maps I think the semantics are the same. The operator[] will insert if the key isn't there. For STL this makes sense since it must return a reference to the value. Since D returns the value without a reference it doesn't have to insert it into the table. So it does seem surprising that it also inserts. I can live with either way, though, since the value returned is Type.init.
 This sneaky behavior really wasted a whole lot of effort (I can almost
 hear Walter laughing in the background :-)
 
 That aside, one is forced to do a double lookup in such cases. This is not
 good at all, and I simply cannot fathom what benefit there is to such an
 approach. In my particular application, the double-lookup (and, of course,
 double hash-computation) is entirely unacceptable. Lookup is the most
 common operation on a hash-table, by far. It has to be efficient, and this
 "side-effect" totally flies in the face of reason. If it were a Phobos
 module it would have been replaced long ago.

Hash tables are O(1) at least. I tried many variations on D's AA data structure to try to speed it up across the board and couldn't get anything that performed as well for a range of applications. Things I tried mostly were around preallocating and changing the string hash function. There are undoubtably faster implementations for specific needs but all around I'd say the AA in D is as good a work-horse as one can expect.
 For the moment I have to assume the behavior is "by design", so
 fair-warning to everyone.

It is subtle if you are expecting something else.
 (Ben? Do you have a proper hash-table I can use please? If not, I'll do an
 FNV port; AA's be damned. Out! Damn spot! Out, I say!)

I don't have another hash-table in mind. What is FNV? "frickin no vay"? ;-)
Jul 26 2004
next sibling parent "Kris" <someidiot earthlink.dot.dot.dot.net> writes:
"Ben Hinkle" <bhinkle4 juno.com> wrote in message
news:ce4gf5$1nbl$1 digitaldaemon.com...
 Hash tables are O(1) at least. I tried many variations on D's AA data
 structure to try to speed it up across the board and couldn't get anything
 that performed as well for a range of applications. Things I tried mostly
 were around preallocating and changing the string hash function. There are
 undoubtably faster implementations for specific needs but all around I'd
 say the AA in D is as good a work-horse as one can expect.

Yes, I followed your post on that with interest. The problem I have is the 2*O(1) aspect. If I'm using hash strings of any reasonable size, that adds up pretty quickly. In this case, the keys can be long.
 (Ben? Do you have a proper hash-table I can use please? If not, I'll do


 FNV port; AA's be damned. Out! Damn spot! Out, I say!)

I don't have another hash-table in mind. What is FNV? "frickin no vay"?

It's worse than that <g> http://www.isthe.com/chongo/tech/comp/fnv/ Do you think Walter would add my port of this to Phobos? (chortle! chortle!)
Jul 26 2004
prev sibling parent reply "Kris" <someidiot earthlink.dot.dot.dot.net> writes:
"Ben Hinkle" wrote ...
 Kris wrote:
 Well, apparently not. Now clearly I did not assign anything to the AA in
 the above code, but an entry is created regardless. That is, AA entries
 are created as a /side-effect/ of doing that lookup. Now, I've used many
 hash-table implementations before, but this behavior is truly a first.
 Hands-up please, all those who would have expected this ...

 That aside, one is forced to do a double lookup in such cases. This is


 good at all, and I simply cannot fathom what benefit there is to such an
 approach. In my particular application, the double-lookup (and, of


 double hash-computation) is entirely unacceptable. Lookup is the most
 common operation on a hash-table, by far. It has to be efficient, and


 "side-effect" totally flies in the face of reason. If it were a Phobos
 module it would have been replaced long ago.

 For the moment I have to assume the behavior is "by design", so
 fair-warning to everyone.

It is subtle if you are expecting something else.

I'd like to just ignore this and go on, but the semantics here are just too badly broken to serve as an example of what containers should do. Since this is the unofficial "container zone", I ought to say something ... To recap; referencing an associative array as an rValue modifies the array itself. That is MyClass [char[]] aa; MyClass x = aa ["key"]; if (x) ... actually /adds/ a null entry to the array. Instead of having 0 entries in the AA, you now have 1. And all you did was nonchalantly read the thing. If you question the sanity of such side-effects, you'd be well within your rights. For example, this is akin to saying "if I read from a byte in memory, then I expect something else to be modified". Or "if I open a file and read from it, I expect the file length to change". Would you be content with that? I would hope not. Talk about confusing! This really appears to defy the very grain of fundamental read/write assumptions. Sure, there are some rather complex and domain-specific scenarios whereby you'd /perhaps/ consider doing something similar regarding side-effects, but this AA stuff is a fundamental language type, built into the language itself ... it's only a hashmap, so there's just no excuse for such, uhhh, vagaries. If you're not convinced, please ask yourself if you'd expect this code to modify the array itself: static const char[] ca = "can't touch this"; char c = ca[0]; I haven't tried it, but I would certainly hope the array isn't modified! Note that the basic semantics are identical to the AA case, and a major factor for building AA into the language was such that it would be just an array "extension". I'd call it a "stretch" instead <g> So, the reason for writing this is twofold: 1) being the "container zone", I beg and plead with all container writers to avoid this sorry approach at all cost. After all, even if you don't agree with the lunacy aspect, you can't deny the fact that a value-lookup takes twice as long as it should (first check for existence, then perform a get). 2) containers should follow the principle of "least surprise" in all aspects. Please consider that. Maybe a third reason would be to try and get the AA semantics changed ... Naaaaaa .... that would be asking for waaaay too much.
Jul 27 2004
parent reply Ben Hinkle <bhinkle4 juno.com> writes:
Kris wrote:

 "Ben Hinkle" wrote ...
 Kris wrote:
 Well, apparently not. Now clearly I did not assign anything to the AA
 in the above code, but an entry is created regardless. That is, AA
 entries are created as a /side-effect/ of doing that lookup. Now, I've
 used many hash-table implementations before, but this behavior is truly
 a first. Hands-up please, all those who would have expected this ...

 That aside, one is forced to do a double lookup in such cases. This is


 good at all, and I simply cannot fathom what benefit there is to such
 an approach. In my particular application, the double-lookup (and, of


 double hash-computation) is entirely unacceptable. Lookup is the most
 common operation on a hash-table, by far. It has to be efficient, and


 "side-effect" totally flies in the face of reason. If it were a Phobos
 module it would have been replaced long ago.

 For the moment I have to assume the behavior is "by design", so
 fair-warning to everyone.

It is subtle if you are expecting something else.

I'd like to just ignore this and go on, but the semantics here are just too badly broken to serve as an example of what containers should do. Since this is the unofficial "container zone", I ought to say something ... To recap; referencing an associative array as an rValue modifies the array itself. That is MyClass [char[]] aa; MyClass x = aa ["key"]; if (x) ... actually /adds/ a null entry to the array. Instead of having 0 entries in the AA, you now have 1. And all you did was nonchalantly read the thing. If you question the sanity of such side-effects, you'd be well within your rights. For example, this is akin to saying "if I read from a byte in memory, then I expect something else to be modified". Or "if I open a file and read from it, I expect the file length to change". Would you be content with that? I would hope not. Talk about confusing! This really appears to defy the very grain of fundamental read/write assumptions. Sure, there are some rather complex and domain-specific scenarios whereby you'd /perhaps/ consider doing something similar regarding side-effects, but this AA stuff is a fundamental language type, built into the language itself ... it's only a hashmap, so there's just no excuse for such, uhhh, vagaries. If you're not convinced, please ask yourself if you'd expect this code to modify the array itself: static const char[] ca = "can't touch this"; char c = ca[0]; I haven't tried it, but I would certainly hope the array isn't modified! Note that the basic semantics are identical to the AA case, and a major factor for building AA into the language was such that it would be just an array "extension". I'd call it a "stretch" instead <g> So, the reason for writing this is twofold: 1) being the "container zone", I beg and plead with all container writers to avoid this sorry approach at all cost. After all, even if you don't agree with the lunacy aspect, you can't deny the fact that a value-lookup takes twice as long as it should (first check for existence, then perform a get). 2) containers should follow the principle of "least surprise" in all aspects. Please consider that. Maybe a third reason would be to try and get the AA semantics changed ... Naaaaaa .... that would be asking for waaaay too much.

Here's the dilemma: if you try to get a value that isn't in the hashmap then what should be returned? No matter what you return the user can't tell if the key is/was in the hashmap or not. So in some sense the right thing to do is throw an exception when the user tries to evaluate aa["key"] and "key" isn't in the map - this solution would be the most correct in terms of principle of least astonishment since it is analogous to asking a dynamic array for an index that is out of bounds (which throws an exception). I'm half-tempted to serious propose that instead of some hack about returning Type.init and maybe or maybe not adding it to the map. That way both STL and Java users will get a surprise! eh, maybe I need another glass of wine. :-)
Jul 27 2004
next sibling parent reply Regan Heath <regan netwin.co.nz> writes:
On Tue, 27 Jul 2004 22:49:39 -0400, Ben Hinkle <bhinkle4 juno.com> wrote:

 Kris wrote:

 "Ben Hinkle" wrote ...
 Kris wrote:
 Well, apparently not. Now clearly I did not assign anything to the AA
 in the above code, but an entry is created regardless. That is, AA
 entries are created as a /side-effect/ of doing that lookup. Now, 

 used many hash-table implementations before, but this behavior is 

 a first. Hands-up please, all those who would have expected this ...

 That aside, one is forced to do a double lookup in such cases. This 


 good at all, and I simply cannot fathom what benefit there is to such
 an approach. In my particular application, the double-lookup (and, of


 double hash-computation) is entirely unacceptable. Lookup is the most
 common operation on a hash-table, by far. It has to be efficient, and


 "side-effect" totally flies in the face of reason. If it were a 

 module it would have been replaced long ago.

 For the moment I have to assume the behavior is "by design", so
 fair-warning to everyone.

It is subtle if you are expecting something else.

I'd like to just ignore this and go on, but the semantics here are just too badly broken to serve as an example of what containers should do. Since this is the unofficial "container zone", I ought to say something ... To recap; referencing an associative array as an rValue modifies the array itself. That is MyClass [char[]] aa; MyClass x = aa ["key"]; if (x) ... actually /adds/ a null entry to the array. Instead of having 0 entries in the AA, you now have 1. And all you did was nonchalantly read the thing. If you question the sanity of such side-effects, you'd be well within your rights. For example, this is akin to saying "if I read from a byte in memory, then I expect something else to be modified". Or "if I open a file and read from it, I expect the file length to change". Would you be content with that? I would hope not. Talk about confusing! This really appears to defy the very grain of fundamental read/write assumptions. Sure, there are some rather complex and domain-specific scenarios whereby you'd /perhaps/ consider doing something similar regarding side-effects, but this AA stuff is a fundamental language type, built into the language itself ... it's only a hashmap, so there's just no excuse for such, uhhh, vagaries. If you're not convinced, please ask yourself if you'd expect this code to modify the array itself: static const char[] ca = "can't touch this"; char c = ca[0]; I haven't tried it, but I would certainly hope the array isn't modified! Note that the basic semantics are identical to the AA case, and a major factor for building AA into the language was such that it would be just an array "extension". I'd call it a "stretch" instead <g> So, the reason for writing this is twofold: 1) being the "container zone", I beg and plead with all container writers to avoid this sorry approach at all cost. After all, even if you don't agree with the lunacy aspect, you can't deny the fact that a value-lookup takes twice as long as it should (first check for existence, then perform a get). 2) containers should follow the principle of "least surprise" in all aspects. Please consider that. Maybe a third reason would be to try and get the AA semantics changed ... Naaaaaa .... that would be asking for waaaay too much.

Here's the dilemma: if you try to get a value that isn't in the hashmap then what should be returned? No matter what you return the user can't tell if the key is/was in the hashmap or not. So in some sense the right thing to do is throw an exception when the user tries to evaluate aa["key"] and "key" isn't in the map - this solution would be the most correct in terms of principle of least astonishment since it is analogous to asking a dynamic array for an index that is out of bounds (which throws an exception). I'm half-tempted to serious propose that instead of some hack about returning Type.init and maybe or maybe not adding it to the map.

I think this is the best solution. In addition you can supply a method like if (aa.contains("key",value)) which returns true/false and assigns value if found. So those who do not want to catch an exception can call that one instead. Personally I think I'd always want to use the contains method instead of aa["key"]. If you dont like the name contains you can use anything else i.e. find, get, lookup
 That way both STL and Java users will get a surprise!

At least it's an immediate surprise and not one you don't realise until later, as it was for Kris with the current behaviour.
 eh, maybe I need another glass of wine. :-)

Don't we all.. Regan -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jul 27 2004
next sibling parent reply "Kris" <someidiot earthlink.dot.dot.dot.net> writes:
"Regan Heath"  wrote ..
 I think this is the best solution. In addition you can supply a method

 if (aa.contains("key",value))

 which returns true/false and assigns value if found. So those who do not
 want to catch an exception can call that one instead. Personally I think
 I'd always want to use the contains method instead of aa["key"].

Good point Regan; Unfortunately, the "value" may be expensive to construct. If so, then you may end up constructing it for nothing (because the AA already has an instance). A mutating method/property such as this might get away with a delegate instead, to create the missing "value". Might call it createWhereMissing(), or something equally hideous <g> For a large percentage of cases, I think one can get away with excluding null as a valid "value", and returning that for failed queries/lookups. There again, the latter wouldn't necessarily work for all AA types (though it would for many other containers). Here's to that glass that Ben was suggesting!
Jul 27 2004
parent reply Regan Heath <regan netwin.co.nz> writes:
On Tue, 27 Jul 2004 21:15:44 -0700, Kris 
<someidiot earthlink.dot.dot.dot.net> wrote:

 "Regan Heath"  wrote ..
 I think this is the best solution. In addition you can supply a method

 if (aa.contains("key",value))

 which returns true/false and assigns value if found. So those who do not
 want to catch an exception can call that one instead. Personally I think
 I'd always want to use the contains method instead of aa["key"].

Good point Regan; Unfortunately, the "value" may be expensive to construct. If so, then you may end up constructing it for nothing (because the AA already has an instance).

The only case I can think of is a large struct. In which case you should probably be storing pointers to them in the AA. eg. struct Foo { ..lots.. } Foo*[char[]] aa; Foo* value; if (aa.contains("regan",value)) .. If you're not storing pointers then it's going to be slow/expensive to simply add/assign one to the aa in the first place.
 A mutating method/property such as this might get away with a
 delegate instead, to create the missing "value". Might call it
 createWhereMissing(), or something equally hideous <g>

In the instance where you want to look for something _and_ add it if it does not exist I believe a different 'find' function should be used, one that returns the insertion location eg. if (aa.contains("regan",at)) ..value exists.. else { aa.add("regan",value,at); //add value at 'at' } is a better solution.
 For a large percentage of cases, I think one can get away with excluding
 null as a valid "value", and returning that for failed queries/lookups.

Perhaps, but the fact that it does not handle basic types (without a wrapper class) makes it a poor soln IMO.
 There again, the latter wouldn't necessarily work for all AA types 
 (though it would for many other containers).
 Here's to that glass that Ben was suggesting!

Regan. -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jul 27 2004
parent reply "Kris" <someidiot earthlink.dot.dot.dot.net> writes:
1) Oh, I misread your initial post. Sorry. So, when you say "returns
true/false and assigns a value if found" I guess you mean this is a kind of
"replace" function instead? And returns whether the operation took place? If
so, then that's very useful. Still leaves the inverse scenario open though,
where you want to retrieve the value of something from the container without
mutating (egad!) at all: a very common case. With the current syntax you
have to check first, then do the retrieval: two lookups.

2) The problem with those "insertion location" style APIs is one of
atomicity ... if something throws an exception in-between, then all bets are
off. Definately works well for some environments though.


"Regan Heath"  wrote...

 The only case I can think of is a large struct. In which case you should
 probably be storing pointers to them in the AA. eg.

 struct Foo {
    ..lots..
 }

 Foo*[char[]] aa;
 Foo* value;

 if (aa.contains("regan",value)) ..

 If you're not storing pointers then it's going to be slow/expensive to
 simply add/assign one to the aa in the first place.

Jul 27 2004
parent reply Regan Heath <regan netwin.co.nz> writes:
On Tue, 27 Jul 2004 22:03:17 -0700, Kris 
<someidiot earthlink.dot.dot.dot.net> wrote:
 1) Oh, I misread your initial post. Sorry. So, when you say "returns
 true/false and assigns a value if found" I guess you mean this is a kind 
 of
 "replace" function instead?

No, no, no.. I'm just not making myself clear methinks. Two different functions, one: bool contains(key_type key, value_type value); looks for key, returns true/false (existance of the value) and assigns value on true. The other: bool contains(key_type key, index_type at); looks for key, returns true/false (existance of the value) and gives an index to the value which you can use to: a) retrieve the value (if it exists) b) replace the value with a new one (if it exists) c) insert a value (if it does not exist)
 And returns whether the operation took place? If
 so, then that's very useful. Still leaves the inverse scenario open 
 though,
 where you want to retrieve the value of something from the container 
 without
 mutating (egad!) at all: a very common case. With the current syntax you
 have to check first, then do the retrieval: two lookups.

Not using my idea above :)
 2) The problem with those "insertion location" style APIs is one of
 atomicity ... if something throws an exception in-between, then all bets 
 are
 off. Definately works well for some environments though.

I realise this, I don't think it should ever mutate on read/lookup operations. Regan
 "Regan Heath"  wrote...

 The only case I can think of is a large struct. In which case you should
 probably be storing pointers to them in the AA. eg.

 struct Foo {
    ..lots..
 }

 Foo*[char[]] aa;
 Foo* value;

 if (aa.contains("regan",value)) ..

 If you're not storing pointers then it's going to be slow/expensive to
 simply add/assign one to the aa in the first place.


-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jul 28 2004
parent reply Regan Heath <regan netwin.co.nz> writes:
On Thu, 29 Jul 2004 09:26:03 +1200, Regan Heath <regan netwin.co.nz> wrote:

 On Tue, 27 Jul 2004 22:03:17 -0700, Kris 
 <someidiot earthlink.dot.dot.dot.net> wrote:
 1) Oh, I misread your initial post. Sorry. So, when you say "returns
 true/false and assigns a value if found" I guess you mean this is a 
 kind of
 "replace" function instead?

No, no, no.. I'm just not making myself clear methinks. Two different functions, one: bool contains(key_type key, value_type value); looks for key, returns true/false (existance of the value) and assigns value on true.

by 'assigns' I mean _gets_ the value _from_ the aa and _assigns_ it to the out variable 'value', this is a 'get a value' function, not a 'set a value' function. It never mutates the aa. My function definition is missing the 'out' here: bool contains(key_type key, out value_type value);
 The other:

    bool contains(key_type key, index_type at);

This should be: bool contains(key_type key, out index_type at);
 looks for key, returns true/false (existance of the value) and gives an 
 index to the value which you can use to:
 a) retrieve the value (if it exists)
 b) replace the value with a new one (if it exists)
 c) insert a value (if it does not exist)

 And returns whether the operation took place? If
 so, then that's very useful. Still leaves the inverse scenario open 
 though,
 where you want to retrieve the value of something from the container 
 without
 mutating (egad!) at all: a very common case. With the current syntax you
 have to check first, then do the retrieval: two lookups.

Not using my idea above :)
 2) The problem with those "insertion location" style APIs is one of
 atomicity ... if something throws an exception in-between, then all 
 bets are
 off. Definately works well for some environments though.

I realise this, I don't think it should ever mutate on read/lookup operations. Regan
 "Regan Heath"  wrote...

 The only case I can think of is a large struct. In which case you 
 should
 probably be storing pointers to them in the AA. eg.

 struct Foo {
    ..lots..
 }

 Foo*[char[]] aa;
 Foo* value;

 if (aa.contains("regan",value)) ..

 If you're not storing pointers then it's going to be slow/expensive to
 simply add/assign one to the aa in the first place.



-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jul 28 2004
next sibling parent "Kris" <someidiot earthlink.dot.dot.dot.net> writes:
Yep - that's where the confusion was Regan. Without the 'out' declaration, I
was reading your "assignment" phrase as assigning /into/ the AA instead.
Sorry about that: I clearly had far too much wine at the time :-)

And yes; that is a much better way to do than the strategy adopted by AA.

- Kris


"Regan Heath" <regan netwin.co.nz> wrote in message
news:opsbu95dvy5a2sq9 digitalmars.com...
 On Thu, 29 Jul 2004 09:26:03 +1200, Regan Heath <regan netwin.co.nz>

 On Tue, 27 Jul 2004 22:03:17 -0700, Kris
 <someidiot earthlink.dot.dot.dot.net> wrote:
 1) Oh, I misread your initial post. Sorry. So, when you say "returns
 true/false and assigns a value if found" I guess you mean this is a
 kind of
 "replace" function instead?

No, no, no.. I'm just not making myself clear methinks. Two different functions, one: bool contains(key_type key, value_type value); looks for key, returns true/false (existance of the value) and assigns value on true.

by 'assigns' I mean _gets_ the value _from_ the aa and _assigns_ it to the out variable 'value', this is a 'get a value' function, not a 'set a value' function. It never mutates the aa. My function definition is missing the 'out' here: bool contains(key_type key, out value_type value);
 The other:

    bool contains(key_type key, index_type at);

This should be: bool contains(key_type key, out index_type at);
 looks for key, returns true/false (existance of the value) and gives an
 index to the value which you can use to:
 a) retrieve the value (if it exists)
 b) replace the value with a new one (if it exists)
 c) insert a value (if it does not exist)

 And returns whether the operation took place? If
 so, then that's very useful. Still leaves the inverse scenario open
 though,
 where you want to retrieve the value of something from the container
 without
 mutating (egad!) at all: a very common case. With the current syntax



 have to check first, then do the retrieval: two lookups.

Not using my idea above :)
 2) The problem with those "insertion location" style APIs is one of
 atomicity ... if something throws an exception in-between, then all
 bets are
 off. Definately works well for some environments though.

I realise this, I don't think it should ever mutate on read/lookup operations. Regan
 "Regan Heath"  wrote...

 The only case I can think of is a large struct. In which case you
 should
 probably be storing pointers to them in the AA. eg.

 struct Foo {
    ..lots..
 }

 Foo*[char[]] aa;
 Foo* value;

 if (aa.contains("regan",value)) ..

 If you're not storing pointers then it's going to be slow/expensive to
 simply add/assign one to the aa in the first place.



-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

Jul 28 2004
prev sibling parent reply Ben Hinkle <bhinkle4 juno.com> writes:
I've added the following function to the collection of "array helper"
functions in mintl/array.d:

/** Query if an associative array contains a given key and if
 * so set the output value and return true and otherwise return
 * false without adding the key to the map or setting the value 
 * parameter. This mechanism is faster than making two lookups:
 * one to see if the key is in the map and another to get the
 * value. Compiler-dependent.
 * \param x the array to query
 * \param key the key to look up
 * \param value the output value if any
 * \return true if the key is in the map and false otherwise
 */
template contains(K,V) {
  bool contains(V[K] x, K key, inout V value) {
    ...
  }
}
unittest {
  double[int] c;
  c[100] = 1.1;
  c[300] = 2.2;
  c[-100] = 3.3;
  double v;
  assert( contains!(int,double)(c,-100,v) );
  assert( v == 3.3 );
  assert( !contains!(int,double)(c,200,v) );
  assert( !(200 in c) );
}

I haven't added the second version of "contains" since the index_type
doesn't exist yet.
-Ben

Regan Heath wrote:

 On Thu, 29 Jul 2004 09:26:03 +1200, Regan Heath <regan netwin.co.nz>
 wrote:
 
 On Tue, 27 Jul 2004 22:03:17 -0700, Kris
 <someidiot earthlink.dot.dot.dot.net> wrote:
 1) Oh, I misread your initial post. Sorry. So, when you say "returns
 true/false and assigns a value if found" I guess you mean this is a
 kind of
 "replace" function instead?

No, no, no.. I'm just not making myself clear methinks. Two different functions, one: bool contains(key_type key, value_type value); looks for key, returns true/false (existance of the value) and assigns value on true.

by 'assigns' I mean _gets_ the value _from_ the aa and _assigns_ it to the out variable 'value', this is a 'get a value' function, not a 'set a value' function. It never mutates the aa. My function definition is missing the 'out' here: bool contains(key_type key, out value_type value);
 The other:

    bool contains(key_type key, index_type at);

This should be: bool contains(key_type key, out index_type at);
 looks for key, returns true/false (existance of the value) and gives an
 index to the value which you can use to:
 a) retrieve the value (if it exists)
 b) replace the value with a new one (if it exists)
 c) insert a value (if it does not exist)

 And returns whether the operation took place? If
 so, then that's very useful. Still leaves the inverse scenario open
 though,
 where you want to retrieve the value of something from the container
 without
 mutating (egad!) at all: a very common case. With the current syntax you
 have to check first, then do the retrieval: two lookups.

Not using my idea above :)
 2) The problem with those "insertion location" style APIs is one of
 atomicity ... if something throws an exception in-between, then all
 bets are
 off. Definately works well for some environments though.

I realise this, I don't think it should ever mutate on read/lookup operations. Regan
 "Regan Heath"  wrote...

 The only case I can think of is a large struct. In which case you
 should
 probably be storing pointers to them in the AA. eg.

 struct Foo {
    ..lots..
 }

 Foo*[char[]] aa;
 Foo* value;

 if (aa.contains("regan",value)) ..

 If you're not storing pointers then it's going to be slow/expensive to
 simply add/assign one to the aa in the first place.




Jul 29 2004
parent Regan Heath <regan netwin.co.nz> writes:
On Thu, 29 Jul 2004 05:07:16 -0400, Ben Hinkle <bhinkle4 juno.com> wrote:

 I've added the following function to the collection of "array helper"
 functions in mintl/array.d:

 /** Query if an associative array contains a given key and if
  * so set the output value and return true and otherwise return
  * false without adding the key to the map or setting the value
  * parameter. This mechanism is faster than making two lookups:
  * one to see if the key is in the map and another to get the
  * value. Compiler-dependent.
  * \param x the array to query
  * \param key the key to look up
  * \param value the output value if any
  * \return true if the key is in the map and false otherwise
  */
 template contains(K,V) {
   bool contains(V[K] x, K key, inout V value) {
     ...
   }
 }
 unittest {
   double[int] c;
   c[100] = 1.1;
   c[300] = 2.2;
   c[-100] = 3.3;
   double v;
   assert( contains!(int,double)(c,-100,v) );
   assert( v == 3.3 );
   assert( !contains!(int,double)(c,200,v) );
   assert( !(200 in c) );
 }

Perfect :)
 I haven't added the second version of "contains" since the index_type
 doesn't exist yet.

I'm not sure an index_type makes sense for an AA, it does for other container types, but for an AA isn't the key the index type, or maybe the hashed value of the key is the index type? Regan
 -Ben

 Regan Heath wrote:

 On Thu, 29 Jul 2004 09:26:03 +1200, Regan Heath <regan netwin.co.nz>
 wrote:

 On Tue, 27 Jul 2004 22:03:17 -0700, Kris
 <someidiot earthlink.dot.dot.dot.net> wrote:
 1) Oh, I misread your initial post. Sorry. So, when you say "returns
 true/false and assigns a value if found" I guess you mean this is a
 kind of
 "replace" function instead?

No, no, no.. I'm just not making myself clear methinks. Two different functions, one: bool contains(key_type key, value_type value); looks for key, returns true/false (existance of the value) and assigns value on true.

by 'assigns' I mean _gets_ the value _from_ the aa and _assigns_ it to the out variable 'value', this is a 'get a value' function, not a 'set a value' function. It never mutates the aa. My function definition is missing the 'out' here: bool contains(key_type key, out value_type value);
 The other:

    bool contains(key_type key, index_type at);

This should be: bool contains(key_type key, out index_type at);
 looks for key, returns true/false (existance of the value) and gives an
 index to the value which you can use to:
 a) retrieve the value (if it exists)
 b) replace the value with a new one (if it exists)
 c) insert a value (if it does not exist)

 And returns whether the operation took place? If
 so, then that's very useful. Still leaves the inverse scenario open
 though,
 where you want to retrieve the value of something from the container
 without
 mutating (egad!) at all: a very common case. With the current syntax 
 you
 have to check first, then do the retrieval: two lookups.

Not using my idea above :)
 2) The problem with those "insertion location" style APIs is one of
 atomicity ... if something throws an exception in-between, then all
 bets are
 off. Definately works well for some environments though.

I realise this, I don't think it should ever mutate on read/lookup operations. Regan
 "Regan Heath"  wrote...

 The only case I can think of is a large struct. In which case you
 should
 probably be storing pointers to them in the AA. eg.

 struct Foo {
    ..lots..
 }

 Foo*[char[]] aa;
 Foo* value;

 if (aa.contains("regan",value)) ..

 If you're not storing pointers then it's going to be slow/expensive 
 to
 simply add/assign one to the aa in the first place.





-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jul 29 2004
prev sibling parent reply "Ben Hinkle" <bhinkle mathworks.com> writes:
 I think this is the best solution. In addition you can supply a method

 if (aa.contains("key",value))

 which returns true/false and assigns value if found. So those who do not
 want to catch an exception can call that one instead. Personally I think
 I'd always want to use the contains method instead of aa["key"].

 If you dont like the name contains you can use anything else i.e. find,
 get, lookup

Seems reasonable to me - though the name "contains" doesn't imply anything about setting values so maybe something like "trySet" would be better. Also I could see a "tryGet" function that looks up the key and if present sets the output variable and returns true and otherwise returns false.
 That way both STL and Java users will get a surprise!

At least it's an immediate surprise and not one you don't realise until later, as it was for Kris with the current behaviour.

very true. subtle bugs can be very frustrating.
 eh, maybe I need another glass of wine. :-)

Don't we all.. Regan -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

Jul 28 2004
parent Regan Heath <regan netwin.co.nz> writes:
On Wed, 28 Jul 2004 10:17:57 -0400, Ben Hinkle <bhinkle mathworks.com> 
wrote:
 I think this is the best solution. In addition you can supply a method

 if (aa.contains("key",value))

 which returns true/false and assigns value if found. So those who do not
 want to catch an exception can call that one instead. Personally I think
 I'd always want to use the contains method instead of aa["key"].

 If you dont like the name contains you can use anything else i.e. find,
 get, lookup

Seems reasonable to me - though the name "contains" doesn't imply anything about setting values

True, I think 'get' is the best name.
 so maybe something like "trySet" would be better.

I think we should avoid the word 'set' it implies to me we're going to add something to the aa, which we're not in this case.
 Also
 I could see a "tryGet" function that looks up the key and if present sets
 the output variable and returns true and otherwise returns false.

That is the function I am proposing.. it seems I have not been making myself very clear. :) Regan
 That way both STL and Java users will get a surprise!

At least it's an immediate surprise and not one you don't realise until later, as it was for Kris with the current behaviour.

very true. subtle bugs can be very frustrating.
 eh, maybe I need another glass of wine. :-)

Don't we all.. Regan -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/


-- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jul 28 2004
prev sibling next sibling parent reply "Kris" <someidiot earthlink.dot.dot.dot.net> writes:
"Ben Hinkle"  wrote
 Here's the dilemma: if you try to get a value that isn't in the hashmap

 what should be returned? No matter what you return the user can't tell if
 the key is/was in the hashmap or not. So in some sense the right thing to
 do is throw an exception when the user tries to evaluate aa["key"] and
 "key" isn't in the map - this solution would be the most correct in terms
 of principle of least astonishment since it is analogous to asking a
 dynamic array for an index that is out of bounds (which throws an
 exception). I'm half-tempted to serious propose that instead of some hack
 about returning Type.init and maybe or maybe not adding it to the map.

 way both STL and Java users will get a surprise!

 eh, maybe I need another glass of wine. :-)

HaHa! Got a fairly reasonable Chateau De Armpit open here too <g> Doesn't Python throw an exception like that? Or is it Ruby? I forget ... One reasonable way out of the hole is to stipulate (as part of the API contract) that a "value" of null is illegal. That way, a null return signifies the entry does not exist. You can then happily do a get(), and test the return. That gets around the double-lookup issue (for a non-mutating lookup). A value-type of pointer, array or class would support a null return. From recollection, this is what Java does. Unfortunately, you can't do that for simple ints and chars (without wrappers), so there's the rub. OT: meant to ask if you or Mike were gonna' work on porting D.L's ConcurrentHashMap for your container-set ... I'd pay money for that :-)
Jul 27 2004
parent Sam McCall <tunah tunah.net> writes:
Kris wrote:

 HaHa! Got a fairly reasonable Chateau De Armpit open here too <g>
 
 Doesn't Python throw an exception like that? Or is it Ruby? I forget ...

return nil (which is the best single behaviour, but not all types support it in D. If you care about existant keys with nil values, call contains()). Hash.new(0) will create a hash that returns 0 for nonexistant keys. (Think sparse matrices...) Hash.new { |hash,key| # code } Passes a the hash and the specified key and executes the code, and returns the result, if the key isn't in the hash. For example: Hash.new { |hash,key| hash[key]=foo(key) } creates a hash that's a caching wrapper of foo. (The value of the last expression is implicitly returned). I like Ruby by the way :-)
 One reasonable way out of the hole is to stipulate (as part of the API
 contract) that a "value" of null is illegal. That way, a null return
 signifies the entry does not exist. You can then happily do a get(), and
 test the return. That gets around the double-lookup issue (for a
 non-mutating lookup). A value-type of pointer, array or class would support
 a null return. From recollection, this is what Java does.
 
 Unfortunately, you can't do that for simple ints and chars (without
 wrappers), so there's the rub.

complicated anyway that one more feature won't make a difference). I'm in a frivolous mood, so don't bother flaming me :-) int? value=array["test"] if(value?) printf("%d\n",value); Type? would be something like struct ?(T) { T value; bit present; /* implicit */ T opCast() { return value; } bit opQuery() { return present; } } Actually, I quite like the opQuery, or maybe foo? could be the "foo aint null" operator that's being sought. Meh. Sam
Jul 28 2004
prev sibling next sibling parent reply "Kris" <someidiot earthlink.dot.dot.dot.net> writes:
"Ben Hinkle"  wrote ...
 Here's the dilemma: if you try to get a value that isn't in the hashmap

 what should be returned? No matter what you return the user can't tell if
 the key is/was in the hashmap or not. So in some sense the right thing to
 do is throw an exception when the user tries to evaluate aa["key"] and
 "key" isn't in the map - this solution would be the most correct in terms
 of principle of least astonishment since it is analogous to asking a
 dynamic array for an index that is out of bounds (which throws an
 exception). I'm half-tempted to serious propose that instead of some hack
 about returning Type.init and maybe or maybe not adding it to the map.

 way both STL and Java users will get a surprise!

 eh, maybe I need another glass of wine. :-)

Perhaps I've had too many recently ... 8~} How about returning a simple true/false from a lookup instead? One can use an 'out' argument for the "value" if there's one present, yes? Not the most elegant solution, or perhaps the most efficient; but it's a whole lot better than two lookups, or the dreaded (cue music ...) *mutating rValue* ... bool get (K key, out V value);
Jul 27 2004
parent reply Regan Heath <regan netwin.co.nz> writes:
On Tue, 27 Jul 2004 22:40:02 -0700, Kris 
<someidiot earthlink.dot.dot.dot.net> wrote:
 "Ben Hinkle"  wrote ...
 Here's the dilemma: if you try to get a value that isn't in the hashmap

 what should be returned? No matter what you return the user can't tell 
 if
 the key is/was in the hashmap or not. So in some sense the right thing 
 to
 do is throw an exception when the user tries to evaluate aa["key"] and
 "key" isn't in the map - this solution would be the most correct in 
 terms
 of principle of least astonishment since it is analogous to asking a
 dynamic array for an index that is out of bounds (which throws an
 exception). I'm half-tempted to serious propose that instead of some 
 hack
 about returning Type.init and maybe or maybe not adding it to the map.

 way both STL and Java users will get a surprise!

 eh, maybe I need another glass of wine. :-)

Perhaps I've had too many recently ... 8~} How about returning a simple true/false from a lookup instead? One can use an 'out' argument for the "value" if there's one present, yes?

That is what I've been trying to suggest for the last 5 posts!
 Not the most
 elegant solution, or perhaps the most efficient; but it's a whole lot 
 better
 than two lookups, or the dreaded (cue music ...) *mutating rValue* ...

I think it's a very elegant solution, it handles all value types, and can be placed in an if statement like so: if (aa.get("regan",value)) { }
 bool get (K key, out V value);

Regan -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jul 28 2004
parent reply Gold Dragon <dragonwing dragonu.net> writes:
What ever happened to using '==' to test if something exists? I don't 
know, I guess I should leave the intelligent things to the smart people. 
When I see = then I assume that you want something to equal that value 
and not test for it. I always read in books that you use '==' instead of 
  '=' to test for something. Perhaps you can overload the '==' to see if 
something exists.
Jul 29 2004
parent reply Regan Heath <regan netwin.co.nz> writes:
On Thu, 29 Jul 2004 04:32:58 -0500, Gold Dragon <dragonwing dragonu.net> 
wrote:

 What ever happened to using '==' to test if something exists? I don't 
 know, I guess I should leave the intelligent things to the smart people. 
 When I see = then I assume that you want something to equal that value 
 and not test for it. I always read in books that you use '==' instead of 
   '=' to test for something. Perhaps you can overload the '==' to see if 
 something exists.

If you do that, how do you compare two AA's with each other? I don't mean identity eg. '===' or 'is' checking they refer to the same AA, I mean compare the actual AA. I admit this is perhaps a rare requirement, but still. Regan -- Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/
Jul 29 2004
parent Gold Dragon <dragonwing dragonu.net> writes:
Regan Heath wrote:
 If you do that, how do you compare two AA's with each other?
 
 I don't mean identity eg. '===' or 'is' checking they refer to the same 
 AA, I mean compare the actual AA. I admit this is perhaps a rare 
 requirement, but still.
 
 Regan
 

I would use it for both as I do in other languages... except Java but you can't overload operators in Java now can you? I don't know, it seems that everyone is walking around in circles trying to find the way out. Maybe you can like, make a function or something and call it 'exists'. Perhaps I'm missing the point and don't understand fully what goes into Hashes but having used them from Perl, I can say that it seems kind of strange. I've also used AA from in PHP and this is what I would to test if there is a value. I would use a foreach loop to get the key, value pair.
Jul 30 2004
prev sibling parent Juanjo =?ISO-8859-15?Q?=C1lvarez?= <juanjuxNO SPAMyahoo.es> writes:
Ben Hinkle wrote:

 eh, maybe I need another glass of wine. :-)

You don't, that's exactly what Python does when you try to access and unexistent key on a dict: -------------------------
 d = {}
 d["akey"] = "avalue"
 print d["akey"]



 print d["nokey"]



File "<stdin>", line 1, in ? KeyError: 'nokey' ------------------------- So the usual idiom to access an array that could or not could have a key is: if "akey" in d: print d["akey"] A nice method of python dicts is setdefault where you can specify a default value so if the key is not in the dictionary the default value is returned: ----------------------------------
 print d.setdefault("akey", "defaultvalue")



 print d.setdefault("nokey", "defaultvalue")



Jul 29 2004
prev sibling parent reply "Matthew" <admin.hat stlsoft.dot.org> writes:
Basically, I've found the STL-way of doing things - which is what D's AAs apes
- painful 75% of the time, and useful 25%
of the time. I've no doubt the same will apply to D's AAs.

It's especially painful in STL, since find has to be tested against end(), and
then subject to (*??).second. Hateful
stuff.

Basically, there's no way to have your cake and eat it, either in C++ or D,
when overloading operators for non built-in
types (and even built-in types, so it seems!). (In my book - out in 6 weeks,
not plugged for a while - I show how
overloading the subscript operators for array types can never get full semantic
conformance with built-in arrays. <g>)

If we want to be pedantically correct, I'd suggest that we abandon the
subscript operators for associative arrays, but
the logical correlation of that is that we then abandon built-in AAs (which I
would also think is probably the best
thing). But D's not really about doing what's pedantically correct, and it's
sometimes tempting to think that it doesn't
even do things that are sensibly correct.

If/when I can ever get DTL compiled, I shall henceforth use the Map template,
which *is* able to support the distinction
between opIndexAssign (creates a new element, or updates an existing one) and
opIndex (which throws a NotFoundException
exception if the element does not exist). However, I'm still not entirely
comfortable given the fact that something so
syntactically similar has such different semantics.

:-(

Matthew


"Kris" <someidiot earthlink.dot.dot.dot.net> wrote in message
news:ce3tul$1h6o$1 digitaldaemon.com...
 So, I've been happily thinking this whole time that associative arrays (AA)
 in D are rather cool; being built into the language and so on ... then today
 I spent several hours tracking down what seemed to be a truly bizarre bug,
 which turned out to be caused by the way D associative arrays operate.
 Here's what happened:

 I want to check if there's an entry in the AA; use it if so, otherwise
 continue. The documentation say to use this construct for checking:

 class X
 {
      MyClass [char[]] aa;

     void test ()
     {
          if ("hello" in aa)
              ....
      }
 }

 and then (presumably) you have to subsequently retrieve the entry as
 follows:

     void test ()
     {
          if ("hello" in aa)
             {
             MyClass x = aa ["hello"];
             // do something with x ...
             }
      }

 Of course, you've just performed two full lookups instead of one: first you
 look to see if the entry exists, then you lookup again to retrieve it. Being
 the freak that I am, I don't want that overhead. So ... I proceeded with
 this instead:

     void test ()
     {
          MyClass x = aa ["hello"];

          if (x)
             {
             // do something with x ...
             }
      }

 Much better! Right?

 Well, apparently not. Now clearly I did not assign anything to the AA in the
 above code, but an entry is created regardless. That is, AA entries are
 created as a /side-effect/ of doing that lookup. Now, I've used many
 hash-table implementations before, but this behavior is truly a first.
 Hands-up please, all those who would have expected this ...

 This sneaky behavior really wasted a whole lot of effort (I can almost hear
 Walter laughing in the background :-)

 That aside, one is forced to do a double lookup in such cases. This is not
 good at all, and I simply cannot fathom what benefit there is to such an
 approach. In my particular application, the double-lookup (and, of course,
 double hash-computation) is entirely unacceptable. Lookup is the most common
 operation on a hash-table, by far. It has to be efficient, and this
 "side-effect" totally flies in the face of reason. If it were a Phobos
 module it would have been replaced long ago.

 For the moment I have to assume the behavior is "by design", so fair-warning
 to everyone.

 (Ben? Do you have a proper hash-table I can use please? If not, I'll do an
 FNV port; AA's be damned. Out! Damn spot! Out, I say!)

Jul 29 2004
parent reply Gold Dragon <dragonwing dragonu.net> writes:
I need to buy your damn book to see what you're talking about. Seems 
interesting but seeing as it is a Computer book it would probably cost 
an arm and a leg.
Jul 30 2004
parent reply "Matthew" <admin.hat stlsoft.dot.org> writes:
"Gold Dragon" <dragonwing dragonu.net> wrote in message
news:cectp4$2a75$2 digitaldaemon.com...
 I need to buy your damn book to see what you're talking about. Seems
 interesting but seeing as it is a Computer book it would probably cost
 an arm and a leg.

He he. I think it's going to be around US$45 (The cover thinks so, anyway: http://synesis.com.au/publishing/imperfect/cpp/impcpp_cover.png)
Jul 30 2004
parent reply Gold Dragon <dragonwing dragonu.net> writes:
 He he. I think it's going to be around US$45 (The cover thinks so, anyway:
 http://synesis.com.au/publishing/imperfect/cpp/impcpp_cover.png)

You plug your book well (damn you :) ) but $45 dollars is a little more than I'm willing to spend for a book that will most likely sit on my shelf until the second edition comes out and the errata has been fixed and added. That isn't to say I'm not going to buy it as there is a 90% chance I will.
Jul 30 2004
parent "Matthew" <admin.hat stlsoft.dot.org> writes:
"Gold Dragon" <dragonwing dragonu.net> wrote in message
news:cedbhm$2gom$1 digitaldaemon.com...
 He he. I think it's going to be around US$45 (The cover thinks so, anyway:
 http://synesis.com.au/publishing/imperfect/cpp/impcpp_cover.png)

You plug your book well (damn you :) ) but $45 dollars is a little more than I'm willing to spend for a book that will most likely sit on my shelf until the second edition comes out and the errata has been fixed and added. That isn't to say I'm not going to buy it as there is a 90% chance I will.

You can get freebies. All you've got to do is get a contract for writing with Addison-Wesley. LOL!
Jul 30 2004