digitalmars.D.dtl - associative-arrays ate my lunch (and my support)

Kris (57/57) Jul 26 2004 So, I've been happily thinking this whole time that associative arrays (...

Sean Kelly (10/22) Jul 26 2004 Not I. Though I'd still want to look at the D code that's doing this be...

Regan Heath (22/55) Jul 26 2004 The problem is, reference types like pointers, arrays, and classes can b...
Ben Hinkle (7/39) Jul 26 2004 The value put into the array is null (more generally the Type.init value...

Sean Kelly (4/19) Jul 26 2004 Oops. I misread Kris' post. Yes, an entry will be created as a side ef...
Kris (4/5) Jul 26 2004 so

Ben Hinkle (9/16) Jul 26 2004 The only reason I can think of is that it makes the code for AA's a litt...

Ben Hinkle (15/83) Jul 26 2004 I don't know about hash tables, but for STL maps I think the semantics a...

Kris (10/19) Jul 26 2004 Yes, I followed your post on that with interest. The problem I have is t...
Kris (42/59) Jul 27 2004 course,

Ben Hinkle (12/89) Jul 27 2004 Here's the dilemma: if you try to get a value that isn't in the hashmap ...

Regan Heath (14/119) Jul 27 2004 I think this is the best solution. In addition you can supply a method l...

Kris (13/18) Jul 27 2004 Good point Regan;

Regan Heath (25/46) Jul 27 2004 The only case I can think of is a large struct. In which case you should...

Kris (11/21) Jul 27 2004 1) Oh, I misread your initial post. Sorry. So, when you say "returns

Regan Heath (20/51) Jul 28 2004 No, no, no.. I'm just not making myself clear methinks.

Regan Heath (10/63) Jul 28 2004 by 'assigns' I mean _gets_ the value _from_ the aa and _assigns_ it to t...

Kris (9/86) Jul 28 2004 Yep - that's where the confusion was Regan. Without the 'out' declaratio...
Ben Hinkle (33/119) Jul 29 2004 I've added the following function to the collection of "array helper"

Regan Heath (8/130) Jul 29 2004 I'm not sure an index_type makes sense for an AA, it does for other

Ben Hinkle (6/21) Jul 28 2004 Seems reasonable to me - though the name "contains" doesn't imply anythi...

Regan Heath (10/41) Jul 28 2004 True, I think 'get' is the best name.

Kris (15/26) Jul 27 2004 That

Sam McCall (36/48) Jul 28 2004 Python throws an exception. Ruby will by default (Hash.new() or {})

Kris (9/20) Jul 27 2004 That

Regan Heath (10/38) Jul 28 2004 That is what I've been trying to suggest for the last 5 posts!

Gold Dragon (6/6) Jul 29 2004 What ever happened to using '==' to test if something exists? I don't

Regan Heath (9/15) Jul 29 2004 If you do that, how do you compare two AA's with each other?

Gold Dragon (10/18) Jul 30 2004 I would use it for both as I do in other languages... except Java but

Juanjo =?ISO-8859-15?Q?=C1lvarez?= (17/24) Jul 29 2004 You don't, that's exactly what Python does when you try to access and

Matthew (18/75) Jul 29 2004 Basically, I've found the STL-way of doing things - which is what D's AA...

Gold Dragon (3/3) Jul 30 2004 I need to buy your damn book to see what you're talking about. Seems

Matthew (3/6) Jul 30 2004 He he. I think it's going to be around US$45 (The cover thinks so, anywa...

Gold Dragon (5/7) Jul 30 2004 You plug your book well (damn you :) ) but $45 dollars is a little more

Matthew (2/9) Jul 30 2004 You can get freebies. All you've got to do is get a contract for writing...

"Kris" <someidiot earthlink.dot.dot.dot.net> writes:

So, I've been happily thinking this whole time that associative arrays (AA)
in D are rather cool; being built into the language and so on ... then today
I spent several hours tracking down what seemed to be a truly bizarre bug,
which turned out to be caused by the way D associative arrays operate.
Here's what happened:

I want to check if there's an entry in the AA; use it if so, otherwise
continue. The documentation say to use this construct for checking:

class X
{
     MyClass [char[]] aa;

    void test ()
    {
         if ("hello" in aa)
             ....
     }
}

and then (presumably) you have to subsequently retrieve the entry as
follows:

    void test ()
    {
         if ("hello" in aa)
            {
            MyClass x = aa ["hello"];
            // do something with x ...
            }
     }

Of course, you've just performed two full lookups instead of one: first you
look to see if the entry exists, then you lookup again to retrieve it. Being
the freak that I am, I don't want that overhead. So ... I proceeded with
this instead:

    void test ()
    {
         MyClass x = aa ["hello"];

         if (x)
            {
            // do something with x ...
            }
     }

Much better! Right?

Well, apparently not. Now clearly I did not assign anything to the AA in the
above code, but an entry is created regardless. That is, AA entries are
created as a /side-effect/ of doing that lookup. Now, I've used many
hash-table implementations before, but this behavior is truly a first.
Hands-up please, all those who would have expected this ...

This sneaky behavior really wasted a whole lot of effort (I can almost hear
Walter laughing in the background :-)

That aside, one is forced to do a double lookup in such cases. This is not
good at all, and I simply cannot fathom what benefit there is to such an
approach. In my particular application, the double-lookup (and, of course,
double hash-computation) is entirely unacceptable. Lookup is the most common
operation on a hash-table, by far. It has to be efficient, and this
"side-effect" totally flies in the face of reason. If it were a Phobos
module it would have been replaced long ago.

For the moment I have to assume the behavior is "by design", so fair-warning
to everyone.

(Ben? Do you have a proper hash-table I can use please? If not, I'll do an
FNV port; AA's be damned. Out! Damn spot! Out, I say!)

Jul 26 2004

Sean Kelly <sean f4.ca> writes:

In article <ce3tul$1h6o$1 digitaldaemon.com>, Kris says...
Well, apparently not. Now clearly I did not assign anything to the AA in the
above code, but an entry is created regardless. That is, AA entries are
created as a /side-effect/ of doing that lookup. Now, I've used many
hash-table implementations before, but this behavior is truly a first.
Hands-up please, all those who would have expected this ...

Not I.  Though I'd still want to look at the D code that's doing this because I
still don't quite believe it.  What if MyClass is not default constructible?

That aside, one is forced to do a double lookup in such cases. This is not
good at all, and I simply cannot fathom what benefit there is to such an
approach. In my particular application, the double-lookup (and, of course,
double hash-computation) is entirely unacceptable. Lookup is the most common
operation on a hash-table, by far. It has to be efficient, and this
"side-effect" totally flies in the face of reason. If it were a Phobos
module it would have been replaced long ago.

So how would you propose a built in associative array handle both class and
primitive types?  Obviously "in" can return a bit.  But what does the subscript
operator return for failed lookups on primitive types?  I would likely return
Type.init in all cases.  Besides, this is a hashtable we're talking about, not a
btree.  A double lookup might not be ideal but it's still far less expensive
than a double tree traversal.


Sean

Jul 26 2004

Regan Heath <regan netwin.co.nz> writes:

On Mon, 26 Jul 2004 22:44:23 +0000 (UTC), Sean Kelly <sean f4.ca> wrote:

 In article <ce3tul$1h6o$1 digitaldaemon.com>, Kris says...
 Well, apparently not. Now clearly I did not assign anything to the AA 
 in the
 above code, but an entry is created regardless. That is, AA entries are
 created as a /side-effect/ of doing that lookup. Now, I've used many
 hash-table implementations before, but this behavior is truly a first.
 Hands-up please, all those who would have expected this ...

 Not I.  Though I'd still want to look at the D code that's doing this 
 because I
 still don't quite believe it.  What if MyClass is not default 
 constructible?

 That aside, one is forced to do a double lookup in such cases. This is 
 not
 good at all, and I simply cannot fathom what benefit there is to such an
 approach. In my particular application, the double-lookup (and, of 
 course,
 double hash-computation) is entirely unacceptable. Lookup is the most 
 common
 operation on a hash-table, by far. It has to be efficient, and this
 "side-effect" totally flies in the face of reason. If it were a Phobos
 module it would have been replaced long ago.

 So how would you propose a built in associative array handle both class 
 and
 primitive types?  Obviously "in" can return a bit.  But what does the 
 subscript
 operator return for failed lookups on primitive types?  I would likely 
 return
 Type.init in all cases.  Besides, this is a hashtable we're talking 
 about, not a
 btree.  A double lookup might not be ideal but it's still far less 
 expensive
 than a double tree traversal.

The problem is, reference types like pointers, arrays, and classes can be 
null, value types like int, float, structs etc cannot, yet you want some 
what of saying "does not exist" for value types.

One solution is to wrap them in a reference type, i.e. a class, but this 
IMO is un-necessary baggage, instead you want to be able to call a method 
on the AA, something like:

   int[char[]] foo;
   int value;

   if (foo.find("label",value)) {
   }

where find looks something like

bool find(K key, out V value) {
   ..lookup key, assign to value..
   ..return true on success..
   ..else false..
}

This allows you to check whether it exists (the return value) and get it's 
value (inout value) all at once for both reference and value types.

Regan.

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

Jul 26 2004

Ben Hinkle <bhinkle4 juno.com> writes:

Sean Kelly wrote:

 In article <ce3tul$1h6o$1 digitaldaemon.com>, Kris says...
Well, apparently not. Now clearly I did not assign anything to the AA in
the above code, but an entry is created regardless. That is, AA entries
are created as a /side-effect/ of doing that lookup. Now, I've used many
hash-table implementations before, but this behavior is truly a first.
Hands-up please, all those who would have expected this ...

 
 Not I.  Though I'd still want to look at the D code that's doing this
 because I
 still don't quite believe it.  What if MyClass is not default
 constructible?

The value put into the array is null (more generally the Type.init value) so
no constructor needs to be called.

That aside, one is forced to do a double lookup in such cases. This is not
good at all, and I simply cannot fathom what benefit there is to such an
approach. In my particular application, the double-lookup (and, of course,
double hash-computation) is entirely unacceptable. Lookup is the most
common operation on a hash-table, by far. It has to be efficient, and this
"side-effect" totally flies in the face of reason. If it were a Phobos
module it would have been replaced long ago.

 
 So how would you propose a built in associative array handle both class
 and
 primitive types?  Obviously "in" can return a bit.  But what does the
 subscript
 operator return for failed lookups on primitive types?  I would likely
 return
 Type.init in all cases.  Besides, this is a hashtable we're talking about,
 not a
 btree.  A double lookup might not be ideal but it's still far less
 expensive than a double tree traversal.

The mintl.map implementation has a private function that performs a lookup
and if it fails returns in an inout parameter the parent node for the add
call. This is tough to do without exposing some datatype internals so it
isn't public. 

 
 Sean

Jul 26 2004

Sean Kelly <sean f4.ca> writes:

In article <ce43vb$1j7m$1 digitaldaemon.com>, Ben Hinkle says...
Sean Kelly wrote:

 In article <ce3tul$1h6o$1 digitaldaemon.com>, Kris says...
Well, apparently not. Now clearly I did not assign anything to the AA in
the above code, but an entry is created regardless. That is, AA entries
are created as a /side-effect/ of doing that lookup. Now, I've used many
hash-table implementations before, but this behavior is truly a first.
Hands-up please, all those who would have expected this ...

 
 Not I.  Though I'd still want to look at the D code that's doing this
 because I
 still don't quite believe it.  What if MyClass is not default
 constructible?

The value put into the array is null (more generally the Type.init value) so
no constructor needs to be called.

Oops.  I misread Kris' post.  Yes, an entry will be created as a side effect but
a lookup should return null for class value types.


Sean

Jul 26 2004

"Kris" <someidiot earthlink.dot.dot.dot.net> writes:

"Ben Hinkle"  wrote ...
 The value put into the array is null (more generally the Type.init value)

so

Why does it add an entry when used as an rValue? Any ideas Ben?

- Kris

Jul 26 2004

Ben Hinkle <bhinkle4 juno.com> writes:

Kris wrote:

 "Ben Hinkle"  wrote ...
 The value put into the array is null (more generally the Type.init value)

 so
 
 Why does it add an entry when used as an rValue? Any ideas Ben?
 
 - Kris

The only reason I can think of is that it makes the code for AA's a little
easier since the routine that gets a node (_aaGet in
src/phobos/internal/aaA.d) returns a pointer to the value slot and so if
the context is rvalue it assigns through the pointer and if it is lvalue it
dereferences the pointer. Having the get routine know about rvalue vs
lvalue would probably mean the context has to be passed to it or there
would be two versions of get: _aaGetRValue and _aaGetLValue.

-Ben

Jul 26 2004

Ben Hinkle <bhinkle4 juno.com> writes:

Kris wrote:

 So, I've been happily thinking this whole time that associative arrays
 (AA) in D are rather cool; being built into the language and so on ...
 then today I spent several hours tracking down what seemed to be a truly
 bizarre bug, which turned out to be caused by the way D associative arrays
 operate. Here's what happened:
 
 I want to check if there's an entry in the AA; use it if so, otherwise
 continue. The documentation say to use this construct for checking:
 
 class X
 {
      MyClass [char[]] aa;
 
     void test ()
     {
          if ("hello" in aa)
              ....
      }
 }
 
 and then (presumably) you have to subsequently retrieve the entry as
 follows:
 
     void test ()
     {
          if ("hello" in aa)
             {
             MyClass x = aa ["hello"];
             // do something with x ...
             }
      }
 
 Of course, you've just performed two full lookups instead of one: first
 you look to see if the entry exists, then you lookup again to retrieve it.
 Being the freak that I am, I don't want that overhead. So ... I proceeded
 with this instead:
 
     void test ()
     {
          MyClass x = aa ["hello"];
 
          if (x)
             {
             // do something with x ...
             }
      }
 
 Much better! Right?
 
 Well, apparently not. Now clearly I did not assign anything to the AA in
 the above code, but an entry is created regardless. That is, AA entries
 are created as a /side-effect/ of doing that lookup. Now, I've used many
 hash-table implementations before, but this behavior is truly a first.
 Hands-up please, all those who would have expected this ...

I don't know about hash tables, but for STL maps I think the semantics are
the same. The operator[] will insert if the key isn't there. For STL this
makes sense since it must return a reference to the value. Since D returns
the value without a reference it doesn't have to insert it into the table.
So it does seem surprising that it also inserts. I can live with either
way, though, since the value returned is Type.init.

 This sneaky behavior really wasted a whole lot of effort (I can almost
 hear Walter laughing in the background :-)
 
 That aside, one is forced to do a double lookup in such cases. This is not
 good at all, and I simply cannot fathom what benefit there is to such an
 approach. In my particular application, the double-lookup (and, of course,
 double hash-computation) is entirely unacceptable. Lookup is the most
 common operation on a hash-table, by far. It has to be efficient, and this
 "side-effect" totally flies in the face of reason. If it were a Phobos
 module it would have been replaced long ago.

Hash tables are O(1) at least. I tried many variations on D's AA data
structure to try to speed it up across the board and couldn't get anything
that performed as well for a range of applications. Things I tried mostly
were around preallocating and changing the string hash function. There are
undoubtably faster implementations for specific needs but all around I'd
say the AA in D is as good a work-horse as one can expect. 

 For the moment I have to assume the behavior is "by design", so
 fair-warning to everyone.

It is subtle if you are expecting something else.

 (Ben? Do you have a proper hash-table I can use please? If not, I'll do an
 FNV port; AA's be damned. Out! Damn spot! Out, I say!)

I don't have another hash-table in mind. What is FNV? "frickin no vay"? ;-)

Jul 26 2004

"Kris" <someidiot earthlink.dot.dot.dot.net> writes:

"Ben Hinkle" <bhinkle4 juno.com> wrote in message
news:ce4gf5$1nbl$1 digitaldaemon.com...
 Hash tables are O(1) at least. I tried many variations on D's AA data
 structure to try to speed it up across the board and couldn't get anything
 that performed as well for a range of applications. Things I tried mostly
 were around preallocating and changing the string hash function. There are
 undoubtably faster implementations for specific needs but all around I'd
 say the AA in D is as good a work-horse as one can expect.

Yes, I followed your post on that with interest. The problem I have is the
2*O(1) aspect. If I'm using hash strings of any reasonable size, that adds
up pretty quickly. In this case, the keys can be long.

 (Ben? Do you have a proper hash-table I can use please? If not, I'll do


an
 FNV port; AA's be damned. Out! Damn spot! Out, I say!)

 I don't have another hash-table in mind. What is FNV? "frickin no vay"?

;-)

It's worse than that <g>  http://www.isthe.com/chongo/tech/comp/fnv/

Do you think Walter would add my port of this to Phobos?
(chortle! chortle!)

Jul 26 2004

"Kris" <someidiot earthlink.dot.dot.dot.net> writes:

"Ben Hinkle" wrote ...
 Kris wrote:
 Well, apparently not. Now clearly I did not assign anything to the AA in
 the above code, but an entry is created regardless. That is, AA entries
 are created as a /side-effect/ of doing that lookup. Now, I've used many
 hash-table implementations before, but this behavior is truly a first.
 Hands-up please, all those who would have expected this ...

 That aside, one is forced to do a double lookup in such cases. This is


not
 good at all, and I simply cannot fathom what benefit there is to such an
 approach. In my particular application, the double-lookup (and, of


course,
 double hash-computation) is entirely unacceptable. Lookup is the most
 common operation on a hash-table, by far. It has to be efficient, and


this
 "side-effect" totally flies in the face of reason. If it were a Phobos
 module it would have been replaced long ago.

 For the moment I have to assume the behavior is "by design", so
 fair-warning to everyone.

 It is subtle if you are expecting something else.

I'd like to just ignore this and go on, but the semantics here are just too
badly broken to serve as an example of what containers should do. Since this
is the unofficial "container zone", I ought to say something ...

To recap; referencing an associative array as an rValue modifies the array
itself. That is

MyClass [char[]]  aa;

MyClass x = aa ["key"];
if (x)
    ...

actually /adds/ a null entry to the array. Instead of having 0 entries in
the AA, you now have 1. And all you did was nonchalantly read the thing.

If you question the sanity of such side-effects, you'd be well within your
rights. For example, this is akin to saying "if I read from a byte in
memory, then I expect something else to be modified". Or "if I open a file
and read from it, I expect the file length to change". Would you be content
with that? I would hope not. Talk about confusing!

This really appears to defy the very grain of fundamental read/write
assumptions. Sure, there are some rather complex and domain-specific
scenarios whereby you'd /perhaps/ consider doing something similar regarding
side-effects, but this AA stuff is a fundamental language type, built into
the language itself ... it's only a hashmap, so there's just no excuse for
such, uhhh, vagaries. If you're not convinced, please ask yourself if you'd
expect this code to modify the array itself:

static const char[] ca = "can't touch this";

char c = ca[0];

I haven't tried it, but I would certainly hope the array isn't modified!
Note that the basic semantics are identical to the AA case, and a major
factor for building AA into the language was such that it would be just an
array "extension". I'd call it a "stretch" instead <g>

So, the reason for writing this is twofold:

1) being the "container zone", I beg and plead with all container writers to
avoid this sorry approach at all cost. After all, even if you don't agree
with the lunacy aspect, you can't deny the fact that a value-lookup takes
twice as long as it should (first check for existence, then perform a get).

2) containers should follow the principle of "least surprise" in all
aspects. Please consider that.

Maybe a third reason would be to try and get the AA semantics changed ...
Naaaaaa .... that would be asking for waaaay too much.

Jul 27 2004

Ben Hinkle <bhinkle4 juno.com> writes:

Kris wrote:

 "Ben Hinkle" wrote ...
 Kris wrote:
 Well, apparently not. Now clearly I did not assign anything to the AA
 in the above code, but an entry is created regardless. That is, AA
 entries are created as a /side-effect/ of doing that lookup. Now, I've
 used many hash-table implementations before, but this behavior is truly
 a first. Hands-up please, all those who would have expected this ...

 That aside, one is forced to do a double lookup in such cases. This is


 not
 good at all, and I simply cannot fathom what benefit there is to such
 an approach. In my particular application, the double-lookup (and, of


 course,
 double hash-computation) is entirely unacceptable. Lookup is the most
 common operation on a hash-table, by far. It has to be efficient, and


 this
 "side-effect" totally flies in the face of reason. If it were a Phobos
 module it would have been replaced long ago.

 For the moment I have to assume the behavior is "by design", so
 fair-warning to everyone.

 It is subtle if you are expecting something else.

 
 I'd like to just ignore this and go on, but the semantics here are just
 too badly broken to serve as an example of what containers should do.
 Since this is the unofficial "container zone", I ought to say something
 ...
 
 To recap; referencing an associative array as an rValue modifies the array
 itself. That is
 
 MyClass [char[]]  aa;
 
 MyClass x = aa ["key"];
 if (x)
     ...
 
 actually /adds/ a null entry to the array. Instead of having 0 entries in
 the AA, you now have 1. And all you did was nonchalantly read the thing.
 
 If you question the sanity of such side-effects, you'd be well within your
 rights. For example, this is akin to saying "if I read from a byte in
 memory, then I expect something else to be modified". Or "if I open a file
 and read from it, I expect the file length to change". Would you be
 content with that? I would hope not. Talk about confusing!
 
 This really appears to defy the very grain of fundamental read/write
 assumptions. Sure, there are some rather complex and domain-specific
 scenarios whereby you'd /perhaps/ consider doing something similar
 regarding side-effects, but this AA stuff is a fundamental language type,
 built into the language itself ... it's only a hashmap, so there's just no
 excuse for such, uhhh, vagaries. If you're not convinced, please ask
 yourself if you'd expect this code to modify the array itself:
 
 static const char[] ca = "can't touch this";
 
 char c = ca[0];
 
 I haven't tried it, but I would certainly hope the array isn't modified!
 Note that the basic semantics are identical to the AA case, and a major
 factor for building AA into the language was such that it would be just an
 array "extension". I'd call it a "stretch" instead <g>
 
 So, the reason for writing this is twofold:
 
 1) being the "container zone", I beg and plead with all container writers
 to avoid this sorry approach at all cost. After all, even if you don't
 agree with the lunacy aspect, you can't deny the fact that a value-lookup
 takes twice as long as it should (first check for existence, then perform
 a get).
 
 2) containers should follow the principle of "least surprise" in all
 aspects. Please consider that.
 
 Maybe a third reason would be to try and get the AA semantics changed ...
 Naaaaaa .... that would be asking for waaaay too much.

Here's the dilemma: if you try to get a value that isn't in the hashmap then
what should be returned? No matter what you return the user can't tell if
the key is/was in the hashmap or not. So in some sense the right thing to
do is throw an exception when the user tries to evaluate aa["key"] and
"key" isn't in the map - this solution would be the most correct in terms
of principle of least astonishment since it is analogous to asking a
dynamic array for an index that is out of bounds (which throws an
exception). I'm half-tempted to serious propose that instead of some hack
about returning Type.init and maybe or maybe not adding it to the map. That
way both STL and Java users will get a surprise!

eh, maybe I need another glass of wine. :-)

Jul 27 2004

Regan Heath <regan netwin.co.nz> writes:

On Tue, 27 Jul 2004 22:49:39 -0400, Ben Hinkle <bhinkle4 juno.com> wrote:

 Kris wrote:

 "Ben Hinkle" wrote ...
 Kris wrote:
 Well, apparently not. Now clearly I did not assign anything to the AA
 in the above code, but an entry is created regardless. That is, AA
 entries are created as a /side-effect/ of doing that lookup. Now, 

 I've
 used many hash-table implementations before, but this behavior is 

 truly
 a first. Hands-up please, all those who would have expected this ...

 That aside, one is forced to do a double lookup in such cases. This 

 is

 not
 good at all, and I simply cannot fathom what benefit there is to such
 an approach. In my particular application, the double-lookup (and, of


 course,
 double hash-computation) is entirely unacceptable. Lookup is the most
 common operation on a hash-table, by far. It has to be efficient, and


 this
 "side-effect" totally flies in the face of reason. If it were a 

 Phobos
 module it would have been replaced long ago.

 For the moment I have to assume the behavior is "by design", so
 fair-warning to everyone.

 It is subtle if you are expecting something else.

 I'd like to just ignore this and go on, but the semantics here are just
 too badly broken to serve as an example of what containers should do.
 Since this is the unofficial "container zone", I ought to say something
 ...

 To recap; referencing an associative array as an rValue modifies the 
 array
 itself. That is

 MyClass [char[]]  aa;

 MyClass x = aa ["key"];
 if (x)
     ...

 actually /adds/ a null entry to the array. Instead of having 0 entries 
 in
 the AA, you now have 1. And all you did was nonchalantly read the thing.

 If you question the sanity of such side-effects, you'd be well within 
 your
 rights. For example, this is akin to saying "if I read from a byte in
 memory, then I expect something else to be modified". Or "if I open a 
 file
 and read from it, I expect the file length to change". Would you be
 content with that? I would hope not. Talk about confusing!

 This really appears to defy the very grain of fundamental read/write
 assumptions. Sure, there are some rather complex and domain-specific
 scenarios whereby you'd /perhaps/ consider doing something similar
 regarding side-effects, but this AA stuff is a fundamental language 
 type,
 built into the language itself ... it's only a hashmap, so there's just 
 no
 excuse for such, uhhh, vagaries. If you're not convinced, please ask
 yourself if you'd expect this code to modify the array itself:

 static const char[] ca = "can't touch this";

 char c = ca[0];

 I haven't tried it, but I would certainly hope the array isn't modified!
 Note that the basic semantics are identical to the AA case, and a major
 factor for building AA into the language was such that it would be just 
 an
 array "extension". I'd call it a "stretch" instead <g>

 So, the reason for writing this is twofold:

 1) being the "container zone", I beg and plead with all container 
 writers
 to avoid this sorry approach at all cost. After all, even if you don't
 agree with the lunacy aspect, you can't deny the fact that a 
 value-lookup
 takes twice as long as it should (first check for existence, then 
 perform
 a get).

 2) containers should follow the principle of "least surprise" in all
 aspects. Please consider that.

 Maybe a third reason would be to try and get the AA semantics changed 
 ...
 Naaaaaa .... that would be asking for waaaay too much.

 Here's the dilemma: if you try to get a value that isn't in the hashmap 
 then
 what should be returned? No matter what you return the user can't tell if
 the key is/was in the hashmap or not. So in some sense the right thing to
 do is throw an exception when the user tries to evaluate aa["key"] and
 "key" isn't in the map - this solution would be the most correct in terms
 of principle of least astonishment since it is analogous to asking a
 dynamic array for an index that is out of bounds (which throws an
 exception). I'm half-tempted to serious propose that instead of some hack
 about returning Type.init and maybe or maybe not adding it to the map.

I think this is the best solution. In addition you can supply a method like

if (aa.contains("key",value))

which returns true/false and assigns value if found. So those who do not 
want to catch an exception can call that one instead. Personally I think 
I'd always want to use the contains method instead of aa["key"].

If you dont like the name contains you can use anything else i.e. find, 
get, lookup

 That way both STL and Java users will get a surprise!

At least it's an immediate surprise and not one you don't realise until 
later, as it was for Kris with the current behaviour.

 eh, maybe I need another glass of wine. :-)

Don't we all..

Regan

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

Jul 27 2004

"Kris" <someidiot earthlink.dot.dot.dot.net> writes:

"Regan Heath"  wrote ..
 I think this is the best solution. In addition you can supply a method

like
 if (aa.contains("key",value))

 which returns true/false and assigns value if found. So those who do not
 want to catch an exception can call that one instead. Personally I think
 I'd always want to use the contains method instead of aa["key"].

Good point Regan;

Unfortunately, the "value" may be expensive to construct. If so, then you
may end up constructing it for nothing (because the AA already has an
instance). A mutating method/property such as this might get away with a
delegate instead, to create the missing "value". Might call it
createWhereMissing(), or something equally hideous <g>

For a large percentage of cases, I think one can get away with excluding
null as a valid "value", and returning that for failed queries/lookups.
There again, the latter wouldn't necessarily work for all AA types (though
it would for many other containers).

Here's to that glass that Ben was suggesting!

Jul 27 2004

Regan Heath <regan netwin.co.nz> writes:

On Tue, 27 Jul 2004 21:15:44 -0700, Kris 
<someidiot earthlink.dot.dot.dot.net> wrote:

 "Regan Heath"  wrote ..
 I think this is the best solution. In addition you can supply a method

 like
 if (aa.contains("key",value))

 which returns true/false and assigns value if found. So those who do not
 want to catch an exception can call that one instead. Personally I think
 I'd always want to use the contains method instead of aa["key"].

 Good point Regan;

 Unfortunately, the "value" may be expensive to construct. If so, then you
 may end up constructing it for nothing (because the AA already has an
 instance).

The only case I can think of is a large struct. In which case you should 
probably be storing pointers to them in the AA. eg.

struct Foo {
   ..lots..
}

Foo*[char[]] aa;
Foo* value;

if (aa.contains("regan",value)) ..

If you're not storing pointers then it's going to be slow/expensive to 
simply add/assign one to the aa in the first place.

 A mutating method/property such as this might get away with a
 delegate instead, to create the missing "value". Might call it
 createWhereMissing(), or something equally hideous <g>

In the instance where you want to look for something _and_ add it if it 
does not exist I believe a different 'find' function should be used, one 
that returns the insertion location eg.

if (aa.contains("regan",at)) ..value exists..
else {
   aa.add("regan",value,at);  //add value at 'at'
}

is a better solution.

 For a large percentage of cases, I think one can get away with excluding
 null as a valid "value", and returning that for failed queries/lookups.

Perhaps, but the fact that it does not handle basic types (without a 
wrapper class) makes it a poor soln IMO.

 There again, the latter wouldn't necessarily work for all AA types 
 (though it would for many other containers).
 Here's to that glass that Ben was suggesting!

Regan.

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

Jul 27 2004

"Kris" <someidiot earthlink.dot.dot.dot.net> writes:

1) Oh, I misread your initial post. Sorry. So, when you say "returns
true/false and assigns a value if found" I guess you mean this is a kind of
"replace" function instead? And returns whether the operation took place? If
so, then that's very useful. Still leaves the inverse scenario open though,
where you want to retrieve the value of something from the container without
mutating (egad!) at all: a very common case. With the current syntax you
have to check first, then do the retrieval: two lookups.

2) The problem with those "insertion location" style APIs is one of
atomicity ... if something throws an exception in-between, then all bets are
off. Definately works well for some environments though.


"Regan Heath"  wrote...

 The only case I can think of is a large struct. In which case you should
 probably be storing pointers to them in the AA. eg.

 struct Foo {
    ..lots..
 }

 Foo*[char[]] aa;
 Foo* value;

 if (aa.contains("regan",value)) ..

 If you're not storing pointers then it's going to be slow/expensive to
 simply add/assign one to the aa in the first place.

Jul 27 2004

Regan Heath <regan netwin.co.nz> writes:

On Tue, 27 Jul 2004 22:03:17 -0700, Kris 
<someidiot earthlink.dot.dot.dot.net> wrote:
 1) Oh, I misread your initial post. Sorry. So, when you say "returns
 true/false and assigns a value if found" I guess you mean this is a kind 
 of
 "replace" function instead?

No, no, no.. I'm just not making myself clear methinks.

Two different functions, one:

   bool contains(key_type key, value_type value);

looks for key, returns true/false (existance of the value) and assigns 
value on true.

The other:

   bool contains(key_type key, index_type at);

looks for key, returns true/false (existance of the value) and gives an 
index to the value which you can use to:
a) retrieve the value (if it exists)
b) replace the value with a new one (if it exists)
c) insert a value (if it does not exist)

 And returns whether the operation took place? If
 so, then that's very useful. Still leaves the inverse scenario open 
 though,
 where you want to retrieve the value of something from the container 
 without
 mutating (egad!) at all: a very common case. With the current syntax you
 have to check first, then do the retrieval: two lookups.

Not using my idea above :)

 2) The problem with those "insertion location" style APIs is one of
 atomicity ... if something throws an exception in-between, then all bets 
 are
 off. Definately works well for some environments though.

I realise this, I don't think it should ever mutate on read/lookup 
operations.

Regan

 "Regan Heath"  wrote...

 The only case I can think of is a large struct. In which case you should
 probably be storing pointers to them in the AA. eg.

 struct Foo {
    ..lots..
 }

 Foo*[char[]] aa;
 Foo* value;

 if (aa.contains("regan",value)) ..

 If you're not storing pointers then it's going to be slow/expensive to
 simply add/assign one to the aa in the first place.




-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

Jul 28 2004

Regan Heath <regan netwin.co.nz> writes:

On Thu, 29 Jul 2004 09:26:03 +1200, Regan Heath <regan netwin.co.nz> wrote:

 On Tue, 27 Jul 2004 22:03:17 -0700, Kris 
 <someidiot earthlink.dot.dot.dot.net> wrote:
 1) Oh, I misread your initial post. Sorry. So, when you say "returns
 true/false and assigns a value if found" I guess you mean this is a 
 kind of
 "replace" function instead?

 No, no, no.. I'm just not making myself clear methinks.

 Two different functions, one:

    bool contains(key_type key, value_type value);

 looks for key, returns true/false (existance of the value) and assigns 
 value on true.

by 'assigns' I mean _gets_ the value _from_ the aa and _assigns_ it to the 
out variable 'value', this is a 'get a value' function, not a 'set a 
value' function. It never mutates the aa. My function definition is 
missing the 'out' here:

bool contains(key_type key, out value_type value);

 The other:

    bool contains(key_type key, index_type at);

This should be:

bool contains(key_type key, out index_type at);

 looks for key, returns true/false (existance of the value) and gives an 
 index to the value which you can use to:
 a) retrieve the value (if it exists)
 b) replace the value with a new one (if it exists)
 c) insert a value (if it does not exist)

 And returns whether the operation took place? If
 so, then that's very useful. Still leaves the inverse scenario open 
 though,
 where you want to retrieve the value of something from the container 
 without
 mutating (egad!) at all: a very common case. With the current syntax you
 have to check first, then do the retrieval: two lookups.

 Not using my idea above :)

 2) The problem with those "insertion location" style APIs is one of
 atomicity ... if something throws an exception in-between, then all 
 bets are
 off. Definately works well for some environments though.

 I realise this, I don't think it should ever mutate on read/lookup 
 operations.

 Regan

 "Regan Heath"  wrote...

 The only case I can think of is a large struct. In which case you 
 should
 probably be storing pointers to them in the AA. eg.

 struct Foo {
    ..lots..
 }

 Foo*[char[]] aa;
 Foo* value;

 if (aa.contains("regan",value)) ..

 If you're not storing pointers then it's going to be slow/expensive to
 simply add/assign one to the aa in the first place.





-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

Jul 28 2004

"Kris" <someidiot earthlink.dot.dot.dot.net> writes:

Yep - that's where the confusion was Regan. Without the 'out' declaration, I
was reading your "assignment" phrase as assigning /into/ the AA instead.
Sorry about that: I clearly had far too much wine at the time :-)

And yes; that is a much better way to do than the strategy adopted by AA.

- Kris


"Regan Heath" <regan netwin.co.nz> wrote in message
news:opsbu95dvy5a2sq9 digitalmars.com...
 On Thu, 29 Jul 2004 09:26:03 +1200, Regan Heath <regan netwin.co.nz>

wrote:
 On Tue, 27 Jul 2004 22:03:17 -0700, Kris
 <someidiot earthlink.dot.dot.dot.net> wrote:
 1) Oh, I misread your initial post. Sorry. So, when you say "returns
 true/false and assigns a value if found" I guess you mean this is a
 kind of
 "replace" function instead?

 No, no, no.. I'm just not making myself clear methinks.

 Two different functions, one:

    bool contains(key_type key, value_type value);

 looks for key, returns true/false (existance of the value) and assigns
 value on true.

 by 'assigns' I mean _gets_ the value _from_ the aa and _assigns_ it to the
 out variable 'value', this is a 'get a value' function, not a 'set a
 value' function. It never mutates the aa. My function definition is
 missing the 'out' here:

 bool contains(key_type key, out value_type value);

 The other:

    bool contains(key_type key, index_type at);

 This should be:

 bool contains(key_type key, out index_type at);

 looks for key, returns true/false (existance of the value) and gives an
 index to the value which you can use to:
 a) retrieve the value (if it exists)
 b) replace the value with a new one (if it exists)
 c) insert a value (if it does not exist)

 And returns whether the operation took place? If
 so, then that's very useful. Still leaves the inverse scenario open
 though,
 where you want to retrieve the value of something from the container
 without
 mutating (egad!) at all: a very common case. With the current syntax



you
 have to check first, then do the retrieval: two lookups.

 Not using my idea above :)

 2) The problem with those "insertion location" style APIs is one of
 atomicity ... if something throws an exception in-between, then all
 bets are
 off. Definately works well for some environments though.

 I realise this, I don't think it should ever mutate on read/lookup
 operations.

 Regan

 "Regan Heath"  wrote...

 The only case I can think of is a large struct. In which case you
 should
 probably be storing pointers to them in the AA. eg.

 struct Foo {
    ..lots..
 }

 Foo*[char[]] aa;
 Foo* value;

 if (aa.contains("regan",value)) ..

 If you're not storing pointers then it's going to be slow/expensive to
 simply add/assign one to the aa in the first place.





 --
 Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

Jul 28 2004

Ben Hinkle <bhinkle4 juno.com> writes:

I've added the following function to the collection of "array helper"
functions in mintl/array.d:

/** Query if an associative array contains a given key and if
 * so set the output value and return true and otherwise return
 * false without adding the key to the map or setting the value 
 * parameter. This mechanism is faster than making two lookups:
 * one to see if the key is in the map and another to get the
 * value. Compiler-dependent.
 * \param x the array to query
 * \param key the key to look up
 * \param value the output value if any
 * \return true if the key is in the map and false otherwise
 */
template contains(K,V) {
  bool contains(V[K] x, K key, inout V value) {
    ...
  }
}
unittest {
  double[int] c;
  c[100] = 1.1;
  c[300] = 2.2;
  c[-100] = 3.3;
  double v;
  assert( contains!(int,double)(c,-100,v) );
  assert( v == 3.3 );
  assert( !contains!(int,double)(c,200,v) );
  assert( !(200 in c) );
}

I haven't added the second version of "contains" since the index_type
doesn't exist yet.
-Ben

Regan Heath wrote:

 On Thu, 29 Jul 2004 09:26:03 +1200, Regan Heath <regan netwin.co.nz>
 wrote:
 
 On Tue, 27 Jul 2004 22:03:17 -0700, Kris
 <someidiot earthlink.dot.dot.dot.net> wrote:
 1) Oh, I misread your initial post. Sorry. So, when you say "returns
 true/false and assigns a value if found" I guess you mean this is a
 kind of
 "replace" function instead?

 No, no, no.. I'm just not making myself clear methinks.

 Two different functions, one:

    bool contains(key_type key, value_type value);

 looks for key, returns true/false (existance of the value) and assigns
 value on true.

 
 by 'assigns' I mean _gets_ the value _from_ the aa and _assigns_ it to the
 out variable 'value', this is a 'get a value' function, not a 'set a
 value' function. It never mutates the aa. My function definition is
 missing the 'out' here:
 
 bool contains(key_type key, out value_type value);
 
 The other:

    bool contains(key_type key, index_type at);

 
 This should be:
 
 bool contains(key_type key, out index_type at);
 
 looks for key, returns true/false (existance of the value) and gives an
 index to the value which you can use to:
 a) retrieve the value (if it exists)
 b) replace the value with a new one (if it exists)
 c) insert a value (if it does not exist)

 And returns whether the operation took place? If
 so, then that's very useful. Still leaves the inverse scenario open
 though,
 where you want to retrieve the value of something from the container
 without
 mutating (egad!) at all: a very common case. With the current syntax you
 have to check first, then do the retrieval: two lookups.

 Not using my idea above :)

 2) The problem with those "insertion location" style APIs is one of
 atomicity ... if something throws an exception in-between, then all
 bets are
 off. Definately works well for some environments though.

 I realise this, I don't think it should ever mutate on read/lookup
 operations.

 Regan

 "Regan Heath"  wrote...

 The only case I can think of is a large struct. In which case you
 should
 probably be storing pointers to them in the AA. eg.

 struct Foo {
    ..lots..
 }

 Foo*[char[]] aa;
 Foo* value;

 if (aa.contains("regan",value)) ..

 If you're not storing pointers then it's going to be slow/expensive to
 simply add/assign one to the aa in the first place.

Jul 29 2004

Regan Heath <regan netwin.co.nz> writes:

On Thu, 29 Jul 2004 05:07:16 -0400, Ben Hinkle <bhinkle4 juno.com> wrote:

 I've added the following function to the collection of "array helper"
 functions in mintl/array.d:

 /** Query if an associative array contains a given key and if
  * so set the output value and return true and otherwise return
  * false without adding the key to the map or setting the value
  * parameter. This mechanism is faster than making two lookups:
  * one to see if the key is in the map and another to get the
  * value. Compiler-dependent.
  * \param x the array to query
  * \param key the key to look up
  * \param value the output value if any
  * \return true if the key is in the map and false otherwise
  */
 template contains(K,V) {
   bool contains(V[K] x, K key, inout V value) {
     ...
   }
 }
 unittest {
   double[int] c;
   c[100] = 1.1;
   c[300] = 2.2;
   c[-100] = 3.3;
   double v;
   assert( contains!(int,double)(c,-100,v) );
   assert( v == 3.3 );
   assert( !contains!(int,double)(c,200,v) );
   assert( !(200 in c) );
 }

Perfect :)

 I haven't added the second version of "contains" since the index_type
 doesn't exist yet.

I'm not sure an index_type makes sense for an AA, it does for other 
container types, but for an AA isn't the key the index type, or maybe the 
hashed value of the key is the index type?

Regan

 -Ben

 Regan Heath wrote:

 On Thu, 29 Jul 2004 09:26:03 +1200, Regan Heath <regan netwin.co.nz>
 wrote:

 On Tue, 27 Jul 2004 22:03:17 -0700, Kris
 <someidiot earthlink.dot.dot.dot.net> wrote:
 1) Oh, I misread your initial post. Sorry. So, when you say "returns
 true/false and assigns a value if found" I guess you mean this is a
 kind of
 "replace" function instead?

 No, no, no.. I'm just not making myself clear methinks.

 Two different functions, one:

    bool contains(key_type key, value_type value);

 looks for key, returns true/false (existance of the value) and assigns
 value on true.

 by 'assigns' I mean _gets_ the value _from_ the aa and _assigns_ it to 
 the
 out variable 'value', this is a 'get a value' function, not a 'set a
 value' function. It never mutates the aa. My function definition is
 missing the 'out' here:

 bool contains(key_type key, out value_type value);

 The other:

    bool contains(key_type key, index_type at);

 This should be:

 bool contains(key_type key, out index_type at);

 looks for key, returns true/false (existance of the value) and gives an
 index to the value which you can use to:
 a) retrieve the value (if it exists)
 b) replace the value with a new one (if it exists)
 c) insert a value (if it does not exist)

 And returns whether the operation took place? If
 so, then that's very useful. Still leaves the inverse scenario open
 though,
 where you want to retrieve the value of something from the container
 without
 mutating (egad!) at all: a very common case. With the current syntax 
 you
 have to check first, then do the retrieval: two lookups.

 Not using my idea above :)

 2) The problem with those "insertion location" style APIs is one of
 atomicity ... if something throws an exception in-between, then all
 bets are
 off. Definately works well for some environments though.

 I realise this, I don't think it should ever mutate on read/lookup
 operations.

 Regan

 "Regan Heath"  wrote...

 The only case I can think of is a large struct. In which case you
 should
 probably be storing pointers to them in the AA. eg.

 struct Foo {
    ..lots..
 }

 Foo*[char[]] aa;
 Foo* value;

 if (aa.contains("regan",value)) ..

 If you're not storing pointers then it's going to be slow/expensive 
 to
 simply add/assign one to the aa in the first place.







-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

Jul 29 2004

"Ben Hinkle" <bhinkle mathworks.com> writes:

 I think this is the best solution. In addition you can supply a method

like
 if (aa.contains("key",value))

 which returns true/false and assigns value if found. So those who do not
 want to catch an exception can call that one instead. Personally I think
 I'd always want to use the contains method instead of aa["key"].

 If you dont like the name contains you can use anything else i.e. find,
 get, lookup

Seems reasonable to me - though the name "contains" doesn't imply anything
about setting values so maybe something like "trySet" would be better. Also
I could see a "tryGet" function that looks up the key and if present sets
the output variable and returns true and otherwise returns false.

 That way both STL and Java users will get a surprise!

 At least it's an immediate surprise and not one you don't realise until
 later, as it was for Kris with the current behaviour.

very true. subtle bugs can be very frustrating.

 eh, maybe I need another glass of wine. :-)

 Don't we all..

 Regan

 -- 
 Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

Jul 28 2004

Regan Heath <regan netwin.co.nz> writes:

On Wed, 28 Jul 2004 10:17:57 -0400, Ben Hinkle <bhinkle mathworks.com> 
wrote:
 I think this is the best solution. In addition you can supply a method

 like
 if (aa.contains("key",value))

 which returns true/false and assigns value if found. So those who do not
 want to catch an exception can call that one instead. Personally I think
 I'd always want to use the contains method instead of aa["key"].

 If you dont like the name contains you can use anything else i.e. find,
 get, lookup

 Seems reasonable to me - though the name "contains" doesn't imply 
 anything
 about setting values

True, I think 'get' is the best name.

 so maybe something like "trySet" would be better.

I think we should avoid the word 'set' it implies to me we're going to add 
something to the aa, which we're not in this case.

 Also
 I could see a "tryGet" function that looks up the key and if present sets
 the output variable and returns true and otherwise returns false.

That is the function I am proposing.. it seems I have not been making 
myself very clear. :)

Regan

 That way both STL and Java users will get a surprise!

 At least it's an immediate surprise and not one you don't realise until
 later, as it was for Kris with the current behaviour.

 very true. subtle bugs can be very frustrating.

 eh, maybe I need another glass of wine. :-)

 Don't we all..

 Regan

 --
 Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/




-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

Jul 28 2004

"Kris" <someidiot earthlink.dot.dot.dot.net> writes:

"Ben Hinkle"  wrote
 Here's the dilemma: if you try to get a value that isn't in the hashmap

then
 what should be returned? No matter what you return the user can't tell if
 the key is/was in the hashmap or not. So in some sense the right thing to
 do is throw an exception when the user tries to evaluate aa["key"] and
 "key" isn't in the map - this solution would be the most correct in terms
 of principle of least astonishment since it is analogous to asking a
 dynamic array for an index that is out of bounds (which throws an
 exception). I'm half-tempted to serious propose that instead of some hack
 about returning Type.init and maybe or maybe not adding it to the map.

That
 way both STL and Java users will get a surprise!

 eh, maybe I need another glass of wine. :-)

HaHa! Got a fairly reasonable Chateau De Armpit open here too <g>

Doesn't Python throw an exception like that? Or is it Ruby? I forget ...

One reasonable way out of the hole is to stipulate (as part of the API
contract) that a "value" of null is illegal. That way, a null return
signifies the entry does not exist. You can then happily do a get(), and
test the return. That gets around the double-lookup issue (for a
non-mutating lookup). A value-type of pointer, array or class would support
a null return. From recollection, this is what Java does.

Unfortunately, you can't do that for simple ints and chars (without
wrappers), so there's the rub.

OT: meant to ask if you or Mike were gonna' work on porting D.L's
ConcurrentHashMap for your container-set ... I'd pay money for that :-)

Jul 27 2004

Sam McCall <tunah tunah.net> writes:

Kris wrote:

 HaHa! Got a fairly reasonable Chateau De Armpit open here too <g>
 
 Doesn't Python throw an exception like that? Or is it Ruby? I forget ...

Python throws an exception. Ruby will by default (Hash.new() or {}) 
return nil (which is the best single behaviour, but not all types 
support it in D. If you care about existant keys with nil values, call 
contains()).
Hash.new(0) will create a hash that returns 0 for nonexistant keys. 
(Think sparse matrices...)
Hash.new { |hash,key|

}
Passes a the hash and the specified key and executes the code, and 
returns the result, if the key isn't in the hash.
For example:
Hash.new { |hash,key|
	hash[key]=foo(key)
}
creates a hash that's a caching wrapper of foo. (The value of the last 
expression is implicitly returned).

I like Ruby by the way :-)

 One reasonable way out of the hole is to stipulate (as part of the API
 contract) that a "value" of null is illegal. That way, a null return
 signifies the entry does not exist. You can then happily do a get(), and
 test the return. That gets around the double-lookup issue (for a
 non-mutating lookup). A value-type of pointer, array or class would support
 a null return. From recollection, this is what Java does.
 
 Unfortunately, you can't do that for simple ints and chars (without
 wrappers), so there's the rub.

Hmm, here's a D 3.0 feature (or D08, or whatever version is sufficiently 
complicated anyway that one more feature won't make a difference). I'm 
in a frivolous mood, so don't bother flaming me :-)

int? value=array["test"]
if(value?)
	printf("%d\n",value);

Type? would be something like
struct ?(T) {
	T value;
	bit present;
	/* implicit */ T opCast() { return value; }
	bit opQuery() { return present; }
}

Actually, I quite like the opQuery, or maybe foo? could be the "foo aint 
null" operator that's being sought.

Meh.

Sam

Jul 28 2004

"Kris" <someidiot earthlink.dot.dot.dot.net> writes:

"Ben Hinkle"  wrote ...
 Here's the dilemma: if you try to get a value that isn't in the hashmap

then
 what should be returned? No matter what you return the user can't tell if
 the key is/was in the hashmap or not. So in some sense the right thing to
 do is throw an exception when the user tries to evaluate aa["key"] and
 "key" isn't in the map - this solution would be the most correct in terms
 of principle of least astonishment since it is analogous to asking a
 dynamic array for an index that is out of bounds (which throws an
 exception). I'm half-tempted to serious propose that instead of some hack
 about returning Type.init and maybe or maybe not adding it to the map.

That
 way both STL and Java users will get a surprise!

 eh, maybe I need another glass of wine. :-)

Perhaps I've had too many recently ... 8~}

How about returning a simple true/false from a lookup instead? One can use
an 'out' argument for the "value" if there's one present, yes? Not the most
elegant solution, or perhaps the most efficient; but it's a whole lot better
than two lookups, or the dreaded (cue music ...) *mutating rValue* ...

bool get (K key, out V value);

Jul 27 2004

Regan Heath <regan netwin.co.nz> writes:

On Tue, 27 Jul 2004 22:40:02 -0700, Kris 
<someidiot earthlink.dot.dot.dot.net> wrote:
 "Ben Hinkle"  wrote ...
 Here's the dilemma: if you try to get a value that isn't in the hashmap

 then
 what should be returned? No matter what you return the user can't tell 
 if
 the key is/was in the hashmap or not. So in some sense the right thing 
 to
 do is throw an exception when the user tries to evaluate aa["key"] and
 "key" isn't in the map - this solution would be the most correct in 
 terms
 of principle of least astonishment since it is analogous to asking a
 dynamic array for an index that is out of bounds (which throws an
 exception). I'm half-tempted to serious propose that instead of some 
 hack
 about returning Type.init and maybe or maybe not adding it to the map.

 That
 way both STL and Java users will get a surprise!

 eh, maybe I need another glass of wine. :-)

 Perhaps I've had too many recently ... 8~}

 How about returning a simple true/false from a lookup instead? One can 
 use
 an 'out' argument for the "value" if there's one present, yes?

That is what I've been trying to suggest for the last 5 posts!

 Not the most
 elegant solution, or perhaps the most efficient; but it's a whole lot 
 better
 than two lookups, or the dreaded (cue music ...) *mutating rValue* ...

I think it's a very elegant solution, it handles all value types, and can 
be placed in an if statement like so:

if (aa.get("regan",value)) {
}

 bool get (K key, out V value);

Regan

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

Jul 28 2004

Gold Dragon <dragonwing dragonu.net> writes:

What ever happened to using '==' to test if something exists? I don't 
know, I guess I should leave the intelligent things to the smart people. 
When I see = then I assume that you want something to equal that value 
and not test for it. I always read in books that you use '==' instead of 
  '=' to test for something. Perhaps you can overload the '==' to see if 
something exists.

Jul 29 2004

Regan Heath <regan netwin.co.nz> writes:

On Thu, 29 Jul 2004 04:32:58 -0500, Gold Dragon <dragonwing dragonu.net> 
wrote:

 What ever happened to using '==' to test if something exists? I don't 
 know, I guess I should leave the intelligent things to the smart people. 
 When I see = then I assume that you want something to equal that value 
 and not test for it. I always read in books that you use '==' instead of 
   '=' to test for something. Perhaps you can overload the '==' to see if 
 something exists.

If you do that, how do you compare two AA's with each other?

I don't mean identity eg. '===' or 'is' checking they refer to the same 
AA, I mean compare the actual AA. I admit this is perhaps a rare 
requirement, but still.

Regan

-- 
Using M2, Opera's revolutionary e-mail client: http://www.opera.com/m2/

Jul 29 2004

Gold Dragon <dragonwing dragonu.net> writes:

Regan Heath wrote:
 If you do that, how do you compare two AA's with each other?
 
 I don't mean identity eg. '===' or 'is' checking they refer to the same 
 AA, I mean compare the actual AA. I admit this is perhaps a rare 
 requirement, but still.
 
 Regan
 

I would use it for both as I do in other languages... except Java but 
you can't overload operators in Java now can you?

I don't know, it seems that everyone is walking around in circles trying 
to find the way out. Maybe you can like, make a function or something 
and call it 'exists'. Perhaps I'm missing the point and don't understand 
fully what goes into Hashes but having used them from Perl, I can say 
that it seems kind of strange. I've also used AA from in PHP and this is 
what I would to test if there is a value. I would use a foreach loop to 
get the key, value pair.

Jul 30 2004

Juanjo =?ISO-8859-15?Q?=C1lvarez?= <juanjuxNO SPAMyahoo.es> writes:

Ben Hinkle wrote:

 eh, maybe I need another glass of wine. :-)

You don't, that's exactly what Python does when you try to access and
unexistent key on a dict:
-------------------------
 d = {}
 d["akey"] = "avalue"
 print d["akey"]



avalue
 print d["nokey"]



Traceback (most recent call last):
  File "<stdin>", line 1, in ?
KeyError: 'nokey'
-------------------------

So the usual idiom to access an array that could or not could have a key is:

if "akey" in d:
    print d["akey"]

A nice method of python dicts is setdefault where you can specify a default
value so if the key is not in the dictionary the default value is returned:

----------------------------------
 print d.setdefault("akey", "defaultvalue")



avalue
 print d.setdefault("nokey", "defaultvalue")



defaultvalue

Jul 29 2004

"Matthew" <admin.hat stlsoft.dot.org> writes:

Basically, I've found the STL-way of doing things - which is what D's AAs apes
- painful 75% of the time, and useful 25%
of the time. I've no doubt the same will apply to D's AAs.

It's especially painful in STL, since find has to be tested against end(), and
then subject to (*??).second. Hateful
stuff.

Basically, there's no way to have your cake and eat it, either in C++ or D,
when overloading operators for non built-in
types (and even built-in types, so it seems!). (In my book - out in 6 weeks,
not plugged for a while - I show how
overloading the subscript operators for array types can never get full semantic
conformance with built-in arrays. <g>)

If we want to be pedantically correct, I'd suggest that we abandon the
subscript operators for associative arrays, but
the logical correlation of that is that we then abandon built-in AAs (which I
would also think is probably the best
thing). But D's not really about doing what's pedantically correct, and it's
sometimes tempting to think that it doesn't
even do things that are sensibly correct.

If/when I can ever get DTL compiled, I shall henceforth use the Map template,
which *is* able to support the distinction
between opIndexAssign (creates a new element, or updates an existing one) and
opIndex (which throws a NotFoundException
exception if the element does not exist). However, I'm still not entirely
comfortable given the fact that something so
syntactically similar has such different semantics.

:-(

Matthew


"Kris" <someidiot earthlink.dot.dot.dot.net> wrote in message
news:ce3tul$1h6o$1 digitaldaemon.com...
 So, I've been happily thinking this whole time that associative arrays (AA)
 in D are rather cool; being built into the language and so on ... then today
 I spent several hours tracking down what seemed to be a truly bizarre bug,
 which turned out to be caused by the way D associative arrays operate.
 Here's what happened:

 I want to check if there's an entry in the AA; use it if so, otherwise
 continue. The documentation say to use this construct for checking:

 class X
 {
      MyClass [char[]] aa;

     void test ()
     {
          if ("hello" in aa)
              ....
      }
 }

 and then (presumably) you have to subsequently retrieve the entry as
 follows:

     void test ()
     {
          if ("hello" in aa)
             {
             MyClass x = aa ["hello"];
             // do something with x ...
             }
      }

 Of course, you've just performed two full lookups instead of one: first you
 look to see if the entry exists, then you lookup again to retrieve it. Being
 the freak that I am, I don't want that overhead. So ... I proceeded with
 this instead:

     void test ()
     {
          MyClass x = aa ["hello"];

          if (x)
             {
             // do something with x ...
             }
      }

 Much better! Right?

 Well, apparently not. Now clearly I did not assign anything to the AA in the
 above code, but an entry is created regardless. That is, AA entries are
 created as a /side-effect/ of doing that lookup. Now, I've used many
 hash-table implementations before, but this behavior is truly a first.
 Hands-up please, all those who would have expected this ...

 This sneaky behavior really wasted a whole lot of effort (I can almost hear
 Walter laughing in the background :-)

 That aside, one is forced to do a double lookup in such cases. This is not
 good at all, and I simply cannot fathom what benefit there is to such an
 approach. In my particular application, the double-lookup (and, of course,
 double hash-computation) is entirely unacceptable. Lookup is the most common
 operation on a hash-table, by far. It has to be efficient, and this
 "side-effect" totally flies in the face of reason. If it were a Phobos
 module it would have been replaced long ago.

 For the moment I have to assume the behavior is "by design", so fair-warning
 to everyone.

 (Ben? Do you have a proper hash-table I can use please? If not, I'll do an
 FNV port; AA's be damned. Out! Damn spot! Out, I say!)

Jul 29 2004

Gold Dragon <dragonwing dragonu.net> writes:

I need to buy your damn book to see what you're talking about. Seems 
interesting but seeing as it is a Computer book it would probably cost 
an arm and a leg.

Jul 30 2004

"Matthew" <admin.hat stlsoft.dot.org> writes:

"Gold Dragon" <dragonwing dragonu.net> wrote in message
news:cectp4$2a75$2 digitaldaemon.com...
 I need to buy your damn book to see what you're talking about. Seems
 interesting but seeing as it is a Computer book it would probably cost
 an arm and a leg.

He he. I think it's going to be around US$45 (The cover thinks so, anyway:
http://synesis.com.au/publishing/imperfect/cpp/impcpp_cover.png)

Jul 30 2004

Gold Dragon <dragonwing dragonu.net> writes:

 He he. I think it's going to be around US$45 (The cover thinks so, anyway:
 http://synesis.com.au/publishing/imperfect/cpp/impcpp_cover.png)


You plug your book well (damn you :) ) but $45 dollars is a little more 
than I'm willing to spend for a book that will most likely sit on my 
shelf until the second edition comes out and the errata has been fixed 
and added. That isn't to say I'm not going to buy it as there is a 90% 
chance I will.

Jul 30 2004

"Matthew" <admin.hat stlsoft.dot.org> writes:

"Gold Dragon" <dragonwing dragonu.net> wrote in message
news:cedbhm$2gom$1 digitaldaemon.com...
 He he. I think it's going to be around US$45 (The cover thinks so, anyway:
 http://synesis.com.au/publishing/imperfect/cpp/impcpp_cover.png)


 You plug your book well (damn you :) ) but $45 dollars is a little more
 than I'm willing to spend for a book that will most likely sit on my
 shelf until the second edition comes out and the errata has been fixed
 and added. That isn't to say I'm not going to buy it as there is a 90%
 chance I will.

You can get freebies. All you've got to do is get a contract for writing with
Addison-Wesley. LOL!

Jul 30 2004

D Programming

C/C++ Programming

Other

digitalmars.D.dtl - associative-arrays ate my lunch (and my support)