www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - AA with complex keytype?

reply Manfred Nowak <svv1999 hotmail.com> writes:
For complex numbers comparisons other than == and != are not defined.

AA's require classes to have comparisons defined.

Why are AA's of complex numbers allowed?

-manfred
Feb 09 2007
next sibling parent reply "Andrei Alexandrescu (See Website For Email)" <SeeWebsiteForEmail erdani.org> writes:
Manfred Nowak wrote:
 For complex numbers comparisons other than == and != are not defined.
 
 AA's require classes to have comparisons defined.
 
 Why are AA's of complex numbers allowed?
Asociative arrays are unordered. All they need is a hash function, which can be defined over any data type. Andrei
Feb 09 2007
parent reply Oskar Linde <oskar.lindeREM OVEgmail.com> writes:
Andrei Alexandrescu (See Website For Email) wrote:
 Manfred Nowak wrote:
 For complex numbers comparisons other than == and != are not defined.

 AA's require classes to have comparisons defined.

 Why are AA's of complex numbers allowed?
Asociative arrays are unordered. All they need is a hash function, which can be defined over any data type.
Not the DMD associative arrays. They require full ordering. /Oskar
Feb 09 2007
parent Lionello Lunesu <lio lunesu.remove.com> writes:
Oskar Linde wrote:
 Andrei Alexandrescu (See Website For Email) wrote:
 Manfred Nowak wrote:
 For complex numbers comparisons other than == and != are not defined.

 AA's require classes to have comparisons defined.

 Why are AA's of complex numbers allowed?
Asociative arrays are unordered. All they need is a hash function, which can be defined over any data type.
Not the DMD associative arrays. They require full ordering.
Yes, ordering (opCmp) is used in case of hash collisions. L.
Feb 09 2007
prev sibling next sibling parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Manfred Nowak wrote:
 For complex numbers comparisons other than == and != are not defined.
Indeed.
 AA's require classes to have comparisons defined.
Also correct.
 Why are AA's of complex numbers allowed?
For one thing, complex numbers aren't classes :P. Seriously: AA's use the TypeInfo.compare() method to compare. For complex types, this is implemented as lexicographical ordering (compare the real parts, and return the result unless it's equal; in that case, return the result of comparing the imaginary parts). The reason "<" and friends aren't defined for complex types is that there's no definition that makes any mathematical sense (at least, not more than other definitions). Fortunately, AA's don't require the ordering to make any mathematical sense at all, they just need some way to decide in what order to put them in a binary three if a hash collision should occur. As long as the order is consistent[1] (a < b && b < c => a < c, that sort of thing) they're happy. [1]: I'm sure there's a mathematical term for this that I can't currently remember.
Feb 09 2007
next sibling parent reply Brian Byrne <bdbyrne wisc.edu> writes:
Frits van Bommel Wrote:

 [1] (a < b && b < c => a < c, that sort of thing) 
 
 ...
 
 [1]: I'm sure there's a mathematical term for this that I can't 
 currently remember.
Transitivity! :) Brian Byrne
Feb 09 2007
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Brian Byrne wrote:
 Frits van Bommel Wrote:
 
 [1] (a < b && b < c => a < c, that sort of thing) 

 ...

 [1]: I'm sure there's a mathematical term for this that I can't 
 currently remember.
Transitivity! :)
"that sort of thing" == "and stuff like that", i.e. not *just* that. If I'd meant transitivity, I'd have said that (that term I know).
Feb 09 2007
parent reply "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Frits van Bommel" <fvbommel REMwOVExCAPSs.nl> wrote in message 
news:eqi7ph$io7$2 digitaldaemon.com...
 "that sort of thing" == "and stuff like that", i.e. not *just* that.
 If I'd meant transitivity, I'd have said that (that term I know).
Woah, woah, woah! Calm down there, killer!
Feb 09 2007
parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Jarrett Billingsley wrote:
 "Frits van Bommel" <fvbommel REMwOVExCAPSs.nl> wrote in message 
 news:eqi7ph$io7$2 digitaldaemon.com...
 "that sort of thing" == "and stuff like that", i.e. not *just* that.
 If I'd meant transitivity, I'd have said that (that term I know).
Woah, woah, woah! Calm down there, killer!
I think that came over wrong. Maybe I should have put in a smiley or something. I just wanted to clarify that point. So hard to convey tone and stuff over a text-only medium...
Feb 09 2007
prev sibling parent reply Manfred Nowak <svv1999 hotmail.com> writes:
Frits van Bommel wrote
 As long as the order is consistent[1] (a < b && b <
 c => a < c, that sort of thing) they're happy.
Thank you. This answers my question. But immediately another arises: Why are classes required to implement opCmp( Object) for AA's to function properly? Currently AA's misbehave if opCmp(Object) is not overwritten for a class, but no error is thrown. For the transitivity of an ordering it suffices for opCmp(Object) to always give "==" if the objects are equal, and give "<" if they are not. It should be easy to implement this as the default opCmp(Object).
Feb 09 2007
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Manfred Nowak wrote:
 Frits van Bommel wrote
 As long as the order is consistent[1] (a < b && b <
 c => a < c, that sort of thing) they're happy.
Thank you. This answers my question. But immediately another arises: Why are classes required to implement opCmp( Object) for AA's to function properly?
Because the TypeInfo for objects uses Object.opCmp (if neither is null)
 Currently AA's misbehave if opCmp(Object) is not overwritten for a 
 class, but no error is thrown.
 
 For the transitivity of an ordering it suffices for opCmp(Object) to 
 always give "==" if the objects are equal, and give "<" if they are 
 not.
By "that sort of thing", I meant "and similar properties". For instance, the order also needs to be non-symmetric: a < b => !(b < a)[1]. Your proposed default implementation doesn't satisfy that. IIRC Object used to compare addresses by default, but that was problematic because that behavior disallows moving garbage collectors.
 It should be easy to implement this as the default opCmp(Object).
But as explained above it wouldn't work correctly... [1]: Actually, since the compare return value can have three meanings (less than, equal to, greater than), that would be: a < b => b > a
Feb 09 2007
parent reply Manfred Nowak <svv1999 hotmail.com> writes:
Frits van Bommel wrote
 Your proposed default implementation doesn't satisfy that.
In fact those requirements are necessary only for binary tress built in the collision buckets. If one can live with linear lists in the collsions baskets, then opCmp(Object) is allowed to degenerate to !opEquals(Object). -manfred
Feb 09 2007
parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Manfred Nowak wrote:
 Frits van Bommel wrote
 Your proposed default implementation doesn't satisfy that.
In fact those requirements are necessary only for binary tress built in the collision buckets.
Yes, but the current implementation *does* use binary trees, IIRC.
 If one can live with linear lists in the collsions baskets, then
 opCmp(Object) is allowed to degenerate to !opEquals(Object).
That might not be a great solution, especially when you need to consider that user-defined hash functions may not be very good...
Feb 09 2007
prev sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Manfred Nowak Wrote:

<snip>
 Why are classes required to implement opCmp( Object) for AA's to 
 function properly?
Apparently because Walter doesn't like the idea of providing an alternative implementation of AAs for unordered classes, but hasn't to my knowledge given a truly convincing reason. My library implementation http://pr.stewartsplace.org.uk/d/sutil/ uses only toHash and opEquals, thus works on unordered types.
 Currently AA's misbehave if opCmp(Object) is not overwritten for a 
 class, but no error is thrown.
Currently? Not under DMD 1.005 as I try it. Please supply a code example that demonstrates this behaviour.
 For the transitivity of an ordering it suffices for opCmp(Object) 
 to always give "==" if the objects are equal, and give "<" if they 
 are not.
So if x != y, then both x < y and y < x? That wouldn't make sense at all.
 It should be easy to implement this as the default opCmp(Object).
OUAT, Object.opCmp was set up to compare the memory addresses, but this behaviour was removed to prepare for copying/compacting GC, and probably partly to eliminate the confusing behaviour it created. Stewart.
Feb 09 2007
next sibling parent reply Manfred Nowak <svv1999 hotmail.com> writes:
Stewart Gordon wrote

 Currently AA's misbehave if opCmp(Object) is not overwritten for
 a class, but no error is thrown.
Currently? Not under DMD 1.005 as I try it. Please supply a code example that demonstrates this behaviour.
Following code shows under 1.005 that no compile time error is given. class C{ hash_t toHash() { return 0; } } void main(){ bool[C] map; map[ new C]= true; map[ new C]= false; } A runtime error shows up only, when a comparison is actually needed. This might be far too late. In this example the runtime error is forced by implementing a worthless toHash.
 So if x != y, then both x < y and y < x?  That wouldn't make sense
 at all. 
Enough sense for an AA: all colliding elements are put into a linear list and searched in sequence on retrieval. Without a good toHash this would natrally lead to a bad runtime behaviour. -manfred
Feb 09 2007
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Manfred Nowak wrote:
 So if x != y, then both x < y and y < x?  That wouldn't make sense
 at all. 
Enough sense for an AA: all colliding elements are put into a linear list and searched in sequence on retrieval. Without a good toHash this would natrally lead to a bad runtime behaviour.
If they'd be put into a linear list you wouldn't even *need* opCmp, so no opCmp would be "enough sence" as well ;). But the fact is they are put into a binary tree to keep acceptable behavior (i.e. O(log N) instead of O(N)) when the hash function sucks. Since hash functions can be user-defined, and not all users are experts at hashing, you need to consider that use case.
Feb 09 2007
parent reply Manfred Nowak <svv1999 hotmail.com> writes:
Frits van Bommel wrote

 Since hash functions can be user-defined, and not all users
 are experts at hashing, you need to consider that use case.
Currently implemented AA's need to have for a good runtime beahviour at least one of 1) or 2) where: (1a) toHash implements a good distribution and (1b) opCmp==!opEquals or better or (2a) toHash worse than described by (1a) and (2b) opCmp implements an ordering suitable for use in unbalanced binary trees There is no evidence, that users that are no experts at hashing will be experts at implementing a suitable ordering, especially if the elements do not have a natural ordering. -manfred
Feb 09 2007
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Manfred Nowak wrote:
 Frits van Bommel wrote
 
 Since hash functions can be user-defined, and not all users
 are experts at hashing, you need to consider that use case.
Currently implemented AA's need to have for a good runtime beahviour at least one of 1) or 2) where: (1a) toHash implements a good distribution and (1b) opCmp==!opEquals or better
I haven't inspected the current AA implementation in detail. Are you sure that an opCmp that never returns "greater" (or, equivalently, never returns "less") will always work? It may well be that the current implementation's binary trees degenerate into a linked list in that case, but it may also very well be that some elements become irretrievable.
 or
 (2a) toHash worse than described by (1a) and
 (2b) opCmp implements an ordering suitable for use in unbalanced binary 
 trees
 
 There is no evidence, that users that are no experts at hashing will be 
 experts at implementing a suitable ordering, especially if the elements 
 do not have a natural ordering.
Orderings are easier to get right than hash functions, I think. Pretty much everybody knows what an ordering looks like, but hash functions need to have a good distribution which can be a rather complicated topic. Of course, since you specified an ordering suitable for an _unbalanced_ binary tree that complicates things a bit. But I don't think good behavior in there depends much on the comparison function. It's more related to in which order you insert the elements. Which orders are good and bad _are_ of course dependent on the comparison... (do current AAs use unbalanced trees? I have no idea, I just know they use _some_ form of trees)
Feb 09 2007
parent reply Manfred Nowak <svv1999 hotmail.com> writes:
Frits van Bommel wrote

 Are
 you sure that an opCmp that never returns "greater" (or,
 equivalently, never returns "less") will always work?
Such an opCmp requires the absence of rebalancing operations. And there have to go some kudos to Walter for that being true.
 (do current AAs use unbalanced trees? I have no idea, I just know
 they use _some_ form of trees)
They are unbalanced. Look at phobos/internal/aaA.d:278 and following lines. -manfred
Feb 09 2007
next sibling parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Manfred Nowak wrote:
 Frits van Bommel wrote
 
 Are
 you sure that an opCmp that never returns "greater" (or,
 equivalently, never returns "less") will always work?
Such an opCmp requires the absence of rebalancing operations. And there have to go some kudos to Walter for that being true.
Doesn't it also require that objects are always passed in the same parameter order to the comparison function? So when comparing two concrete objects a and b, you must always do compare(a, b), never compare(b, a). No matter which is the one you're looking for and which is the element in the tree.
Feb 09 2007
parent reply Manfred Nowak <svv1999 hotmail.com> writes:
Frits van Bommel wrote

 you must always do compare(a, b), never compare(b, a)
No. Because if you are at a node in the tree where a==b, you are done. And if you are at a node in the tree where a!=b, you will go into the direction dictated by opCmp, which is by construction of opCmp the same for all permutations. This holds for all operations on the tree. -manfred
Feb 10 2007
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Manfred Nowak wrote:
 Frits van Bommel wrote
 
 you must always do compare(a, b), never compare(b, a)
No. Because if you are at a node in the tree where a==b, you are done. And if you are at a node in the tree where a!=b, you will go into the direction dictated by opCmp, which is by construction of opCmp the same for all permutations. This holds for all operations on the tree.
Let's say a!=b. If compare(a,b) returns a!=b, as you proposed, that evaluates to true, which converts to integer 1, representing "greater". So it returns that a>b holds. So if b is the node in the tree, you would then continue searching in the *right* branch. If on the other hand compare(b,a) is called, by the same process it returns 1 indicating b>a holds. If b is still the tree node, you would then continue your search in the *left* branch So the implementation needs to be careful about parameter order, and I don't want to go dig into the source to find out if it is. Nor do I want to check every Phobos/Tango release I use to find out if it _still_ is. Since the API doesn't specify this, I think it would be a mistake to rely on it even if it _currently_ holds.
Feb 10 2007
parent Manfred Nowak <svv1999 hotmail.com> writes:
Frits van Bommel wrote

 So the implementation needs to be careful about parameter order,
That's right. The value stemming from the collision bucket has to be at a fixed parameter position on all calls of opCmp. But this can be easily verified by inserting and retrieving two elements under a worthless toHash.
 Since the API doesn't specify this
That's also right. But this also can be fixed easily. -manfred
Feb 11 2007
prev sibling parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Manfred Nowak Wrote:

 Frits van Bommel wrote
<snip>
 (do current AAs use unbalanced trees?  I have no idea, I just 
 know they use _some_ form of trees)
They are unbalanced. Look at phobos/internal/aaA.d:278 and following lines.
Balancing the trees on the fly would be inefficient. But rehash ought to balance them while it's at it. Stewart.
Feb 10 2007
parent reply Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Stewart Gordon wrote:
 Manfred Nowak Wrote:
 
 Frits van Bommel wrote
<snip>
 (do current AAs use unbalanced trees?  I have no idea, I just 
 know they use _some_ form of trees)
They are unbalanced. Look at phobos/internal/aaA.d:278 and following lines.
Balancing the trees on the fly would be inefficient. But rehash ought to balance them while it's at it.
There are several well-known methods of efficiently keeping binary trees (reasonably) balanced. Off the top of my head there are AVL trees and red-black trees, but those definitely aren't the only ones. For implementing a hash-table bucket this is probably not necessary, though. There typically shouldn't be very much elements in a bucket.
Feb 10 2007
parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Frits van Bommel wrote:
 Stewart Gordon wrote:
 Manfred Nowak Wrote:

 Frits van Bommel wrote
<snip>
 (do current AAs use unbalanced trees?  I have no idea, I just know 
 they use _some_ form of trees)
They are unbalanced. Look at phobos/internal/aaA.d:278 and following lines.
Balancing the trees on the fly would be inefficient. But rehash ought to balance them while it's at it.
 There are several well-known methods of efficiently keeping binary trees 
 (reasonably) balanced. Off the top of my head there are AVL trees and 
 red-black trees, but those definitely aren't the only ones.
 
 For implementing a hash-table bucket this is probably not necessary, 
 though. There typically shouldn't be very much elements in a bucket.
So much for O(N lg N) performance in the case of a really bad hash function, though. You'll get O(N) if you happen to insert your values such that their hashes are strictly increasing or strictly decreasing. In theory anyway. ;-) I wouldn't be surprised though if in practice the simplicity of leaving out balancing outweighs the theoretical advantage of balanced trees in 95% of the cases. Someone who has too much time on their hands should take a look at what the code for Python dicts and Lua tables do. Both of those are reputed to be heavily tweaked for super-duper performance. --bb
Feb 10 2007
parent "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Bill Baxter" <dnewsgroup billbaxter.com> wrote in message 
news:eqkmg1$hnb$1 digitaldaemon.com...

 Someone who has too much time on their hands should take a look at what 
 the code for Python dicts and Lua tables do.  Both of those are reputed to 
 be heavily tweaked for super-duper performance.
Lua tables use (according to the source) "a mix of chained scatter table with Brent's variation." The insert method comment reads: /* ** inserts a new key into a hash table; first, check whether key's main ** position is free. If not, check whether colliding node is in its main ** position or not: if it is not, move colliding node to an empty place and ** put new key in its main position; otherwise (colliding node is in its main ** position), new key goes to an empty position. */ It's basically using a variation on separate chaining. Additionally, Lua 5 tables will attempt to keep integer keys in a true array rather than in the hash, which improves performance when using tables as arrays, but that's not really necessary in a language that has true arrays like D ;)
Feb 10 2007
prev sibling parent reply Charles D Hixson <charleshixsn earthlink.net> writes:
Stewart Gordon wrote:
  <snip>

 So if x != y, then both x < y and y < x?  That wouldn't make sense at all.
 
 It should be easy to implement this as the default opCmp(Object).
OUAT, Object.opCmp was set up to compare the memory addresses, but this behaviour was removed to prepare for copying/compacting GC, and probably partly to eliminate the confusing behaviour it created. Stewart.
Actually, there are many cases where modular arithmetic to various bases is the desired behavior, and in such a system "both x < y and y < x" makes perfect sense. You can reach ANY number by repeatedly incrementing and also by repeatedly decrementing. OTOH, I'm not certain that these are worth building into a language. Ada thought so, but no other language appears to have followed their lead. (Possibly it was also done in Algol68 ... I never used it, but it contained all sorts of experimental things.)
Feb 13 2007
next sibling parent reply "Joel C. Salomon" <JoelCSalomon Gmail.com> writes:
Charles D Hixson wrote:
 Stewart Gordon wrote:
 So if x != y, then both x < y and y < x?  That wouldn't make sense at 
 all.
Actually, there are many cases where modular arithmetic to various bases is the desired behavior, and in such a system "both x < y and y < x" makes perfect sense. You can reach ANY number by repeatedly incrementing and also by repeatedly decrementing.
Mathematically speaking, a group like ℤ₃ is unordered, precisely for the reason given. If we need to impose an ordering for array purposes (I don’t understand that bit, just (mis)repeating something I°ve seen said on the subject), then give the ordering lexicographic properties. (x < y) && (y < x) should not ever return true. --Joel
Feb 14 2007
parent reply Manfred Nowak <svv1999 hotmail.com> writes:
Joel C. Salomon wrote

 Mathematically speaking, a group like ℤ₃ is unordered,
 precisely for the reason given.  If we need to impose an ordering
 for array purposes (I don’t understand that bit, just
 (mis)repeating something I°ve seen said on the subject), then
 give the ordering lexicographic properties. 
It is faster to not impose an ordering on an unordered set. -manfred
Feb 14 2007
parent reply Joel Salomon <JoelCSalomon Gmail.com> writes:
 Mathematically speaking, a group like ℤ₃ is unordered,
 precisely for the reason given.  If we need to impose an ordering
 for array purposes (I don’t understand that bit, just
 (mis)repeating something I’ve seen said on the subject), then
 give the ordering lexicographic properties. 
It is faster to not impose an ordering on an unordered set.
Do you know where D insists that “some” ordering be applied to a type, and why?
Feb 14 2007
next sibling parent Manfred Nowak <svv1999 hotmail.com> writes:
Joel Salomon wrote

 Do you know where D insists that “some” ordering be applied to
 a type, and why?
I have given an example that an ordering _operator_ is required by D in another branch of this thread: class C{ hash_t toHash() { return 0; } } void main(){ bool[C] map; map[ new C]= true; map[ new C]= false; } And this branch of the thread discusses Stewarts remark, that an ordering operator, which doesn't implement an ordering is senseless in his opinion. -manfred
Feb 14 2007
prev sibling parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Joel Salomon wrote:
 Mathematically speaking, a group like ℤ₃ is unordered,
 precisely for the reason given.  If we need to impose an ordering
 for array purposes (I don’t understand that bit, just
 (mis)repeating something I’ve seen said on the subject), then
 give the ordering lexicographic properties. 
It is faster to not impose an ordering on an unordered set.
Do you know where D insists that “some” ordering be applied to a type, and why?
Associative arrays. http://www.digitalmars.com/d/arrays.html#associative They're implemented as a hash table with a binary tree per bucket. Binary trees require some ordering between the elements. This isn't the only implementation option of course, for example it could also be a linked list instead of a binary tree. Or it could not use a hash table at all and just use one big binary tree[1]. (But AFAICT the only reasonably efficient option without requiring ordering would be a hash table with a linked list per bucket) [1]: Presumably some self-balancing tree like red-black or AVL.
Feb 14 2007
prev sibling parent reply Nicolai Waniek <no.spam thank.you> writes:
Charles D Hixson wrote:
 Actually, there are many cases where modular arithmetic to various bases
 is the desired behavior, and in such a system "both x < y and y < x"
 makes perfect sense.  You can reach ANY number by repeatedly
 incrementing and also by repeatedly decrementing.
What's about this: a = 1+0j b = 0+1j how would you increment a to finnally get to b? how do you want to compare two complex numbers? this is mathematically impossible. Nicolai
Feb 14 2007
parent Manfred Nowak <svv1999 hotmail.com> writes:
Nicolai Waniek wrote

 What's about this:
 
 a = 1+0j
 b = 0+1j
 
 how would you increment a to finnally get to b? how do you want to
 compare two complex numbers? this is mathematically impossible.
It depends on how one defines "increment". Because Charles only mentions one operation, "increment" can be defined to be the normal complex multiplication. Then "increment( a, b)= b" holds in your case above. If one wants to define an ordering for all points on the unit-circle around the origin Charles is also righteous to define: (a != b) => (a < b) & (b < a) Because you can travel on the circumference of the circle in each direction until "a != b" turns out to be false. -manfred
Feb 14 2007