www.digitalmars.com         C & C++   DMDScript  

D - Hashtables from Associative Arrays

reply Benji Smith <dlanguage xxagg.com> writes:
During the coding of several different modules in the past few weeks, I've found
myself wishing that I could use the associative array capabilities of D to
create a simple hastable, without having to have anything in the array at all.

The code I've been writing has looked something like this:

bit[char[]] myHashTable;

But I'm not really using the bit values, just the hash values. So, when I add an
entry to the hash, it looks like this:

myHashTable["red"] = true;
myHashTable["blue"] = true;

But I don't like the fact that I'm wasting a bunch of memory space in the
process. In fact, I'm about to implement a data structure (a fulltext index for
a pseudo-db application) that will have hundreds of thousands of entries, and I
hate to think about tacking on an extra 32 bits to every single entry in the
hashtable.

I'd rather do something crazy, like this:

null[char[]] myHashTable;

..or this...

empty[char[]] myHashTable;

..or even crazier, like this:

[char[]] myHashTable;

Using semantics like this, it should generate a compile-time error to try to use
the assignment operator. So, this code:

myHashTable["red"] = mySomethingElse;

..would break the compilation. The only thing I'm interested in checking is
whether a particular key value exists. So, I'd run checks like this:

if ("red" in myHashTable) { ... }

Of course, I don't know what syntax you'd be able to use to put a key _into_ the
hashtable, but I could probably think of something.

Of course, I could just implement my own hashtable class that would allow this
type of behavior, but then I wouldn't be able to use the same hashtable syntax
that already exists for associative arrays in the language.

Wouldn anyone else find something like this useful?

--Benji Smith
Sep 18 2003
next sibling parent Helmut Leitner <leitner hls.via.at> writes:
Benji Smith wrote:
 During the coding of several different modules in the past few weeks, I've
found
 myself wishing that I could use the associative array capabilities of D to
 create a simple hastable, without having to have anything in the array at all.
 
 The code I've been writing has looked something like this:
 
 bit[char[]] myHashTable;
 
 But I'm not really using the bit values, just the hash values. So, when I add
an
 entry to the hash, it looks like this:
 
 myHashTable["red"] = true;
 myHashTable["blue"] = true;
 
 But I don't like the fact that I'm wasting a bunch of memory space in the
 process. In fact, I'm about to implement a data structure (a fulltext index for
 a pseudo-db application) that will have hundreds of thousands of entries, and I
 hate to think about tacking on an extra 32 bits to every single entry in the
 hashtable.
It may not be very costly. It depends on implementation detail that only Walter knows. Hashes are always costly, because you have a bucket table, you have nodes for the linked lists that start at the buckets and contain key references and values (or value references). It might be the there is place for a 32-bit value reference that is used for small values. It might be that the nodes have a flat space for references or values and because allocations are multiple of 16 this may hurt or may not hurt. So it may cost you a lot or nothing to do myHashTable[word] = word; myHashTable[word] = bit; You will have to measure it. -- Helmut Leitner leitner hls.via.at Graz, Austria www.hls-software.com
Sep 18 2003
prev sibling next sibling parent Antti =?iso-8859-1?Q?Syk=E4ri?= <jsykari gamma.hut.fi> writes:
 I'd rather do something crazy, like this:
 
 null[char[]] myHashTable;
void[char[]] is probably what you're looking for (as void is a type, and void f(int x) is a valid type as well). I don't think it works, though (and cannot currently check) -Antti
Sep 18 2003
prev sibling parent "Charles Sanders" <sanders-consulting comcast.net> writes:
Im not sure why u just wouldnt use an array here ?  Sort it and use a binary
search for  O(log n ^ 2 )


Charles


"Benji Smith" <dlanguage xxagg.com> wrote in message
news:bkchpe$2rb$1 digitaldaemon.com...
 During the coding of several different modules in the past few weeks, I've
found
 myself wishing that I could use the associative array capabilities of D to
 create a simple hastable, without having to have anything in the array at
all.
 The code I've been writing has looked something like this:

 bit[char[]] myHashTable;

 But I'm not really using the bit values, just the hash values. So, when I
add an
 entry to the hash, it looks like this:

 myHashTable["red"] = true;
 myHashTable["blue"] = true;

 But I don't like the fact that I'm wasting a bunch of memory space in the
 process. In fact, I'm about to implement a data structure (a fulltext
index for
 a pseudo-db application) that will have hundreds of thousands of entries,
and I
 hate to think about tacking on an extra 32 bits to every single entry in
the
 hashtable.

 I'd rather do something crazy, like this:

 null[char[]] myHashTable;

 ..or this...

 empty[char[]] myHashTable;

 ..or even crazier, like this:

 [char[]] myHashTable;

 Using semantics like this, it should generate a compile-time error to try
to use
 the assignment operator. So, this code:

 myHashTable["red"] = mySomethingElse;

 ..would break the compilation. The only thing I'm interested in checking
is
 whether a particular key value exists. So, I'd run checks like this:

 if ("red" in myHashTable) { ... }

 Of course, I don't know what syntax you'd be able to use to put a key
_into_ the
 hashtable, but I could probably think of something.

 Of course, I could just implement my own hashtable class that would allow
this
 type of behavior, but then I wouldn't be able to use the same hashtable
syntax
 that already exists for associative arrays in the language.

 Wouldn anyone else find something like this useful?

 --Benji Smith
Sep 18 2003