digitalmars.D.learn - built-in string hash ?
- Kevin Bailey <noone nowhere.com> Aug 28 2010
- torhu <no spam.invalid> Aug 28 2010
- bearophile <bearophileHUGS lycos.com> Aug 28 2010
- Pelle <pelle.mansson gmail.com> Aug 28 2010
- bearophile <bearophileHUGS lycos.com> Aug 28 2010
- BCS <none anon.com> Aug 28 2010
So I have a class containing two strings:
class Foo
{
string s1, s2;
...
and I'd like to add a toHash() member, but I can't find
the built-in string hash function. These don't work:
s1.toHash()
s1.toHash
s1.hash
s1.hash()
hash(s1)
yet strings can clearly be the key in a map.
I see that xml.d writes its own string hash function
but that just doesn't seem right.
Is there a way to get to the built-in ?
Aug 28 2010
On 28.08.2010 16:37, Kevin Bailey wrote:So I have a class containing two strings: class Foo { string s1, s2; ... and I'd like to add a toHash() member, but I can't find the built-in string hash function. These don't work: s1.toHash() s1.toHash s1.hash s1.hash() hash(s1) yet strings can clearly be the key in a map. I see that xml.d writes its own string hash function but that just doesn't seem right. Is there a way to get to the built-in ?
You can get to it through TypeInfo: http://www.digitalmars.com/d/2.0/phobos/object.html#getHash string a = "abc"; auto hash = typeid(a).getHash(&a);
Aug 28 2010
torhu:string a = "abc"; auto hash = typeid(a).getHash(&a);
If higher performance is necessary, you may pre-compute part of that: void main() { string a = "abc"; auto hash1 = typeid(a).getHash(&a); auto stringHash = &(typeid(a).getHash); auto hash2 = stringHash(&a); assert(hash1 == hash2); } Bye, bearophile
Aug 28 2010
On 08/28/2010 10:25 PM, bearophile wrote:torhu:string a = "abc"; auto hash = typeid(a).getHash(&a);
If higher performance is necessary, you may pre-compute part of that: void main() { string a = "abc"; auto hash1 = typeid(a).getHash(&a); auto stringHash =&(typeid(a).getHash); auto hash2 = stringHash(&a); assert(hash1 == hash2); } Bye, bearophile
I doubt that gives any performance gains. typeid(a).getHash should be a constant expression anyway, and I don't see any gains in my tiny benchmark test. Perhaps it works better if a was an Object, since typeid for objects does more.
Aug 28 2010
Pelle:I doubt that gives any performance gains. typeid(a).getHash should be a constant expression anyway, and I don't see any gains in my tiny benchmark test.
My code shows a limited time difference: import std.stdio: writeln; import std.date: getUTCtime, ticksPerSecond; void main() { enum int NLOOP = 30_000_000; string s = "and I'd like to add a toHash() member, but I can't find the built-in string hash function."; { auto t0 = getUTCtime(); auto stringHash =&(typeid(s).getHash); hash_t tot; for (int i; i < NLOOP; i++) tot += stringHash(&s); auto t1 = getUTCtime(); writeln(tot, " ", (t1 - t0) / cast(double)ticksPerSecond); } { auto t0 = getUTCtime(); hash_t tot; for (int i; i < NLOOP; i++) tot += typeid(s).getHash(&s); auto t1 = getUTCtime(); writeln(tot, " ", (t1 - t0) / cast(double)ticksPerSecond); } } And the asm shows the first loop to contain one less instruction (dmd 2.048, -O -release -inline), so the difference is small: L42: lea ECX,01Ch[ESP] mov EAX,02Ch[ESP] push ECX call EDI add ESI,EAX inc EBX cmp EBX,01C9C380h jb L42 LA7: lea EDX,01Ch[ESP] mov ECX,_D12TypeInfo_Aya6__initZ mov EAX,offset FLAT:_D12TypeInfo_Aya6__initZ push EDX call dword ptr 018h[ECX] add EDI,EAX inc EBX cmp EBX,01C9C380h jb LA7 Bye, bearophile
Aug 28 2010
Hello bearophile,Pelle:I doubt that gives any performance gains. typeid(a).getHash should be a constant expression anyway, and I don't see any gains in my tiny benchmark test.
import std.stdio: writeln; import std.date: getUTCtime, ticksPerSecond; void main() { enum int NLOOP = 30_000_000; string s = "and I'd like to add a toHash() member, but I can't find the built-in string hash function."; { auto t0 = getUTCtime(); auto stringHash =&(typeid(s).getHash); hash_t tot; for (int i; i < NLOOP; i++) tot += stringHash(&s); auto t1 = getUTCtime(); writeln(tot, " ", (t1 - t0) / cast(double)ticksPerSecond); } { auto t0 = getUTCtime(); hash_t tot; for (int i; i < NLOOP; i++) tot += typeid(s).getHash(&s); auto t1 = getUTCtime(); writeln(tot, " ", (t1 - t0) / cast(double)ticksPerSecond); } } And the asm shows the first loop to contain one less instruction (dmd 2.048, -O -release -inline), so the difference is small: L42: lea ECX,01Ch[ESP] mov EAX,02Ch[ESP] push ECX call EDI add ESI,EAX inc EBX cmp EBX,01C9C380h jb L42 LA7: lea EDX,01Ch[ESP] mov ECX,_D12TypeInfo_Aya6__initZ
DMD seems to do a bad job of dead assignment removal at the ASM level. I've seen this befor where half the asm in large blocks was dead.mov EAX,offset FLAT:_D12TypeInfo_Aya6__initZ push EDX call dword ptr 018h[ECX] add EDI,EAX inc EBX cmp EBX,01C9C380h jb LA7 Bye, bearophile
... <IXOYE><
Aug 28 2010








BCS <none anon.com>