digitalmars.D.learn - Pre-expanding alloc cell(s) / reserving space for an associative array
- Cecil Ward (13/13) Jul 09 2023 Before I posted a question about avoiding unnecessary
- Steven Schveighoffer (8/23) Jul 09 2023 No there is no such `.reserve` call for associative arrays.
- IchorDev (13/26) Jul 10 2023 From the spec it sounds as though (but good luck testing for
- H. S. Teoh (12/17) Jul 10 2023 [...]
- Steven Schveighoffer (18/33) Jul 10 2023 This used to be the case, but in the latest implementation, the AA uses
Before I posted a question about avoiding unnecessary allocs/reallocs when adding entries to an array like so uint[ dstring ] arr; when I build it up from nothing with successive insertions. The array is accessed by a key that is a dstring. I was told that I can’t use .reserve or the like on it? Is that correct? My memory fails me, powerful pain drugs. Is there an alternate method I could use ? To be honest, the number of entries is likely to be extremely modest, so this is not a huge performance issue, six entries would be considered quite a lot. The string keys are probably not very very long. But I’m learning good habits for the day when I might have a bigger such database.
Jul 09 2023
On 7/9/23 4:24 PM, Cecil Ward wrote:Before I posted a question about avoiding unnecessary allocs/reallocs when adding entries to an array like so uint[ dstring ] arr; when I build it up from nothing with successive insertions. The array is accessed by a key that is a dstring. I was told that I can’t use .reserve or the like on it? Is that correct? My memory fails me, powerful pain drugs. Is there an alternate method I could use ? To be honest, the number of entries is likely to be extremely modest, so this is not a huge performance issue, six entries would be considered quite a lot. The string keys are probably not very very long. But I’m learning good habits for the day when I might have a bigger such database.No there is no such `.reserve` call for associative arrays. It might be possible to implement, but it's quite a bit different from a normal array reserve -- an associative array allocates all its elements as individual memory blocks, so there is no place to reserve things. You would have to add e.g. a free-list, that is reserved from the allocator itself. -Steve
Jul 09 2023
On Sunday, 9 July 2023 at 20:24:24 UTC, Cecil Ward wrote:Before I posted a question about avoiding unnecessary allocs/reallocs when adding entries to an array like so uint[ dstring ] arr; when I build it up from nothing with successive insertions. The array is accessed by a key that is a dstring. I was told that I can’t use .reserve or the like on it? Is that correct? My memory fails me, powerful pain drugs. Is there an alternate method I could use ? To be honest, the number of entries is likely to be extremely modest, so this is not a huge performance issue, six entries would be considered quite a lot. The string keys are probably not very very long. But I’m learning good habits for the day when I might have a bigger such database.From the spec it sounds as though (but good luck testing for sure) that if you have (for example) 6 big dummy key-value pairs in the AA to begin with, then if you use `.clear` it "Removes all remaining keys and values from [the] associative array. The array is not rehashed after removal, __to allow for the existing storage to be reused.__" ([source](https://dlang.org/spec/hash-map.html#properties)) I'm not 100% sure this would work how you want, and it's such a hack, but it's something. D has a few user-made hash-map libraries, so it might be worth investigating those to see if they offer a pre-allocation method, too.
Jul 10 2023
On Mon, Jul 10, 2023 at 09:30:57AM +0000, IchorDev via Digitalmars-d-learn wrote: [...]From the spec it sounds as though (but good luck testing for sure) that if you have (for example) 6 big dummy key-value pairs in the AA to begin with, then if you use `.clear` it "Removes all remaining keys and values from [the] associative array. The array is not rehashed after removal, __to allow for the existing storage to be reused.__"[...] This is not an accurate understanding of what actually happens. The AA implementation consists of a primary hashtable (an array), each slot of which points to a list of buckets. Clearing the AA does not discard the hashtable, but does dispose of the buckets, so adding new keys afterwards will allocate new buckets. So the buckets used by the dummy key-value pairs do not get reused without a reallocation. T -- This is a tpyo.
Jul 10 2023
On 7/10/23 8:44 AM, H. S. Teoh wrote:On Mon, Jul 10, 2023 at 09:30:57AM +0000, IchorDev via Digitalmars-d-learn wrote: [...]This used to be the case, but in the latest implementation, the AA uses open addressing. That is, if a slot is full, it linearly searches the table for the next open slot. This cuts down significantly on cache misses, since the hash slot itself stores the hash, so the pointer only needs to be followed if the hash matches. Fun historical fact, the original AA used a tree to store colliding elements, which means that all AA keys had to support both opEquals and opCmp. BUT, you are right that the elements themselves are individual blocks of memory, and only the bucket array is reused. If you want to see how the hash table works, without having to deal with all the typeinfo acrobatics that the actual builtin AA uses, you can look at my newaa library: https://github.com/schveiguy/newaa This is the exact implementation of the AA internals, but uses compile-time introspection instead of typeinfo. -SteveFrom the spec it sounds as though (but good luck testing for sure) that if you have (for example) 6 big dummy key-value pairs in the AA to begin with, then if you use `.clear` it "Removes all remaining keys and values from [the] associative array. The array is not rehashed after removal, __to allow for the existing storage to be reused.__"[...] This is not an accurate understanding of what actually happens. The AA implementation consists of a primary hashtable (an array), each slot of which points to a list of buckets. Clearing the AA does not discard the hashtable, but does dispose of the buckets, so adding new keys afterwards will allocate new buckets. So the buckets used by the dummy key-value pairs do not get reused without a reallocation.
Jul 10 2023