www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - Library associative array project v0.0.1

reply Steven Schveighoffer <schveiguy gmail.com> writes:
I just spent a couple hours making a library AA solution that is binary 
compatible with druntime's builtin AA.

The benefits:

1. Proves that a library implementation is possible, also shows where 
shortcomings are.
2. Usable at compile time to make an AA that can be used at runtime.
3. Much more approachable code than the AA runtime, does not require 
"faking" a typeinfo, dealing with typeinfo in general, or deal with 
magic compiler hooks. This gives a good base to start experimenting with.

For the future:

1. Implement all the things that AAs can do (which are possible, some 
are not).
2. Look at alternatives to GC for allocation/deallocation.
3. Possible use with betterC?

Anyway, enjoy!

https://code.dlang.org/packages/newaa

-Steve
May 11 2022
next sibling parent reply templatedperson <templated person.com> writes:
On Wednesday, 11 May 2022 at 15:31:02 UTC, Steven Schveighoffer 
wrote:
 I just spent a couple hours making a library AA solution that 
 is binary compatible with druntime's builtin AA.

 The benefits:

 1. Proves that a library implementation is possible, also shows 
 where shortcomings are.
 2. Usable at compile time to make an AA that can be used at 
 runtime.
 3. Much more approachable code than the AA runtime, does not 
 require "faking" a typeinfo, dealing with typeinfo in general, 
 or deal with magic compiler hooks. This gives a good base to 
 start experimenting with.
This is great!
 2. Look at alternatives to GC for allocation/deallocation.
Maybe add ` nogc` overloads that take `std.experimental.allocator`s? That could be the best way to accomplish this.
May 11 2022
next sibling parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 12/05/2022 3:54 AM, templatedperson wrote:
 2. Look at alternatives to GC for allocation/deallocation.
Maybe add ` nogc` overloads that take `std.experimental.allocator`s? That could be the best way to accomplish this.
That won't help much. The virtual types are not nogc, and for a good reason, the default is the GC.
May 11 2022
parent reply templatedperson <templated person.com> writes:
On Wednesday, 11 May 2022 at 16:07:07 UTC, rikki cattermole wrote:
 The virtual types are not  nogc ....
By "the virtual types", do you mean `interface`s? If so, that could be solved by doing an `enum isAllocator(T) = ...` like `range`s do, right? Or, I believe Atila's `concepts` library enable people to use `interface`s with `struct`s.
 for a good reason, the default is the GC.
Good reason or not, people sometimes need to do manual memory management. I usually use the GC and like that we have it, but if you care about performance you gotta drop down to manual managements sometimes. If libraries don't let users decide what sort of memory management pattern they want to use we have two languages essentially. I'd rather all of phobos had overloads with allocator arguments, but sadly it doesn't.
May 11 2022
parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 12/05/2022 4:55 AM, templatedperson wrote:
 for a good reason, the default is the GC.
Good reason or not, people sometimes need to do manual memory management. I usually use the GC and like that we have it, but if you care about performance you gotta drop down to manual managements sometimes. If libraries don't let users decide what sort of memory management pattern they want to use we have two languages essentially. I'd rather all of phobos had overloads with allocator arguments, but sadly it doesn't.
They are classes hidden inside structs. I'm well aware of the issues, I have my own -betterC allocator library (that will hopefully be available soon-ish) that does not have these issues. ```d struct RCAllocator { private { void delegate() safe nogc pure nothrow refAdd_; void delegate() safe nogc pure nothrow refSub_; void[]delegate(size_t, TypeInfo ti = null) safe nogc pure nothrow allocate_; bool delegate(scope void[]) safe nogc pure nothrow deallocate_; bool delegate(scope ref void[], size_t) safe nogc pure nothrow reallocate_; Ternary delegate(scope void[]) safe nogc pure nothrow owns_; bool delegate() safe nogc pure nothrow deallocateAll_; bool delegate() safe nogc pure nothrow empty_; } safe nogc pure nothrow: ```
May 11 2022
parent templatedperson <templated person.com> writes:
On Wednesday, 11 May 2022 at 16:58:27 UTC, rikki cattermole wrote:
 I'm well aware of the issues, I have my own -betterC allocator 
 library (that will hopefully be available soon-ish) that does 
 not have these issues.
That's exciting to hear! I'll watch for that.
May 11 2022
prev sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/11/22 11:54 AM, templatedperson wrote:
 On Wednesday, 11 May 2022 at 15:31:02 UTC, Steven Schveighoffer wrote:
 I just spent a couple hours making a library AA solution that is 
 binary compatible with druntime's builtin AA.

 The benefits:

 1. Proves that a library implementation is possible, also shows where 
 shortcomings are.
 2. Usable at compile time to make an AA that can be used at runtime.
 3. Much more approachable code than the AA runtime, does not require 
 "faking" a typeinfo, dealing with typeinfo in general, or deal with 
 magic compiler hooks. This gives a good base to start experimenting with.
This is great!
Thanks!
 
 2. Look at alternatives to GC for allocation/deallocation.
Maybe add ` nogc` overloads that take `std.experimental.allocator`s? That could be the best way to accomplish this.
Technically, we don't need attributes, if the allocator doesn't use the GC, it will infer nogc. Adding an allocator template parameter is a definite plan. However, once you go outside the standard AA implementation, you can no longer be binary compatible. But the overall goal is to provide a base for future improvement (both in the language and as an alternative to AAs), so please, if you have more ideas, file some github issues! -Steve
May 11 2022
prev sibling next sibling parent reply ikod <igor.khasilev gmail.com> writes:
On Wednesday, 11 May 2022 at 15:31:02 UTC, Steven Schveighoffer 
wrote:
 I just spent a couple hours making a library AA solution that 
 is binary compatible with druntime's builtin AA.

 The benefits:

 For the future:

 1. Implement all the things that AAs can do (which are 
 possible, some are not).
 2. Look at alternatives to GC for allocation/deallocation.
 3. Possible use with betterC?

 Anyway, enjoy!
I don't know if this compatible with your approach, but for my hashmap I added goals like: * allow nogc and/or safe operations on hashmap if key and value types allow this * add some methods returning a copy of the value instead of a pointer to value Regards!
May 11 2022
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/11/22 2:13 PM, ikod wrote:
 I don't know if this compatible with your approach, but for my hashmap I 
 added goals like:
 
 * allow  nogc and/or  safe operations on hashmap if key and value types 
 allow this
I believe everything is attributed or inferred as it should be (it is a template, so inference should happen on all the methods). I copied all the attributes from the aaA.d file, and assignment/opEquals for the value/key should be inferred based on the appropriate calls. Everything should be correct, because now the language is doing it for me, instead of having to infer this with weird wrappers (e.g. [here](https://github.com/dlang/druntime/blob/9631e0d515fa4354b7302c8e2404dd119ff201aa/src/object.d#L3443)).
 * add some methods returning a copy of the value instead of a pointer to 
 value
I return a reference to the value (via `opIndex`). Can you elaborate on why you would need it to be a copy? What is your API? -Steve
May 11 2022
parent reply ikod <igor.khasilev gmail.com> writes:
On Wednesday, 11 May 2022 at 19:15:36 UTC, Steven Schveighoffer 
wrote:

 * add some methods returning a copy of the value instead of a 
 pointer to value
I return a reference to the value (via `opIndex`). Can you elaborate on why you would need it to be a copy? What is your API? -Steve
Sorry, it is safe to return reference for your implementation, as you call `new Entry` on every insert, like standard AA. I keep entries in preallocated vector for performance reasons and every `resize` can break reference.
May 11 2022
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/11/22 5:41 PM, ikod wrote:
 On Wednesday, 11 May 2022 at 19:15:36 UTC, Steven Schveighoffer wrote:
 
 * add some methods returning a copy of the value instead of a pointer 
 to value
I return a reference to the value (via `opIndex`). Can you elaborate on why you would need it to be a copy? What is your API?
Sorry, it is safe to return reference for your implementation, as you call `new Entry` on every insert, like standard AA. I keep entries in preallocated vector for performance reasons and every `resize` can break reference.
Ah yes, with a non-gc allocator, it would be important to either ensure references do not get stale, or to at least identify the pitfalls. Maybe the "return a copy" version is safe, whereas the normal opIndex is system. That reminds me though! with this, I can experiment with preallocating (reserve) entries, something that is oft-requested for AA. -Steve
May 11 2022
prev sibling next sibling parent kinke <noone nowhere.com> writes:
On Wednesday, 11 May 2022 at 15:31:02 UTC, Steven Schveighoffer 
wrote:
 I just spent a couple hours making a library AA solution that 
 is binary compatible with druntime's builtin AA.

 The benefits:
 [...]
Much appreciated, thanks!
May 11 2022
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, May 11, 2022 at 11:31:02AM -0400, Steven Schveighoffer via
Digitalmars-d-announce wrote:
 I just spent a couple hours making a library AA solution that is
 binary compatible with druntime's builtin AA.
This is awesome! Don't have time to look at it in detail right now, but will definitely keep this in mind.
 The benefits:
 
 1. Proves that a library implementation is possible, also shows where
 shortcomings are.
What are the shortcomings that you found?
 2. Usable at compile time to make an AA that can be used at runtime.
Awesome.
 3. Much more approachable code than the AA runtime, does not require
 "faking" a typeinfo, dealing with typeinfo in general, or deal with
 magic compiler hooks. This gives a good base to start experimenting
 with.
Awesome.
 For the future:
 
 1. Implement all the things that AAs can do (which are possible, some
 are not).
Which things are not possible to do?
 2. Look at alternatives to GC for allocation/deallocation.
 3. Possible use with betterC?
[...] Just use malloc/free? Could have problems with dangling references to buckets, though, if somebody retains the pointer returned by `key in aa` expressions. Or use ref-counting of some sort. But hard to do this without changing binary compatibility. T -- "You know, maybe we don't *need* enemies." "Yeah, best friends are about all I can take." -- Calvin & Hobbes
May 12 2022
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/12/22 1:15 PM, H. S. Teoh wrote:
 On Wed, May 11, 2022 at 11:31:02AM -0400, Steven Schveighoffer via
Digitalmars-d-announce wrote:
 I just spent a couple hours making a library AA solution that is
 binary compatible with druntime's builtin AA.
This is awesome! Don't have time to look at it in detail right now, but will definitely keep this in mind.
 The benefits:

 1. Proves that a library implementation is possible, also shows where
 shortcomings are.
What are the shortcomings that you found?
I mean it has the benefit of highlighting where the language doesn't provide the capability of replacing the existing AA with a library type. I literally threw this together in a few hours, so I haven't used it or tested it a lot. I know there are *some* things, but I haven't tried yet to exhaustively make it a drop-in replacement. I was surprised at how short the code is once you throw out all the TypeInfo BS that is currently in druntime.
 For the future:

 1. Implement all the things that AAs can do (which are possible, some
 are not).
Which things are not possible to do?
The whole thing how the compiler knows that an outer AA is being used to initialize an inner AA. e.g. this works currently, but is impossible to hook for a library (I think): ```d int[int][int] aa; aa[0][1] = 5; ``` There's also possible problems with qualifiers that are yet-to-be-discovered, but usually show up when the compiler is cheating.
 2. Look at alternatives to GC for allocation/deallocation.
 3. Possible use with betterC?
[...] Just use malloc/free? Could have problems with dangling references to buckets, though, if somebody retains the pointer returned by `key in aa` expressions. Or use ref-counting of some sort. But hard to do this without changing binary compatibility.
Yes, the lifetime issues are the real problem with not using the GC. Maybe you just avoid the `in` operator in those flavors? Instead you can use a match-style operation, something like: ```d hash.match(k, (ref v) { // work with v }); ``` The whole point of this "exercise" is to see what is missing, and explore what is possible. -Steve
May 12 2022
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, May 12, 2022 at 01:46:19PM -0400, Steven Schveighoffer via
Digitalmars-d-announce wrote:
 On 5/12/22 1:15 PM, H. S. Teoh wrote:
 On Wed, May 11, 2022 at 11:31:02AM -0400, Steven Schveighoffer via
Digitalmars-d-announce wrote:
[...]
 1. Proves that a library implementation is possible, also shows
 where shortcomings are.
What are the shortcomings that you found?
[...]
 I was surprised at how short the code is once you throw out all the
 TypeInfo BS that is currently in druntime.
Yeah, the TypeInfo stuff takes up a lot of code. And is hackish and ugly.
 For the future:
 
 1. Implement all the things that AAs can do (which are possible,
 some are not).
Which things are not possible to do?
The whole thing how the compiler knows that an outer AA is being used to initialize an inner AA. e.g. this works currently, but is impossible to hook for a library (I think): ```d int[int][int] aa; aa[0][1] = 5; ```
I already saw this problem 8 years ago: https://issues.dlang.org/show_bug.cgi?id=7753 Maybe it's time for us to write a DIP for this? ;-)
 There's also possible problems with qualifiers that are
 yet-to-be-discovered, but usually show up when the compiler is
 cheating.
Ugh. AA + qualifiers == one of the dark dirty corners of D that I don't like looking at. It's a prime example of an issue that's papered over half-heartedly with a half-solution that doesn't really solve the problem: 1. In the bad ole days, AA's used to allow unqualified types for the key. This wasn't a problem with POD or immutable types, but shows up when you try to make an AA with, say, a char[] key type. You'd insert a key into the AA, then accidentally overwrite the array contents, and suddenly the key "vanishes" from the AA -- because the contents changed without the AA knowing, so it's still sitting in the old slot with the old hash value. 2. This problem was brought up, so somebody thought it was a good idea to implicitly force the key type into const. So when you declared an AA of type int[char[]] for example, it'd get implicitly converted to int[const(char[])]. But this doesn't solve anything! All the const does is ensure the AA cannot modify the key. But the user still can!! So the original problem still exists. AA keys really should be immutable, i.e., it should not be modifiable by *anyone*, not the AA code, not the caller, not anyone else. That's the only way you can guarantee the consistency of the hash values stored in the AA vs. the contents of the key. OTOH, though, you *do* want to accept const-qualified versions of the key *during lookup*. (It would be onerous to require .idup'ing a char[] into a string just so you can do a lookup in the AA, for example.) This gets a bit hairy, though: if the AA entry may not exist and may need to be created, then the key must be immutable ('cos we'll potentially be storing a reference to the key in the AA). But if it's a pure lookup without new entry creation, then const is acceptable.
 2. Look at alternatives to GC for allocation/deallocation.
 3. Possible use with betterC?
[...] Just use malloc/free? Could have problems with dangling references to buckets, though, if somebody retains the pointer returned by `key in aa` expressions. Or use ref-counting of some sort. But hard to do this without changing binary compatibility.
Yes, the lifetime issues are the real problem with not using the GC. Maybe you just avoid the `in` operator in those flavors? Instead you can use a match-style operation, something like: ```d hash.match(k, (ref v) { // work with v }); ```
[...] That's a great idea. Should be `scope ref`, though, to avoid the reference leaking out via a closure / global. ;-) T -- English is useful because it is a mess. Since English is a mess, it maps well onto the problem space, which is also a mess, which we call reality. Similarly, Perl was designed to be a mess, though in the nicest of all possible ways. -- Larry Wall
May 12 2022