www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Static associative array expressions

reply Stefan Koch <uplink.coder googlemail.com> writes:
So ... I have had some time on my hands and since have recently 
been working on the newCTFE VM and re-implemented the ABI proper 
...

I thought actually ... this is an ABI issue.
If the compiler knew about the layout of the associative array we 
could emit that layout into the binary just as we would at 
runtime.
provided of course that the hash-function is CTFE-able.

```d
import core.internal.hash;
static immutable ctAA = ["abc":"v",  "ab":"vv", "c" : "vvv", "d" 
: "vvvv"];

struct CTFEHashableString
{
     string s;
     alias s this;
     hash_t toHash()  safe pure nothrow  nogc
     {
         return hashOf(s);
     }
}

void main()
{
     auto rtAA = ["abc":"v",  "ab":"vv", "c" : "vvv", "d" : 
"vvvv"];

     import core.stdc.stdio;
     printf("ctAA.length: %zu\n", ctAA.length);
     printf("rtAA.length: %zu\n", rtAA.length);

     printf(" ----- compile time AA ------\n");
     foreach(k,v; ctAA)
     {
         printf("%s:%s\n", k.ptr, v.ptr);
     }

     printf(" ------- runttime AA --------\n");
     foreach(k,v; rtAA)
     {
         printf("%s:%s\n", k.ptr, v.ptr);
     }
}

```

This code now works on my computer ;)
Yes this means the compiler has to be changed if we change the 
runtime structure of the AA.
And there is a tiny bit of code duplication, however I do think 
that the parity of compile-time and run-time features is worth it 
...

What do you guys think?
Sep 18 2021
next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Sunday, 19 September 2021 at 05:43:04 UTC, Stefan Koch wrote:
 [ .... ]

 This code now works on my computer ;)
 Yes this means the compiler has to be changed if we change the 
 runtime structure of the AA.
 And there is a tiny bit of code duplication, however I do think 
 that the parity of compile-time and run-time features is worth 
 it ...

 What do you guys think?
Draft PR: https://github.com/dlang/dmd/pull/13087/files I plan on factoring the 3 stages out into separate functions. I don't want this block of code be wedged in like the bitfields ... The `todt.d` (static initializer generation) is obtuse enough as it is. It took me an entire day to get this working.
Sep 18 2021
prev sibling next sibling parent reply pineapple <meapineapple gmail.com> writes:
This could be nice. It's always been kind of annoying having to 
put AAs in static initializers and not being able to inspect them 
at compile time.
Sep 19 2021
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Sunday, 19 September 2021 at 11:39:15 UTC, pineapple wrote:
 This could be nice. It's always been kind of annoying having to 
 put AAs in static initializers and not being able to inspect 
 them at compile time.
Be careful though. The runtime does not know if an array is in static memory or not. (Though it would be trivial to add a flag so it knows.) And Therefore when you do this: ```d static mutable_ct_createdAA = mixin(["a","b"]); ``` and you mutate that array in any way. The runtime will crash. As it tries write into memory it is not supposed to write in. That can be fixed by either: - disallowing static mutable AAs for now. (my preferred option.) - Or but adding a flag which would tell the Runtime to copy the literal into mutable memory and perform any modification in that memory. COW (copy on write) essentially.
Sep 19 2021
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 9/19/21 10:10 AM, Stefan Koch wrote:
 On Sunday, 19 September 2021 at 11:39:15 UTC, pineapple wrote:
 This could be nice. It's always been kind of annoying having to put 
 AAs in static initializers and not being able to inspect them at 
 compile time.
Be careful though. The runtime does not know if an array is in static memory or not. (Though it would be trivial to add a flag so it knows.) And Therefore when you do this: ```d static mutable_ct_createdAA = mixin(["a","b"]); ``` and you mutate that array in any way. The runtime will crash. As it tries write into memory it is not supposed to write in.
How do statically initialized mutable dynamic arrays work? Should be the same thing. -Steve
Sep 20 2021
parent Stefan Koch <uplink.coder googlemail.com> writes:
On Monday, 20 September 2021 at 13:08:53 UTC, Steven 
Schveighoffer wrote:
 On 9/19/21 10:10 AM, Stefan Koch wrote:
 On Sunday, 19 September 2021 at 11:39:15 UTC, pineapple wrote:
 This could be nice. It's always been kind of annoying having 
 to put AAs in static initializers and not being able to 
 inspect them at compile time.
Be careful though. The runtime does not know if an array is in static memory or not. (Though it would be trivial to add a flag so it knows.) And Therefore when you do this: ```d static mutable_ct_createdAA = mixin(["a","b"]); ``` and you mutate that array in any way. The runtime will crash. As it tries write into memory it is not supposed to write in.
How do statically initialized mutable dynamic arrays work? Should be the same thing. -Steve
I just had a look to what druntime does. writing into statically allocated memory is fine as long as it's allocated writable. Which means assigning to a mutable array is fine. And the only time when an array has to reallocate is on appending in shrinking. At which point it is copied. Which is the same for statically allocated and dynamically allocated arrays. However for an AA assigning to an index. may pr may not reallocate them. which is why we would have to put a guard into the assignment code as well.
Sep 20 2021
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 9/19/21 1:43 AM, Stefan Koch wrote:

 What do you guys think?
I think an easier fix is just to change the call that does the AA literals to something that can be CTFEd. Right now it's an extern(C) hook that is opaque and takes void[] and TypeInfos. Something like: ```d V[K] __aALiteral(V, K)(K[] keys, V[] values); ``` Then just lower the call to that, and if that is CTFEable, it works. -Steve
Sep 19 2021
next sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Sunday, 19 September 2021 at 11:55:04 UTC, Steven 
Schveighoffer wrote:
 On 9/19/21 1:43 AM, Stefan Koch wrote:

 What do you guys think?
I think an easier fix is just to change the call that does the AA literals to something that can be CTFEd. Right now it's an extern(C) hook that is opaque and takes void[] and TypeInfos. Something like: ```d V[K] __aALiteral(V, K)(K[] keys, V[] values); ``` Then just lower the call to that, and if that is CTFEable, it works. -Steve
That puts actually more pressure on CTFE to be able to adapt to different layouts. The current fixed layout of the AA allows something like newCTFE to do less work. Also it introduces yet more templates, I cannot cache templates as effectively as I can runtime which I only have to generate once.
Sep 19 2021
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 9/19/21 9:38 AM, Stefan Koch wrote:
 On Sunday, 19 September 2021 at 11:55:04 UTC, Steven Schveighoffer wrote:
 On 9/19/21 1:43 AM, Stefan Koch wrote:

 What do you guys think?
I think an easier fix is just to change the call that does the AA literals to something that can be CTFEd. Right now it's an extern(C) hook that is opaque and takes void[] and TypeInfos. Something like: ```d V[K] __aALiteral(V, K)(K[] keys, V[] values); ``` Then just lower the call to that, and if that is CTFEable, it works.
That puts actually more pressure on CTFE to be able to adapt to different layouts.
CTFE doesn't have to use runtime AAs internally, only when moving to code generation would it have to call this function.
 Also it introduces yet more templates, I cannot cache templates as 
 effectively as I can runtime which I only have to generate once.
Only one template per AA type. -Steve
Sep 20 2021
parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Monday, 20 September 2021 at 12:38:59 UTC, Steven 
Schveighoffer wrote:
 On 9/19/21 9:38 AM, Stefan Koch wrote:
 On Sunday, 19 September 2021 at 11:55:04 UTC, Steven 
 Schveighoffer wrote:
 On 9/19/21 1:43 AM, Stefan Koch wrote:

 What do you guys think?
I think an easier fix is just to change the call that does the AA literals to something that can be CTFEd. Right now it's an extern(C) hook that is opaque and takes void[] and TypeInfos. Something like: ```d V[K] __aALiteral(V, K)(K[] keys, V[] values); ``` Then just lower the call to that, and if that is CTFEable, it works.
That puts actually more pressure on CTFE to be able to adapt to different layouts.
CTFE doesn't have to use runtime AAs internally, only when moving to code generation would it have to call this function.
 Also it introduces yet more templates, I cannot cache 
 templates as effectively as I can runtime which I only have to 
 generate once.
Only one template per AA type. -Steve
But that would be an interpreted/or some day JITed function call that has to do the hashing of the values and place them into the right buckets. Which would be a different function per instance. Whereas currently the only thing that has to be interpreted is the toHash function. and it only has to be called once per element.
Sep 20 2021
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 9/20/21 8:54 AM, Stefan Koch wrote:
 On Monday, 20 September 2021 at 12:38:59 UTC, Steven Schveighoffer wrote:
 On 9/19/21 9:38 AM, Stefan Koch wrote:
 Also it introduces yet more templates, I cannot cache templates as 
 effectively as I can runtime which I only have to generate once.
Only one template per AA type.
But that would be an interpreted/or some day JITed function call that has to do the hashing of the values and place them into the right buckets. Which would be a different function per instance. Whereas currently the only thing that has to be interpreted is the toHash function. and it only has to be called once per element.
There exist ways to abstract out parts of templates that are the same for every instance. What *would* have to happen is that the AA code would have to either be duplicated (or mostly duplicated), or made publicly accessible to the compiler. -Steve
Sep 20 2021
prev sibling parent reply Kagamin <spam here.lot> writes:
On Sunday, 19 September 2021 at 11:55:04 UTC, Steven 
Schveighoffer wrote:
 On 9/19/21 1:43 AM, Stefan Koch wrote:

 What do you guys think?
I think an easier fix is just to change the call that does the AA literals to something that can be CTFEd. Right now it's an extern(C) hook that is opaque and takes void[] and TypeInfos. Something like: ```d V[K] __aALiteral(V, K)(K[] keys, V[] values); ``` Then just lower the call to that, and if that is CTFEable, it works. -Steve
Isn't that already kinda possible? A quick example: ```d struct StaticHashTable(TKey,TValue) { struct Pair { TKey key, TValue value } Pair[] items; inout(TValue) opIndex(in TKey key) pure inout { foreach(i;items)if(i.key==key)return i.value; assert(false); } } StaticHashTable!(TKey,TValue) make(TKey,TValue)(in TValue[TKey] aa) pure { StaticHashTable!(TKey,TValue) s; s.items.length=8; s.items[0].key="a"; s.items[0].value=aa["a"]; return s; } immutable s=make(["a":"b","c":"d"]); static assert(s["a"]=="b"); ``` That is, you can serialize a hashtable into anything with a relatively easy syntax.
Sep 19 2021
next sibling parent Stefan Koch <uplink.coder googlemail.com> writes:
On Monday, 20 September 2021 at 06:11:20 UTC, Kagamin wrote:
 On Sunday, 19 September 2021 at 11:55:04 UTC, Steven 
 Schveighoffer wrote:
 On 9/19/21 1:43 AM, Stefan Koch wrote:

 What do you guys think?
I think an easier fix is just to change the call that does the AA literals to something that can be CTFEd. Right now it's an extern(C) hook that is opaque and takes void[] and TypeInfos. Something like: ```d V[K] __aALiteral(V, K)(K[] keys, V[] values); ``` Then just lower the call to that, and if that is CTFEable, it works. -Steve
Isn't that already kinda possible? A quick example: ```d struct StaticHashTable(TKey,TValue) { struct Pair { TKey key, TValue value } Pair[] items; inout(TValue) opIndex(in TKey key) pure inout { foreach(i;items)if(i.key==key)return i.value; assert(false); } } StaticHashTable!(TKey,TValue) make(TKey,TValue)(in TValue[TKey] aa) pure { StaticHashTable!(TKey,TValue) s; s.items.length=8; s.items[0].key="a"; s.items[0].value=aa["a"]; return s; } immutable s=make(["a":"b","c":"d"]); static assert(s["a"]=="b"); ``` That is, you can serialize a hashtable into anything with a relatively easy syntax.
Yes of course you can write a library feature that mimics what we already have in as a builtin. And to make searching that efficient you can add ```d Value* opIndex (Key key) { size_t search_item_idx = hash(key) % n_items; Value* result = null; for(int search_gap = 1;; search_gap++) { if (items[search_item_idx].key == key) { result = &items[search_item_idx].value; break; } else if (items[search_item_idx].key == Key.init && items[search_item_idx].value == value.init) { break; } search_item_idx = (search_item_idx + search_gap) % n_items; } return result; } ``` .... that code is lifted straight from d-runtime. But why should the user have to have another copy of that code in their program?
Sep 20 2021
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 9/20/21 2:11 AM, Kagamin wrote:
 On Sunday, 19 September 2021 at 11:55:04 UTC, Steven Schveighoffer wrote:
 On 9/19/21 1:43 AM, Stefan Koch wrote:

 What do you guys think?
I think an easier fix is just to change the call that does the AA literals to something that can be CTFEd. Right now it's an extern(C) hook that is opaque and takes void[] and TypeInfos. Something like: ```d V[K] __aALiteral(V, K)(K[] keys, V[] values); ``` Then just lower the call to that, and if that is CTFEable, it works.
Isn't that already kinda possible? A quick example: ```d struct StaticHashTable(TKey,TValue) {     struct Pair { TKey key, TValue value }     Pair[] items;     inout(TValue) opIndex(in TKey key) pure inout     {         foreach(i;items)if(i.key==key)return i.value;         assert(false);     } } StaticHashTable!(TKey,TValue) make(TKey,TValue)(in TValue[TKey] aa) pure {     StaticHashTable!(TKey,TValue) s;     s.items.length=8;     s.items[0].key="a";     s.items[0].value=aa["a"];     return s; } immutable s=make(["a":"b","c":"d"]); static assert(s["a"]=="b"); ``` That is, you can serialize a hashtable into anything with a relatively easy syntax.
A `StaticHashTable` is not a builtin AA. The idea is to allow AAs to be generated at compile time, and then moved into usable runtime AAs. One thing that is horrific, but possible, is to generate something like: ```d struct FakeAA(K, V) { // mimic the true AA impl struct FakeBucket { size_t hash; void * entry; } struct FakeAAImpl { FakeBucket[] buckets; uint used; uint deleted; TypeInfo_Struct entryTI; uint firstUsed; immutable uint keysz; immutable uint valsz; immutable uint valoff; ubyte flags; } FakeAAImpl *impl; V[K] __get() { return *(cast(V[K]*)&impl); } alias __get this; } ``` And then build that thing at compile time based on the CTFE AA. But I would still like to see the AA static initializer "just work". -Steve
Sep 20 2021
parent Stefan Koch <uplink.coder googlemail.com> writes:
On Monday, 20 September 2021 at 13:02:22 UTC, Steven 
Schveighoffer wrote:
 ```d
 struct FakeAA(K, V)
 {
     // mimic the true AA impl
     struct FakeBucket {
         size_t hash;
         void * entry;
     }
     struct FakeAAImpl
     {
         FakeBucket[] buckets;
         uint used;
         uint deleted;
         TypeInfo_Struct entryTI;
         uint firstUsed;
         immutable uint keysz;
         immutable uint valsz;
         immutable uint valoff;
         ubyte flags;
     }
     FakeAAImpl *impl;
     V[K] __get() { return *(cast(V[K]*)&impl); }
     alias __get this;
 }
 ```

 And then build that thing at compile time based on the CTFE AA.

 But I would still like to see the AA static initializer "just 
 work".

 -Steve
In fact that is exactly what my implementation of AA literals does. just that it does that internally to the "static initializer generation code" except for evaluation of the hash-function it has nothing to do with CTFE ;) And by the by that is _exactly_ how _all_ generation of static initalizers work. They build up "fake" runtime structures and hope that the structures match the layout used at runtime. Which is why cross-compilation is still a pain.
Sep 20 2021