www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Storing interfaces as void[]

reply David Zhang <straivers98 gmail.com> writes:
I want to store interfaces as untyped void[], then cast them back 
to the interface at a later time. However, it appears to produce 
garbage values on get().

Is this even possible, and if so, what is happening here? The 
alternative would be a struct { CheckedPtr self; api_fns.... }

e.g.

void register(I)(I i) {
   auto mem = new void[](I.sizeof);
   memcpy(mem.ptr, cast(void*) i, I.sizeof);

   // CheckedPtr includes a hash of fullyQualifiedName
   map[i.get_name()] = CheckedPtr!I(mem.ptr);
}

I get(I)() {
   // basically cast(I) p
   return map[I.get_name()].as!I();
}
Mar 12 2021
next sibling parent reply Imperatorn <johan_forsberg_86 hotmail.com> writes:
On Friday, 12 March 2021 at 17:37:43 UTC, David  Zhang wrote:
 I want to store interfaces as untyped void[], then cast them 
 back to the interface at a later time. However, it appears to 
 produce garbage values on get().

 Is this even possible, and if so, what is happening here? The 
 alternative would be a struct { CheckedPtr self; api_fns.... }

 e.g.

 void register(I)(I i) {
   auto mem = new void[](I.sizeof);
   memcpy(mem.ptr, cast(void*) i, I.sizeof);

   // CheckedPtr includes a hash of fullyQualifiedName
   map[i.get_name()] = CheckedPtr!I(mem.ptr);
 }

 I get(I)() {
   // basically cast(I) p
   return map[I.get_name()].as!I();
 }
Have you tried using Variant or jsvar (https://code.dlang.org/packages/arsd-official%3Ajsvar)? 🤔
Mar 12 2021
parent reply David Zhang <straivers98 gmail.com> writes:
On Friday, 12 March 2021 at 17:46:22 UTC, Imperatorn wrote:
 On Friday, 12 March 2021 at 17:37:43 UTC, David  Zhang wrote:
 I want to store interfaces as untyped void[], then cast them 
 back to the interface at a later time. However, it appears to 
 produce garbage values on get().

 Is this even possible, and if so, what is happening here? The 
 alternative would be a struct { CheckedPtr self; api_fns.... }

 e.g.

 void register(I)(I i) {
   auto mem = new void[](I.sizeof);
   memcpy(mem.ptr, cast(void*) i, I.sizeof);

   // CheckedPtr includes a hash of fullyQualifiedName
   map[i.get_name()] = CheckedPtr!I(mem.ptr);
 }

 I get(I)() {
   // basically cast(I) p
   return map[I.get_name()].as!I();
 }
Have you tried using Variant or jsvar (https://code.dlang.org/packages/arsd-official%3Ajsvar)? 🤔
It doesn't appear to support interfaces, see opAssign at line 682. It occurs to me that I.sizeof == 8 which is just enough for the vtbl, but not enough for an implementation ptr. Maybe it's a pointer to a {self, vtbl} pair? SomeClass.sizeof == 16, which is enough storage for both...
Mar 12 2021
parent reply Imperatorn <johan_forsberg_86 hotmail.com> writes:
On Friday, 12 March 2021 at 17:57:06 UTC, David  Zhang wrote:
 On Friday, 12 March 2021 at 17:46:22 UTC, Imperatorn wrote:
 On Friday, 12 March 2021 at 17:37:43 UTC, David  Zhang wrote:
 [...]
Have you tried using Variant or jsvar (https://code.dlang.org/packages/arsd-official%3Ajsvar)? 🤔
It doesn't appear to support interfaces, see opAssign at line 682. It occurs to me that I.sizeof == 8 which is just enough for the vtbl, but not enough for an implementation ptr. Maybe it's a pointer to a {self, vtbl} pair? SomeClass.sizeof == 16, which is enough storage for both...
Did you try Variant? Seems to work for me
Mar 12 2021
parent David Zhang <straivers98 gmail.com> writes:
On Friday, 12 March 2021 at 18:14:12 UTC, Imperatorn wrote:
 On Friday, 12 March 2021 at 17:57:06 UTC, David  Zhang wrote:
 On Friday, 12 March 2021 at 17:46:22 UTC, Imperatorn wrote:
 On Friday, 12 March 2021 at 17:37:43 UTC, David  Zhang wrote:
 [...]
Have you tried using Variant or jsvar (https://code.dlang.org/packages/arsd-official%3Ajsvar)? 🤔
It doesn't appear to support interfaces, see opAssign at line 682. It occurs to me that I.sizeof == 8 which is just enough for the vtbl, but not enough for an implementation ptr. Maybe it's a pointer to a {self, vtbl} pair? SomeClass.sizeof == 16, which is enough storage for both...
Did you try Variant? Seems to work for me
It seems a bit overkill for this. I also want to be able to track memory usage here, and Variant doesn't appear able to store arbitrary-sized structs without untrackable allocations. The idea has merit, but doesn't give me the control I desire :)
Mar 12 2021
prev sibling next sibling parent reply tsbockman <thomas.bockman gmail.com> writes:
On Friday, 12 March 2021 at 17:37:43 UTC, David  Zhang wrote:
 I want to store interfaces as untyped void[], then cast them 
 back to the interface at a later time.
Assuming these interfaces are actual D `interface`s, declared with the keyword and using the default `extern(D)` linkage, you're way over-complicating things. All `extern(D)` `class`es are sub-types of `object.Object`, and have type information accessible through the implicit virtual function table pointer `__vptr` that can be used to perform safe dynamic casts: //////////////////////////////////////////// /* Fully qualified names can be *very* long because of templates, so it can be wasteful to store and compare them at runtime. Let's use `TypeInfo` instead: */ Object[TypeInfo] map; void register(I)(I i) if(is(I == interface)) // This wouldn't work right with value types like structs. { /* The garbage collector will keep the instance referenced by `i` alive for us as long as necessary. So, there is no need to copy the instance unless we want to be able to mutate it elsewhere without affecting the instance refrenced by `map`. */ /* `typeid(i)` will compile, but probably doesn't do what you want: you need to retrieve the value using the same key you put it in with, and below that has to be `typeid(I)`: */ map[typeid(I)] = cast(Object) i; } I get(I)() if(is(I == interface)) // This wouldn't work right with value types like structs. { /* This will return `null` if the value isn't really a reference to an instance of a subtype of `I`, or if the key isn't in the map yet. If you don't care about the latter case, it can be shortened to just `cast(I) map[I.stringof]`. */ auto valuePtr = (typeid(I) in map); return (valuePtr is null)? null : cast(I) *valuePtr; } //////////////////////////////////////////// Many variations on this are possible, depending on exactly what you're really trying to accomplish. This is my best guess as to what you're after, though, without seeing the context for the code you posted.
 However, it appears to produce garbage values on get().

 Is this even possible, and if so, what is happening here? The 
 alternative would be a struct { CheckedPtr self; api_fns.... }

 e.g.

 void register(I)(I i) {
   auto mem = new void[](I.sizeof);
   memcpy(mem.ptr, cast(void*) i, I.sizeof);

   // CheckedPtr includes a hash of fullyQualifiedName
   map[i.get_name()] = CheckedPtr!I(mem.ptr);
 }

 I get(I)() {
   // basically cast(I) p
   return map[I.get_name()].as!I();
 }
Your code confuses two levels of indirection (because D makes this super confusing); that is why it doesn't do what you want. `cast(void*) i` is an untyped pointer to some class instance, while `I.sizeof` is the size of *the pointer itself*, NOT the instance it points to. `I` is *not* the type of an interface instance, it is the type of a reference to an instance of the interface. (A reference is a pointer that sometimes automatically dereferences itself.) So, I.sizeof is always (void*).sizeof, the size of a pointer. The types of class and interface instances cannot be named or referenced in D, but `__traits(classInstanceSize, C)` will get you the size of an instance of class `C`. Getting the size of an interface instance isn't really a sensible thing to do, since the whole point of interfaces is that you don't need to know about the implementing type to work with them. Regardless, copying or moving the memory of a class instances directly is generally a bad idea, for various reasons.
Mar 12 2021
next sibling parent tsbockman <thomas.bockman gmail.com> writes:
On Friday, 12 March 2021 at 18:50:26 UTC, tsbockman wrote:
     /* This will return `null` if the value isn't really a 
 reference to an instance
     of a subtype of `I`, or if the key isn't in the map yet. If 
 you don't care
     about the latter case, it can be shortened to just `cast(I) 
 map[I.stringof]`. */
Oops, that last bit should say `cast(I) map[typeid(I)]`.
Mar 12 2021
prev sibling parent reply David Zhang <straivers98 gmail.com> writes:
On Friday, 12 March 2021 at 18:50:26 UTC, tsbockman wrote:
 <snip>
The idea is to implement a service locator s.t. code like this is possible: // struct (I didn't mention this in the top post, my mistake) auto log = Logger() api_registry.register!Logger(log); // class/interface auto input = new InputReplay(recorded_input); api_registry.register!InputStream(input); // // somewhere else // // interface auto input = api_registry.get!InputStream(); // do something with input. // struct* api_registry.get!Logger().info(...); // // and additionally // auto input = new KeyboardInput(...); api_registry.replace!InputStream(input);
 /* Fully qualified names can be *very* long because of 
 templates, so it can be wasteful to store and compare them at 
 runtime. Let's use `TypeInfo` instead: */
Aye, I'm using hashes. The idea is to support either D interfaces or structs with arbitrary content.
 `I` is *not* the type of an interface instance, it is the type 
 of a reference to an instance of the interface.
So `I i` is a reference to the instance, which itself holds a reference to the implementation, like a reference to a delegate (pointer to (ctx, fn))? Thus cast(void*) produces something like *(ctx, fn). I don't mind that. I just want to store that reference as void[], be able to replace with with some other reference to another implementation as void[], then retrieve that as a reference intact. I don't really need to copy or move the class instances here, just be able to read, assign, and replace references to them in the same place that I read, assign, and replace void[]'s of structs.
Mar 12 2021
parent reply tsbockman <thomas.bockman gmail.com> writes:
On Friday, 12 March 2021 at 19:24:17 UTC, David  Zhang wrote:
 On Friday, 12 March 2021 at 18:50:26 UTC, tsbockman wrote:
 <snip>
The idea is to implement a service locator s.t. code like this is possible: ... I don't really need to copy or move the class instances here, just be able to read, assign, and replace references to them in the same place that I read, assign, and replace void[]'s of structs.
Why do you think you need a `void[]` slice? I think `void*` pointers are sufficient. This handles all normal data types, as long as they are allocated on the GC heap: //////////////////////////////// /* NOTE: It might be wiser to make this a `final class` instead of a `struct` to get reference semantics: */ struct Registry { private void*[TypeInfo] map; /// Return true if the registry was updated, false if not. bool put(bool overwrite = true, Indirect)(Indirect i) trusted if(is(Indirect : T*, T) // Support structs, static arrays, etc. || is(Indirect == interface) || is(Indirect == class)) { auto key = typeid(Indirect), value = cast(void*) i; bool updated; if(value is null) updated = map.remove(key); else { static if(overwrite) { updated = true; map[key] = value; } else updated = (map.require(key, value) is value); } return updated; } alias put(Indirect) = put!(true, Indirect); bool remove(Indirect)() trusted if(is(Indirect : T*, T) // Support structs, static arrays, etc. || is(Indirect == interface) || is(Indirect == class)) { return map.remove(typeid(Indirect)); } /** Returns a reference (for interfaces and classes) or a pointer (for everything else) if the type has been registered, and null otherwise. **/ Indirect get(Indirect)() trusted if(is(Indirect : T*, T) // Support structs, static arrays, etc. || is(Indirect == interface) || is(Indirect == class)) { return cast(Indirect) map.get(typeid(Indirect), null); } } safe unittest { static interface I { char c() const pure safe nothrow nogc; } static class C : I { private char _c; this(char c) inout pure safe nothrow nogc { this._c = c; } override char c() const pure safe nothrow nogc { return _c; } } static struct S { char c = 'S'; this(char c) inout pure safe nothrow nogc { this.c = c; } } Registry registry; assert( registry.put!false(new C('1'))); assert( registry.put!I(new C('2'))); assert( registry.put(new S('$'))); assert(!registry.put!false(new C('3'))); // Optionally protect existing entries. assert(registry.get!(I).c == '2'); assert(registry.get!(C).c == '1'); assert(registry.remove!I); assert(registry.get!I is null); assert(registry.get!C !is null); assert(registry.get!(S*).c == '$'); assert(registry.get!(int*) is null); } //////////////////////////////// NOTE: If you only need one Registry instance in per thread, then you probably don't need the associative array at all; instead just use a template like this: //////////////////////////////// template registered(Indirect) if(is(Indirect : T*, T) // Support structs, static arrays, etc. || is(Indirect == interface) || is(Indirect == class)) { /* This uses thread local storage (TLS). Sharing across the entire process is possible too, but would require synchronization of the registered objects, not just this reference/pointer: */ private Indirect indirect = null; /// Return true if the registry was updated, false if not. bool put(bool overwrite = true)(Indirect i) safe { bool updated = (indirect !is i); static if(overwrite) { indirect = i; } else { updated &= (indirect is null); if(updated) indirect = i; } return updated; } bool remove() safe { bool updated = (indirect !is null); indirect = null; return updated; } /** Returns a reference (for interfaces and classes) or a pointer (for everything else) if the type has been registered, and null otherwise. **/ Indirect get() safe { return indirect; } } safe unittest { static interface I { char c() const pure safe nothrow nogc; } static class C : I { private char _c; this(char c) inout pure safe nothrow nogc { this._c = c; } override char c() const pure safe nothrow nogc { return _c; } } static struct S { char c = 'S'; this(char c) inout pure safe nothrow nogc { this.c = c; } } assert( registered!(C).put!false(new C('1'))); assert( registered!(I).put(new C('2'))); assert( registered!(S*).put(new S('$'))); assert(!registered!(C).put!false(new C('3'))); // Optionally protect existing entries. assert(registered!(I).get.c == '2'); assert(registered!(C).get.c == '1'); assert(registered!(I).remove()); assert(registered!(I).get is null); assert(registered!(C).get !is null); assert(registered!(S*).get.c == '$'); assert(registered!(int*).get is null); } ////////////////////////////////
 Aye, I'm using hashes. The idea is to support either D 
 interfaces or structs with arbitrary content.
You can use TypeInfo references as the keys for the struct types, too. It's better than hashes because the TypeInfo is already being generated anyway, and is guaranteed to be unique, unlike your hashes which could theoretically collide.
 `I` is *not* the type of an interface instance, it is the type 
 of a reference to an instance of the interface.
So `I i` is a reference to the instance, which itself holds a reference to the implementation, like a reference to a delegate (pointer to (ctx, fn))?
No. What you are calling "the implementation" is the same thing as "the instance". `I i` is a reference into the interior of the implementing class instance, offset such that it points at the virtual function table pointer for that interface. The lowering is something like this: //////////////////////////////// interface I { ... } class C : I { ... } struct C_Instance { void* __vtblForC; // ... other junk void* __vtblForI; // ... explicit and inherited class fields, if any. } C c = new C; // Works like `C_Instance* c = new C_Instance;` I i = c; // Works like `void** i = (C_Instance.__vtblForI.offsetof + cast(void*) c);` //////////////////////////////// You can prove this to yourself by inspecting the raw numerical addresses. You will find that the address used by the interface reference points to a location *inside* the instance of C: //////////////////////////////// void main() { import std.stdio : writeln; static interface I { char c() safe; } static class C : I { override char c() safe { return 'C'; } } C c = new C; const cStart = cast(size_t) cast(void*) c; const cEnd = cStart + __traits(classInstanceSize, C); I i = c; const iStart = cast(size_t) cast(void*) i; writeln(cStart <= iStart && iStart < cEnd); } ////////////////////////////////
 Thus cast(void*) produces something like *(ctx, fn).
No. See above.
Mar 12 2021
parent reply David Zhang <straivers98 gmail.com> writes:
On Friday, 12 March 2021 at 22:18:59 UTC, tsbockman wrote:
 Why do you think you need a `void[]` slice? I think `void*` 
 pointers are sufficient. This handles all normal data types, as 
 long as they are allocated on the GC heap:
I wanted to have the registry own the structs' memory, though using new makes more sense on second thought. Your example templated implementation makes so much sense, though it doesn't work in this case. Thanks for the idea though.
 Aye, I'm using hashes. The idea is to support either D 
 interfaces or structs with arbitrary content.
You can use TypeInfo references as the keys for the struct types, too. It's better than hashes because the TypeInfo is already being generated anyway, and is guaranteed to be unique, unlike your hashes which could theoretically collide.
Makes sense. Using TypeInfo never occurred to me. I assume they are generated for COM classes as well?
 `I` is *not* the type of an interface instance, it is the 
 type of a reference to an instance of the interface.
So `I i` is a reference to the instance, which itself holds a reference to the implementation, like a reference to a delegate (pointer to (ctx, fn))?
No. What you are calling "the implementation" is the same thing as "the instance". `I i` is a reference into the interior of the implementing class instance, offset such that it points at the virtual function table pointer for that interface. The lowering is something like this: //////////////////////////////// interface I { ... } class C : I { ... } struct C_Instance { void* __vtblForC; // ... other junk void* __vtblForI; // ... explicit and inherited class fields, if any. } C c = new C; // Works like `C_Instance* c = new C_Instance;` I i = c; // Works like `void** i = (C_Instance.__vtblForI.offsetof + cast(void*) c);`
Makes sense, I always thought of them as existing in separate places. So much head-bashing, and it's over. Thanks, tsbockman, Imperatorn.
Mar 12 2021
parent tsbockman <thomas.bockman gmail.com> writes:
On Saturday, 13 March 2021 at 00:36:37 UTC, David  Zhang wrote:
 On Friday, 12 March 2021 at 22:18:59 UTC, tsbockman wrote:
 You can use TypeInfo references as the keys for the struct 
 types, too. It's better than hashes because the TypeInfo is 
 already being generated anyway, and is guaranteed to be 
 unique, unlike your hashes which could theoretically collide.
Makes sense. Using TypeInfo never occurred to me. I assume they are generated for COM classes as well?
I'm not sure about that; you should test it yourself. I know that runtime type information support is incomplete for some non-`extern(D)` types - for example: https://issues.dlang.org/show_bug.cgi?id=21690 You can always fall back to fully qualified names for non-`extern(D)` stuff if you have to. But you should protect against hash collisions somehow, if you go that route. Here's a hybrid approach that is immune to hash collisions and works across DLL boundaries, but can almost always verify equality with a single pointer comparison: /////////////////////////////////////// struct TypeKey { private: const(void)* ptr; size_t length; this(const(TypeInfo) typeInfo) const pure trusted nothrow nogc { ptr = cast(const(void)*) typeInfo; length = 0u; } this(string fqn) immutable pure trusted nothrow { /* We need to allocate a block of size_t to ensure proper alignment for the first chunk, which is the hash code of the fqn string: */ size_t[] chunks = new size_t[1 + (fqn.length + (size_t.sizeof - 1)) / size_t.sizeof]; chunks[0] = hashOf(fqn); (cast(char*) chunks.ptr)[size_t.sizeof .. size_t.sizeof + fqn.length] = fqn; ptr = cast(immutable(void)*) chunks.ptr; length = fqn.length; } property const(TypeInfo) typeInfo() const pure trusted nothrow nogc in(length == 0u) { return cast(const(TypeInfo)) ptr; } property size_t fqnHash() const pure trusted nothrow nogc in(length != 0u) { return *cast(const(size_t)*) ptr; } property string fqn() const pure trusted nothrow nogc in(length != 0u) { const fqnPtr = cast(immutable(char)*) (this.ptr + size_t.sizeof); return fqnPtr[0 .. length]; } public: string toString() const safe { return (length == 0u)? typeInfo.toString() : fqn; } size_t toHash() const safe nothrow { return (length == 0u)? typeInfo.toHash : fqnHash; } bool opEquals(TypeKey that) const trusted { if(this.ptr is that.ptr) return true; if(this.length != that.length) return false; if(length == 0u) return (this.typeInfo == that.typeInfo); if(this.fqnHash != that.fqnHash) return false; return (this.fqn == that.fqn); } } template typeKeyOf(Indirect) if(is(Indirect : T*, T) // Support structs, static arrays, etc. || is(Indirect == interface) || is(Indirect == class)) { static if(is(Indirect : T*, T) || (__traits(getLinkage, Indirect) == "D")) { property const(TypeKey) typeKeyOf() pure safe nothrow nogc { return const(TypeKey)(typeid(Indirect)); } } else { /* For FQN-based keys, ideally the whole process should share a single copy of the heap allocated fqn and its hash, so we'll use a global. No synchronization is necessary post construction, since it is immutable: */ private immutable TypeKey masterKey; shared static this() { // With some bit-twiddling, this could be done at compile time, if needed: import std.traits : fullyQualifiedName; masterKey = immutable(TypeKey)(fullyQualifiedName!Indirect); } property immutable(TypeKey) typeKeyOf() safe nothrow nogc { assert(masterKey.ptr !is null); return masterKey; } } } safe unittest { static extern(C++) class X { } static extern(C++) class Y { } assert(typeKeyOf!(int*) == typeKeyOf!(int*)); assert(typeKeyOf!(int*) != typeKeyOf!(float*)); assert(typeKeyOf!X == typeKeyOf!X); assert(typeKeyOf!X != typeKeyOf!Y); assert(typeKeyOf!X != typeKeyOf!(int*)); assert(typeKeyOf!(float*) != typeKeyOf!Y); } ///////////////////////////////////////
 The lowering is something like this:
 ...
Makes sense, I always thought of them as existing in separate places.
Yeah, there are multiple reasonable ways of supporting interfaces in a language, each with their own trade-offs. But, this is the one used by D at the moment.
 So much head-bashing, and it's over. Thanks, tsbockman, 
 Imperatorn.
You're very welcome!
Mar 12 2021
prev sibling parent reply frame <frame86 live.com> writes:
On Friday, 12 March 2021 at 17:37:43 UTC, David  Zhang wrote:
 I want to store interfaces as untyped void[], then cast them 
 back to the interface at a later time. However, it appears to 
 produce garbage values on get().

 Is this even possible, and if so, what is happening here? The 
 alternative would be a struct { CheckedPtr self; api_fns.... }

 e.g.

 void register(I)(I i) {
   auto mem = new void[](I.sizeof);
   memcpy(mem.ptr, cast(void*) i, I.sizeof);

   // CheckedPtr includes a hash of fullyQualifiedName
   map[i.get_name()] = CheckedPtr!I(mem.ptr);
 }

 I get(I)() {
   // basically cast(I) p
   return map[I.get_name()].as!I();
 }
Maybe I don't get this right but why you don't just use the void[] as it is? void[] mem = [i]; //... return (cast(T[]) mem)[0]; This is how I exchange variable data between DLLs.
Mar 13 2021
parent tsbockman <thomas.bockman gmail.com> writes:
On Saturday, 13 March 2021 at 15:44:53 UTC, frame wrote:
 Maybe I don't get this right but why you don't just use the 
 void[] as it is?

 void[] mem = [i];
 //...
 return (cast(T[]) mem)[0];

 This is how I exchange variable data between DLLs.
That works, and is more elegant than the OP's solution. (And, I didn't know you could do it that way, so thanks for sharing!) However, it still adds an unnecessary second level of indirection for interfaces and classes. The simplest and most efficient solution is actually this: auto indirect = cast(void*) i; // i is of type I. // ... return cast(I) indirect; Where `is(I : T*, T) || is(I == interface) || is(I == class)`. The constraint enforces that all types should be accessed through exactly one level of indirection, allowing the same code to handle both reference types and value types with maximum efficiency.
Mar 13 2021