www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.learn - Determining function template from runtime type: better ideas?

reply cc <cc nevernet.com> writes:
Sorry for the fairly lengthy post.  I'm wondering if there are 
any suggested good practices in place for calling templated 
functions using the runtime type of an object, e.g. what 
`typeid(object)` returns.
Consider the following situation:
```d
class Person {
	string name;
	int age;
}
class Boss : Person {
	int numEmployees;
}

string serialize(T)(T obj) {
	// ... iterate over fields and serialize stuff
	return join([FieldNameTuple!T], ",");
}

void main() {
	writeln(serialize(new Person)); // name,age
	writeln(serialize(new Boss)); // numEmployees
	writeln(serialize(cast(Person) new Boss)); // name,age
}
```
Naturally, when an object is instantiated as a `Boss` but the 
variable holding it is type `Person`, the version of the template 
that gets called is `serialize!Person`.  It's especially common 
to run into this if, say, `Person` defined `Person[] friends` and 
that array was populated by various inherited subclasses that 
needed to be recognized (we also need to recursively serialize 
the members of the parent class, but that part's trivial so I'll 
skip it for now).

So, the first basic idea I came up with was using mixins in each 
serializable class:
```d
mixin template Serializable() {
	static if (hasMember!(BaseClassesTuple!(typeof(this))[0], 
"serializeMe")) {
		override string serializeMe() {
			return serializeTrue(this);
		}
	} else {
		string serializeMe() {
			return serializeTrue(this);
		}
	}
}

class Person {
	string name;
	int age;
	mixin Serializable;
}
class Boss : Person {
	int numEmployees;
	mixin Serializable;
}

string serialize(T)(T obj) {
	return obj.serializeMe();
}
private string serializeTrue(T)(T obj) {
	// ...
	return join([FieldNameTuple!T], ",");
}

void main() {
	writeln(serialize(new Person)); // name,age
	writeln(serialize(new Boss)); // numEmployees
	writeln(serialize(cast(Person) new Boss)); // numEmployees
}
```
This works, but I kind of don't like using mixins for this for 
some reason.  It feels less obvious that the class itself has 
been identified as something serializable, and you never know 
what additional methods or fields the mixin might be declaring.

I'd prefer to use UDAs, so I came up with something like:
```d
enum Serializable;
 Serializable class Person {
	string name;
	int age;
}
 Serializable class Boss : Person {
	int numEmployees;
}

string serialize(Object obj) {
	string runtimeTypeName = typeid(obj).name;
	static foreach (sym; getSymbolsByUDA!(test_serialize, 
Serializable)) {
		if (fullyQualifiedName!sym == runtimeTypeName)
			return serializeTrue(cast(sym) obj);
	}
	assert(false, "Unable to serialize type: "~runtimeTypeName);
}
private string serializeTrue(T)(T obj) {
	// ...
	return join([FieldNameTuple!T], ",");
}

void main() {
	writeln(serialize(new Person)); // name,age
	writeln(serialize(new Boss)); // numEmployees
	writeln(serialize(cast(Person) new Boss)); // numEmployees
}
```
This also works, but it has a problem.  We pass the current 
module directly to `getSymbolsByUDA` (`test_serialize.d`), but if 
we have classes spread across multiple modules, we need some way 
to iterate through those as well.  Unfortunately I couldn't find 
any trait related to iterating through all the modules compiled 
into a project.  Could an `allModules` or such thing be added?  
Or is this non-trivial to the compilation process?

So, I ultimately came up with the following.  It requires some 
instantiation at runtime to create lookup tables between real 
type templates and `typeid` values, and requires a mixin of a 
mixin since I couldn't find a way to get the module of a symbol 
as an alias (no `moduleOf!symbol` trait?), but it seems to get 
the job done (also, I finally went and added the recursive parent 
class serialization):
```d
enum Serializable;
 Serializable class Person {
	string name;
	int age;
}
 Serializable class Boss : Person {
	int numEmployees;
}

mixin template RegisterSerializer(alias MODULE) {
	void RegisterSerializer() {
		static foreach (SYM; getSymbolsByUDA!(MODULE, Serializable))
			static if (is(SYM == class))
				serialTypes[fullyQualifiedName!SYM] = new SerialType!SYM;
	}
}
abstract class SerialTypeBase {
	string encodeObject(Object obj);
}
final class SerialType(T) : SerialTypeBase {
	override string encodeObject(Object obj) {
		return serializeTrue(cast(T) obj);
	}
}

void[0][string] registeredModules;
SerialTypeBase[string] serialTypes;

string serialize(T)(T obj) {
	enum string MODULENAME = moduleName!T;
	if (MODULENAME !in registeredModules) {
		registeredModules.require(MODULENAME);
		mixin("mixin RegisterSerializer!"~MODULENAME~";");
		RegisterSerializer();
	}

	string runtimeTypeName = typeid(obj).name;
	if (auto st = runtimeTypeName in serialTypes) {
		return st.encodeObject(obj);
	}
	assert(false, "Unable to serialize type: "~runtimeTypeName);
}
private string serializeTrue(T)(T obj) {
	// ...
	alias PARENT = BaseClassesTuple!(T)[0];
	static if (hasUDA!(PARENT, Serializable))
		return join([serializeTrue(cast(PARENT) obj), 
FieldNameTuple!T], ",");
	else
		return join([FieldNameTuple!T], ",");
}

void main() {
	writeln(serialize(new Person)); // name,age
	writeln(serialize(new Boss)); // name,age,numEmployees
	writeln(serialize(cast(Person) new Boss)); // 
name,age,numEmployees
}
```
Now it should be able to handle any class marked as 
` Serializable` regardless of module.  So.. are there any better 
ways to do this?  I glanced through some of the other 
serialization modules in D (orange) but it didn't look like any 
of them were doing anything like this.  Did I miss something much 
simpler?  It feels like there ought to be something inherent in 
`TypeInfo` that would let me get away with all this.
Mar 22 2022
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Mar 22, 2022 at 07:09:07AM +0000, cc via Digitalmars-d-learn wrote:
 Sorry for the fairly lengthy post.  I'm wondering if there are any
 suggested good practices in place for calling templated functions
 using the runtime type of an object, e.g. what `typeid(object)`
 returns.
[[...] Templates are instantiated at compile-time, and polymorphic types are resolved at runtime. It's not possible to do what you describe (not directly, anyway). However, you *can* get away with something similar by using CRTP: class Serializable(Derived, Base = Object) : Base { static if (is(Base : Serializable!(Base, C), C)) { override void serialize(...) { serializeImpl(); } } else { void serialize(...) { serializeImpl(); } } private void serializeImpl(...) { auto target = cast(Derived) this; ... // Now you have the derived type, serialize accordingly } } class Person : Serializable!Person { ... } class Boss : Serializable!(Boss, Person) { ... } ... // etc. The idea is to inject an intermediate base class above each leaf class in your class hierarchy that automates away the boilerplate of serialization code. In accordance with CRTP, you pass the class you're declaring as a template parameter into the Serializable wrapper, so that it has compile-time information about your derived class type, which it uses in serializeImpl() to serialize the derived class data directly -- no runtime typeid() is needed. (Generally, use of typeid() is discouraged; it still exists to support legacy D code but it's usually better to use compile-time introspection instead.) The above example is only half the equation, of course. For deserialization, you need a way of dynamically selecting the right deserialization function overload to recreate the derived type. This can be done by registering the deserializers into a global AA keyed by derived class name using static this(), which gets run at startup time: static Object function(ubyte[] data)[string] deserializers; class Serializable(Derived, Base = Object) : Base { ... static this() { deserializers[Derived.stringof] = (ubyte[] data) { auto obj = new Derived; ... // code to decode data here return obj; }; } ... } The main deserializer then just looks up the derived class name in the input data, and calls the appropriate function in `deserializers` to recreate the derived object. (Note: this method does not depend on Object.factory, which is fraught with problems.) T -- Answer: Because it breaks the logical sequence of discussion. / Question: Why is top posting bad?
Mar 22 2022