www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - CRTP + compile-time introspection + static ctors = WIN

reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
Recently, I needed to extend a simple serialization system I wrote for
one of my projects to handle polymorphic objects.  It's all data-only
structs and classes, so no need for fancy heavyweight serialization
libraries.

One way to do this was to add load() and save() methods in the base
class, and override them for every derived class.  However, this would
be too much boilerplate, and prone to mistakes (forget to override a
method, and derived class data would fail to be serialized).

Another solution is to use mixins to inject these methods into each
derived class. But again, too much boilerplate, and prone to forgetting
to include the mixin statement in the class.

Yesterday, I hit upon a nice solution: use CRTP (the Curiously-Recursive
Template Pattern) to inject these methods into each derived class:

	class Saveable(Derived, Base) : Base {
		static if (is(Base == Object)) {
			// Top-level virtual function
			void save() { ... }
		} else {
			// Derived class override
			override void save() { ... }
		}
	}

	class Base : Saveable!(Base, Object) { ...  }

	class Derived1 : Saveable!(Derived1, Base) { ... }

	class Derived2 : Saveable!(Derived1, Base) { ... }

Since the CRTP is right at the first line of the class declaration, it's
hard to miss, and it's easy to notice when I forgot to include it (as
opposed to a mixin line buried somewhere in a potentially large class
definition).

The Base parameter to Saveable lets us nicely inject overridable methods
into the class hierarchy, and also to differentiate between top-level
methods and derived class overrides.

Saveable.save() uses the template argument to introspect the derived
class and generate code to serialize its fields. It includes code to
generate a tag in the serialized output to identify what type it is.

That takes care of the serialization half of the task.

For deserialization, there was the possibility of using Object.factory.
However, the API is klunky, and there is a disconnect with how to read
the fields back with the right types.

For this, static ctors come to the rescue. I expanded Saveable thus:

	alias Loader = Object function(InputFile);
	Loader[string] classLoaders;

	class Saveable(Derived, Base) : Base {
		static if (is(Base == Object)) {
			// Top-level virtual function
			void save() { ... }
		} else {
			// Derived class override
			override void save() { ... }
		}

		static this()
		{
			classLoaders[Derived.stringof] = (InputFile f) {
				auto result = new Derived;
				... // use introspection to read Derived's fields back
				return result;
			};
		}
	}

The magic here is that the static this() block is generated *once per
instantation* of Saveable, and it has full compile-time knowledge of the
derived class. So the function literal can use compile-time
introspection to generate the serialization code.  Then this knowledge
is translated to runtime by registering the function literal into a
global table of loaders, keyed by the class name. (For simplicity, I
used .stringof here; for larger-scale projects you probably want
.mangleof instead.)

And since static this() blocks are run at program startup and dynamic
library load time, this ensures that after program startup,
`classLoaders` has knowledge of all types the program will ever use.
So the deserialization code can simply look up the saved type tag in
`classLoaders`, and call the function pointer to reconstruct the object.

The result: to make any class serializable, you just replace:

	class MyClass : MyBase { ... }

with

	class MyClass : Saveable!(MyClass, MyBase) { ... }

and everything else is taken care of automatically. No need for mixins,
no need for repetitious serialization boilerplate polluting every class,
no need even for runtime TypeInfo's.  This will support even class
definitions loaded via dynamic libraries -- as long as you use
Runtime.loadLibrary to ensure static ctors are run -- since the static
ctors will inject any new class loaders into `classLoaders`, thus
automatically "teaching" the deserialization code how to deserialize the
corresponding classes.

CRTP + compile-time introspection + static ctors = WIN

D rocks!!


T

-- 
"Outlook not so good." That magic 8-ball knows everything! I'll ask about
Exchange Server next. -- (Stolen from the net)
Jan 15 2021
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 15 January 2021 at 18:31:18 UTC, H. S. Teoh wrote:
 CRTP + compile-time introspection + static ctors = WIN
truth I wrote a lil on this a while ago too http://dpldocs.info/this-week-in-d/Blog.Posted_2019_06_10.html#tip-of-the-week and http://dpldocs.info/this-week-in-d/Blog.Posted_2019_08_05.html#what-adam-is-working-on are both on the same topic. I like using static ctors with mixin templates too, you can define your own private vars and get them init, my jni.d does that for bridging.
 D rocks!!
D is bigger than a rock. D boulders.
Jan 15 2021
parent Imperatorn <johan_forsberg_86 hotmail.com> writes:
On Friday, 15 January 2021 at 18:34:12 UTC, Adam D. Ruppe wrote:
 On Friday, 15 January 2021 at 18:31:18 UTC, H. S. Teoh wrote:
 CRTP + compile-time introspection + static ctors = WIN
truth I wrote a lil on this a while ago too http://dpldocs.info/this-week-in-d/Blog.Posted_2019_06_10.html#tip-of-the-week and http://dpldocs.info/this-week-in-d/Blog.Posted_2019_08_05.html#what-adam-is-working-on are both on the same topic. I like using static ctors with mixin templates too, you can define your own private vars and get them init, my jni.d does that for bridging.
 D rocks!!
D is bigger than a rock. D boulders.
+ D Mars
Jan 16 2021
prev sibling next sibling parent reply Mathias LANG <geod24 gmail.com> writes:
On Friday, 15 January 2021 at 18:31:18 UTC, H. S. Teoh wrote:
 And since static this() blocks are run at program startup and 
 dynamic
 library load time, this ensures that after program startup,
 `classLoaders` has knowledge of all types the program will ever 
 use.
Just don't use separate compilation or you're in for a lot of troubles. https://issues.dlang.org/show_bug.cgi?id=20641
Jan 15 2021
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Sat, Jan 16, 2021 at 07:57:12AM +0000, Mathias LANG via Digitalmars-d wrote:
 On Friday, 15 January 2021 at 18:31:18 UTC, H. S. Teoh wrote:
 
 And since static this() blocks are run at program startup and
 dynamic library load time, this ensures that after program startup,
 `classLoaders` has knowledge of all types the program will ever use.
Just don't use separate compilation or you're in for a lot of troubles. https://issues.dlang.org/show_bug.cgi?id=20641
Hmm, interesting. Though in this case I'm not overly concerned since registering the same type multiple times is harmless. :-D T -- "Computer Science is no more about computers than astronomy is about telescopes." -- E.W. Dijkstra
Jan 18 2021
prev sibling next sibling parent zjh <fqbqrr 163.com> writes:
On Friday, 15 January 2021 at 18:31:18 UTC, H. S. Teoh wrote:

Very Good.
Jan 16 2021
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2021-01-15 19:31, H. S. Teoh wrote:

 Yesterday, I hit upon a nice solution: use CRTP (the Curiously-Recursive
 Template Pattern) to inject these methods into each derived class:
 
 	class Saveable(Derived, Base) : Base {
 		static if (is(Base == Object)) {
 			// Top-level virtual function
 			void save() { ... }
 		} else {
 			// Derived class override
 			override void save() { ... }
 		}
 	}
 
 	class Base : Saveable!(Base, Object) { ...  }
 
 	class Derived1 : Saveable!(Derived1, Base) { ... }
 
 	class Derived2 : Saveable!(Derived1, Base) { ... }
 
That's an interesting idea. Although it's a bit intrusive since it requires changing what you're serializing. In my serialization library Orange [1] I solved this by registering subclasses that are going to be serialized through a base class reference [2]. If they're not serialized through a base class reference, no registration is required. [1] https://github.com/jacob-carlborg/orange [2] https://github.com/jacob-carlborg/orange/blob/1c4b1ab989fc36e6fae91131ba6951acf074f383/tests/BaseClass.d#L73 -- /Jacob Carlborg
Jan 18 2021
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Jan 18, 2021 at 05:49:03PM +0100, Jacob Carlborg via Digitalmars-d
wrote:
 On 2021-01-15 19:31, H. S. Teoh wrote:
 
 Yesterday, I hit upon a nice solution: use CRTP (the
 Curiously-Recursive Template Pattern) to inject these methods into
 each derived class:
 
 	class Saveable(Derived, Base) : Base {
 		static if (is(Base == Object)) {
 			// Top-level virtual function
 			void save() { ... }
 		} else {
 			// Derived class override
 			override void save() { ... }
 		}
 	}
 
 	class Base : Saveable!(Base, Object) { ...  }
 
 	class Derived1 : Saveable!(Derived1, Base) { ... }
 
 	class Derived2 : Saveable!(Derived1, Base) { ... }
 
That's an interesting idea. Although it's a bit intrusive since it requires changing what you're serializing.
True, but I was looking for a maximally-automated, minimal-boilerplate solution.
 In my serialization library Orange [1] I solved this by registering
 subclasses that are going to be serialized through a base class
 reference [2]. If they're not serialized through a base class
 reference, no registration is required.
 
 [1] https://github.com/jacob-carlborg/orange
 [2] https://github.com/jacob-carlborg/orange/blob/1c4b1ab989fc36e6fae91131ba6951acf074f383/tests/BaseClass.d#L73
[...] I also considered this approach, but rejected it because forgetting to register a derived class would result in incorrect serialization. I felt that was too fragile for my needs. My current serialization need is quite specific in scope: I have a bunch of arrays, AA's, structs, and classes, all of which are data-only (i.e., public data fields only, no special semantics via setters/getters). A small number of types may require special serialization/deserialization treatment; for this the serialization code detects the existence of custom .save/.load methods. Other than that, serialization is automated from the root object. Since root objects are very general, and as development goes on the exact combination of contained types may change, so a solution that does not require explicit registration of types is ideal. Using my solution above, the only thing I need to check is that the derived class derives from Saveable, which can be done the first time I declare it. It's highly visible, so accidental omission can be immediately noticed. Nothing else needs to be done, as the rest of the mechanisms are fully automated from that point on. T -- In theory, there is no difference between theory and practice.
Jan 18 2021
parent Jacob Carlborg <doob me.com> writes:
On 2021-01-18 18:15, H. S. Teoh wrote:

 I also considered this approach, but rejected it because forgetting to
 register a derived class would result in incorrect serialization. I felt
 that was too fragile for my needs.
Orange will catch this at runtime and throw an exception.
 My current serialization need is quite specific in scope: I have a bunch
 of arrays, AA's, structs, and classes, all of which are data-only (i.e.,
 public data fields only, no special semantics via setters/getters).  A
 small number of types may require special serialization/deserialization
 treatment; for this the serialization code detects the existence of
 custom .save/.load methods.  Other than that, serialization is automated
 from the root object.  Since root objects are very general, and as
 development goes on the exact combination of contained types may change,
 so a solution that does not require explicit registration of types is
 ideal.
 
 Using my solution above, the only thing I need to check is that the
 derived class derives from Saveable, which can be done the first time I
 declare it. It's highly visible, so accidental omission can be
 immediately noticed.  Nothing else needs to be done, as the rest of the
 mechanisms are fully automated from that point on.
I think both of our solutions are equally automatic. You require inheriting from a specific class, mine require registering the class. -- /Jacob Carlborg
Jan 23 2021