digitalmars.D - Implementing serialisation with minmal boilerplate and template

Stefan Koch (140/140) Aug 15 2021 Good Day everyone,

Stefan Koch (22/27) Aug 15 2021 There was a bug in the code I posted.
drug (7/154) Aug 15 2021 I'm impressed by your work. But I have one question - when ordinary

Stefan Koch (11/17) Aug 15 2021 I have to write a DIP and have it approved.

Temtaime (5/17) Aug 16 2021 Hello.

Bruce Carneal (5/27) Aug 16 2021 dlang sure seems to inspire run time serializers:

Steven Schveighoffer (6/36) Aug 16 2021 [D is for

russhy (11/11) Aug 17 2021 Is this runtime reflection?

12345swordy (4/16) Aug 17 2021 The reason that they use runtime reflection in java/c# is because

russhy (6/32) Aug 17 2021 As well as other atrocities such as runtime code generation /

Adam Ruppe (3/5) Aug 17 2021 basically it provides the same reflection we have now but as CTFE
12345swordy (5/31) Aug 17 2021 Which is not a inherently an evil thing, so there is no need for

Alexandru Ermicioi (7/10) Aug 17 2021 How does basic principles of oop force java implementations use

12345swordy (8/15) Aug 17 2021 https://en.wikipedia.org/wiki/Inheritance_(object-oriented_programming)

Alexandru Ermicioi (21/37) Aug 17 2021 I still fail to see relation between oop and runtime info use.

Arafel (15/56) Aug 18 2021 There's at least a use case that can only be solved with runtime

max haughton (5/17) Aug 17 2021 It's for introspecting over code at compile time, not at runtime.

Stefan Koch <uplink.coder googlemail.com> writes:

Good Day everyone,

A friend asked me recently of core.reflect could be used for 
serialization.

The code to do that generically is about 100 lines of code.
Since it's so little I am just going to post it here verbatim.

```D
class C {  ("NoSerialize") ubyte a; uint b;  ("NoSerialize") 
ulong c;
     this(ubyte a, uint b, ulong c)
     {
         this.a = a; this.b = b; this.c = c;
     }
}
struct S {  ("NoSerialize") ubyte a; uint b; ubyte c; 
 ("NoSerialize") ulong d; }

import core.reflect.reflect;
/// if a type is an aggregate type return the aggregate 
declaration
/// otherwise return null
AggregateDeclaration aggregateFromType(const Type type)
{
     AggregateDeclaration result = null;
     if (auto ts = cast(TypeStruct)type)
     {
         result = ts.sym;
     } else if (auto tc = cast(TypeClass)type)
     {
         result = tc.sym;
     }
     return result;
}

bool hasNoSerializeAttrib(const Declaration d)
{
     bool NoSerialize = false;
     foreach(attr;d.attributes)
     {
         auto se = cast(StringLiteral)attr;
         if (se && se.string_ == "NoSerialize")
         {
           NoSerialize = true;
           break;
         }
     }
     return NoSerialize;
}
   /// the only template in here. only one instances per 
serialzed-root-type
   const(ubyte[]) serialize(T)(T value) {
     static immutable Type type = cast(immutable Type) 
nodeFromName("T");
     return serializeType(cast(ubyte*)&value, type);
   }

   const(ubyte[]) serializeType(const ubyte* ptr, const Type type)
   {
     // writeShallowTypeDescriptor(Type);
     // printf("Serializing type: %s\n", type.identifier.ptr);
     if (auto sa = cast (TypeArray) type)
     {
         ulong length = sa.dim;
         const void* values = ptr;
         auto elemType = sa.nextOf;
         return serializeArray(length, ptr, elemType);
     }
     else if (auto da = cast(TypeSlice) type)
     {
         ulong length = *cast(size_t*) ptr;
         const void* values = *cast(const ubyte**)(ptr + 
size_t.sizeof);
         auto elemType = da.nextOf;
         return serializeArray(length, ptr, elemType);
     }
     else if (auto ts = cast(TypeStruct) type)
     {
         auto fields = ts.sym.fields;
         return serializeAggregate(ptr, fields);
     }
     else if (auto tc = cast(TypeClass) type)
     {
         auto fields = tc.sym.fields;
         return serializeAggregate(*cast(const ubyte**)ptr, 
fields);
     }
     else if (auto tb = cast(TypeBasic) type)
     {
         // writeTypeTag ?
         return ptr[0 .. type.size];
     }
     else assert(0, "Serialisation for " ~ type.identifier ~ " not 
implemented .. " ~ (cast()type).toString());
   }

   ubyte[] serializeArray(ulong length, const ubyte* ptr, const 
Type elemType)
   {
     ubyte[] result;
     // writeLength(length)
     foreach(i; 0 .. length)
     {
       result ~= serializeType(ptr, elemType);
     }
     return result;
   }

   ubyte[] serializeAggregate(const ubyte* ptr, 
VariableDeclaration[] fields)
   {
     ubyte[] result;
     foreach(f; fields)
     {
       if (hasNoSerializeAttrib(f))
       {
         // skip fields which are annotated with noSerialize;
         continue;
       }
       result ~= serializeType(ptr + f.offset, f.type);
     }
     return result;
   }


void main()
{
     S s = S(72, 19992034, 98);
     C c = new C(72, 19992039, 98);
     auto buffer = serialize(s);
     assert(buffer.length == 5);
     assert ((buffer[0] | buffer[1] << 8 | buffer[2] << 16 | 
buffer[3] << 24) == 19992034);
     assert (buffer[4] == 98);
     auto buffer_c = serialize(c);
     assert(buffer_c.length == 4);
     assert ((buffer_c[0] | buffer_c[1] << 8 | buffer_c[2] << 16 | 
buffer_c[3] << 24) == 19992039);
     // we can see the fields annotated with  ("NoSerialize") are 
skipped
}
```

I would like to know what you think about this

In my next example I will show how you can modify serialization 
for library types source-code you can't control.

However for that to work `core.reflect` needs to be extended a 
little ;)

Cheers and have a nice day,

Stefan

Aug 15 2021

Stefan Koch <uplink.coder googlemail.com> writes:

On Sunday, 15 August 2021 at 11:18:57 UTC, Stefan Koch wrote:
 Good Day everyone,

 A friend asked me recently of core.reflect could be used for 
 serialization.
 [ ... ]

 I would like to know what you think about this

There was a bug in the code I posted.
I should have run every path before calling it a day ;)
The code for dynamic array serialization

```d
         ulong length = *cast(size_t*) ptr;
         const void* values = *cast(const ubyte**)(ptr + 
size_t.sizeof);
         auto elemType = da.nextOf;
         return serializeArray(length, ptr, elemType);
```

has to be

```d
         ulong length = *cast(size_t*) ptr;
         const ubyte* values = *cast(const ubyte**)(ptr + 
size_t.sizeof);
         auto elemType = da.nextOf;
         return serializeArray(length, ptr, values);
```
and of course serializeArray has to write the length as well for 
this to work as without the length information you cannot 
de-serialize.

Aug 15 2021

drug <drug2004 bk.ru> writes:

15.08.2021 14:18, Stefan Koch пишет:
 Good Day everyone,
 
 A friend asked me recently of core.reflect could be used for serialization.
 
 The code to do that generically is about 100 lines of code.
 Since it's so little I am just going to post it here verbatim.
 
 ```D
 class C {  ("NoSerialize") ubyte a; uint b;  ("NoSerialize") ulong c;
      this(ubyte a, uint b, ulong c)
      {
          this.a = a; this.b = b; this.c = c;
      }
 }
 struct S {  ("NoSerialize") ubyte a; uint b; ubyte c;  ("NoSerialize") 
 ulong d; }
 
 import core.reflect.reflect;
 /// if a type is an aggregate type return the aggregate declaration
 /// otherwise return null
 AggregateDeclaration aggregateFromType(const Type type)
 {
      AggregateDeclaration result = null;
      if (auto ts = cast(TypeStruct)type)
      {
          result = ts.sym;
      } else if (auto tc = cast(TypeClass)type)
      {
          result = tc.sym;
      }
      return result;
 }
 
 bool hasNoSerializeAttrib(const Declaration d)
 {
      bool NoSerialize = false;
      foreach(attr;d.attributes)
      {
          auto se = cast(StringLiteral)attr;
          if (se && se.string_ == "NoSerialize")
          {
            NoSerialize = true;
            break;
          }
      }
      return NoSerialize;
 }
    /// the only template in here. only one instances per 
 serialzed-root-type
    const(ubyte[]) serialize(T)(T value) {
      static immutable Type type = cast(immutable Type) nodeFromName("T");
      return serializeType(cast(ubyte*)&value, type);
    }
 
    const(ubyte[]) serializeType(const ubyte* ptr, const Type type)
    {
      // writeShallowTypeDescriptor(Type);
      // printf("Serializing type: %s\n", type.identifier.ptr);
      if (auto sa = cast (TypeArray) type)
      {
          ulong length = sa.dim;
          const void* values = ptr;
          auto elemType = sa.nextOf;
          return serializeArray(length, ptr, elemType);
      }
      else if (auto da = cast(TypeSlice) type)
      {
          ulong length = *cast(size_t*) ptr;
          const void* values = *cast(const ubyte**)(ptr + size_t.sizeof);
          auto elemType = da.nextOf;
          return serializeArray(length, ptr, elemType);
      }
      else if (auto ts = cast(TypeStruct) type)
      {
          auto fields = ts.sym.fields;
          return serializeAggregate(ptr, fields);
      }
      else if (auto tc = cast(TypeClass) type)
      {
          auto fields = tc.sym.fields;
          return serializeAggregate(*cast(const ubyte**)ptr, fields);
      }
      else if (auto tb = cast(TypeBasic) type)
      {
          // writeTypeTag ?
          return ptr[0 .. type.size];
      }
      else assert(0, "Serialisation for " ~ type.identifier ~ " not 
 implemented .. " ~ (cast()type).toString());
    }
 
    ubyte[] serializeArray(ulong length, const ubyte* ptr, const Type 
 elemType)
    {
      ubyte[] result;
      // writeLength(length)
      foreach(i; 0 .. length)
      {
        result ~= serializeType(ptr, elemType);
      }
      return result;
    }
 
    ubyte[] serializeAggregate(const ubyte* ptr, VariableDeclaration[] 
 fields)
    {
      ubyte[] result;
      foreach(f; fields)
      {
        if (hasNoSerializeAttrib(f))
        {
          // skip fields which are annotated with noSerialize;
          continue;
        }
        result ~= serializeType(ptr + f.offset, f.type);
      }
      return result;
    }
 
 
 void main()
 {
      S s = S(72, 19992034, 98);
      C c = new C(72, 19992039, 98);
      auto buffer = serialize(s);
      assert(buffer.length == 5);
      assert ((buffer[0] | buffer[1] << 8 | buffer[2] << 16 | buffer[3] 
 << 24) == 19992034);
      assert (buffer[4] == 98);
      auto buffer_c = serialize(c);
      assert(buffer_c.length == 4);
      assert ((buffer_c[0] | buffer_c[1] << 8 | buffer_c[2] << 16 | 
 buffer_c[3] << 24) == 19992039);
      // we can see the fields annotated with  ("NoSerialize") are skipped
 }
 ```
 
 I would like to know what you think about this
 
 In my next example I will show how you can modify serialization for 
 library types source-code you can't control.
 
 However for that to work `core.reflect` needs to be extended a little ;)
 
 Cheers and have a nice day,
 
 Stefan

I'm impressed by your work. But I have one question - when ordinary 
people like me can be able to use all these amazing things you did? 
Iterating over aggregate members considering their type, attributes and 
runtime value is so often in my practice that I'm really interesting in 
your core.reflect but I can use only official compiler w/o any 
customization.

Aug 15 2021

Stefan Koch <uplink.coder googlemail.com> writes:

On Sunday, 15 August 2021 at 12:27:44 UTC, drug wrote:
 I'm impressed by your work. But I have one question - when 
 ordinary people like me can be able to use all these amazing 
 things you did? Iterating over aggregate members considering 
 their type, attributes and runtime value is so often in my 
 practice that I'm really interesting in your core.reflect but I 
 can use only official compiler w/o any customization.

I have to write a DIP and have it approved.

I am already starting as this one is much simpler than the type 
function project.
Essentially the changes to the core language spec is just a few 
lines.

I suspect the runtime documentation will be more challenging.
But for that at least I have already specified a data-structure 
of there is not much variability to take into account.

I cannot give a definitive timeline but I would hope for this to 
go in before 2022 is over.

Aug 15 2021

Temtaime <temtaime gmail.com> writes:

On Sunday, 15 August 2021 at 14:21:00 UTC, Stefan Koch wrote:
 On Sunday, 15 August 2021 at 12:27:44 UTC, drug wrote:
 [...]

 I have to write a DIP and have it approved.

 I am already starting as this one is much simpler than the type 
 function project.
 Essentially the changes to the core language spec is just a few 
 lines.

 I suspect the runtime documentation will be more challenging.
 But for that at least I have already specified a data-structure 
 of there is not much variability to take into account.

 I cannot give a definitive timeline but I would hope for this 
 to go in before 2022 is over.

Hello.
Take a look at 
https://github.com/Temtaime/utile/blob/main/source/utile/binary/tests.d :)
Maybe someone will found this library of me useful

Aug 16 2021

Bruce Carneal <bcarneal gmail.com> writes:

On Monday, 16 August 2021 at 16:53:54 UTC, Temtaime wrote:
 On Sunday, 15 August 2021 at 14:21:00 UTC, Stefan Koch wrote:
 On Sunday, 15 August 2021 at 12:27:44 UTC, drug wrote:
 [...]

 I have to write a DIP and have it approved.

 I am already starting as this one is much simpler than the 
 type function project.
 Essentially the changes to the core language spec is just a 
 few lines.

 I suspect the runtime documentation will be more challenging.
 But for that at least I have already specified a 
 data-structure of there is not much variability to take into 
 account.

 I cannot give a definitive timeline but I would hope for this 
 to go in before 2022 is over.

 Hello.
 Take a look at 
 https://github.com/Temtaime/utile/blob/main/source/utile/binary/tests.d :)
 Maybe someone will found this library of me useful

dlang sure seems to inspire run time serializers: 
https://code.dlang.org/search?q=serialization

My home brew serialization is not as sophisticated as what you've 
written, let alone what Stefan is doing at compile time.

Aug 16 2021

Steven Schveighoffer <schveiguy gmail.com> writes:

On 8/16/21 5:18 PM, Bruce Carneal wrote:
 On Monday, 16 August 2021 at 16:53:54 UTC, Temtaime wrote:
 On Sunday, 15 August 2021 at 14:21:00 UTC, Stefan Koch wrote:
 On Sunday, 15 August 2021 at 12:27:44 UTC, drug wrote:
 [...]

 I have to write a DIP and have it approved.

 I am already starting as this one is much simpler than the type 
 function project.
 Essentially the changes to the core language spec is just a few lines.

 I suspect the runtime documentation will be more challenging.
 But for that at least I have already specified a data-structure of 
 there is not much variability to take into account.

 I cannot give a definitive timeline but I would hope for this to go 
 in before 2022 is over.

 Hello.
 Take a look at 
 https://github.com/Temtaime/utile/blob/main/source/utile/binary/tests.d :) 

 Maybe someone will found this library of me useful

 
 dlang sure seems to inspire run time serializers: 
 https://code.dlang.org/search?q=serialization
 
 My home brew serialization is not as sophisticated as what you've 
 written, let alone what Stefan is doing at compile time.
 

[D is for 
(de)serialization](https://dconf.org/2019/talks/schveighoffer.html)

;)

There's at least one talk about serialization in almost every dconf I think.

-Steve

Aug 16 2021

russhy <russhy gmail.com> writes:

Is this runtime reflection?

Will this depend on the GC? if so does it add pressure to the GC?

We already have compile time type introspection, i don't think 
it's wise to move things to runtime, we have a poor GC adding 
more pressure to it is just bad

Compile time reflection already proved to be superior in heavy 
workloads


examples to follow

Also if it uses the GC, i'm not sure "core" package is the go, 
should be put on "std", or as a library imo

Aug 17 2021

12345swordy <alexanderheistermann gmail.com> writes:

On Tuesday, 17 August 2021 at 13:52:19 UTC, russhy wrote:
 Is this runtime reflection?

 Will this depend on the GC? if so does it add pressure to the 
 GC?

 We already have compile time type introspection, i don't think 
 it's wise to move things to runtime, we have a poor GC adding 
 more pressure to it is just bad

 Compile time reflection already proved to be superior in heavy 
 workloads


 examples to follow

 Also if it uses the GC, i'm not sure "core" package is the go, 
 should be put on "std", or as a library imo


of the basic principles of OOP.

-Alex

Aug 17 2021

russhy <russhy gmail.com> writes:

On Tuesday, 17 August 2021 at 14:16:52 UTC, 12345swordy wrote:
 On Tuesday, 17 August 2021 at 13:52:19 UTC, russhy wrote:
 Is this runtime reflection?

 Will this depend on the GC? if so does it add pressure to the 
 GC?

 We already have compile time type introspection, i don't think 
 it's wise to move things to runtime, we have a poor GC adding 
 more pressure to it is just bad

 Compile time reflection already proved to be superior in heavy 
 workloads


 examples to follow

 Also if it uses the GC, i'm not sure "core" package is the go, 
 should be put on "std", or as a library imo


 because of the basic principles of OOP.

 -Alex

As well as other atrocities such as runtime code generation / 
runtime dependencies


 It's for introspecting over code at compile time, not at 
 runtime. Stefan and I have been mulling over this for ages and 
 I think we both think it can do more than reflection as 
 currently know it at least. This let's you drink from the 
 firehose, so to speak.

Oh i see, so the goal is not what i was thinking, my bad!

I guess will have to read the DIP to know more about it, that is 
interesting, i'm curious now

Aug 17 2021

Adam Ruppe <destructionator gmail.com> writes:

On Tuesday, 17 August 2021 at 14:52:22 UTC, russhy wrote:
 I guess will have to read the DIP to know more about it, that 
 is interesting, i'm curious now

basically it provides the same reflection we have now but as CTFE 
classes instead of compiler traits and is expression matching.

Aug 17 2021

12345swordy <alexanderheistermann gmail.com> writes:

On Tuesday, 17 August 2021 at 14:52:22 UTC, russhy wrote:
 On Tuesday, 17 August 2021 at 14:16:52 UTC, 12345swordy wrote:
 On Tuesday, 17 August 2021 at 13:52:19 UTC, russhy wrote:
 Is this runtime reflection?

 Will this depend on the GC? if so does it add pressure to the 
 GC?

 We already have compile time type introspection, i don't 
 think it's wise to move things to runtime, we have a poor GC 
 adding more pressure to it is just bad

 Compile time reflection already proved to be superior in 
 heavy workloads


 examples to follow

 Also if it uses the GC, i'm not sure "core" package is the 
 go, should be put on "std", or as a library imo


 because of the basic principles of OOP.

 -Alex

 As well as other atrocities such as runtime code generation / 
 runtime dependencies

Which is not a inherently an evil thing, so there is no need for 
hyperbolic language such as using the word atrocities. It's all 
about the trade offs.

- Alex

Aug 17 2021

Alexandru Ermicioi <alexandru.ermicioi gmail.com> writes:

On Tuesday, 17 August 2021 at 14:16:52 UTC, 12345swordy wrote:

 because of the basic principles of OOP.

 -Alex

How does basic principles of oop force java implementations use 
runtime reflection???

It would mean that D also has to use runtime reflection because 
it has OOP support.

Regards,
Alexandru.

Aug 17 2021

12345swordy <alexanderheistermann gmail.com> writes:

On Tuesday, 17 August 2021 at 18:10:48 UTC, Alexandru Ermicioi 
wrote:
 On Tuesday, 17 August 2021 at 14:16:52 UTC, 12345swordy wrote:

 because of the basic principles of OOP.

 -Alex

 How does basic principles of oop force java implementations use 
 runtime reflection???

https://en.wikipedia.org/wiki/Inheritance_(object-oriented_programming)

You cannot obtain information such as "How many child classes 
does this class currently has", without compiling every 
code/library that you use, which isn't feasible as not every 
library share their source code.

- Alex

Aug 17 2021

Alexandru Ermicioi <alexandru.ermicioi gmail.com> writes:

On Tuesday, 17 August 2021 at 20:23:48 UTC, 12345swordy wrote:
 On Tuesday, 17 August 2021 at 18:10:48 UTC, Alexandru Ermicioi 
 wrote:
 On Tuesday, 17 August 2021 at 14:16:52 UTC, 12345swordy wrote:

 because of the basic principles of OOP.

 -Alex

 How does basic principles of oop force java implementations 
 use runtime reflection???

 https://en.wikipedia.org/wiki/Inheritance_(object-oriented_programming)

 You cannot obtain information such as "How many child classes 
 does this class currently has", without compiling every 
 code/library that you use, which isn't feasible as not every 
 library share their source code.

 - Alex

I still fail to see relation between oop and runtime info use. 
The part about source code sharing is also true for c and d libs, 
which both could share only header files and have pointer to 
struct declared without said struct declaration (i.e. Opaque 
struct type, not sure completely about D though).

The thing is, that runtime reflection in java is more or less 
usable compared to d (sigh), and can be used to design libs that 
are using that data to do their job, but this doesn't mean that 
there aren't any options on using compile time info to generate 
or alter compiled code. Take for example lombok project, jpa 
model generator from hibernate, or mapstruct library which is a 
mapper from one java type to another (kinda close to 
serializers), all of them are used at compile time to either 
generate new code, or alter existing one, based on annotation 
processor plugin feature offered by java compiler, not to mention 
byte code enhancement capabilities, and libs using them.


and alteration.

Regards,
Alexandru.

Aug 17 2021

Arafel <er.krali gmail.com> writes:

On 17/8/21 22:50, Alexandru Ermicioi wrote:
 On Tuesday, 17 August 2021 at 20:23:48 UTC, 12345swordy wrote:
 On Tuesday, 17 August 2021 at 18:10:48 UTC, Alexandru Ermicioi wrote:
 On Tuesday, 17 August 2021 at 14:16:52 UTC, 12345swordy wrote:

 the basic principles of OOP.

 -Alex

 How does basic principles of oop force java implementations use 
 runtime reflection???

 https://en.wikipedia.org/wiki/Inheritance_(object-oriented_programming)

 You cannot obtain information such as "How many child classes does 
 this class currently has", without compiling every code/library that 
 you use, which isn't feasible as not every library share their source 
 code.

 - Alex

 
 I still fail to see relation between oop and runtime info use. The part 
 about source code sharing is also true for c and d libs, which both 
 could share only header files and have pointer to struct declared 
 without said struct declaration (i.e. Opaque struct type, not sure 
 completely about D though).
 
 The thing is, that runtime reflection in java is more or less usable 
 compared to d (sigh), and can be used to design libs that are using that 
 data to do their job, but this doesn't mean that there aren't any 
 options on using compile time info to generate or alter compiled code. 
 Take for example lombok project, jpa model generator from hibernate, or 
 mapstruct library which is a mapper from one java type to another (kinda 
 close to serializers), all of them are used at compile time to either 
 generate new code, or alter existing one, based on annotation processor 
 plugin feature offered by java compiler, not to mention byte code 
 enhancement capabilities, and libs using them.
 

 alteration.
 
 Regards,
 Alexandru.

There's at least a use case that can only be solved with runtime 
reflection: dynamically loaded code (so, dlopen or similar) where no 
source code is available at all, for example if you want to implement a 
plugin system.

For sure you can have the code "register" itself, and add some kind of 
"description" or metadata, but if you want for instance to implement a 
persistence / serialization system in such an environment, you end up 
all but implementing your own version of runtime reflection.

I don't have that much experience with that, but I think that separate 
compilation could also have issues, for instance if you want to 
distribute your program in binary form, and still allow people to 
compile plugins / extensions for it.

A workaround might be to distribute enough of the source code in the .di 
files, but I'm not sure how workable this would be.

Aug 18 2021

max haughton <maxhaton gmail.com> writes:

On Tuesday, 17 August 2021 at 13:52:19 UTC, russhy wrote:
 Is this runtime reflection?

 Will this depend on the GC? if so does it add pressure to the 
 GC?

 We already have compile time type introspection, i don't think 
 it's wise to move things to runtime, we have a poor GC adding 
 more pressure to it is just bad

 Compile time reflection already proved to be superior in heavy 
 workloads


 examples to follow

 Also if it uses the GC, i'm not sure "core" package is the go, 
 should be put on "std", or as a library imo

It's for introspecting over code at compile time, not at runtime. 
Stefan and I have been mulling over this for ages and I think we 
both think it can do more than reflection as currently know it at 
least. This let's you drink from the firehose, so to speak.

Aug 17 2021

D Programming

C/C++ Programming

Other

digitalmars.D - Implementing serialisation with minmal boilerplate and template