www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Serialization + semantics of toString

reply aarti_pl <aarti interia.pl> writes:
Andrei Alexandrescu pisze:
 But that being said, I'd so much want to start thinking of an actual
 text serialization infrastructure. Why develop one later with the
 mention "well use that stuff for debugging only, this is the real stuff."

 Andrei
You might want to see my serialization library for D. I think that it is worth noting as it manages to achieve the goal: same data - completely different output. Because this output might be defined by user in the way she wants, it seems that this can work exactly the way toString should work. It is achieved by using Archive classes which makes proper formatting, and which are completely independent from data being printed. Initial design is based on C++ Boost. I just extended concept a bit and adopted it to D. Basic interface for serialization is like this: auto serializer = Serializer!(TextArchive); //It might be also e.g.: //auto serializer = Serializer!(JsonArchive); auto input = new TransparentClass(-21, 2.11, "text1", 128, -127); auto output = serializer.dump(input); assert(serializer.load!(TransparentClass)(output) == input); In case of transparent classes (every field is public) you don't need any method inside of serialized class/struct. In case of opaque classes there is enough to: 1. add mixin inside: mixin Serializable; or 2. add template method: void describeUdt(T)(T arch) { arch.describeStaticArray(array, array.stringof); } This is all what is necessary to print every possible class/struct in whatever format you want. Because of limitations of D I couldn't achieve serialization of classes from base pointer. It was because of fact that template methods are not virtual. Recently I didn't have time to work on it, but if you think it's worthy and eventually might be included in Phobos, I would be interested to work on it further. But I would definitely need some code/concepts review. Unfortunately there is rather poor documentation. But you can find a lot of unit tests in examples directory. It's Boost licensed so no worries :-) BR Marcin Kuszczak (aarti_pl)
Nov 12 2009
next sibling parent reply aarti_pl <aarti interia.pl> writes:
I forgot to throw a link:

http://www.dsource.org/projects/doost/browser/trunk/examples/util/serializer/FunctionTest.d

BR
Marcin Kuszczak
(aarti_pl)

aarti_pl pisze:
 Andrei Alexandrescu pisze:
  > But that being said, I'd so much want to start thinking of an actual
  > text serialization infrastructure. Why develop one later with the
  > mention "well use that stuff for debugging only, this is the real 
 stuff."
  >
  > Andrei
 
 You might want to see my serialization library for D.
 
 I think that it is worth noting as it manages to achieve the goal:
 same data - completely different output. Because this output might be 
 defined by user in the way she wants, it seems that this can work 
 exactly the way toString should work.
 
 It is achieved by using Archive classes which makes proper formatting, 
 and which are completely independent from data being printed. Initial 
 design is based on C++ Boost. I just extended concept a bit and adopted 
 it to D.
 
 Basic interface for serialization is like this:
 
 auto serializer = Serializer!(TextArchive);
 //It might be also e.g.:
 //auto serializer = Serializer!(JsonArchive);
 auto input = new TransparentClass(-21, 2.11, "text1", 128, -127);
 auto output = serializer.dump(input);
 assert(serializer.load!(TransparentClass)(output) == input);
 
 In case of transparent classes (every field is public) you don't need 
 any method inside of serialized class/struct.
 
 In case of opaque classes there is enough to:
 1. add mixin inside:
 mixin Serializable;
 or
 2. add template method:
 void describeUdt(T)(T arch) {
     arch.describeStaticArray(array, array.stringof);
 }
 
 This is all what is necessary to print every possible class/struct in 
 whatever format you want.
 
 Because of limitations of D I couldn't achieve serialization of classes 
 from base pointer. It was because of fact that template methods are not 
 virtual.
 
 Recently I didn't have time to work on it, but if you think it's worthy 
 and eventually might be included in Phobos, I would be interested to 
 work on it further. But I would definitely need some code/concepts review.
 
 Unfortunately there is rather poor documentation. But you can find a lot 
 of unit tests in examples directory.
 
 It's Boost licensed so no worries :-)
 
 BR
 Marcin Kuszczak
 (aarti_pl)
Nov 12 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
aarti_pl wrote:
 I forgot to throw a link:
 
 http://www.dsource.org/projects/doost/browser/trunk/examples/util/seria
izer/FunctionTest.d 
Cool, do you also have documentation? Andrei
Nov 12 2009
parent reply aarti_pl <aarti_no_spam_ interia.pl> writes:
Andrei Alexandrescu Wrote:

 aarti_pl wrote:
 I forgot to throw a link:
 
 http://www.dsource.org/projects/doost/browser/trunk/examples/util/seria
izer/FunctionTest.d 
Cool, do you also have documentation? Andrei
Well, that's definitely weak point of this library. :-) You can find some information on doost wiki: http://www.dsource.org/projects/doost/wiki There are also some DDOC comments in code. It's also worthy to look on Boost serialization library description, as my library is based on it. Anyway I will try to improve documentation a bit during weekend. Additionally I would like to mention that there is also great BinaryArchive from Bill Baxter, which I didn't mention in my first post. --- I think that the most interesting question is if we can replace toString() with template based solution? So instead of: String toString() we would write e.g.: void describe(T)(T archive) { } Some other things should be considered: * What about virtual calls to describe? Currently template methods are not virtual, but I remember that there were posts that it is possible in some limited way. * Is it semantically same solution as toString()? Or toString() is used for something other than describing members of class? * If toString() is used for other things than describing class/struct fields, then do we need a standard way to do this or should it be implementation specific? * Solution with template method seems to be more general, as there might be different customization of output for every kind of archive, but isn't it too much for simple uses? Maybe string output is just enough? * Should be there some default archive available or should it be always defined by user? Best Regards Marcin Kuszczak (aarti_pl)
Nov 13 2009
next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Fri, Nov 13, 2009 at 12:13 AM, aarti_pl <aarti_no_spam_ interia.pl> wrote:
 Andrei Alexandrescu Wrote:

 Additionally I would like to mention that there is also great BinaryArchive
from Bill Baxter, which I didn't mention in my first post.
There is? Completely forgot about that. If I recall the big wish list item I had for your serializer was robust subclass handling for things like serializing a BaseClass[] with a mix of pointers to BaseClass and DerivedClass. You need to be able to de-serialize that by saying something like unserialize!(BaseClass[]). I think at the time I tried it, your serializer didn't save enough info to know the proper derived class to load up. --bb
Nov 13 2009
prev sibling parent aarti_pl <aarti interia.pl> writes:
aarti_pl pisze:
 Andrei Alexandrescu Wrote:
 aarti_pl wrote:
 I forgot to throw a link:
 http://www.dsource.org/projects/doost/browser/trunk/examples/util/seria
izer/FunctionTest.d 
Cool, do you also have documentation? Andrei
Well, that's definitely weak point of this library. :-) You can find some information on doost wiki: http://www.dsource.org/projects/doost/wiki There are also some DDOC comments in code. It's also worthy to look on Boost serialization library description, as my library is based on it. Anyway I will try to improve documentation a bit during weekend. Additionally I would like to mention that there is also great BinaryArchive from Bill Baxter, which I didn't mention in my first post. --- I think that the most interesting question is if we can replace toString() with template based solution? So instead of: String toString() we would write e.g.: void describe(T)(T archive) { } Some other things should be considered: * What about virtual calls to describe? Currently template methods are not virtual, but I remember that there were posts that it is possible in some limited way. * Is it semantically same solution as toString()? Or toString() is used for something other than describing members of class? * If toString() is used for other things than describing class/struct fields, then do we need a standard way to do this or should it be implementation specific? * Solution with template method seems to be more general, as there might be different customization of output for every kind of archive, but isn't it too much for simple uses? Maybe string output is just enough? * Should be there some default archive available or should it be always defined by user? Best Regards Marcin Kuszczak (aarti_pl)
I have put some more user documentation on Doost project wiki, but it is not yet half finished. Nevertheless I think it should help to start working with serializer. I am especially interested about comments on Storage concept. I am completely unsure if this design is good enough. Comments might be send privately: aarti_no_spam_[at]interia.pl or here on NG. ---- After thinking a bit about toString/serialization I got to conclusion that these two are different things. In my opinion the best way to proceed would be to change name of toString method into: toDebugString(); as it discourages using it for anything other than debugging. Default implementation of toDebugString() should be serialization of object to string. It just occurred to me that it dosn't make any sense to add special serialization code (even if it is simple) to just get quick and dirty printout of object state. These are just two different use cases. Best Regards Marcin Kuszczak (aarti_pl)
Nov 15 2009
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 11/13/09 00:13, aarti_pl wrote:
 Andrei Alexandrescu pisze:
  > But that being said, I'd so much want to start thinking of an actual
  > text serialization infrastructure. Why develop one later with the
  > mention "well use that stuff for debugging only, this is the real
 stuff."
  >
  > Andrei

 You might want to see my serialization library for D.

 I think that it is worth noting as it manages to achieve the goal:
 same data - completely different output. Because this output might be
 defined by user in the way she wants, it seems that this can work
 exactly the way toString should work.

 It is achieved by using Archive classes which makes proper formatting,
 and which are completely independent from data being printed. Initial
 design is based on C++ Boost. I just extended concept a bit and adopted
 it to D.

 Basic interface for serialization is like this:

 auto serializer = Serializer!(TextArchive);
 //It might be also e.g.:
 //auto serializer = Serializer!(JsonArchive);
 auto input = new TransparentClass(-21, 2.11, "text1", 128, -127);
 auto output = serializer.dump(input);
 assert(serializer.load!(TransparentClass)(output) == input);

 In case of transparent classes (every field is public) you don't need
 any method inside of serialized class/struct.

 In case of opaque classes there is enough to:
 1. add mixin inside:
 mixin Serializable;
 or
 2. add template method:
 void describeUdt(T)(T arch) {
 arch.describeStaticArray(array, array.stringof);
 }
Or you could use arhc.typeof[i] to access/set the values (even private) of a struct/class.
 This is all what is necessary to print every possible class/struct in
 whatever format you want.

 Because of limitations of D I couldn't achieve serialization of classes
 from base pointer. It was because of fact that template methods are not
 virtual.

 Recently I didn't have time to work on it, but if you think it's worthy
 and eventually might be included in Phobos, I would be interested to
 work on it further. But I would definitely need some code/concepts review.

 Unfortunately there is rather poor documentation. But you can find a lot
 of unit tests in examples directory.

 It's Boost licensed so no worries :-)

 BR
 Marcin Kuszczak
 (aarti_pl)
Nov 13 2009
parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 13 Nov 2009 17:11:54 +0300, Jacob Carlborg <doob me.com> wrote:

 On 11/13/09 00:13, aarti_pl wrote:
 Andrei Alexandrescu pisze:
  > But that being said, I'd so much want to start thinking of an actual
  > text serialization infrastructure. Why develop one later with the
  > mention "well use that stuff for debugging only, this is the real
 stuff."
  >
  > Andrei

 You might want to see my serialization library for D.

 I think that it is worth noting as it manages to achieve the goal:
 same data - completely different output. Because this output might be
 defined by user in the way she wants, it seems that this can work
 exactly the way toString should work.

 It is achieved by using Archive classes which makes proper formatting,
 and which are completely independent from data being printed. Initial
 design is based on C++ Boost. I just extended concept a bit and adopted
 it to D.

 Basic interface for serialization is like this:

 auto serializer = Serializer!(TextArchive);
 //It might be also e.g.:
 //auto serializer = Serializer!(JsonArchive);
 auto input = new TransparentClass(-21, 2.11, "text1", 128, -127);
 auto output = serializer.dump(input);
 assert(serializer.load!(TransparentClass)(output) == input);

 In case of transparent classes (every field is public) you don't need
 any method inside of serialized class/struct.

 In case of opaque classes there is enough to:
 1. add mixin inside:
 mixin Serializable;
 or
 2. add template method:
 void describeUdt(T)(T arch) {
 arch.describeStaticArray(array, array.stringof);
 }
Or you could use arhc.typeof[i] to access/set the values (even private) of a struct/class.
You mean .tupleof? Just tested and it really works (wow!), didn't know about that. Thanks!
Nov 13 2009
parent Jacob Carlborg <doob me.com> writes:
On 11/13/09 15:17, Denis Koroskin wrote:
 On Fri, 13 Nov 2009 17:11:54 +0300, Jacob Carlborg <doob me.com> wrote:

 On 11/13/09 00:13, aarti_pl wrote:
 Andrei Alexandrescu pisze:
 But that being said, I'd so much want to start thinking of an actual
 text serialization infrastructure. Why develop one later with the
 mention "well use that stuff for debugging only, this is the real
stuff."
 Andrei
You might want to see my serialization library for D. I think that it is worth noting as it manages to achieve the goal: same data - completely different output. Because this output might be defined by user in the way she wants, it seems that this can work exactly the way toString should work. It is achieved by using Archive classes which makes proper formatting, and which are completely independent from data being printed. Initial design is based on C++ Boost. I just extended concept a bit and adopted it to D. Basic interface for serialization is like this: auto serializer = Serializer!(TextArchive); //It might be also e.g.: //auto serializer = Serializer!(JsonArchive); auto input = new TransparentClass(-21, 2.11, "text1", 128, -127); auto output = serializer.dump(input); assert(serializer.load!(TransparentClass)(output) == input); In case of transparent classes (every field is public) you don't need any method inside of serialized class/struct. In case of opaque classes there is enough to: 1. add mixin inside: mixin Serializable; or 2. add template method: void describeUdt(T)(T arch) { arch.describeStaticArray(array, array.stringof); }
Or you could use arhc.typeof[i] to access/set the values (even private) of a struct/class.
You mean .tupleof? Just tested and it really works (wow!), didn't know about that. Thanks!
Yes, tupleof
Nov 13 2009
prev sibling next sibling parent reply aarti_pl <aarti_no_spam_ interia.pl> writes:
Jacob Carlborg Wrote:

 On 11/13/09 00:13, aarti_pl wrote:
 Andrei Alexandrescu pisze:
  > But that being said, I'd so much want to start thinking of an actual
  > text serialization infrastructure. Why develop one later with the
  > mention "well use that stuff for debugging only, this is the real
 stuff."
  >
  > Andrei

 You might want to see my serialization library for D.

 I think that it is worth noting as it manages to achieve the goal:
 same data - completely different output. Because this output might be
 defined by user in the way she wants, it seems that this can work
 exactly the way toString should work.

 It is achieved by using Archive classes which makes proper formatting,
 and which are completely independent from data being printed. Initial
 design is based on C++ Boost. I just extended concept a bit and adopted
 it to D.

 Basic interface for serialization is like this:

 auto serializer = Serializer!(TextArchive);
 //It might be also e.g.:
 //auto serializer = Serializer!(JsonArchive);
 auto input = new TransparentClass(-21, 2.11, "text1", 128, -127);
 auto output = serializer.dump(input);
 assert(serializer.load!(TransparentClass)(output) == input);

 In case of transparent classes (every field is public) you don't need
 any method inside of serialized class/struct.

 In case of opaque classes there is enough to:
 1. add mixin inside:
 mixin Serializable;
 or
 2. add template method:
 void describeUdt(T)(T arch) {
 arch.describeStaticArray(array, array.stringof);
 }
Or you could use arhc.typeof[i] to access/set the values (even private) of a struct/class.
It works exactly this way. In D1 it was not possible to access private members with tupleof[], so there was a need for describe(). But even in D2 I think that describe() should stay as it gives more flexibility for user.
Nov 13 2009
parent reply Jacob Carlborg <doob me.com> writes:
On 11/13/09 16:03, aarti_pl wrote:
 Jacob Carlborg Wrote:

 On 11/13/09 00:13, aarti_pl wrote:
 Andrei Alexandrescu pisze:
   >  But that being said, I'd so much want to start thinking of an actual
   >  text serialization infrastructure. Why develop one later with the
   >  mention "well use that stuff for debugging only, this is the real
 stuff."
   >
   >  Andrei

 You might want to see my serialization library for D.

 I think that it is worth noting as it manages to achieve the goal:
 same data - completely different output. Because this output might be
 defined by user in the way she wants, it seems that this can work
 exactly the way toString should work.

 It is achieved by using Archive classes which makes proper formatting,
 and which are completely independent from data being printed. Initial
 design is based on C++ Boost. I just extended concept a bit and adopted
 it to D.

 Basic interface for serialization is like this:

 auto serializer = Serializer!(TextArchive);
 //It might be also e.g.:
 //auto serializer = Serializer!(JsonArchive);
 auto input = new TransparentClass(-21, 2.11, "text1", 128, -127);
 auto output = serializer.dump(input);
 assert(serializer.load!(TransparentClass)(output) == input);

 In case of transparent classes (every field is public) you don't need
 any method inside of serialized class/struct.

 In case of opaque classes there is enough to:
 1. add mixin inside:
 mixin Serializable;
 or
 2. add template method:
 void describeUdt(T)(T arch) {
 arch.describeStaticArray(array, array.stringof);
 }
Or you could use arhc.typeof[i] to access/set the values (even private) of a struct/class.
It works exactly this way. In D1 it was not possible to access private members with tupleof[], so there was a need for describe(). But even in D2 I think that describe() should stay as it gives more flexibility for user.
It wasn't? When was that added? It works for me using gdc based on dmd somewhere between 1.024 and 1.030.
Nov 13 2009
parent aarti_pl <aarti interia.pl> writes:
Jacob Carlborg pisze:
 On 11/13/09 16:03, aarti_pl wrote:
 Jacob Carlborg Wrote:

 On 11/13/09 00:13, aarti_pl wrote:
 Andrei Alexandrescu pisze:
   >  But that being said, I'd so much want to start thinking of an 
 actual
   >  text serialization infrastructure. Why develop one later with the
   >  mention "well use that stuff for debugging only, this is the real
 stuff."
   >
   >  Andrei

 You might want to see my serialization library for D.

 I think that it is worth noting as it manages to achieve the goal:
 same data - completely different output. Because this output might be
 defined by user in the way she wants, it seems that this can work
 exactly the way toString should work.

 It is achieved by using Archive classes which makes proper formatting,
 and which are completely independent from data being printed. Initial
 design is based on C++ Boost. I just extended concept a bit and adopted
 it to D.

 Basic interface for serialization is like this:

 auto serializer = Serializer!(TextArchive);
 //It might be also e.g.:
 //auto serializer = Serializer!(JsonArchive);
 auto input = new TransparentClass(-21, 2.11, "text1", 128, -127);
 auto output = serializer.dump(input);
 assert(serializer.load!(TransparentClass)(output) == input);

 In case of transparent classes (every field is public) you don't need
 any method inside of serialized class/struct.

 In case of opaque classes there is enough to:
 1. add mixin inside:
 mixin Serializable;
 or
 2. add template method:
 void describeUdt(T)(T arch) {
 arch.describeStaticArray(array, array.stringof);
 }
Or you could use arhc.typeof[i] to access/set the values (even private) of a struct/class.
It works exactly this way. In D1 it was not possible to access private members with tupleof[], so there was a need for describe(). But even in D2 I think that describe() should stay as it gives more flexibility for user.
It wasn't? When was that added? It works for me using gdc based on dmd somewhere between 1.024 and 1.030.
I remember there were some problems, but it was already some time ago... Also I remember that Andrei mentioned on NG that tupleof[] should give access to all members nevertheless of their protection attributes, so I assume it was not working like this. Anyway possibility to define describe() should stay, as it gives flexibility. It is just not necessary even when members are private. BR Marcin Kuszczak
Nov 13 2009
prev sibling parent reply aarti_pl <aarti_no_spam_ interia.pl> writes:
Bill Baxter Wrote:

 On Fri, Nov 13, 2009 at 12:13 AM, aarti_pl <aarti_no_spam_ interia.pl> wrote:
 Andrei Alexandrescu Wrote:

 Additionally I would like to mention that there is also great BinaryArchive
from Bill Baxter, which I didn't mention in my first post.
There is? Completely forgot about that. If I recall the big wish list item I had for your serializer was robust subclass handling for things like serializing a BaseClass[] with a mix of pointers to BaseClass and DerivedClass. You need to be able to de-serialize that by saying something like unserialize!(BaseClass[]). I think at the time I tried it, your serializer didn't save enough info to know the proper derived class to load up. --bb
This is still missing. The problem I had is that template functions are not virtual, so I can not get derived class which should be dumped. But I think it can be doable: just needs some more time and thinking. Now typeid() gives type of most derived class, so maybe this is a way?
Nov 13 2009
parent Bill Baxter <wbaxter gmail.com> writes:
On Fri, Nov 13, 2009 at 7:14 AM, aarti_pl <aarti_no_spam_ interia.pl> wrote=
:
 Bill Baxter Wrote:

 On Fri, Nov 13, 2009 at 12:13 AM, aarti_pl <aarti_no_spam_ interia.pl> w=
rote:
 Andrei Alexandrescu Wrote:

 Additionally I would like to mention that there is also great BinaryAr=
chive from Bill Baxter, which I didn't mention in my first post.
 There is? =A0Completely forgot about that.

 If I recall the big wish list item I had for your serializer was
 robust subclass handling for things like serializing a BaseClass[]
 with a mix of pointers to BaseClass and DerivedClass. =A0You need to be
 able to de-serialize that by saying something like
 unserialize!(BaseClass[]). =A0I think at the time I tried it, your
 serializer didn't save enough info to know the proper derived class to
 load up.

 --bb
This is still missing. The problem I had is that template functions are n=
ot virtual, so I can not get derived class which should be dumped. But I th= ink it can be doable: just needs some more time and thinking. Well, it was definitely possible, even a year ago, because Tom S.'s serializer in xf could do it, though I don't recall how. It was quite complicated. Layers and layers of ctfe code instantiating templates creating more code via ctfe, or something like that.
 Now typeid() gives type of most derived class, so maybe this is a way?
Yeh, definitely seems like that could help. --bb
Nov 13 2009