digitalmars.D - Serialization + semantics of toString

aarti_pl (41/45) Nov 12 2009 You might want to see my serialization library for D.

aarti_pl (6/66) Nov 12 2009 I forgot to throw a link:

Andrei Alexandrescu (3/6) Nov 12 2009 Cool, do you also have documentation?

aarti_pl (22/30) Nov 13 2009 Well, that's definitely weak point of this library. :-)

Bill Baxter (10/12) Nov 13 2009 There is? Completely forgot about that.
aarti_pl (21/52) Nov 15 2009 I have put some more user documentation on Doost project wiki, but it is...

Jacob Carlborg (3/50) Nov 13 2009 Or you could use arhc.typeof[i] to access/set the values (even private)

Denis Koroskin (3/46) Nov 13 2009 You mean .tupleof? Just tested and it really works (wow!), didn't know

Jacob Carlborg (2/50) Nov 13 2009 Yes, tupleof

aarti_pl (2/47) Nov 13 2009 It works exactly this way. In D1 it was not possible to access private m...

Jacob Carlborg (3/50) Nov 13 2009 It wasn't? When was that added? It works for me using gdc based on dmd

aarti_pl (9/67) Nov 13 2009 I remember there were some problems, but it was already some time ago......

aarti_pl (2/18) Nov 13 2009 This is still missing. The problem I had is that template functions are ...

Bill Baxter (12/31) Nov 13 2009 rote:

aarti_pl <aarti interia.pl> writes:

Andrei Alexandrescu pisze:
 But that being said, I'd so much want to start thinking of an actual
 text serialization infrastructure. Why develop one later with the
 mention "well use that stuff for debugging only, this is the real stuff."

 Andrei

You might want to see my serialization library for D.

I think that it is worth noting as it manages to achieve the goal:
same data - completely different output. Because this output might be 
defined by user in the way she wants, it seems that this can work 
exactly the way toString should work.

It is achieved by using Archive classes which makes proper formatting, 
and which are completely independent from data being printed. Initial 
design is based on C++ Boost. I just extended concept a bit and adopted 
it to D.

Basic interface for serialization is like this:

auto serializer = Serializer!(TextArchive);
//It might be also e.g.:
//auto serializer = Serializer!(JsonArchive);
auto input = new TransparentClass(-21, 2.11, "text1", 128, -127);
auto output = serializer.dump(input);
assert(serializer.load!(TransparentClass)(output) == input);

In case of transparent classes (every field is public) you don't need 
any method inside of serialized class/struct.

In case of opaque classes there is enough to:
1. add mixin inside:
mixin Serializable;
or
2. add template method:
void describeUdt(T)(T arch) {
     arch.describeStaticArray(array, array.stringof);
}

This is all what is necessary to print every possible class/struct in 
whatever format you want.

Because of limitations of D I couldn't achieve serialization of classes 
from base pointer. It was because of fact that template methods are not 
virtual.

Recently I didn't have time to work on it, but if you think it's worthy 
and eventually might be included in Phobos, I would be interested to 
work on it further. But I would definitely need some code/concepts review.

Unfortunately there is rather poor documentation. But you can find a lot 
of unit tests in examples directory.

It's Boost licensed so no worries :-)

BR
Marcin Kuszczak
(aarti_pl)

Nov 12 2009

aarti_pl <aarti interia.pl> writes:

I forgot to throw a link:

http://www.dsource.org/projects/doost/browser/trunk/examples/util/serializer/FunctionTest.d

BR
Marcin Kuszczak
(aarti_pl)

aarti_pl pisze:
 Andrei Alexandrescu pisze:
  > But that being said, I'd so much want to start thinking of an actual
  > text serialization infrastructure. Why develop one later with the
  > mention "well use that stuff for debugging only, this is the real 
 stuff."
  >
  > Andrei
 
 You might want to see my serialization library for D.
 
 I think that it is worth noting as it manages to achieve the goal:
 same data - completely different output. Because this output might be 
 defined by user in the way she wants, it seems that this can work 
 exactly the way toString should work.
 
 It is achieved by using Archive classes which makes proper formatting, 
 and which are completely independent from data being printed. Initial 
 design is based on C++ Boost. I just extended concept a bit and adopted 
 it to D.
 
 Basic interface for serialization is like this:
 
 auto serializer = Serializer!(TextArchive);
 //It might be also e.g.:
 //auto serializer = Serializer!(JsonArchive);
 auto input = new TransparentClass(-21, 2.11, "text1", 128, -127);
 auto output = serializer.dump(input);
 assert(serializer.load!(TransparentClass)(output) == input);
 
 In case of transparent classes (every field is public) you don't need 
 any method inside of serialized class/struct.
 
 In case of opaque classes there is enough to:
 1. add mixin inside:
 mixin Serializable;
 or
 2. add template method:
 void describeUdt(T)(T arch) {
     arch.describeStaticArray(array, array.stringof);
 }
 
 This is all what is necessary to print every possible class/struct in 
 whatever format you want.
 
 Because of limitations of D I couldn't achieve serialization of classes 
 from base pointer. It was because of fact that template methods are not 
 virtual.
 
 Recently I didn't have time to work on it, but if you think it's worthy 
 and eventually might be included in Phobos, I would be interested to 
 work on it further. But I would definitely need some code/concepts review.
 
 Unfortunately there is rather poor documentation. But you can find a lot 
 of unit tests in examples directory.
 
 It's Boost licensed so no worries :-)
 
 BR
 Marcin Kuszczak
 (aarti_pl)

Nov 12 2009

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

aarti_pl wrote:
 I forgot to throw a link:
 
 http://www.dsource.org/projects/doost/browser/trunk/examples/util/seria
izer/FunctionTest.d 

Cool, do you also have documentation?

Andrei

Nov 12 2009

aarti_pl <aarti_no_spam_ interia.pl> writes:

Andrei Alexandrescu Wrote:

 aarti_pl wrote:
 I forgot to throw a link:
 
 http://www.dsource.org/projects/doost/browser/trunk/examples/util/seria
izer/FunctionTest.d 

 
 Cool, do you also have documentation?
 
 Andrei

Well, that's definitely weak point of this library. :-) 

You can find some information on doost wiki: 
http://www.dsource.org/projects/doost/wiki
There are also some DDOC comments in code. It's also worthy to look on Boost
serialization library description, as my library is based on it.

Anyway I will try to improve documentation a bit during weekend.

Additionally I would like to mention that there is also great BinaryArchive
from Bill Baxter, which I didn't mention in my first post.

---

I think that the most interesting question is if we can replace toString() with
template based solution? So instead of:
String toString()

we would write e.g.:
void describe(T)(T archive) {
}

Some other things should be considered:
* What about virtual calls to describe? Currently template methods are not
virtual, but I remember that there were posts that it is possible in some
limited way.
* Is it semantically same solution as toString()? Or toString() is used for
something other than describing members of class? 
* If toString() is used for other things than describing class/struct fields,
then do we need a standard way to do this or should it be implementation
specific?
* Solution with template method seems to be more general, as there might be
different customization of output for every kind of archive, but isn't it too
much for simple uses? Maybe string output is just enough?
* Should be there some default archive available or should it be always defined
by user?

Best Regards
Marcin Kuszczak
(aarti_pl)

Nov 13 2009

Bill Baxter <wbaxter gmail.com> writes:

On Fri, Nov 13, 2009 at 12:13 AM, aarti_pl <aarti_no_spam_ interia.pl> wrote:
 Andrei Alexandrescu Wrote:

 Additionally I would like to mention that there is also great BinaryArchive
from Bill Baxter, which I didn't mention in my first post.

There is?  Completely forgot about that.

If I recall the big wish list item I had for your serializer was
robust subclass handling for things like serializing a BaseClass[]
with a mix of pointers to BaseClass and DerivedClass.  You need to be
able to de-serialize that by saying something like
unserialize!(BaseClass[]).  I think at the time I tried it, your
serializer didn't save enough info to know the proper derived class to
load up.

--bb

Nov 13 2009

aarti_pl <aarti interia.pl> writes:

aarti_pl pisze:
 Andrei Alexandrescu Wrote:
 aarti_pl wrote:
 I forgot to throw a link:
 http://www.dsource.org/projects/doost/browser/trunk/examples/util/seria
izer/FunctionTest.d 

 Cool, do you also have documentation?
 Andrei

 
 Well, that's definitely weak point of this library. :-) 
 You can find some information on doost wiki: 
 http://www.dsource.org/projects/doost/wiki
 There are also some DDOC comments in code. It's also worthy to look on Boost
serialization library description, as my library is based on it.
 Anyway I will try to improve documentation a bit during weekend.
 Additionally I would like to mention that there is also great BinaryArchive
from Bill Baxter, which I didn't mention in my first post.
 ---
 I think that the most interesting question is if we can replace toString()
with template based solution? So instead of:
 String toString()
 
 we would write e.g.:
 void describe(T)(T archive) {
 }
 
 Some other things should be considered:
 * What about virtual calls to describe? Currently template methods are not
virtual, but I remember that there were posts that it is possible in some
limited way.
 * Is it semantically same solution as toString()? Or toString() is used for
something other than describing members of class? 
 * If toString() is used for other things than describing class/struct fields,
then do we need a standard way to do this or should it be implementation
specific?
 * Solution with template method seems to be more general, as there might be
different customization of output for every kind of archive, but isn't it too
much for simple uses? Maybe string output is just enough?
 * Should be there some default archive available or should it be always
defined by user?
 
 Best Regards
 Marcin Kuszczak
 (aarti_pl)

I have put some more user documentation on Doost project wiki, but it is 
not yet half finished. Nevertheless I think it should help to start 
working with serializer.

I am especially interested about comments on Storage concept. I am 
completely unsure if this design is good enough. Comments might be send 
privately: aarti_no_spam_[at]interia.pl or here on NG.

----

After thinking a bit about toString/serialization I got to conclusion 
that these two are different things. In my opinion the best way to 
proceed would be to change name of toString method into:
toDebugString();
as it discourages using it for anything other than debugging. Default 
implementation of toDebugString() should be serialization of object to 
string.

It just occurred to me that it dosn't make any sense to add special 
serialization code (even if it is simple) to just get quick and dirty 
printout of object state. These are just two different use cases.

Best Regards
Marcin Kuszczak
(aarti_pl)

Nov 15 2009

Jacob Carlborg <doob me.com> writes:

On 11/13/09 00:13, aarti_pl wrote:
 Andrei Alexandrescu pisze:
  > But that being said, I'd so much want to start thinking of an actual
  > text serialization infrastructure. Why develop one later with the
  > mention "well use that stuff for debugging only, this is the real
 stuff."
  >
  > Andrei

 You might want to see my serialization library for D.

 I think that it is worth noting as it manages to achieve the goal:
 same data - completely different output. Because this output might be
 defined by user in the way she wants, it seems that this can work
 exactly the way toString should work.

 It is achieved by using Archive classes which makes proper formatting,
 and which are completely independent from data being printed. Initial
 design is based on C++ Boost. I just extended concept a bit and adopted
 it to D.

 Basic interface for serialization is like this:

 auto serializer = Serializer!(TextArchive);
 //It might be also e.g.:
 //auto serializer = Serializer!(JsonArchive);
 auto input = new TransparentClass(-21, 2.11, "text1", 128, -127);
 auto output = serializer.dump(input);
 assert(serializer.load!(TransparentClass)(output) == input);

 In case of transparent classes (every field is public) you don't need
 any method inside of serialized class/struct.

 In case of opaque classes there is enough to:
 1. add mixin inside:
 mixin Serializable;
 or
 2. add template method:
 void describeUdt(T)(T arch) {
 arch.describeStaticArray(array, array.stringof);
 }

Or you could use arhc.typeof[i] to access/set the values (even private) 
of a struct/class.

 This is all what is necessary to print every possible class/struct in
 whatever format you want.

 Because of limitations of D I couldn't achieve serialization of classes
 from base pointer. It was because of fact that template methods are not
 virtual.

 Recently I didn't have time to work on it, but if you think it's worthy
 and eventually might be included in Phobos, I would be interested to
 work on it further. But I would definitely need some code/concepts review.

 Unfortunately there is rather poor documentation. But you can find a lot
 of unit tests in examples directory.

 It's Boost licensed so no worries :-)

 BR
 Marcin Kuszczak
 (aarti_pl)

Nov 13 2009

"Denis Koroskin" <2korden gmail.com> writes:

On Fri, 13 Nov 2009 17:11:54 +0300, Jacob Carlborg <doob me.com> wrote:

 On 11/13/09 00:13, aarti_pl wrote:
 Andrei Alexandrescu pisze:
  > But that being said, I'd so much want to start thinking of an actual
  > text serialization infrastructure. Why develop one later with the
  > mention "well use that stuff for debugging only, this is the real
 stuff."
  >
  > Andrei

 You might want to see my serialization library for D.

 I think that it is worth noting as it manages to achieve the goal:
 same data - completely different output. Because this output might be
 defined by user in the way she wants, it seems that this can work
 exactly the way toString should work.

 It is achieved by using Archive classes which makes proper formatting,
 and which are completely independent from data being printed. Initial
 design is based on C++ Boost. I just extended concept a bit and adopted
 it to D.

 Basic interface for serialization is like this:

 auto serializer = Serializer!(TextArchive);
 //It might be also e.g.:
 //auto serializer = Serializer!(JsonArchive);
 auto input = new TransparentClass(-21, 2.11, "text1", 128, -127);
 auto output = serializer.dump(input);
 assert(serializer.load!(TransparentClass)(output) == input);

 In case of transparent classes (every field is public) you don't need
 any method inside of serialized class/struct.

 In case of opaque classes there is enough to:
 1. add mixin inside:
 mixin Serializable;
 or
 2. add template method:
 void describeUdt(T)(T arch) {
 arch.describeStaticArray(array, array.stringof);
 }

 Or you could use arhc.typeof[i] to access/set the values (even private)  
 of a struct/class.

You mean .tupleof? Just tested and it really works (wow!), didn't know  
about that. Thanks!

Nov 13 2009

Jacob Carlborg <doob me.com> writes:

On 11/13/09 15:17, Denis Koroskin wrote:
 On Fri, 13 Nov 2009 17:11:54 +0300, Jacob Carlborg <doob me.com> wrote:

 On 11/13/09 00:13, aarti_pl wrote:
 Andrei Alexandrescu pisze:
 But that being said, I'd so much want to start thinking of an actual
 text serialization infrastructure. Why develop one later with the
 mention "well use that stuff for debugging only, this is the real

 stuff."
 Andrei

 You might want to see my serialization library for D.

 I think that it is worth noting as it manages to achieve the goal:
 same data - completely different output. Because this output might be
 defined by user in the way she wants, it seems that this can work
 exactly the way toString should work.

 It is achieved by using Archive classes which makes proper formatting,
 and which are completely independent from data being printed. Initial
 design is based on C++ Boost. I just extended concept a bit and adopted
 it to D.

 Basic interface for serialization is like this:

 auto serializer = Serializer!(TextArchive);
 //It might be also e.g.:
 //auto serializer = Serializer!(JsonArchive);
 auto input = new TransparentClass(-21, 2.11, "text1", 128, -127);
 auto output = serializer.dump(input);
 assert(serializer.load!(TransparentClass)(output) == input);

 In case of transparent classes (every field is public) you don't need
 any method inside of serialized class/struct.

 In case of opaque classes there is enough to:
 1. add mixin inside:
 mixin Serializable;
 or
 2. add template method:
 void describeUdt(T)(T arch) {
 arch.describeStaticArray(array, array.stringof);
 }

 Or you could use arhc.typeof[i] to access/set the values (even
 private) of a struct/class.

 You mean .tupleof? Just tested and it really works (wow!), didn't know
 about that. Thanks!

Yes, tupleof

Nov 13 2009

aarti_pl <aarti_no_spam_ interia.pl> writes:

Jacob Carlborg Wrote:

 On 11/13/09 00:13, aarti_pl wrote:
 Andrei Alexandrescu pisze:
  > But that being said, I'd so much want to start thinking of an actual
  > text serialization infrastructure. Why develop one later with the
  > mention "well use that stuff for debugging only, this is the real
 stuff."
  >
  > Andrei

 You might want to see my serialization library for D.

 I think that it is worth noting as it manages to achieve the goal:
 same data - completely different output. Because this output might be
 defined by user in the way she wants, it seems that this can work
 exactly the way toString should work.

 It is achieved by using Archive classes which makes proper formatting,
 and which are completely independent from data being printed. Initial
 design is based on C++ Boost. I just extended concept a bit and adopted
 it to D.

 Basic interface for serialization is like this:

 auto serializer = Serializer!(TextArchive);
 //It might be also e.g.:
 //auto serializer = Serializer!(JsonArchive);
 auto input = new TransparentClass(-21, 2.11, "text1", 128, -127);
 auto output = serializer.dump(input);
 assert(serializer.load!(TransparentClass)(output) == input);

 In case of transparent classes (every field is public) you don't need
 any method inside of serialized class/struct.

 In case of opaque classes there is enough to:
 1. add mixin inside:
 mixin Serializable;
 or
 2. add template method:
 void describeUdt(T)(T arch) {
 arch.describeStaticArray(array, array.stringof);
 }

 Or you could use arhc.typeof[i] to access/set the values (even private) 
 of a struct/class.

It works exactly this way. In D1 it was not possible to access private members
with tupleof[], so there was a need for describe(). But even in D2 I think that
describe() should stay as it gives more flexibility for user.

Nov 13 2009

Jacob Carlborg <doob me.com> writes:

On 11/13/09 16:03, aarti_pl wrote:
 Jacob Carlborg Wrote:

 On 11/13/09 00:13, aarti_pl wrote:
 Andrei Alexandrescu pisze:
   >  But that being said, I'd so much want to start thinking of an actual
   >  text serialization infrastructure. Why develop one later with the
   >  mention "well use that stuff for debugging only, this is the real
 stuff."
   >
   >  Andrei

 You might want to see my serialization library for D.

 I think that it is worth noting as it manages to achieve the goal:
 same data - completely different output. Because this output might be
 defined by user in the way she wants, it seems that this can work
 exactly the way toString should work.

 It is achieved by using Archive classes which makes proper formatting,
 and which are completely independent from data being printed. Initial
 design is based on C++ Boost. I just extended concept a bit and adopted
 it to D.

 Basic interface for serialization is like this:

 auto serializer = Serializer!(TextArchive);
 //It might be also e.g.:
 //auto serializer = Serializer!(JsonArchive);
 auto input = new TransparentClass(-21, 2.11, "text1", 128, -127);
 auto output = serializer.dump(input);
 assert(serializer.load!(TransparentClass)(output) == input);

 In case of transparent classes (every field is public) you don't need
 any method inside of serialized class/struct.

 In case of opaque classes there is enough to:
 1. add mixin inside:
 mixin Serializable;
 or
 2. add template method:
 void describeUdt(T)(T arch) {
 arch.describeStaticArray(array, array.stringof);
 }

 Or you could use arhc.typeof[i] to access/set the values (even private)
 of a struct/class.

 It works exactly this way. In D1 it was not possible to access private members
with tupleof[], so there was a need for describe(). But even in D2 I think that
describe() should stay as it gives more flexibility for user.

It wasn't? When was that added? It works for me using gdc based on dmd 
somewhere between 1.024 and 1.030.

Nov 13 2009

aarti_pl <aarti interia.pl> writes:

Jacob Carlborg pisze:
 On 11/13/09 16:03, aarti_pl wrote:
 Jacob Carlborg Wrote:

 On 11/13/09 00:13, aarti_pl wrote:
 Andrei Alexandrescu pisze:
   >  But that being said, I'd so much want to start thinking of an 
 actual
   >  text serialization infrastructure. Why develop one later with the
   >  mention "well use that stuff for debugging only, this is the real
 stuff."
   >
   >  Andrei

 You might want to see my serialization library for D.

 I think that it is worth noting as it manages to achieve the goal:
 same data - completely different output. Because this output might be
 defined by user in the way she wants, it seems that this can work
 exactly the way toString should work.

 It is achieved by using Archive classes which makes proper formatting,
 and which are completely independent from data being printed. Initial
 design is based on C++ Boost. I just extended concept a bit and adopted
 it to D.

 Basic interface for serialization is like this:

 auto serializer = Serializer!(TextArchive);
 //It might be also e.g.:
 //auto serializer = Serializer!(JsonArchive);
 auto input = new TransparentClass(-21, 2.11, "text1", 128, -127);
 auto output = serializer.dump(input);
 assert(serializer.load!(TransparentClass)(output) == input);

 In case of transparent classes (every field is public) you don't need
 any method inside of serialized class/struct.

 In case of opaque classes there is enough to:
 1. add mixin inside:
 mixin Serializable;
 or
 2. add template method:
 void describeUdt(T)(T arch) {
 arch.describeStaticArray(array, array.stringof);
 }

 Or you could use arhc.typeof[i] to access/set the values (even private)
 of a struct/class.

 It works exactly this way. In D1 it was not possible to access private 
 members with tupleof[], so there was a need for describe(). But even 
 in D2 I think that describe() should stay as it gives more flexibility 
 for user.

 It wasn't? When was that added? It works for me using gdc based on dmd 
 somewhere between 1.024 and 1.030.

I remember there were some problems, but it was already some time ago... 
  Also I remember that Andrei mentioned on NG that tupleof[] should give 
access to all members nevertheless of their protection attributes, so I 
assume it was not working like this.

Anyway possibility to define describe() should stay, as it gives 
flexibility. It is just not necessary even when members are private.

BR
Marcin Kuszczak

Nov 13 2009

aarti_pl <aarti_no_spam_ interia.pl> writes:

Bill Baxter Wrote:

 On Fri, Nov 13, 2009 at 12:13 AM, aarti_pl <aarti_no_spam_ interia.pl> wrote:
 Andrei Alexandrescu Wrote:

 Additionally I would like to mention that there is also great BinaryArchive
from Bill Baxter, which I didn't mention in my first post.

 
 There is?  Completely forgot about that.
 
 If I recall the big wish list item I had for your serializer was
 robust subclass handling for things like serializing a BaseClass[]
 with a mix of pointers to BaseClass and DerivedClass.  You need to be
 able to de-serialize that by saying something like
 unserialize!(BaseClass[]).  I think at the time I tried it, your
 serializer didn't save enough info to know the proper derived class to
 load up.
 
 --bb

This is still missing. The problem I had is that template functions are not
virtual, so I can not get derived class which should be dumped. But I think it
can be doable: just needs some more time and thinking. Now typeid() gives type
of most derived class, so maybe this is a way?

Nov 13 2009

Bill Baxter <wbaxter gmail.com> writes:

On Fri, Nov 13, 2009 at 7:14 AM, aarti_pl <aarti_no_spam_ interia.pl> wrote=
:
 Bill Baxter Wrote:

 On Fri, Nov 13, 2009 at 12:13 AM, aarti_pl <aarti_no_spam_ interia.pl> w=


rote:
 Andrei Alexandrescu Wrote:

 Additionally I would like to mention that there is also great BinaryAr=



chive from Bill Baxter, which I didn't mention in my first post.
 There is? =A0Completely forgot about that.

 If I recall the big wish list item I had for your serializer was
 robust subclass handling for things like serializing a BaseClass[]
 with a mix of pointers to BaseClass and DerivedClass. =A0You need to be
 able to de-serialize that by saying something like
 unserialize!(BaseClass[]). =A0I think at the time I tried it, your
 serializer didn't save enough info to know the proper derived class to
 load up.

 --bb

 This is still missing. The problem I had is that template functions are n=

ot virtual, so I can not get derived class which should be dumped. But I th=
ink it can be doable: just needs some more time and thinking.

Well, it was definitely possible, even a year ago, because Tom S.'s
serializer in xf could do it, though I don't recall how. It was quite
complicated.   Layers and layers of ctfe code instantiating templates
creating more code via ctfe, or something like that.

 Now typeid() gives type of most derived class, so maybe this is a way?

Yeh, definitely seems like that could help.

--bb

Nov 13 2009

D Programming

C/C++ Programming

Other

digitalmars.D - Serialization + semantics of toString