digitalmars.D - std.serialization: pre-voting review / discussion

Dicebot (35/35) Aug 12 2013 Stepping up to act as a Review Manager for Jacob Carlborg

Jacob Carlborg (12/16) Aug 12 2013 I don't think a pull request should be made before a module has gone

Dicebot (3/9) Aug 13 2013 That is something that requires the input from Phobos devs.

Dmitry Olshansky (7/16) Aug 13 2013 IMHO a good idea to have a non-trivial test suite to be separate so that...

Jacob Carlborg (4/8) Aug 13 2013 They are of a high level nature and not for the internal stuff.
Dicebot (4/8) Aug 13 2013 What do you think about having top-level folder with functional

Jacob Carlborg (4/6) Aug 13 2013 I think that's a good idea.

Jonathan M Davis (8/10) Aug 13 2013 In general, the tests should be right after the functions that they're t...

Dicebot (8/20) Aug 14 2013 There have been no real packages in Phobos so far. Tricky part

Dicebot (3/3) Aug 13 2013 2 everyone: should I interpret lack of activity as lack of

Dmitry Olshansky (4/6) Aug 13 2013 Give us some time darn it ;)

Tobias Pankrath (20/25) Aug 13 2013 I had no look at the code, but just opened the documentation, asking the...

Jacob Carlborg (8/11) Aug 13 2013 There's a fully working example here:

Dicebot (6/10) Aug 13 2013 Two random proposals for discussion:

Jacob Carlborg (4/6) Aug 13 2013 I'm wondering if the package.d module could be used for this, somehow.
glycerine (34/41) Aug 13 2013 I'm included to prefer the Thrift bindings over Orange since I

Dicebot (4/5) Aug 14 2013 Jacob, can you add a high-level overview which answers this

Jacob Carlborg (4/7) Aug 14 2013 Yes, I could do that.
Jacob Carlborg (5/8) Aug 14 2013 Actually, I'm not so sure about that. He wants me to do a comparison to

Dicebot (7/15) Aug 14 2013 I did not mean to answer these questions specifically - just

Jacob Carlborg (4/9) Aug 14 2013 Ok, ok, I'll do that.

Jacob Carlborg (11/40) Aug 14 2013 It seems like we have a different meaning of "essential". I have

glycerine (32/37) Aug 14 2013 Wishful thinking aside, they are competitors. The fact that you

Dicebot (7/8) Aug 14 2013 They are not. `std.serialization` does not and should not compete

glycerine (12/16) Aug 17 2013 Huh? Do you know what thrift does? Summary: Everything that

BS (13/31) Aug 17 2013 I agree built in version support is important.

David Nadlinger (7/9) Aug 18 2013 In an ideal world, Thrift could maybe be built on

Dicebot (23/41) Aug 17 2013 Yes I know what Thrift does and that is not what it is needed

David Nadlinger (10/13) Aug 18 2013 The D implementation of Thrift is actually not a binding and does

John Colvin (9/20) Aug 17 2013 Thrift is the preferred choice when choosing a library for ALL
David Nadlinger (19/31) Aug 18 2013 That's actually not true. Thrift does not serialize arbitrary

Andrej Mitrovic (11/23) Aug 18 2013 I think it would be good if we added Thrift and other test-cases, for

David Nadlinger (25/30) Aug 18 2013 The big problem with this right now is that quite frequently, you

Dicebot (3/4) Aug 18 2013 Please, don't move too far from review topic ;) It is a separate
Walter Bright (13/19) Aug 20 2013 That's exactly the problem. If these large projects are incorporated int...

Jacob Carlborg (22/29) Aug 18 2013 Orange/std.serialization is capable of serializing more types than

Jonathan M Davis (7/8) Aug 18 2013 I don't know if it's crucial or not, but I know that the Java guys didn'...

bsd (24/37) Aug 19 2013 I think this versioning idea is more important for protocol

David Nadlinger (16/29) Aug 19 2013 Seems like your memory has indeed faded a bit. ;)

David Nadlinger (13/18) Aug 19 2013 By the way, to be honest, this is also the main point that makes
bsd (2/17) Aug 19 2013 Getting old! :-)

ilya-stromberg (16/17) Aug 22 2013 Can std.serialization load data if class definition was changed?

Jacob Carlborg (5/19) Aug 22 2013 Yes. In this case it will use the name of the instance fields when

ilya-stromberg (14/19) Aug 22 2013 Great! What about more difficult cases? For example, we have:

Jacob Carlborg (6/19) Aug 22 2013 No it can't. It will throw an exception because it cannot find a "long"

ilya-stromberg (14/25) Aug 23 2013 It's a serious issue. May be it's more important than range

Dicebot (11/17) Aug 23 2013 I don't think it as an issue at all. Behavior you want can't be

Tyler Jameson Little (6/23) Aug 23 2013 What about adding delegate hooks in somewhere? These delegates

Jacob Carlborg (5/10) Aug 23 2013 std.serialization already supports delegate hooks for missing values:

Tyler Jameson Little (2/15) Aug 23 2013 Awesome!

ilya-stromberg (16/33) Aug 28 2013 Maybe you are right.

Dicebot (5/6) Aug 28 2013 There was a good proposal by Dmitry to separate sequential strict

ilya-stromberg (6/10) Aug 28 2013 The problem is not only my. Actually, I didn't use C#

Dicebot (4/5) Aug 28 2013 http://forum.dlang.org/post/kvj17t$1ash$1@digitalmars.com

Jacob Carlborg (6/20) Aug 28 2013 I don't think we should add too much of this kind of functionality.

ilya-stromberg (7/30) Sep 01 2013 Jacob, can you use clearer error messages and provide more

ilya-stromberg (4/38) Sep 01 2013 Sorry, I want to write:

Jacob Carlborg (4/7) Sep 01 2013 Yes, I could enhance the error message.

Jacob Carlborg (37/50) Aug 23 2013 Actually, my previous answer was not entirely correct. By default it

ilya-stromberg (43/48) Aug 24 2013 Great job!

Jacob Carlborg (11/29) Aug 24 2013 I actually noticed this problem when I wrote the example. First, the

ilya-stromberg (4/12) Aug 24 2013 In that case maybe we should remove "Serializable" interface? And

Jacob Carlborg (4/7) Aug 24 2013 Yes, that's what I'm planning to do.

ilya-stromberg (4/11) Aug 24 2013 Maybe we should rename methods "toData" and "fromData" to avoid

ilya-stromberg (10/17) Aug 26 2013 The name "isSerializable" is TERRIBLE:

mrd (8/12) Sep 21 2013 Is this the right way?

Jacob Carlborg (6/12) Sep 23 2013 Not necessarily. I could implement that by default it will use the field...

ilya-stromberg (22/24) Aug 13 2013 Sorry, I think that example is WRONG!

Jacob Carlborg (5/25) Aug 14 2013 How do I do that?

Dicebot (3/10) Aug 14 2013 http://dlang.org/changelog.html#documentedunittest

Jacob Carlborg (4/5) Aug 14 2013 Thanks.

Jacob Carlborg (4/8) Aug 13 2013 I have rebased now.
Tyler Jameson Little (8/8) Aug 14 2013 Serious:

Jacob Carlborg (13/21) Aug 14 2013 That's up to the archive how it chooses to implement it. But the current...

Tyler Jameson Little (10/34) Aug 14 2013 Well, std.xml needs to be replaced anyway, so it's probably not a

Jacob Carlborg (8/13) Aug 14 2013 I haven't done any measurements. It will use the GC to deserialize

Tove (22/28) Aug 14 2013 I understand the need for Orange to be backwards compatible, but

Tyler Jameson Little (3/33) Aug 14 2013 I like this a lot more. Phobos just needs to be compatible with

Jacob Carlborg (5/7) Aug 14 2013 I guess this is why we have this thread. I would like to hear comments

ilya-stromberg (6/13) Aug 14 2013 I think we should avoid mixins as much as it possible.

Jacob Carlborg (6/10) Aug 14 2013 Of course UDA's should be the primary use for this. The question is

Dicebot (5/7) Aug 14 2013 What is the point of including something into Phobos and
ilya-stromberg (7/17) Aug 14 2013 I did not use Orange at all.

Dicebot (6/11) Aug 14 2013 std.serialization is not Orange and should not be considered as

Jonathan M Davis (3/7) Aug 14 2013 Agreed.

Kapps (11/21) Aug 14 2013 I don't think it should be included. The UDAs replace it nicely,

Jacob Carlborg (5/7) Aug 14 2013 You do have a point.

Jacob Carlborg (8/27) Aug 14 2013 I don't know, it doesn't really hurt to be present. And for anyone using...

ilya-stromberg (34/53) Sep 04 2013 Jacob, can you add "@serializationName(string name)" UDA?

Jacob Carlborg (7/39) Sep 04 2013 Yes, the question is how much of these customization should be

Andrei Alexandrescu (4/9) Aug 14 2013 This seems like a major limitation. (Disclaimer: I haven't read the

Jacob Carlborg (5/7) Aug 14 2013 The data is built up as a DOM (with the XmlArchive) using std.xml. I

Andrei Alexandrescu (4/9) Aug 14 2013 I'm thinking some people may need to stream to/from large files and

Jacob Carlborg (4/6) Aug 14 2013 Yes, I understand that. But currently I'm limited by std.xml.

H. S. Teoh (9/15) Aug 14 2013 [...]

Jacob Carlborg (5/9) Aug 14 2013 Since std.xml returns the data as a string, you mean I just forward the

ilya-stromberg (4/9) Aug 14 2013 Can you use another serialization format and supports file output

Tyler Jameson Little (13/25) Aug 14 2013 That's often not possible, especially when working with an
Jacob Carlborg (7/9) Aug 15 2013 The idea of the library is that it can support multiple archive types.

ilya-stromberg (2/9) Aug 21 2013 Jacob, do you close to finish the work on the binary archive?

Jacob Carlborg (4/5) Aug 21 2013 No, I don't think so.

mrd (11/18) Sep 21 2013 I am also working on my own binary archive implementation (I just

Jacob Carlborg (10/15) Sep 23 2013 Fields are looked up by name. This is to avoid a dependency of the order...

ilya-stromberg (21/33) Aug 18 2013 Shall we fix it before accept the std.serialization?

Marek Janukowicz (16/46) Aug 18 2013 My opinion is - accept it as it is (if it's not completely broken). I

Andrej Mitrovic (3/6) Aug 18 2013 FWIW you could try out msgpack-d: https://github.com/msgpack/msgpack-d#u...

Marek Janukowicz (5/13) Aug 18 2013 That's what I ended up using, but I would be much more happy to have

Tobias Pankrath (7/19) Aug 18 2013 We should add a suitable range interface, even if it makes no

Tyler Jameson Little (7/28) Aug 18 2013 I completely agree.

Jacob Carlborg (5/6) Aug 18 2013 The XML module from Tango excepts the content being in memory as well,

Jesse Phillips (28/28) Aug 17 2013 I'd like to start off by saying I don't really know what I want

Jacob Carlborg (15/41) Aug 19 2013 I have had a brief look at Protocol Buffers and I don't see why it

ilya-stromberg (3/7) Aug 19 2013 You can find the Protocol Buffers library here, may be it helps:

Jesse Phillips (2/9) Aug 19 2013 Code has moved to https://github.com/opticron/ProtocolBuffer

Jacob Carlborg (5/6) Aug 19 2013 Does it have any utility functions that are fairly standalone to handle

Jesse Phillips (3/7) Aug 20 2013 The data conversions are handled by

Jesse Phillips (9/24) Aug 19 2013 I not familiar with the interaction of Archive and Serializer. I

Jacob Carlborg (15/22) Aug 19 2013 std.serialization basically support any type in D (except for delegates

Dicebot (24/60) Aug 18 2013 OK, time to make a short summary.

Walter Bright (3/5) Aug 18 2013 I agree. Ranges are a very big deal for D, and libraries that can concei...
ilya-stromberg (4/15) Aug 19 2013 Can we path current std.xml to add file input/output, not only
Jacob Carlborg (9/30) Aug 19 2013 I've been quite busy lately but I've tried to address the minor issues

Dicebot (8/15) Aug 19 2013 I also expect that enhancement to dlang.org to support package.d

Jacob Carlborg (5/6) Aug 19 2013 It just that I don't clearly know how the code will need to look like,

Dicebot (3/8) Aug 19 2013 Ok, I'll investigate related part of package a bit more in

Jacob Carlborg (17/19) Aug 19 2013 What I have now is something like this:

Tyler Jameson Little (27/32) Aug 19 2013 Maybe we need some kind of doc explaining the idiomatic usage of

Johannes Pfau (48/68) Aug 19 2013 Your "pipe" function is the same as std.algorithm.copy(InputRange,

Dmitry Olshansky (5/14) Aug 19 2013 +1
Tyler Jameson Little (8/77) Aug 19 2013 +1 for the first way.

ilya-stromberg (23/61) Aug 20 2013 No, you are WRONG. InputRange is MORE flexible: it can be lazy or

Johannes Pfau (13/83) Aug 20 2013 Yes, InputRange is more flexible, but it's also more difficult to

ilya-stromberg (6/22) Aug 20 2013 No, Archive have to do NOTHING. 'serialize' call must only store

Dicebot (70/70) Aug 20 2013 Ok, I was trying to avoid expressing personal opinion until now

Daniel Murphy (4/10) Aug 20 2013 I think this is very important. Simple uses should be as simple as

Tyler Jameson Little (6/20) Aug 20 2013 +1

Jacob Carlborg (4/9) Aug 20 2013 The rest of the API is need for more advanced use cases.
ilya-stromberg (10/31) Aug 20 2013 It will be great! Also, whith Uniform Function Call Syntax (UFCS)

Jacob Carlborg (4/9) Aug 20 2013 That's the plan.

Jacob Carlborg (12/27) Aug 20 2013 I have been planning to add a function like that but just haven't got

Dicebot (6/12) Aug 20 2013 Cool, as I have said it is not something critical that would

Dicebot (5/5) Aug 20 2013 P.S. Right now most important (and probably only really

Jacob Carlborg (6/11) Aug 20 2013 Yes, but now there have been quite a lot suggestions for how the range

ilya-stromberg (6/19) Aug 21 2013 Try to read the article:
Dicebot (5/8) Aug 21 2013 Sure. I have already written my opinion on this but getting

Jacob Carlborg (4/6) Aug 19 2013 I have removed all uses of "mixin annotations".

Walter Bright (27/29) Aug 20 2013 Thank you, Jacob. It looks like you've put a lot of nice work into this.

Jacob Carlborg (33/63) Aug 20 2013 Yes, I need to add some overview documentation. There's still the

Walter Bright (13/29) Aug 20 2013 Hmm. That looks then like a ddoc bug.

Jacob Carlborg (9/20) Aug 20 2013 I guess it could be called "archiver", or do you have a better suggestio...

David Nadlinger (4/5) Aug 20 2013 Almost certainly, yes. An "archive" is something you put data
Walter Bright (4/12) Aug 20 2013 I tend to think in terms of concrete examples, rather than abstract conc...

Jacob Carlborg (5/6) Aug 22 2013 Added as: http://d.puremagic.com/issues/show_bug.cgi?id=10870

Walter Bright (2/4) Aug 22 2013 Thanks

Dicebot (2/2) Aug 31 2013 Jacob, what are your current plans on this (considering recent

Jacob Carlborg (22/24) Aug 31 2013 My todo list looks like this:

Dicebot (5/24) Aug 31 2013 Great. No hurry here, there is no hard deadline for voting - I'll

Jacob Carlborg (6/9) Aug 31 2013 What I mean is that we usual have a couple of weeks for reviewing and

Dicebot (7/16) Aug 31 2013 You won't. Reviewing is not a blocking operation, if anyone wants

Dicebot (2/2) Sep 28 2013 Review summary: http://wiki.dlang.org/Review/std.serialization

"Dicebot" <public dicebot.lv> writes:

Stepping up to act as a Review Manager for Jacob Carlborg 
std.serialization

---- Input ----

Code: https://github.com/jacob-carlborg/phobos/tree/serialization

Documentation: 
https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/index.html

Previous review thread: 
http://forum.dlang.org/thread/adyanbsdsxsfdpvoozne forum.dlang.org

---- Changes since last review ----

- Sources has been integrated into Phobos source tree
- DDOC documentation has been provided in a form it should look 
like on dlang.org
- Most utility functions/template code depends on have been 
inlined. Remaining `package` utility modules:
     * std.serialization.archives.xmldocument
     * std.serialization.attribute
     * std.serialization.registerwrapper

---- Information for reviewers ----

Goal of this thread is to detect if there are any outstanding 
issues that need to fixed before formal "yes"/"no" voting 
happens. If no critical objections will arise, voting will begin 
starting with a next week.

Please take this seriously: "If you identify problems along the 
way, please note if they are minor, serious, or showstoppers." 
(http://wiki.dlang.org/Review/Process). This information later 
will be used to determine if library is ready for voting.

If there are any frequent Phobos contributors / core developers 
please pay extra attention to submission code style and fitting 
into overall Phobos guidelines and structure.

-------------------------------------

Let the thread begin.

Jacob, it is probably worth creating a pull request with latest 
rebased version of your proposal to simplify getting a quick 
overview of changes. Also please tell if there is anything you 
want/need to implement before merging.

Aug 12 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-12 15:27, Dicebot wrote:

 Jacob, it is probably worth creating a pull request with latest rebased
 version of your proposal to simplify getting a quick overview of
 changes.

I don't think a pull request should be made before a module has gone 
through the review queue and is approved. With Github it's easy to diff 
between a fork and upstream: 
https://github.com/jacob-carlborg/phobos/compare/serialization

 Also please tell if there is anything you want/need to implement before
merging.

* I have to double check that I haven't added any improvements to Orange 
not present in std.serialization. But that would only be bug fixes and 
no public API change so that shouldn't hold the review.

* I forgot to add that the unit tests are, a bit controversial, located 
in std.serialization.tests

-- 
/Jacob Carlborg

Aug 12 2013

"Dicebot" <public dicebot.lv> writes:

On Monday, 12 August 2013 at 19:55:01 UTC, Jacob Carlborg wrote:
 I don't think a pull request should be made before a module has 
 gone through the review queue and is approved. With Github it's 
 easy to diff between a fork and upstream: 
 https://github.com/jacob-carlborg/phobos/compare/serialization

Agreed.

 * I forgot to add that the unit tests are, a bit controversial, 
 located in std.serialization.tests

That is something that requires the input from Phobos devs.

Aug 13 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

13-Aug-2013 16:48, Dicebot пишет:
 On Monday, 12 August 2013 at 19:55:01 UTC, Jacob Carlborg wrote:
 I don't think a pull request should be made before a module has gone
 through the review queue and is approved. With Github it's easy to
 diff between a fork and upstream:
 https://github.com/jacob-carlborg/phobos/compare/serialization

 Agreed.

 * I forgot to add that the unit tests are, a bit controversial,
 located in std.serialization.tests

 That is something that requires the input from Phobos devs.

IMHO a good idea to have a non-trivial test suite to be separate so that 
it doesn't not clutter other module(s). That said isolated tests for 
individual pieces and internal stuff are better kept together with the 
code they test.

-- 
Dmitry Olshansky

Aug 13 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-13 16:34, Dmitry Olshansky wrote:

 IMHO a good idea to have a non-trivial test suite to be separate so that
 it doesn't not clutter other module(s). That said isolated tests for
 individual pieces and internal stuff are better kept together with the
 code they test.

They are of a high level nature and not for the internal stuff.

-- 
/Jacob Carlborg

Aug 13 2013

"Dicebot" <public dicebot.lv> writes:

On Tuesday, 13 August 2013 at 14:35:06 UTC, Dmitry Olshansky 
wrote:
 IMHO a good idea to have a non-trivial test suite to be 
 separate so that it doesn't not clutter other module(s). That 
 said isolated tests for individual pieces and internal stuff 
 are better kept together with the code they test.

What do you think about having top-level folder with functional 
tests for a more complex packages?

Aug 13 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-13 17:09, Dicebot wrote:

 What do you think about having top-level folder with functional tests
 for a more complex packages?

I think that's a good idea.

-- 
/Jacob Carlborg

Aug 13 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Monday, August 12, 2013 21:55:00 Jacob Carlborg wrote:
 * I forgot to add that the unit tests are, a bit controversial, located
 in std.serialization.tests

In general, the tests should be right after the functions that they're testing 
and not in a separate file. I don't know that it's never appropriate to put the 
tests in a separate file, but none of Phobos does this right now, and I think 
that a very good reason is needed for doing it. I also don't know that it 
works very well with how the Posix makefiles compile and run each module's unit 
tests separately.

- Jonathan M Davis

Aug 13 2013

"Dicebot" <public dicebot.lv> writes:

On Wednesday, 14 August 2013 at 02:32:20 UTC, Jonathan M Davis 
wrote:
 In general, the tests should be right after the functions that 
 they're testing
 and not in a separate file. I don't know that it's never 
 appropriate to put the
 tests in a separate file, but none of Phobos does this right 
 now, and I think
 that a very good reason is needed for doing it. I also don't 
 know that it
 works very well with how the Posix makefiles compile and run 
 each module's unit
 tests separately.

 - Jonathan M Davis

There have been no real packages in Phobos so far. Tricky part 
comes when you need to test some functionality that is split 
across several modules - those are technically not _unit_ tests 
and do not belong to any specific module. I don't know what is 
the right approach here but I doubt existing practice can be 
taken blindly.

Aug 14 2013

"Dicebot" <public dicebot.lv> writes:

2 everyone: should I interpret lack of activity as lack of 
interest in getting this into Phobos or lack of issues to comment 
on? ;)

Aug 13 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

13-Aug-2013 17:15, Dicebot пишет:
 2 everyone: should I interpret lack of activity as lack of interest in
 getting this into Phobos or lack of issues to comment on? ;)

Give us some time darn it ;)
-- 
Dmitry Olshansky

Aug 13 2013

Tobias Pankrath <lists pankrath.net> writes:

On 08/12/2013 03:27 PM, Dicebot wrote:
 Stepping up to act as a Review Manager for Jacob Carlborg std.serialization

 ---- Input ----

 Code: https://github.com/jacob-carlborg/phobos/tree/serialization

 Documentation:
 https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/index.html

I had no look at the code, but just opened the documentation, asking the 
question: "What do I need to do to serialize this graph data structure, 
I have here?". The documentation does not seem to give a straight answer.

Now, that's an issue I have with almost all phobos modules, for example 
with std.container, but I'll raise this point here: Our documentational 
standards are not good enough, because all we ever have is some API 
documentation ala this is module X, it contains the symbols A, B, C, 
which have a short description respectively.

However a good documentation (look at docs.python.org for example) needs 
to do more. The module has a purpose, because it should help me to 
accomplish a task. The documentation must say (preferably in a single 
location) what this task is and how this module/library may
help me in accomplishing it's task. An outline of the basic design 
decisions (for example where does std.container mention it's structures
have reference semantics?) are often invaluable in unterstanding api/
more detailed documentation.

I know how much work such documentation is and I surely wouldn't vote 
against std.serialization just because of this. But it's one of the two 
biggest hindrances if you want to get started / productive with D. (IMHO)

Aug 13 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-13 15:33, Tobias Pankrath wrote:

 I had no look at the code, but just opened the documentation, asking the
 question: "What do I need to do to serialize this graph data structure,
 I have here?". The documentation does not seem to give a straight answer.

There's a fully working example here:

https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/std_serialization_serializer.html

I worked quite hard with the documentation. There are code examples here 
as well, I just don't know where to put them in Phobos:

https://github.com/jacob-carlborg/orange/wiki/_pages

-- 
/Jacob Carlborg

Aug 13 2013

"Dicebot" <public dicebot.lv> writes:

On Tuesday, 13 August 2013 at 15:04:38 UTC, Jacob Carlborg wrote:
 I worked quite hard with the documentation. There are code 
 examples here as well, I just don't know where to put them in 
 Phobos:

 https://github.com/jacob-carlborg/orange/wiki/_pages

Two random proposals for discussion:

1) Chosing one or two examples that cover most typical use cases 
and put them into `serializer` module header.

2) Create a devoted `examples` module which is not imported in 
`package.d` but is available in documentation.

Aug 13 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-13 17:12, Dicebot wrote:

 2) Create a devoted `examples` module which is not imported in
 `package.d` but is available in documentation.

I'm wondering if the package.d module could be used for this, somehow.

-- 
/Jacob Carlborg

Aug 13 2013

"glycerine" <noreply noreply.com> writes:

On Tuesday, 13 August 2013 at 15:12:40 UTC, Dicebot wrote:
 On Tuesday, 13 August 2013 at 15:04:38 UTC, Jacob Carlborg 
 wrote:
 I worked quite hard with the documentation. There are code 
 examples here as well, I just don't know where to put them in 
 Phobos:

 https://github.com/jacob-carlborg/orange/wiki/_pages


I'm included to prefer the Thrift bindings over Orange since I 
need binary compression and type safety (XML??? yikes), 
inter-language operability, and most essentially, data versioning.

Nonetheless, in order to make a realistic comparison and 
evaulation, I need much more of the theory of operation, and a 
description of the Orange design.  I appreciate that you worked 
hard with the documentation.  But most of the essential 
description is missing.

Here is an outline of serialization tradeoffs and architectural 
issues that should be discussed in the documentation.

1. Interface Definition Language (IDL): required or not? If not, 
how do know the details of what to serialize. If not, how do you 
handle/support data versioning? If not, how do you interoperate 
without another language? If yes, which types are supported and 
what is the syntax and grammar of the IDL?

2. Is the serialized format independently de-marshallable, or is 
meta information required in addition?

3. Which transports if any, are integrated/supported?  Memory 
buffer, file descriptor, framed, zero-copy, socket, SSL support, 
JSON, etc.

4. Are service definitions supported (methods on objects or 
functions)? Are they versioned?

5. How compatible is the format with other languages?

6. How compact is the encoding?

7. How fast is to marshal and unMarshal?  What tradeoffs were 
made.

8. Is there a debug encoding, text that is human readable?

9. To emphasize the important point of the first item again: data 
versioning: how do you upgrade your cluster when a data 
definition changes?  If your serailization format requires 
simultaneous downtime for the entire cluster instead of 
supporting incremental upgrade, I'd say your architecture is 
seriously antiquated.

Aug 13 2013

"Dicebot" <public dicebot.lv> writes:

On Tuesday, 13 August 2013 at 22:54:05 UTC, glycerine wrote:
 ...

Jacob, can you add a high-level overview which answers this 
questions? (in any place you find convenient, until proper place 
for package-wide documentation is decided).

Aug 14 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-14 09:19, Dicebot wrote:

 Jacob, can you add a high-level overview which answers this questions?
 (in any place you find convenient, until proper place for package-wide
 documentation is decided).

Yes, I could do that.

-- 
/Jacob Carlborg

Aug 14 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-14 09:19, Dicebot wrote:

 Jacob, can you add a high-level overview which answers this questions?
 (in any place you find convenient, until proper place for package-wide
 documentation is decided).

Actually, I'm not so sure about that. He wants me to do a comparison to 
Thrift.

-- 
/Jacob Carlborg

Aug 14 2013

"Dicebot" <public dicebot.lv> writes:

On Wednesday, 14 August 2013 at 07:43:22 UTC, Jacob Carlborg 
wrote:
 On 2013-08-14 09:19, Dicebot wrote:

 Jacob, can you add a high-level overview which answers this 
 questions?
 (in any place you find convenient, until proper place for 
 package-wide
 documentation is decided).

 Actually, I'm not so sure about that. He wants me to do a 
 comparison to Thrift.

I did not mean to answer these questions specifically - just 
provide a high-level overview of what std.serialization is. 
Architecture, use case domain etc. - to avoid similar confusion 
from people exploring standard library documentation. If such 
misunderstanding has happened once, it is likely to happen again.

Aug 14 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-14 09:46, Dicebot wrote:

 I did not mean to answer these questions specifically - just provide a
 high-level overview of what std.serialization is. Architecture, use case
 domain etc. - to avoid similar confusion from people exploring standard
 library documentation. If such misunderstanding has happened once, it is
 likely to happen again.

Ok, ok, I'll do that.

-- 
/Jacob Carlborg

Aug 14 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-14 00:54, glycerine wrote:

 I'm included to prefer the Thrift bindings over Orange since I need
 binary compression and type safety (XML??? yikes), inter-language
 operability, and most essentially, data versioning.

 Nonetheless, in order to make a realistic comparison and evaulation, I
 need much more of the theory of operation, and a description of the
 Orange design.  I appreciate that you worked hard with the
 documentation.  But most of the essential description is missing.

It seems like we have a different meaning of "essential". I have 
documented the package for what it is, not for what it's not. It's a 
package for serialization, not a RCP or network package. SSL support 
doesn't make sense, it's like asking "Does std.algorithm.map has SSL 
support?".

You seem to like me to write a comparison to Thrift in the 
documentation. You have to make the comparison yourself.

 Here is an outline of serialization tradeoffs and architectural issues
 that should be discussed in the documentation.

 1. Interface Definition Language (IDL): required or not? If not, how do
 know the details of what to serialize. If not, how do you handle/support
 data versioning? If not, how do you interoperate without another
 language? If yes, which types are supported and what is the syntax and
 grammar of the IDL?

 2. Is the serialized format independently de-marshallable, or is meta
 information required in addition?

 3. Which transports if any, are integrated/supported?  Memory buffer,
 file descriptor, framed, zero-copy, socket, SSL support, JSON, etc.

 4. Are service definitions supported (methods on objects or functions)?
 Are they versioned?

 5. How compatible is the format with other languages?

 6. How compact is the encoding?

 7. How fast is to marshal and unMarshal?  What tradeoffs were made.

 8. Is there a debug encoding, text that is human readable?

 9. To emphasize the important point of the first item again: data
 versioning: how do you upgrade your cluster when a data definition
 changes?  If your serailization format requires simultaneous downtime
 for the entire cluster instead of supporting incremental upgrade, I'd
 say your architecture is seriously antiquated.

Many of these questions doesn't even make sense, as I stated above.

-- 
/Jacob Carlborg

Aug 14 2013

"glycerine" <noreply noreply.com> writes:

On Wednesday, 14 August 2013 at 07:40:13 UTC, Jacob Carlborg 
wrote:
 I have documented the package for what it is, not for what it's 
 not. It's a package for serialization, not a RCP or network 
 package....

 You seem to like me to write a comparison to Thrift in the 
 documentation. You have to make the comparison yourself.

Wishful thinking aside, they are competitors. The fact that you 
haven't already done this comparison is unfortunate. I've already 
done that comparison, tried Orange, and found it wanting. If you 
don't want everyone else to do the same, you should answer the 
questions outlined so that it can be positioned appropriately in 
people's minds.

If you'd like examples of how to present design rationale, using 
contrast for illustration, consider the example of Stroustrup's 
presentation of C++ features in any of his several books. 
Contrasting analysis is often essential in describing the 
history, design and rationale for your work; the "related work" 
section of any technical publication is required by reviewers. 
You should provide it if you don't want your work to be dismissed 
out of hand.

Many if not most modern serialization libraries do address 
transport, and it is critical to the most common use case for 
serialization: as a developer, I want to move data between two 
different environments, be they hosts, memories, disk, process, 
thread, or different languages; so that I can store and process 
data non-locally and in a non-sequential fashion.

For Orange, you can simply say that you have no transport 
support, and perhaps describe why you don't consider it (e.g. 
what use case were you designing for?), and that will suffice to 
answer number three (3).  In addition, there are still eight 
other salient issues.

I provided this outline to assist you in describing your work. 
You'll need to be more specific about where you are confused if 
you don't understand a particular issue. Ignoring issues won't 
make them go away, it will just make others ignore and go away 
from your work.

Aug 14 2013

"Dicebot" <public dicebot.lv> writes:

On Wednesday, 14 August 2013 at 13:28:42 UTC, glycerine wrote:
 Wishful thinking aside, they are competitors.

They are not. `std.serialization` does not and should not compete 
in Thrift domain. In fact, if something like this can be found in 
proposal you should point to it and it will be discussed as a 
possible removal candidate.

Your input is valuable put please reconsider it without this 
wrong assumption.

Aug 14 2013

"glycerine" <donotreply noreply.com> writes:

On Wednesday, 14 August 2013 at 13:43:50 UTC, Dicebot wrote:
 On Wednesday, 14 August 2013 at 13:28:42 UTC, glycerine wrote:
 Wishful thinking aside, they are competitors.

 They are not. `std.serialization` does not and should not 
 compete in Thrift domain.

Huh? Do you know what thrift does? Summary: Everything that
Orange/std.serialization does and more. To the point: Thrift
provides data versioning, std.serialization does not. In my book:
end of story, game over. Thrift is preffered choice. If you
are going to standardize something, standardize the Thrift
bindings so that the compiler doesn't introduce regressions
that break them, like happened from dmd 2.062 - present.

You don't provide any rationale for your assertion, so I can't
really respond more constructively until you do. Please
familiarize yourself with D's Thrift bindings, which work well
with dmd 2.061. Then provide a rationale for your conjecture.

Aug 17 2013

"BS" <slackovsky gmail.com> writes:

On Saturday, 17 August 2013 at 08:29:37 UTC, glycerine wrote:
 On Wednesday, 14 August 2013 at 13:43:50 UTC, Dicebot wrote:
 On Wednesday, 14 August 2013 at 13:28:42 UTC, glycerine wrote:
 Wishful thinking aside, they are competitors.

 They are not. `std.serialization` does not and should not 
 compete in Thrift domain.

 Huh? Do you know what thrift does? Summary: Everything that
 Orange/std.serialization does and more. To the point: Thrift
 provides data versioning, std.serialization does not. In my 
 book:
 end of story, game over. Thrift is preffered choice. If you
 are going to standardize something, standardize the Thrift
 bindings so that the compiler doesn't introduce regressions
 that break them, like happened from dmd 2.062 - present.

 You don't provide any rationale for your assertion, so I can't
 really respond more constructively until you do. Please
 familiarize yourself with D's Thrift bindings, which work well
 with dmd 2.061. Then provide a rationale for your conjecture.

I agree built in version support is important.

As for your other issues you mention:

a) do one thing, do it well.
b) modular is better than monolithic.
c) std.serialization is for serialization, no more no less.
d) Thrift is for scalable cross-language services development.
(and much more http://thrift.apache.org/)

Just because Thrift can serialize classes/structs doesn't mean 
std.serialization  should support RPC services or transport of 
serialized data.

I'd rather that was left for a separate module (or two or three) 
built on top of std.serialization.

Aug 17 2013

"David Nadlinger" <code klickverbot.at> writes:

On Saturday, 17 August 2013 at 10:15:34 UTC, BS wrote:
 I'd rather that was left for a separate module (or two or 
 three) built on top of std.serialization.

In an ideal world, Thrift could maybe be built on 
std.serialization, but in the current form that's not true 
(regardless of e.g. versioning, Orange is likely not fast 
enough), and I am not sure whether this is a desirable goal in 
the first place anyway.

David

Aug 18 2013

"Dicebot" <public dicebot.lv> writes:

On Saturday, 17 August 2013 at 08:29:37 UTC, glycerine wrote:
 On Wednesday, 14 August 2013 at 13:43:50 UTC, Dicebot wrote:
 On Wednesday, 14 August 2013 at 13:28:42 UTC, glycerine wrote:
 Wishful thinking aside, they are competitors.

 They are not. `std.serialization` does not and should not 
 compete in Thrift domain.

 Huh? Do you know what thrift does? Summary: Everything that
 Orange/std.serialization does and more. To the point: Thrift
 provides data versioning, std.serialization does not. In my 
 book:
 end of story, game over. Thrift is preffered choice. If you
 are going to standardize something, standardize the Thrift
 bindings so that the compiler doesn't introduce regressions
 that break them, like happened from dmd 2.062 - present.

 You don't provide any rationale for your assertion, so I can't
 really respond more constructively until you do. Please
 familiarize yourself with D's Thrift bindings, which work well
 with dmd 2.061. Then provide a rationale for your conjecture.

Yes I know what Thrift does and that is not what it is needed 
here. Important things to consider:

1) Having bindings in standard library is discouraged, we have 
Deimos for that. There is only curl stuff and it is considered a 
bad solution as far as I am aware of.

2) Thrift covers very wide domain of tasks - protocol 
descriptions, inter-operation between different versions, 
cross-language operation. `std.serialization` is about one simple 
task - taking care of D type reflection to load/store them in 
some way.

3) UNIX-way. Standard library must provide small self-sufficient 
components that focus on doing one job and doing it good. It is 
up to user to combine those components to build behavior he needs 
in actual application. In that sense core of std.serialization 
must be evaluated from the point of view "does it allow me to add 
feature X?" instead of "does it have feature X?"

4) There are lot of different serialization use cases and often 
having something like Thrift is a huge overkill. Good standard 
library allows user to chose only functionality he actually 
needs, no more.

There is nothing wrong with your choice of Thrift - it just does 
not belong to std.serialization

Aug 17 2013

"David Nadlinger" <code klickverbot.at> writes:

On Saturday, 17 August 2013 at 11:20:17 UTC, Dicebot wrote:
 1) Having bindings in standard library is discouraged, we have 
 Deimos for that. There is only curl stuff and it is considered 
 a bad solution as far as I am aware of.

The D implementation of Thrift is actually not a binding and does 
not necessarily rely on the Thrift code generator either – all 
the latter does is to generate a D struct definition for the 
types/method parameters in your .thrift file that is then handled 
at D compile-time via reflection. In fact, this even works the 
other way, allowing you to generate .thrift IDL files for 
existing D types. (And yes, in theory the code generator could be 
replaced by ImportExpressions and a CTFE parser.)

David

Aug 18 2013

"John Colvin" <john.loughran.colvin gmail.com> writes:

On Saturday, 17 August 2013 at 08:29:37 UTC, glycerine wrote:
 On Wednesday, 14 August 2013 at 13:43:50 UTC, Dicebot wrote:
 On Wednesday, 14 August 2013 at 13:28:42 UTC, glycerine wrote:
 Wishful thinking aside, they are competitors.

 They are not. `std.serialization` does not and should not 
 compete in Thrift domain.

 Huh? Do you know what thrift does? Summary: Everything that
 Orange/std.serialization does and more. To the point: Thrift
 provides data versioning, std.serialization does not. In my 
 book:
 end of story, game over. Thrift is preffered choice.

Thrift is the preferred choice when choosing a library for ALL 
your possible serialization needs and more. However, standard 
library modules are not about including every possible 
convenience, it's about providing solid building blocks for 
creating larger frameworks.

What you're suggesting leads directly to the clearly idiotic like 
"std.stdio sucks because it's doesn't have a printRainbows 
feature"

Aug 17 2013

"David Nadlinger" <code klickverbot.at> writes:

On Saturday, 17 August 2013 at 08:29:37 UTC, glycerine wrote:
 On Wednesday, 14 August 2013 at 13:43:50 UTC, Dicebot wrote:
 On Wednesday, 14 August 2013 at 13:28:42 UTC, glycerine wrote:
 Wishful thinking aside, they are competitors.

 They are not. `std.serialization` does not and should not 
 compete in Thrift domain.

 Huh? Do you know what thrift does? Summary: Everything that
 Orange/std.serialization does and more.

That's actually not true. Thrift does not serialize arbitrary 
object graphs, or any types with indirections, for that matter. 
This is by design, it would be hard to do this efficiently in all 
target languages, and contrary to Orange, performance is the main 
focus of Thrift.

 If you
 are going to standardize something, standardize the Thrift
 bindings so that the compiler doesn't introduce regressions
 that break them, like happened from dmd 2.062 - present.

On a related note, we desperately need to do something about 
this, especially since there seems to be an increased amount of 
interest in Thrift lately. For 2.061 and the previous releases, I 
always tested every beta against Thrift, and almost invariably 
found at least one bug/regression per release. However, for 2.062 
and 2.063, I was busy with LDC (and other things) at the time and 
it seems like I forgot to run the tests.

The DMD 2.062+ error message (see 
https://issues.apache.org/jira/browse/THRIFT-2130) doesn't make 
much sense; I guess the best way of going about this would be to 
try to DustMite-reduce the problem first or to fire up DMD in gdb 
to see what exactly is tripping the recursive alias error.

David

Aug 18 2013

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 8/18/13, David Nadlinger <code klickverbot.at> wrote:
 On Saturday, 17 August 2013 at 08:29:37 UTC, glycerine wrote:
 If you
 are going to standardize something, standardize the Thrift
 bindings so that the compiler doesn't introduce regressions
 that break them, like happened from dmd 2.062 - present.

 On a related note, we desperately need to do something about
 this, especially since there seems to be an increased amount of
 interest in Thrift lately. For 2.061 and the previous releases, I
 always tested every beta against Thrift, and almost invariably
 found at least one bug/regression per release. However, for 2.062
 and 2.063, I was busy with LDC (and other things) at the time and
 it seems like I forgot to run the tests.

I think it would be good if we added Thrift and other test-cases, for
example from the D Templates Book, to the test machines. But since
there's a lot of code maybe the test machines should run the tests
sporadically (e.g. after every #N new commits), otherwise pull
requests would take forever to test.

Alternatively we could at least try to test these major projects with
release candidates Normally the project maintainers would do this
themselves, but it's easy to run out of time or just to forget to test
things, and then it's too late (well we have fixup DMD releases now so
it's not too bad).

Aug 18 2013

"David Nadlinger" <code klickverbot.at> writes:

On Sunday, 18 August 2013 at 14:52:04 UTC, Andrej Mitrovic wrote:
 Normally the project maintainers would do this
 themselves, but it's easy to run out of time or just to forget 
 to test
 things, and then it's too late (well we have fixup DMD releases 
 now so it's not too bad).

The big problem with this right now is that quite frequently, you 
run the tests and discover one regression in the beta, file it, 
fix it (or wait for it to get fixed), then run the tests again, 
discover that they still don't pass, etc.

This is not only an annoying and time-intensive job for the 
maintainer of the project (as during beta you have to pretty much 
always be on your toes for a new version to test lest Walter 
decide to make the final release), but this also increases beta 
duration.

One obvious reaction to this (as a project maintainer) would be 
to continuously track Git master and report regressions as they 
arise. However, this is also not always practical, as quite 
often, there is a regression/backwards-incompatible change early 
on in the development process that is not fixed until much later, 
so that multiple issues can still pile up unnoticed.

Having a system that regularly, automatically runs the test 
suites of several larger, well-known D projects with the results 
being readily available to the DMD/druntime/Phobos teams would 
certainly help. But it's also not ideal, since if a project 
starts to fail, the exact nature of the issue (regression in DMD 
or bug in the project, and if the former, a minimal test case) 
can often be hard to track down for somebody not already familiar 
with the code base.

David

Aug 18 2013

"Dicebot" <public dicebot.lv> writes:

On Sunday, 18 August 2013 at 16:33:51 UTC, David Nadlinger wrote:
 ...

Please, don't move too far from review topic ;) It is a separate 
issue to discuss.

Aug 18 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 8/18/2013 9:33 AM, David Nadlinger wrote:
 Having a system that regularly, automatically runs the test suites of several
 larger, well-known D projects with the results being readily available to the
 DMD/druntime/Phobos teams would certainly help. But it's also not ideal, since
 if a project starts to fail, the exact nature of the issue (regression in DMD
or
 bug in the project, and if the former, a minimal test case) can often be hard
to
 track down for somebody not already familiar with the code base.

That's exactly the problem. If these large projects are incorporated into the 
autotester, who is going to isolate/fix problems arising with them?

The test suite is designed to be a collection of already-isolated issues, so 
understanding what went wrong shouldn't be too difficult. Note that already it 
is noticeably much harder to debug a phobos unit test gone awry than the other 
tests. A full blown project that nobody understands would fare far worse.

(And the other problem, of course, is the test suite is designed to be runnable 
fairly quickly. Compiling some other large project and running its test suite 
can make the autotester much less useful when the turnaround time increases.)

Putting large projects into the autotester has the implication that development 
and support of those projects has been ceded to the core dev team, i.e. who is 
responsible for it has been badly blurred.

Aug 20 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-17 10:29, glycerine wrote:

 Huh? Do you know what thrift does? Summary: Everything that
 Orange/std.serialization does and more. To the point: Thrift
 provides data versioning, std.serialization does not. In my book:
 end of story, game over. Thrift is preffered choice. If you
 are going to standardize something, standardize the Thrift
 bindings so that the compiler doesn't introduce regressions
 that break them, like happened from dmd 2.062 - present.

Orange/std.serialization is capable of serializing more types than 
Thrift is. Example it will correctly serialize and deserialize slices, 
pointers and so on.

It's easy to implement versioning yourself, something like:

class Foo
{
     int version_;
     int a;
     int b;

     void toData (Serializer serializer, Serializer.Data key)
     {
         serializer.serialize(a, "a");
         serializer.serialize(version_, "version_");

         if (version_ == 2)
             serializer.serialize(b, "b");
     }

     // Do the corresponding in "fromData".
}

If versioning is crucial it can be added.

-- 
/Jacob Carlborg

Aug 18 2013

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday, August 18, 2013 21:45:59 Jacob Carlborg wrote:
 If versioning is crucial it can be added.

I don't know if it's crucial or not, but I know that the Java guys didn't have 
it initially but ended up adding it later, which would imply that they ran 
into problems that made them decide that it should be there. I'd certainly be 
inclined to think that it's better to have it, and it's probably easier to add 
it before it's merged than later. But I don't know how crucial it is.

- Jonathan M Davis

Aug 18 2013

"bsd" <slackovsky gmail.com> writes:

On Sunday, 18 August 2013 at 20:33:01 UTC, Jonathan M Davis wrote:
 On Sunday, August 18, 2013 21:45:59 Jacob Carlborg wrote:
 If versioning is crucial it can be added.

 I don't know if it's crucial or not, but I know that the Java 
 guys didn't have
 it initially but ended up adding it later, which would imply 
 that they ran
 into problems that made them decide that it should be there. 
 I'd certainly be
 inclined to think that it's better to have it, and it's 
 probably easier to add
 it before it's merged than later. But I don't know how crucial 
 it is.

 - Jonathan M Davis

I think this versioning idea is more important for protocol 
buffers, msgpck, thrift like libraries that use a separate IDL 
schema and IDL-compiled code. std.serialization uses the D code 
itself to serialize so the version is practically dictated by the 
user. It may as well be manually handled....as long as it 
throws/returns error and doesn't crash if one tries to 
deserialize an archive type into a different/modified D type.

 From memory the Protocol Buffers versioning is to ensure schema 
generated code and library are compatible. You get compile errors 
if you try to compile IDL generated code with a newer version of 
the library. Similarly you get runtime errors if you deserialize 
data that was serialized with an older version of the library. 
This is all from memory so I could be wrong...

Orange seems/feels more like the BOOST.serialization to me but 
much better. It's D for a start and allows custom archive types. 
In BOOST, the library stores a version number in the archive for 
each class serialized. This number defaults to 0 but can be set 
by the user via a #define.

http://www.boost.org/doc/libs/1_54_0/libs/serialization/doc/tutorial.html#versioning

I think adding it later can be done without breaking existing 
API, if it is deemed necessary. It just needs to default to 0 or 
something similar when missing from an archive and ensure it 
won't clash with any fields in existing archives.

Aug 19 2013

"David Nadlinger" <code klickverbot.at> writes:

On Monday, 19 August 2013 at 14:47:15 UTC, bsd wrote:
 I think this versioning idea is more important for protocol 
 buffers, msgpck, thrift like libraries that use a separate IDL 
 schema and IDL-compiled code. std.serialization uses the D code 
 itself to serialize so the version is practically dictated by 
 the user. It may as well be manually handled....as long as it 
 throws/returns error and doesn't crash if one tries to 
 deserialize an archive type into a different/modified D type.

 From memory the Protocol Buffers versioning is to ensure schema 
 generated code and library are compatible. You get compile 
 errors if you try to compile IDL generated code with a newer 
 version of the library. Similarly you get runtime errors if you 
 deserialize data that was serialized with an older version of 
 the library. This is all from memory so I could be wrong...

Seems like your memory has indeed faded a bit. ;)

Versioning is an integral idea of formats like Protobuf and 
Thrift. For example, see the "A bit of history" section right on 
the doc overview page. [1] You might also want to read through 
the (rather dated) Thrift whitepaper to get an idea about what 
the design constraints for it were. [2]

The main point is that when you have deployed services at the 
scale Google or Facebook work with, you can't just upgrade all 
involved parties simultaneously on a schema change. So, having to 
support multiple versions running along each other is pretty much 
a given, and the best way to deal with that is to build it right 
into your protocols.

David


[1] https://developers.google.com/protocol-buffers/docs/overview
[2] http://thrift.apache.org/static/files/thrift-20070401.pdf

Aug 19 2013

"David Nadlinger" <code klickverbot.at> writes:

On Monday, 19 August 2013 at 19:47:32 UTC, David Nadlinger wrote:
 Versioning is an integral idea of formats like Protobuf and 
 Thrift. For example, see the "A bit of history" section right 
 on the doc overview page. [1] You might also want to read 
 through the (rather dated) Thrift whitepaper to get an idea 
 about what the design constraints for it were. [2]

By the way, to be honest, this is also the main point that makes 
me feel uneasy about including Orbit in Phobos at this point: 
Sure, it has been around for some time, but as far as I can tell, 
not a lot of people are using it right now, and what seems to be 
entirely missing from the docs is a clear design rationale, 
outlining its goals and explaining how Orbit compares to 
well-known existing solutions.

It seems to me that a large part of the discussion in this thread 
can be attributed to that fact, i.e. a lack of 
understanding/agreement what the module is supposed to be in the 
first place.

David

Aug 19 2013

"bsd" <slackovsky gmail.com> writes:

 Seems like your memory has indeed faded a bit. ;)

 Versioning is an integral idea of formats like Protobuf and 
 Thrift. For example, see the "A bit of history" section right 
 on the doc overview page. [1] You might also want to read 
 through the (rather dated) Thrift whitepaper to get an idea 
 about what the design constraints for it were. [2]

 The main point is that when you have deployed services at the 
 scale Google or Facebook work with, you can't just upgrade all 
 involved parties simultaneously on a schema change. So, having 
 to support multiple versions running along each other is pretty 
 much a given, and the best way to deal with that is to build it 
 right into your protocols.

 David


 [1] https://developers.google.com/protocol-buffers/docs/overview
 [2] http://thrift.apache.org/static/files/thrift-20070401.pdf

Getting old! :-)

Thanks for the heads up.

Aug 19 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Sunday, 18 August 2013 at 19:46:00 UTC, Jacob Carlborg wrote:
 If versioning is crucial it can be added.

Can std.serialization load data if class definition was changed?

For example, we have class "Foo":

class Foo
{
     int a;
     int b;
}

and we serialize it in some file. After that class "Foo" was 
changed:

class Foo
{
     int b;
     int a;
}

Can std.serialization load data from old file to the new class?

Aug 22 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-22 13:57, ilya-stromberg wrote:

 Can std.serialization load data if class definition was changed?

 For example, we have class "Foo":

 class Foo
 {
      int a;
      int b;
 }

 and we serialize it in some file. After that class "Foo" was changed:

 class Foo
 {
      int b;
      int a;
 }

 Can std.serialization load data from old file to the new class?

Yes. In this case it will use the name of the instance fields when 
searching for values in the archive.

-- 
/Jacob Carlborg

Aug 22 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Thursday, 22 August 2013 at 13:13:48 UTC, Jacob Carlborg wrote:
 On 2013-08-22 13:57, ilya-stromberg wrote:

 Can std.serialization load data if class definition was 
 changed?

 Yes. In this case it will use the name of the instance fields 
 when searching for values in the archive.

Great! What about more difficult cases? For example, we have:

class Foo
{
    int a;
    int b;
}

After changes we have new class:

class Foo
{
    long b;
}

Can std.serialization load data to new class from old file? It 
should ignore "a" and convert "b" from int to long.

Aug 22 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-22 21:30, ilya-stromberg wrote:

 Great! What about more difficult cases? For example, we have:

 class Foo
 {
     int a;
     int b;
 }

 After changes we have new class:

 class Foo
 {
     long b;
 }

 Can std.serialization load data to new class from old file? It should
 ignore "a" and convert "b" from int to long.

No it can't. It will throw an exception because it cannot find a "long" 
element:

Could not find an element "long" with the attribute "key" with the value "b"

-- 
/Jacob Carlborg

Aug 22 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Thursday, 22 August 2013 at 19:53:53 UTC, Jacob Carlborg wrote:
 On 2013-08-22 21:30, ilya-stromberg wrote:

 What about more difficult cases?

 No it can't. It will throw an exception because it cannot find 
 a "long" element:

 Could not find an element "long" with the attribute "key" with 
 the value "b"

It's a serious issue. May be it's more important than range 
support. For example, I have to change class (bug fixing, new 
features, etc.), but it comparable with previos version (example: 
it's always possible to convert "int" to "long"). I that case I 
can't use std.serialization and have to write own solution (for 
examle, save data in csv file).

The easist way to fix it - store Interface Definition of the 
serialized data (should be generated automaticly). For example, 
we can use XML Schema for Xml Archive. With Interface Definition 
we can find changes and try to convert data to new format.

Note that glycerine also put your attention to this point:
http://forum.dlang.org/post/kftlfwcyughhghewqogm forum.dlang.org

 1. Interface Definition Language (IDL): required or not? If 
 not, how do know the details of what to serialize. If not, how 
 do you handle/support data versioning? If not, how do you 
 interoperate without another language? If yes, which types are 
 supported and what is the syntax and grammar of the IDL?

Ideas?

Aug 23 2013

"Dicebot" <public dicebot.lv> writes:

On Friday, 23 August 2013 at 13:34:04 UTC, ilya-stromberg wrote:
 It's a serious issue. May be it's more important than range 
 support. For example, I have to change class (bug fixing, new 
 features, etc.), but it comparable with previos version 
 (example: it's always possible to convert "int" to "long"). I 
 that case I can't use std.serialization and have to write own 
 solution (for examle, save data in csv file).

I don't think it as an issue at all. Behavior you want can't be 
defined in a generic way, at least not without lot of UDA help or 
similar declarative approach. In other words, the fact that those 
two classes are interchangeable in the context of the 
serialization exists only in the mind of programmer, not in D 
type system.

More than that, such behavior goes seriously out of the line of D 
being strongly typed language. I think functionality you want 
does belong to a more specialized module, not generic 
std.serialization - maybe even format-specific.

Aug 23 2013

"Tyler Jameson Little" <beatgammit gmail.com> writes:

On Friday, 23 August 2013 at 13:39:47 UTC, Dicebot wrote:
 On Friday, 23 August 2013 at 13:34:04 UTC, ilya-stromberg wrote:
 It's a serious issue. May be it's more important than range 
 support. For example, I have to change class (bug fixing, new 
 features, etc.), but it comparable with previos version 
 (example: it's always possible to convert "int" to "long"). I 
 that case I can't use std.serialization and have to write own 
 solution (for examle, save data in csv file).

 I don't think it as an issue at all. Behavior you want can't be 
 defined in a generic way, at least not without lot of UDA help 
 or similar declarative approach. In other words, the fact that 
 those two classes are interchangeable in the context of the 
 serialization exists only in the mind of programmer, not in D 
 type system.

 More than that, such behavior goes seriously out of the line of 
 D being strongly typed language. I think functionality you want 
 does belong to a more specialized module, not generic 
 std.serialization - maybe even format-specific.

What about adding delegate hooks in somewhere? These delegates 
would be called on errors like invalid type or missing field.

I'm not saying this needs to be there in order to release, but 
would this be a direction we'd like to go eventually? I've seen 
similar approaches elsewhere (e.g. Node.js's HTTP parser).

Aug 23 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-23 16:39, Tyler Jameson Little wrote:

 What about adding delegate hooks in somewhere? These delegates would be
 called on errors like invalid type or missing field.

 I'm not saying this needs to be there in order to release, but would
 this be a direction we'd like to go eventually? I've seen similar
 approaches elsewhere (e.g. Node.js's HTTP parser).

std.serialization already supports delegate hooks for missing values:

https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/std_serialization_serializer.html#.Serializer.errorCallback

-- 
/Jacob Carlborg

Aug 23 2013

"Tyler Jameson Little" <beatgammit gmail.com> writes:

On Friday, 23 August 2013 at 20:29:40 UTC, Jacob Carlborg wrote:
 On 2013-08-23 16:39, Tyler Jameson Little wrote:

 What about adding delegate hooks in somewhere? These delegates 
 would be
 called on errors like invalid type or missing field.

 I'm not saying this needs to be there in order to release, but 
 would
 this be a direction we'd like to go eventually? I've seen 
 similar
 approaches elsewhere (e.g. Node.js's HTTP parser).

 std.serialization already supports delegate hooks for missing 
 values:

 https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/std_serialization_serializer.html#.Serializer.errorCallback

Awesome!

Aug 23 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Friday, 23 August 2013 at 13:39:47 UTC, Dicebot wrote:
 On Friday, 23 August 2013 at 13:34:04 UTC, ilya-stromberg wrote:
 It's a serious issue. May be it's more important than range 
 support. For example, I have to change class (bug fixing, new 
 features, etc.), but it comparable with previos version 
 (example: it's always possible to convert "int" to "long"). I 
 that case I can't use std.serialization and have to write own 
 solution (for examle, save data in csv file).

 I don't think it as an issue at all. Behavior you want can't be 
 defined in a generic way, at least not without lot of UDA help 
 or similar declarative approach. In other words, the fact that 
 those two classes are interchangeable in the context of the 
 serialization exists only in the mind of programmer, not in D 
 type system.

 More than that, such behavior goes seriously out of the line of 
 D being strongly typed language. I think functionality you want 
 does belong to a more specialized module, not generic 
 std.serialization - maybe even format-specific.

Maybe you are right.
But I think it's not so difficult to implement, at least for 
simle cases.
We can follow a simple rules, for example like this:

Does element "b" exists in the archive? - Yes.
Does element "b" has type "long"? - No, the type is "int".
Can we convert type "int" to "long"? - Yes, load element "b" to 
tempory variable and convert it to "long":

int _b = 4;
long b = to!long(_b);

Is it difficult to implement?
Also, we can provide a few deserialize models: strict (like 
current behavior) and smart (like example above). May be even 3 
levels: strict, implicit conversions (like int to long) and 
explicit conversions (like long to int).

Aug 28 2013

"Dicebot" <public dicebot.lv> writes:

On Wednesday, 28 August 2013 at 16:02:09 UTC, ilya-stromberg 
wrote:
 ...

There was a good proposal by Dmitry to separate sequential strict 
serialization for random-access one as two distinct entities. I 
like it and I think it that is also can solve your problem.

Aug 28 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Wednesday, 28 August 2013 at 16:10:03 UTC, Dicebot wrote:
 There was a good proposal by Dmitry to separate sequential 
 strict serialization for random-access one as two distinct 
 entities. I like it and I think it that is also can solve your 
 problem.


serialization due this problem - any minimal code change breaks 
all previously serialized data. But I used .Net 1, maybe in 
current version solve this.
Can you print the link, please?

Aug 28 2013

"Dicebot" <public dicebot.lv> writes:

On Wednesday, 28 August 2013 at 16:19:20 UTC, ilya-stromberg 
wrote:
 Can you print the link, please?

http://forum.dlang.org/post/kvj17t$1ash$1 digitalmars.com

(Rigid vs Flexible part)

Aug 28 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-28 18:02, ilya-stromberg wrote:

 Maybe you are right.
 But I think it's not so difficult to implement, at least for simle cases.
 We can follow a simple rules, for example like this:

 Does element "b" exists in the archive? - Yes.
 Does element "b" has type "long"? - No, the type is "int".
 Can we convert type "int" to "long"? - Yes, load element "b" to tempory
 variable and convert it to "long":

 int _b = 4;
 long b = to!long(_b);

 Is it difficult to implement?
 Also, we can provide a few deserialize models: strict (like current
 behavior) and smart (like example above). May be even 3 levels: strict,
 implicit conversions (like int to long) and explicit conversions (like
 long to int).

I don't think we should add too much of this kind of functionality. 
There's a reason for why it supports custom serialization. This is a 
perfect example.

-- 
/Jacob Carlborg

Aug 28 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Thursday, 22 August 2013 at 19:53:53 UTC, Jacob Carlborg wrote:
 On 2013-08-22 21:30, ilya-stromberg wrote:

 Great! What about more difficult cases? For example, we have:

 class Foo
 {
    int a;
    int b;
 }

 After changes we have new class:

 class Foo
 {
    long b;
 }

 Can std.serialization load data to new class from old file? It 
 should
 ignore "a" and convert "b" from int to long.

 No it can't. It will throw an exception because it cannot find 
 a "long" element:

 Could not find an element "long" with the attribute "key" with 
 the value "b"

Jacob, can you use clearer error messages and provide more 
information for it?
You can type full class/sruct name (via 
std.traits.fullyQualifiedName) and field name and type: 
information can not be found in the archive

Please, put attention on it.

Sep 01 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Sunday, 1 September 2013 at 08:33:51 UTC, ilya-stromberg wrote:
 On Thursday, 22 August 2013 at 19:53:53 UTC, Jacob Carlborg 
 wrote:
 On 2013-08-22 21:30, ilya-stromberg wrote:

 Great! What about more difficult cases? For example, we have:

 class Foo
 {
   int a;
   int b;
 }

 After changes we have new class:

 class Foo
 {
   long b;
 }

 Can std.serialization load data to new class from old file? 
 It should
 ignore "a" and convert "b" from int to long.

 No it can't. It will throw an exception because it cannot find 
 a "long" element:

 Could not find an element "long" with the attribute "key" with 
 the value "b"

 Jacob, can you use clearer error messages and provide more 
 information for it?
 You can type full class/sruct name (via 
 std.traits.fullyQualifiedName) and field name and type: 
 information can not be found in the archive

 Please, put attention on it.

Sorry, I want to write:

Could not deserialize the field "b" with type "long" of class 
"Fouo": information can not be found in the archive.

Sep 01 2013

Jacob Carlborg <doob me.com> writes:

On 2013-09-01 10:35, ilya-stromberg wrote:

 Sorry, I want to write:

 Could not deserialize the field "b" with type "long" of class "Fouo":
 information can not be found in the archive.

Yes, I could enhance the error message.

-- 
/Jacob Carlborg

Sep 01 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-22 21:30, ilya-stromberg wrote:

 Great! What about more difficult cases? For example, we have:

 class Foo
 {
     int a;
     int b;
 }

 After changes we have new class:

 class Foo
 {
     long b;
 }

 Can std.serialization load data to new class from old file? It should
 ignore "a" and convert "b" from int to long.

Actually, my previous answer was not entirely correct. By default it 
will throw an exception. But you can implement the above using custom 
serialization (here using Orange) :

module main;

import orange.serialization._;
import orange.serialization.archives._;

import std.stdio;

class Foo : Serializable
{
     long b;

     void toData (Serializer serializer, Serializer.Data key)
     {
     }

     void fromData (Serializer serializer, Serializer.Data key)
     {
         b = serializer.deserialize!(int)("b");
     }
}

void main ()
{
     auto archive = new XmlArchive!(char);
     auto serializer = new Serializer(archive);

     auto data = `<?xml version="1.0" encoding="UTF-8"?>
     <archive version="1.0.0" type="org.dsource.orange.xml">
         <data>
             <object runtimeType="main.Foo" type="main.Foo" key="0" id="0">
                 <int key="a" id="1">3</int>
                 <int key="b" id="2">4</int>
             </object>
         </data>
     </archive>`;

     auto f = serializer.deserialize!(Foo)(cast(immutable(void)[]) data);
     assert(f.b == 4);
}

-- 
/Jacob Carlborg

Aug 23 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Friday, 23 August 2013 at 20:28:10 UTC, Jacob Carlborg wrote:
 On 2013-08-22 21:30, ilya-stromberg wrote:

 What about more difficult cases?

 Actually, my previous answer was not entirely correct. By 
 default it will throw an exception. But you can implement the 
 above using custom serialization (here using Orange) :

Great job!
A little question. For example, I would like to load data from 
previos format and store current version in default 
std.serialization format. So, I don't want to implement "toData" 
at all? Is it possible? Or can I call the default serialization 
method? Something like this:

class Foo : Serializable
{
     long b;

     //I don't want to implement this
     void toData (Serializer serializer, Serializer.Data key)
     {
         serializer.serialize(this);
     }

     void fromData (Serializer serializer, Serializer.Data key)
     {
         b = serializer.deserialize!(int)("b");
     }
}

Also, please add this examlpe to the documentation, it could be 
useful for many users.

Note that we can split Serializable interface for 2 interfaces:

interface ToSerializable
{
     void toData(Serializer serializer, Serializer.Data key);
}

interface FromSerializable
{
     void fromData(Serializer serializer, Serializer.Data key);
}

interface Serializable : ToSerializable, FromSerializable
{
}

class Foo : FromSerializable
{
     long b;

     void fromData (Serializer serializer, Serializer.Data key)
     {
         b = serializer.deserialize!(int)("b");
     }

     //I must NOT to implement toData
}

Aug 24 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-24 14:45, ilya-stromberg wrote:

 Great job!
 A little question. For example, I would like to load data from previos
 format and store current version in default std.serialization format.
 So, I don't want to implement "toData" at all? Is it possible? Or can I
 call the default serialization method? Something like this:

 class Foo : Serializable
 {
      long b;

      //I don't want to implement this
      void toData (Serializer serializer, Serializer.Data key)
      {
          serializer.serialize(this);
      }

      void fromData (Serializer serializer, Serializer.Data key)
      {
          b = serializer.deserialize!(int)("b");
      }
 }

I actually noticed this problem when I wrote the example. First, the 
interface Serializable is actually not necessary because this is 
actually checked with at template at compile time, it's possible to use 
these methods for structs as well. Second, instead of checking for both 
"toData" and "fromData" when serializing and deserializing it should 
only check for "toData" when serializing and only for "fromData" when 
deserializing.

I'll add this to my todo list.

-- 
/Jacob Carlborg

Aug 24 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Saturday, 24 August 2013 at 17:47:35 UTC, Jacob Carlborg wrote:
 I actually noticed this problem when I wrote the example. 
 First, the interface Serializable is actually not necessary 
 because this is actually checked with at template at compile 
 time, it's possible to use these methods for structs as well. 
 Second, instead of checking for both "toData" and "fromData" 
 when serializing and deserializing it should only check for 
 "toData" when serializing and only for "fromData" when 
 deserializing.

In that case maybe we should remove "Serializable" interface? And 
just spesify that user must implement "toData" or "fromData" for 
custom serializing or deserializing. Is it possible?

Aug 24 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-24 21:26, ilya-stromberg wrote:

 In that case maybe we should remove "Serializable" interface? And just
 spesify that user must implement "toData" or "fromData" for custom
 serializing or deserializing. Is it possible?

Yes, that's what I'm planning to do.

-- 
/Jacob Carlborg

Aug 24 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Saturday, 24 August 2013 at 19:32:13 UTC, Jacob Carlborg wrote:
 On 2013-08-24 21:26, ilya-stromberg wrote:

 In that case maybe we should remove "Serializable" interface? 
 And just
 spesify that user must implement "toData" or "fromData" for 
 custom
 serializing or deserializing. Is it possible?

 Yes, that's what I'm planning to do.

Maybe we should rename methods "toData" and "fromData" to avoid 
name collisions? For example, we can use serializeToData and 
deserializeFromData, it will be clearer.

Aug 24 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Saturday, 24 August 2013 at 17:47:35 UTC, Jacob Carlborg wrote:
 First, the interface Serializable is actually not necessary 
 because this is actually checked with at template at compile 
 time, it's possible to use these methods for structs as well. 
 Second, instead of checking for both "toData" and "fromData" 
 when serializing and deserializing it should only check for 
 "toData" when serializing and only for "fromData" when 
 deserializing.

The name "isSerializable" is TERRIBLE:
https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/std_serialization_serializable.html#.isSerializable
It only checks if functions "toData" and "fromData" exists in a 
class or struct. But std.serialization can serialize almost any 
data, so please rename the template, like 
"hasCustomSerialization".

The real "isSerializable" must check if it's possible to 
serialize and should look like this:
enum isSerializable(T) = serializer.serialize(T);

Aug 26 2013

"mrd" <denis.feklushkin gmail.com> writes:

On Thursday, 22 August 2013 at 13:13:48 UTC, Jacob Carlborg wrote:
 On 2013-08-22 13:57, ilya-stromberg wrote:

 Can std.serialization load data from old file to the new class?

 Yes. In this case it will use the name of the instance fields 
 when searching for values in the archive.

Is this the right way?

There are special formats (Protocol Buffers, for example) for a 
binary format what can be changed over time without breaking old 
code.

But for normal serialization is not this redundant?
Besides, search by name slower compared with other methods (field 
numbers, for example).

Sep 21 2013

Jacob Carlborg <doob me.com> writes:

On 2013-09-21 15:13, mrd wrote:

 Is this the right way?

 There are special formats (Protocol Buffers, for example) for a binary
 format what can be changed over time without breaking old code.

 But for normal serialization is not this redundant?
 Besides, search by name slower compared with other methods (field
 numbers, for example).

Not necessarily. I could implement that by default it will use the field 
number, if the names doesn't match it could fallback to do a lookup by name.

I would like to avoid having a dependency on the orders of the fields.

-- 
/Jacob Carlborg

Sep 23 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Tuesday, 13 August 2013 at 15:04:38 UTC, Jacob Carlborg wrote:
 There's a fully working example here:

 https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/std_serialization_serializer.html

Sorry, I think that example is WRONG!
Your example:

void main ()
{
     auto archive = new XmlArchive!();
     auto serializer = new Serializer;

     auto foo = new Foo;
     foo.a = 3;

     serializer.serialize(foo);
     auto foo2 = serializer.deserialize!(Foo)(archive.untypedData);

     writeln(foo2.a); // prints "3"
     assert(foo.a == foo2.a);
}

As I can see, it should be:

     auto archive = new XmlArchive!();
     auto serializer = new Serializer(archive);

Please fix example or correct me.

Note that current DDocs version supports to generate 
documentation from unittests - it lets as make shure that 
documentation is correct. Shall we add the condition of examples 
generation from unittests as required?

Aug 13 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-14 08:29, ilya-stromberg wrote:

 Sorry, I think that example is WRONG!
 Your example:

 void main ()
 {
      auto archive = new XmlArchive!();
      auto serializer = new Serializer;

      auto foo = new Foo;
      foo.a = 3;

      serializer.serialize(foo);
      auto foo2 = serializer.deserialize!(Foo)(archive.untypedData);

      writeln(foo2.a); // prints "3"
      assert(foo.a == foo2.a);
 }

 As I can see, it should be:

      auto archive = new XmlArchive!();
      auto serializer = new Serializer(archive);

 Please fix example or correct me.

Right, good catch.

 Note that current DDocs version supports to generate documentation from
 unittests - it lets as make shure that documentation is correct. Shall
 we add the condition of examples generation from unittests as required?

How do I do that?

-- 
/Jacob Carlborg

Aug 14 2013

"Dicebot" <public dicebot.lv> writes:

On Wednesday, 14 August 2013 at 07:29:48 UTC, Jacob Carlborg 
wrote:
 Note that current DDocs version supports to generate 
 documentation from
 unittests - it lets as make shure that documentation is 
 correct. Shall
 we add the condition of examples generation from unittests as 
 required?

 How do I do that?

http://dlang.org/changelog.html#documentedunittest

Aug 14 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-14 09:32, Dicebot wrote:

 http://dlang.org/changelog.html#documentedunittest

Thanks.

-- 
/Jacob Carlborg

Aug 14 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-12 15:27, Dicebot wrote:

 Jacob, it is probably worth creating a pull request with latest rebased
 version of your proposal to simplify getting a quick overview of
 changes. Also please tell if there is anything you want/need to
 implement before merging.

I have rebased now.

-- 
/Jacob Carlborg

Aug 13 2013

"Tyler Jameson Little" <beatgammit gmail.com> writes:

Serious:

- doesn't use ranges
   - does this store the entire serialized output in memory?
   - I would to serialize to a range (file?) and deserialize from 
a range (file?)

Minor

- Indentation messed up in Serializable example
- Typo: NonSerialized example should read NonSerialized!(b)

Aug 14 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-14 10:19, Tyler Jameson Little wrote:
 Serious:

 - doesn't use ranges
    - does this store the entire serialized output in memory?

That's up to the archive how it chooses to implement it. But the current 
XmlArchive does so, yes. I becomes quite limited because of std.xml.

    - I would to serialize to a range (file?) and deserialize from a
 range (file?)

The serialized data is returned as an array, so that is compatible with 
the range interface, it just won't be lazy.

The input data used for deserializing excepts a void[], I don't think 
that's compatible with the range interface.

 Minor

 - Indentation messed up in Serializable example

Right, I'll fix that.

 - Typo: NonSerialized example should read NonSerialized!(b)

No, it's not a typo. If you read the documentation you'll see that:

"If no fields or "this" is specified, it indicates that the whole 
class/struct should not be (de)serialized."

-- 
/Jacob Carlborg

Aug 14 2013

"Tyler Jameson Little" <beatgammit gmail.com> writes:

On Wednesday, 14 August 2013 at 08:48:23 UTC, Jacob Carlborg 
wrote:
 On 2013-08-14 10:19, Tyler Jameson Little wrote:
 Serious:

 - doesn't use ranges
   - does this store the entire serialized output in memory?

 That's up to the archive how it chooses to implement it. But 
 the current XmlArchive does so, yes. I becomes quite limited 
 because of std.xml.

Well, std.xml needs to be replaced anyway, so it's probably not a 
good limitation to have. It may take some work to replace it 
correctly though...

   - I would to serialize to a range (file?) and deserialize 
 from a
 range (file?)

 The serialized data is returned as an array, so that is 
 compatible with the range interface, it just won't be lazy.

 The input data used for deserializing excepts a void[], I don't 
 think that's compatible with the range interface.

I'm mostly interested in reducing memory. If I'm (de)serializing 
a large object or lots of objects, this could become an issue.

Related question: Have you looked at how much this relies on the 
GC?

 Minor

 - Indentation messed up in Serializable example

 Right, I'll fix that.

 - Typo: NonSerialized example should read NonSerialized!(b)

 No, it's not a typo. If you read the documentation you'll see 
 that:

 "If no fields or "this" is specified, it indicates that the 
 whole class/struct should not be (de)serialized."

Ah, missed that.

Aug 14 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-14 11:15, Tyler Jameson Little wrote:

 Well, std.xml needs to be replaced anyway, so it's probably not a good
 limitation to have. It may take some work to replace it correctly though...

No, but should std.serialization be on hold until std.xml is replaced?

 I'm mostly interested in reducing memory. If I'm (de)serializing a large
 object or lots of objects, this could become an issue.

 Related question: Have you looked at how much this relies on the GC?

I haven't done any measurements. It will use the GC to deserialize 
values that are normally heap allocated, that is: arrays, associative 
arrays, objects, strings and so on. In addition to that, a pointer to 
each deserialized value is stored in an associative array.

-- 
/Jacob Carlborg

Aug 14 2013

"Tove" <tove fransson.se> writes:

On Wednesday, 14 August 2013 at 08:48:23 UTC, Jacob Carlborg 
wrote:
 On 2013-08-14 10:19, Tyler Jameson Little wrote:
 - Typo: NonSerialized example should read NonSerialized!(b)

 No, it's not a typo. If you read the documentation you'll see 
 that:

 "If no fields or "this" is specified, it indicates that the 
 whole class/struct should not be (de)serialized."

I understand the need for Orange to be backwards compatible, but 
for std.serialization, why isn't the old-style mixin simply 
removed in favor of the UDA.

Furthermore for "template NonSerialized(Fields...)" there is an 
example, while for the new style "struct nonSerialized;" there 
isn't!

I find the newstyle both more intuitive and you also more dry not 
duplicating the identifier: "int b; mixin NonSerialized!(b)"

 nonSerialized struct Foo
{
     int a;
     int b;
     int c;
}

struct Bar
{
     int a;
     int b;
      nonSerialized int c;
}

Aug 14 2013

"Tyler Jameson Little" <beatgammit gmail.com> writes:

On Wednesday, 14 August 2013 at 09:17:44 UTC, Tove wrote:
 On Wednesday, 14 August 2013 at 08:48:23 UTC, Jacob Carlborg 
 wrote:
 On 2013-08-14 10:19, Tyler Jameson Little wrote:
 - Typo: NonSerialized example should read NonSerialized!(b)

 No, it's not a typo. If you read the documentation you'll see 
 that:

 "If no fields or "this" is specified, it indicates that the 
 whole class/struct should not be (de)serialized."

 I understand the need for Orange to be backwards compatible, 
 but for std.serialization, why isn't the old-style mixin simply 
 removed in favor of the UDA.

 Furthermore for "template NonSerialized(Fields...)" there is an 
 example, while for the new style "struct nonSerialized;" there 
 isn't!

 I find the newstyle both more intuitive and you also more dry 
 not duplicating the identifier: "int b; mixin NonSerialized!(b)"

  nonSerialized struct Foo
 {
     int a;
     int b;
     int c;
 }

 struct Bar
 {
     int a;
     int b;
      nonSerialized int c;
 }

I like this a lot more. Phobos just needs to be compatible with 
the current release, so backwards compat is a non-issue here.

Aug 14 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-14 11:25, Tyler Jameson Little wrote:

 I like this a lot more. Phobos just needs to be compatible with the
 current release, so backwards compat is a non-issue here.

I guess this is why we have this thread. I would like to hear comments 
from a couple of others as well about this before deciding.

-- 
/Jacob Carlborg

Aug 14 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Wednesday, 14 August 2013 at 09:28:20 UTC, Jacob Carlborg 
wrote:
 On 2013-08-14 11:25, Tyler Jameson Little wrote:

 I like this a lot more. Phobos just needs to be compatible 
 with the
 current release, so backwards compat is a non-issue here.

 I guess this is why we have this thread. I would like to hear 
 comments from a couple of others as well about this before 
 deciding.

I think we should avoid mixins as much as it possible.
UDA  nonSerialized looks much better, so I think we should use it.
Of course, we can leave template NonSerialized(Fields...) for 
backwards compatible with Orange and, maybe, deprecate it.

Aug 14 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-14 16:15, ilya-stromberg wrote:

 I think we should avoid mixins as much as it possible.
 UDA  nonSerialized looks much better, so I think we should use it.
 Of course, we can leave template NonSerialized(Fields...) for backwards
 compatible with Orange and, maybe, deprecate it.

Of course UDA's should be the primary use for this. The question is 
should NonSerialized be included at all? Should it be included and 
deprecated or should it just be included?

-- 
/Jacob Carlborg

Aug 14 2013

"Dicebot" <public dicebot.lv> writes:

On Wednesday, 14 August 2013 at 14:34:58 UTC, Jacob Carlborg 
wrote:
 Should it be included and deprecated or should it just be 
 included?

What is the point of including something into Phobos and 
immediately deprecating it? It is first inclusion, so there are 
no backwards compatibility concerns.

Aug 14 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Wednesday, 14 August 2013 at 14:34:58 UTC, Jacob Carlborg 
wrote:
 On 2013-08-14 16:15, ilya-stromberg wrote:

 I think we should avoid mixins as much as it possible.
 UDA  nonSerialized looks much better, so I think we should use 
 it.
 Of course, we can leave template NonSerialized(Fields...) for 
 backwards
 compatible with Orange and, maybe, deprecate it.

 Of course UDA's should be the primary use for this. The 
 question is should NonSerialized be included at all? Should it 
 be included and deprecated or should it just be included?

I did not use Orange at all.
But the most terrible for library is losing backwards 
compatiblity. So, if Orange uses NonSerialized, we have to 
include it. We can just add documentation notice that 
NonSerialized will be deprecated after, for example, 6 months.

Aug 14 2013

"Dicebot" <public dicebot.lv> writes:

On Wednesday, 14 August 2013 at 14:50:45 UTC, ilya-stromberg 
wrote:
 I did not use Orange at all.
 But the most terrible for library is losing backwards 
 compatiblity. So, if Orange uses NonSerialized, we have to 
 include it. We can just add documentation notice that 
 NonSerialized will be deprecated after, for example, 6 months.

std.serialization is not Orange and should not be considered as 
one. Once it is included into Phobos it is a brand new library 
and must be treated as such, no matter what origin it has. Users 
of Orange expect compatibility from Orange, not Phobos.

Aug 14 2013

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Wednesday, August 14, 2013 16:54:59 Dicebot wrote:
 std.serialization is not Orange and should not be considered as
 one. Once it is included into Phobos it is a brand new library
 and must be treated as such, no matter what origin it has. Users
 of Orange expect compatibility from Orange, not Phobos.

Agreed.

- Jonathan M Davis

Aug 14 2013

"Kapps" <opantm2+spam gmail.com> writes:

On Wednesday, 14 August 2013 at 14:34:58 UTC, Jacob Carlborg 
wrote:
 On 2013-08-14 16:15, ilya-stromberg wrote:

 I think we should avoid mixins as much as it possible.
 UDA  nonSerialized looks much better, so I think we should use 
 it.
 Of course, we can leave template NonSerialized(Fields...) for 
 backwards
 compatible with Orange and, maybe, deprecate it.

 Of course UDA's should be the primary use for this. The 
 question is should NonSerialized be included at all? Should it 
 be included and deprecated or should it just be included?

I don't think it should be included. The UDAs replace it nicely, 
and though std.serialization would be essentially Orange it's 
still a different library. Some breaking changes are to be 
expected, and in this case I think worth-while. Having multiple 
ways of specifying options such as non-serialized is confusing. 
If you had UDAs available to you from the start, would you have 
included the mixins? If not, why include them now? This is 
essentially a fresh start, one inside Phobos rather than a 
separate library.

Aug 14 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-14 17:08, Kapps wrote:

 If you had UDAs available to you from the start, would you have included the
mixins?

No.

 If not, why include them now?

You do have a point.

-- 
/Jacob Carlborg

Aug 14 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-14 11:17, Tove wrote:

 I understand the need for Orange to be backwards compatible, but for
 std.serialization, why isn't the old-style mixin simply removed in favor
 of the UDA.

I don't know, it doesn't really hurt to be present. And for anyone using 
Orange they only need to change the imports to have it work with 
std.serialization.

 Furthermore for "template NonSerialized(Fields...)" there is an example,
 while for the new style "struct nonSerialized;" there isn't!

Good point, I'll add an example.

 I find the newstyle both more intuitive and you also more dry not
 duplicating the identifier: "int b; mixin NonSerialized!(b)"

  nonSerialized struct Foo
 {
      int a;
      int b;
      int c;
 }

 struct Bar
 {
      int a;
      int b;
       nonSerialized int c;
 }

Absolutely.

-- 
/Jacob Carlborg

Aug 14 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Wednesday, 14 August 2013 at 09:26:55 UTC, Jacob Carlborg 
wrote:
 On 2013-08-14 11:17, Tove wrote:
 I find the newstyle both more intuitive and you also more dry 
 not
 duplicating the identifier: "int b; mixin NonSerialized!(b)"

  nonSerialized struct Foo
 {
     int a;
     int b;
     int c;
 }

 struct Bar
 {
     int a;
     int b;
      nonSerialized int c;
 }

 Absolutely.

Jacob, can you add " serializationName(string name)" UDA?
I saw the custom serialization example from documentation:
https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/std_serialization_serializable.html#.Serializable

class Foo : Serializable
{
     int a;

     void toData (Serializer serializer, Serializer.Data key)
     {
         serializer.serialize(a, "b");
     }

  void fromData (Serializer serializer, Serializer.Data key)
  {
      a = serializer.deserialize!(int)("b");
  }
}

Whith " serializationName(string name)" attribute example should 
look like this:

class Foo
{
      serializationName("b")
     int a;
}

Or for class/struct name:

 serializationName("Bar")
class Foo
{
     int a;
}

I think it's easier to use than custom serialization. And 
" nonSerialized" UDA used for same purpose - simplify 
serialization customization.

Is it possible to implement?

Sep 04 2013

Jacob Carlborg <doob me.com> writes:

On 2013-09-04 14:37, ilya-stromberg wrote:

 Jacob, can you add " serializationName(string name)" UDA?
 I saw the custom serialization example from documentation:
 https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/std_serialization_serializable.html#.Serializable


 class Foo : Serializable
 {
      int a;

      void toData (Serializer serializer, Serializer.Data key)
      {
          serializer.serialize(a, "b");
      }

   void fromData (Serializer serializer, Serializer.Data key)
   {
       a = serializer.deserialize!(int)("b");
   }
 }

 Whith " serializationName(string name)" attribute example should look
 like this:

 class Foo
 {
       serializationName("b")
      int a;
 }

 Or for class/struct name:

  serializationName("Bar")
 class Foo
 {
      int a;
 }

 I think it's easier to use than custom serialization. And
 " nonSerialized" UDA used for same purpose - simplify serialization
 customization.

 nonSerialized is already available. At the bottom of the link you posted.

 Is it possible to implement?

Yes, the question is how much of these customization should be 
supported. It's easy to add at a later time if I don't add it from the 
beginning.

-- 
/Jacob Carlborg

Sep 04 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 8/14/13 1:48 AM, Jacob Carlborg wrote:
 On 2013-08-14 10:19, Tyler Jameson Little wrote:
    - I would to serialize to a range (file?) and deserialize from a
 range (file?)

 The serialized data is returned as an array, so that is compatible with
 the range interface, it just won't be lazy.

This seems like a major limitation. (Disclaimer: I haven't read the 
documentation yet.)

Andrei

Aug 14 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-14 18:25, Andrei Alexandrescu wrote:

 This seems like a major limitation. (Disclaimer: I haven't read the
 documentation yet.)

The data is built up as a DOM (with the XmlArchive) using std.xml. I 
should I get a range out of that?

-- 
/Jacob Carlborg

Aug 14 2013

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 8/14/13 12:10 PM, Jacob Carlborg wrote:
 On 2013-08-14 18:25, Andrei Alexandrescu wrote:

 This seems like a major limitation. (Disclaimer: I haven't read the
 documentation yet.)

 The data is built up as a DOM (with the XmlArchive) using std.xml. I
 should I get a range out of that?

I'm thinking some people may need to stream to/from large files and 
would find the requirement of in-core representation limiting.

Andrei

Aug 14 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-14 21:11, Andrei Alexandrescu wrote:

 I'm thinking some people may need to stream to/from large files and
 would find the requirement of in-core representation limiting.

Yes, I understand that. But currently I'm limited by std.xml.

-- 
/Jacob Carlborg

Aug 14 2013

"H. S. Teoh" <hsteoh quickfur.ath.cx> writes:

On Wed, Aug 14, 2013 at 09:23:50PM +0200, Jacob Carlborg wrote:
 On 2013-08-14 21:11, Andrei Alexandrescu wrote:
 
I'm thinking some people may need to stream to/from large files and
would find the requirement of in-core representation limiting.

 
 Yes, I understand that. But currently I'm limited by std.xml.

[...]

Would it be possible to present a range *interface*, that currently just
converts to array (or aliases to array if it's already an array), so
that in the future, when std.xml is revamped to use ranges, existing
code will automatically reap the benefits?


T

-- 
Today's society is one of specialization: as you grow, you learn more and more
about less and less. Eventually, you know everything about nothing.

Aug 14 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-14 21:40, H. S. Teoh wrote:

 Would it be possible to present a range *interface*, that currently just
 converts to array (or aliases to array if it's already an array), so
 that in the future, when std.xml is revamped to use ranges, existing
 code will automatically reap the benefits?

Since std.xml returns the data as a string, you mean I just forward the 
range interface to this string?

-- 
/Jacob Carlborg

Aug 14 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Wednesday, 14 August 2013 at 19:23:51 UTC, Jacob Carlborg 
wrote:
 On 2013-08-14 21:11, Andrei Alexandrescu wrote:

 I'm thinking some people may need to stream to/from large 
 files and
 would find the requirement of in-core representation limiting.

 Yes, I understand that. But currently I'm limited by std.xml.

Can you use another serialization format and supports file output 
for it? For example, can you use JSON, BSON or binary format?

Aug 14 2013

"Tyler Jameson Little" <beatgammit gmail.com> writes:

On Wednesday, 14 August 2013 at 19:55:52 UTC, ilya-stromberg 
wrote:
 On Wednesday, 14 August 2013 at 19:23:51 UTC, Jacob Carlborg 
 wrote:
 On 2013-08-14 21:11, Andrei Alexandrescu wrote:

 I'm thinking some people may need to stream to/from large 
 files and
 would find the requirement of in-core representation limiting.

 Yes, I understand that. But currently I'm limited by std.xml.

 Can you use another serialization format and supports file 
 output for it? For example, can you use JSON, BSON or binary 
 format?

That's often not possible, especially when working with an 
external API.

When working with large files, it's much better to read the file 
in chunks so you can be processing the data while the platters 
are seeking. This isn't as big of a problem with SSDs, but you 
still have to wait for the OS. RAM usage is also an issue, but 
for me it's less of an issue than waiting for I/O.

Even if rotating media were to be phased out, there's still the 
problem of streaming data over a network.

std.xml will be replaced, but it shouldn't require breaking code 
to fix std.serialize.

Aug 14 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-14 21:55, ilya-stromberg wrote:

 Can you use another serialization format and supports file output for
 it? For example, can you use JSON, BSON or binary format?

The idea of the library is that it can support multiple archive types. 
Currently only XML is implemented. I have been working on a binary 
archive for a while but I haven't finished it yet.

The actual format has nothing to do with file output.

-- 
/Jacob Carlborg

Aug 15 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Thursday, 15 August 2013 at 07:07:13 UTC, Jacob Carlborg wrote:
 On 2013-08-14 21:55, ilya-stromberg wrote:

 Can you use another serialization format and supports file 
 output for
 it? For example, can you use JSON, BSON or binary format?

 The idea of the library is that it can support multiple archive 
 types. Currently only XML is implemented. I have been working 
 on a binary archive for a while but I haven't finished it yet.

Jacob, do you close to finish the work on the binary archive?

Aug 21 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-21 16:39, ilya-stromberg wrote:

 Jacob, do you close to finish the work on the binary archive?

No, I don't think so.

-- 
/Jacob Carlborg

Aug 21 2013

"mrd" <denis.feklushkin gmail.com> writes:

On Thursday, 15 August 2013 at 07:07:13 UTC, Jacob Carlborg wrote:
 On 2013-08-14 21:55, ilya-stromberg wrote:

 Can you use another serialization format and supports file 
 output for
 it? For example, can you use JSON, BSON or binary format?

 The idea of the library is that it can support multiple archive 
 types. Currently only XML is implemented. I have been working 
 on a binary archive for a while but I haven't finished it yet.

I am also working on my own binary archive implementation (I just 
want to quickly get a serialization of integral types, arrays and 
structs into binary) and I have a question:

What is the purpose of the "keys"?
If they really needed, can it be realised as option and disable 
them?
(I see also that they are forces to add a lot of duplicate 
functions.)

I guess if it succeeded binary format can be made very compact 
(and possibly faster) as Protocol Buffers.

Sep 21 2013

Jacob Carlborg <doob me.com> writes:

On 2013-09-21 14:48, mrd wrote:

 What is the purpose of the "keys"?

Fields are looked up by name. This is to avoid a dependency of the order 
of the fields. I guess I can look up by field order instead and fallback 
to a name look up if a name don't match.

 If they really needed, can it be realised as option and disable them?
 (I see also that they are forces to add a lot of duplicate functions.)

Yeah, I guess so.

 I guess if it succeeded binary format can be made very compact (and
 possibly faster) as Protocol Buffers.

I'm working on a binary archive as well. It ignores the name and look up 
by field order instead. It assume that the archived data and the 
class/struct has the same field order.

-- 
/Jacob Carlborg

Sep 23 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Wednesday, 14 August 2013 at 16:25:21 UTC, Andrei Alexandrescu 
wrote:
 On 8/14/13 1:48 AM, Jacob Carlborg wrote:
 On 2013-08-14 10:19, Tyler Jameson Little wrote:
   - I would to serialize to a range (file?) and deserialize 
 from a
 range (file?)

 The serialized data is returned as an array, so that is 
 compatible with
 the range interface, it just won't be lazy.

 This seems like a major limitation. (Disclaimer: I haven't read 
 the documentation yet.)

 Andrei

Shall we fix it before accept the std.serialization?

For example, if I have 10GB of data and 16GB operating memory, I 
can't use std.serialization. It saves all my data into string 
into operating memory, so I haven't got enough memory to save 
data in file. It's currently limited by std.xml.

In other hand, std.serialization can help in many other cases if 
I have enough memory to store copy of my data.

As I can see, we have a few options:
- accept std.serialization as is. If users can't use 
std.serialization due memory limitation, they should find another 
way.
- hold std.serialization until we will have new std.xml module 
with support of range/file input/output. Users should use Orange 
if they need std.serialization right now.
- hold std.serialization until we will have binary archive for 
serialization with support of range/file input/output. Users 
should use Orange if they need std.serialization right now.
- use another xml library, for example from Tango.

Ideas?

Aug 18 2013

Marek Janukowicz <marek janukowicz.net> writes:

ilya-stromberg wrote:
 The serialized data is returned as an array, so that is
 compatible with
 the range interface, it just won't be lazy.

 This seems like a major limitation. (Disclaimer: I haven't read
 the documentation yet.)

 Andrei

 
 Shall we fix it before accept the std.serialization?
 
 For example, if I have 10GB of data and 16GB operating memory, I
 can't use std.serialization. It saves all my data into string
 into operating memory, so I haven't got enough memory to save
 data in file. It's currently limited by std.xml.
 
 In other hand, std.serialization can help in many other cases if
 I have enough memory to store copy of my data.
 
 As I can see, we have a few options:
 - accept std.serialization as is. If users can't use
 std.serialization due memory limitation, they should find another
 way.
 - hold std.serialization until we will have new std.xml module
 with support of range/file input/output. Users should use Orange
 if they need std.serialization right now.
 - hold std.serialization until we will have binary archive for
 serialization with support of range/file input/output. Users
 should use Orange if they need std.serialization right now.
 - use another xml library, for example from Tango.

My opinion is - accept it as it is (if it's not completely broken). I 
recently needed some way to serialize a data structure (in order by save the 
state of the app and restore it later) and was quite disappointed there is 
nothing like that in Phobos. Although XML is not necessarily well suited to 
my particular use case, it's still better than nothing.

Binary archive would be a great plus, but allow me to point out that current 
state of affairs (std.serialization being in a pre-accepted state for a long 
time AFAIK) is probably the worst state we might have - on the one hand I 
would not use third party libs, because std.serialization is just around the 
corner, on the other I don't have std.serialization distributed with the 
compiler yet. Also binary archive is an extension, not a change, so I don't 
see any reason why it could not be added later (because it would be backward 
compatible).

-- 
Marek Janukowicz

Aug 18 2013

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 8/18/13, Marek Janukowicz <marek janukowicz.net> wrote:
 I recently needed some way to serialize a data structure (in order by save the
 state of the app and restore it later) and was quite disappointed there is
 nothing like that in Phobos.

FWIW you could try out msgpack-d: https://github.com/msgpack/msgpack-d#usage

It's a very tiny and a fast library.

Aug 18 2013

Marek Janukowicz <marek janukowicz.net> writes:

Andrej Mitrovic wrote:
 I recently needed some way to serialize a data structure (in order by
 save the state of the app and restore it later) and was quite
 disappointed there is nothing like that in Phobos.

 
 FWIW you could try out msgpack-d:
 https://github.com/msgpack/msgpack-d#usage
 
 It's a very tiny and a fast library.

That's what I ended up using, but I would be much more happy to have 
something like this in Phobos.

-- 
Marek Janukowicz

Aug 18 2013

"Tobias Pankrath" <tobias pankrath.net> writes:

On Sunday, 18 August 2013 at 08:38:53 UTC, ilya-stromberg wrote:
 As I can see, we have a few options:
 - accept std.serialization as is. If users can't use 
 std.serialization due memory limitation, they should find 
 another way.
 - hold std.serialization until we will have new std.xml module 
 with support of range/file input/output. Users should use 
 Orange if they need std.serialization right now.
 - hold std.serialization until we will have binary archive for 
 serialization with support of range/file input/output. Users 
 should use Orange if they need std.serialization right now.
 - use another xml library, for example from Tango.

 Ideas?

We should add a suitable range interface, even if it makes no 
sense with current std.xml and include std.serialization now. For 
many use cases it will be sufficient and the improvements can 
come when std.xml2 comes. Holding back std.serialization will 
only mean that we won't see any new backend from users and would 
be quite unfair to Jacob and may keep off other contributors.

Aug 18 2013

"Tyler Jameson Little" <beatgammit gmail.com> writes:

On Sunday, 18 August 2013 at 14:24:38 UTC, Tobias Pankrath wrote:
 On Sunday, 18 August 2013 at 08:38:53 UTC, ilya-stromberg wrote:
 As I can see, we have a few options:
 - accept std.serialization as is. If users can't use 
 std.serialization due memory limitation, they should find 
 another way.
 - hold std.serialization until we will have new std.xml module 
 with support of range/file input/output. Users should use 
 Orange if they need std.serialization right now.
 - hold std.serialization until we will have binary archive for 
 serialization with support of range/file input/output. Users 
 should use Orange if they need std.serialization right now.
 - use another xml library, for example from Tango.

 Ideas?

 We should add a suitable range interface, even if it makes no 
 sense with current std.xml and include std.serialization now. 
 For many use cases it will be sufficient and the improvements 
 can come when std.xml2 comes. Holding back std.serialization 
 will only mean that we won't see any new backend from users and 
 would be quite unfair to Jacob and may keep off other 
 contributors.

I completely agree.

I'm the one that brought it up, and I mostly brought it up so the 
API doesn't have to change once std.xml is fixed. I don't think 
changing the return type to a range will be too difficult or 
memory expensive.

Also, since slices *are* ranges, shouldn't this just work?

Aug 18 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-18 10:38, ilya-stromberg wrote:

 - use another xml library, for example from Tango.

The XML module from Tango excepts the content being in memory as well, 
at least the Document module.

-- 
/Jacob Carlborg

Aug 18 2013

"Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:

I'd like to start off by saying I don't really know what I want 

serializer and dealt with Protocol Buffers.

This library looks to be providing a means to serialize any D 
data structure. It deals with pointers/class/struct/arrays... It 
is export format agnostic, while currently only XML is available, 
allowing for export to JSON or some binary form. Afterwards the 
data can return to the program through deserialization.

This is a use-case I don't think I've needed. Though I do see the 
value in it and would expect Phobos to provide such functionality.

What I'm not finding in this library is a way to support a 3rd 
party protocol. Such as those used in Thrift or Protocol Buffers. 
These specify some aspects of data layout, for example in 
Protocol Buffers arrays of primitives can be laid out in two 
forms [ID][value][ID][value] or [ID][Length][value][value].

Thrift and Protocol Buffers use code generation to create the 
language data type, and at least for Protocol Buffers a method 
contains all the logic for deserializing a collection of bytes, 
and one for serializing. I'm not seeing how std.serialize would 
make this easier or more usable.

When looking at the Archive module, I see that all the specific 
types get their own void function. I'm unclear on where these 
functions are supposed to archive to, and container types take a 
delegate which I suppose is a means for the archiver to place 
output around the field data.

In conclusion, I don't feel like I've said very much. I don't 
think std.serialize is applicable to Protocol Buffers or Thrift, 
and I don't know what benefit there would be if it was.

Aug 17 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-18 01:31, Jesse Phillips wrote:
 I'd like to start off by saying I don't really know what I want from a

 dealt with Protocol Buffers.

 This library looks to be providing a means to serialize any D data
 structure. It deals with pointers/class/struct/arrays... It is export
 format agnostic, while currently only XML is available, allowing for
 export to JSON or some binary form. Afterwards the data can return to
 the program through deserialization.

 This is a use-case I don't think I've needed. Though I do see the value
 in it and would expect Phobos to provide such functionality.

 What I'm not finding in this library is a way to support a 3rd party
 protocol. Such as those used in Thrift or Protocol Buffers. These
 specify some aspects of data layout, for example in Protocol Buffers
 arrays of primitives can be laid out in two forms [ID][value][ID][value]
 or [ID][Length][value][value].

I have had a brief look at Protocol Buffers and I don't see why it 
wouldn't work as an archive. I would probably need to implement a 
Protocol Buffers archive type to see what the limitations of 
std.serialization and Protocol Buffers are.

 Thrift and Protocol Buffers use code generation to create the language
 data type, and at least for Protocol Buffers a method contains all the
 logic for deserializing a collection of bytes, and one for serializing.
 I'm not seeing how std.serialize would make this easier or more usable.

If a Thrift or Protocol Buffers archive would be used with 
std.serialization I'm thinking that one would skip that step and have 
the data types defined directly in D.

 When looking at the Archive module, I see that all the specific types
 get their own void function. I'm unclear on where these functions are
 supposed to archive to

The archive holds the data. When the serialization is complete the data 
can be accessed using "archive.data".

, and container types take a delegate which I
 suppose is a means for the archiver to place output around the field data.

Yes, exactly. It lets the archive know where a structured type begins 
and ends.

 In conclusion, I don't feel like I've said very much. I don't think
 std.serialize is applicable to Protocol Buffers or Thrift, and I don't
 know what benefit there would be if it was.


-- 
/Jacob Carlborg

Aug 19 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Monday, 19 August 2013 at 12:49:48 UTC, Jacob Carlborg wrote:
 I have had a brief look at Protocol Buffers and I don't see why 
 it wouldn't work as an archive. I would probably need to 
 implement a Protocol Buffers archive type to see what the 
 limitations of std.serialization and Protocol Buffers are.

You can find the Protocol Buffers library here, may be it helps:
https://256.makerslocal.org/wiki/index.php/ProtocolBuffer

Aug 19 2013

"Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:

On Monday, 19 August 2013 at 13:17:48 UTC, ilya-stromberg wrote:
 On Monday, 19 August 2013 at 12:49:48 UTC, Jacob Carlborg wrote:
 I have had a brief look at Protocol Buffers and I don't see 
 why it wouldn't work as an archive. I would probably need to 
 implement a Protocol Buffers archive type to see what the 
 limitations of std.serialization and Protocol Buffers are.

 You can find the Protocol Buffers library here, may be it helps:
 https://256.makerslocal.org/wiki/index.php/ProtocolBuffer

Code has moved to https://github.com/opticron/ProtocolBuffer

Aug 19 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-19 17:41, Jesse Phillips wrote:

 Code has moved to https://github.com/opticron/ProtocolBuffer

Does it have any utility functions that are fairly standalone to handle 
the basic types, i.e. int, string, float and so on?

-- 
/Jacob Carlborg

Aug 19 2013

"Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:

On Monday, 19 August 2013 at 16:29:54 UTC, Jacob Carlborg wrote:
 On 2013-08-19 17:41, Jesse Phillips wrote:

 Code has moved to https://github.com/opticron/ProtocolBuffer

 Does it have any utility functions that are fairly standalone 
 to handle the basic types, i.e. int, string, float and so on?

The data conversions are handled by
https://github.com/opticron/ProtocolBuffer/blob/master/conversion/pbbinary.d

Aug 20 2013

"Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:

On Monday, 19 August 2013 at 12:49:48 UTC, Jacob Carlborg wrote:
 I have had a brief look at Protocol Buffers and I don't see why 
 it wouldn't work as an archive. I would probably need to 
 implement a Protocol Buffers archive type to see what the 
 limitations of std.serialization and Protocol Buffers are.

I not familiar with the interaction of Archive and Serializer. I 
was overwhelmed by the number of functions I'd have to implement 
(or in my case ignore) and ultimately I didn't know what my 
serialized data would look like.

I think it is possible to output a binary format which uses the 
same translation as Protocol Buffers, but I wouldn't expect it to 
resemble a message.

 Thrift and Protocol Buffers use code generation to create the 
 language
 data type, and at least for Protocol Buffers a method contains 
 all the
 logic for deserializing a collection of bytes, and one for 
 serializing.
 I'm not seeing how std.serialize would make this easier or 
 more usable.

 If a Thrift or Protocol Buffers archive would be used with 
 std.serialization I'm thinking that one would skip that step 
 and have the data types defined directly in D.

I'll see if I can push my way through creating an Archive type.

Aug 19 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-19 17:40, Jesse Phillips wrote:

 I not familiar with the interaction of Archive and Serializer. I was
 overwhelmed by the number of functions I'd have to implement (or in my
 case ignore) and ultimately I didn't know what my serialized data would
 look like.

std.serialization basically support any type in D (except for delegates 
and function pointers). If a particular method doesn't make sense to 
implement for a given archive, just implement a dummy function to 
satisfy the interface. The documentation for Archive says so:

"When implementing a new archive type, if any of these methods do not 
make sense for that particular implementation just implement an empty 
method and return T.init, if the method returns a value."

If something breaks due to this please let me know.

 I think it is possible to output a binary format which uses the same
 translation as Protocol Buffers, but I wouldn't expect it to resemble a
 message.

In the binary archive I'm working on I have chosen to ignore some parts 
of the implicit contract between the serializer and the archive. For 
example, I'm not planning to support slices, pointers to fields and 
similar complex features.

-- 
/Jacob Carlborg

Aug 19 2013

"Dicebot" <public dicebot.lv> writes:

On Monday, 12 August 2013 at 13:27:45 UTC, Dicebot wrote:
 Stepping up to act as a Review Manager for Jacob Carlborg 
 std.serialization

 ---- Input ----

 Code: 
 https://github.com/jacob-carlborg/phobos/tree/serialization

 Documentation: 
 https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/index.html

 Previous review thread: 
 http://forum.dlang.org/thread/adyanbsdsxsfdpvoozne forum.dlang.org

 ---- Changes since last review ----

 - Sources has been integrated into Phobos source tree
 - DDOC documentation has been provided in a form it should look 
 like on dlang.org
 - Most utility functions/template code depends on have been 
 inlined. Remaining `package` utility modules:
     * std.serialization.archives.xmldocument
     * std.serialization.attribute
     * std.serialization.registerwrapper

 ---- Information for reviewers ----

 Goal of this thread is to detect if there are any outstanding 
 issues that need to fixed before formal "yes"/"no" voting 
 happens. If no critical objections will arise, voting will 
 begin starting with a next week.

 Please take this seriously: "If you identify problems along the 
 way, please note if they are minor, serious, or showstoppers." 
 (http://wiki.dlang.org/Review/Process). This information later 
 will be used to determine if library is ready for voting.

 If there are any frequent Phobos contributors / core developers 
 please pay extra attention to submission code style and fitting 
 into overall Phobos guidelines and structure.

 -------------------------------------

 Let the thread begin.

 Jacob, it is probably worth creating a pull request with latest 
 rebased version of your proposal to simplify getting a quick 
 overview of changes. Also please tell if there is anything you 
 want/need to implement before merging.

OK, time to make a short summary.

There have been mentioned several issues / improvement 
possibilities. I don't think they prevent voting and it is up to 
Jacob to decide what he want to incorporate from it.

However, there are two things that do matter in my opinion - 
pre-UDA part of API and uncertainty about range-based lazy 
approach. Important thing here is that while library can be 
included with plenty of features lacking we can't really afford 
to break its API only few releases later just to add/remove these 
features.

So as a review manager, I think voting should be delayed until 
API is ready to address lazy range-based work model. No actual 
implementation is required but

1) it should be possible to do it later without breaking user code
2) library should not make an assumption about implementation 
being lazy or eager

That is my understanding based on current knowledge of Phobos 
modules, please correct me if I am wrong.

Jacob, please tell if you have any objections or, if this 
decision sounds reasonable - just contact me via e-mail when you 
will find std.serialization suitable for final voting. I think it 
is pretty clear that package itself is considered useful and 
welcome to Phobos.

Aug 18 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 8/18/2013 11:26 AM, Dicebot wrote:
 So as a review manager, I think voting should be delayed until API is ready to
 address lazy range-based work model.

I agree. Ranges are a very big deal for D, and libraries that can conceivably 
support it must do so.

Aug 18 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Sunday, 18 August 2013 at 18:26:55 UTC, Dicebot wrote:
 On Monday, 12 August 2013 at 13:27:45 UTC, Dicebot wrote:
 Stepping up to act as a Review Manager for Jacob Carlborg 
 std.serialization

 So as a review manager, I think voting should be delayed until 
 API is ready to address lazy range-based work model. No actual 
 implementation is required but

 1) it should be possible to do it later without breaking user 
 code
 2) library should not make an assumption about implementation 
 being lazy or eager

Can we path current std.xml to add file input/output, not only 
memory input/output? It can helps to serialize big data arrays 
directly in file.

Aug 19 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-18 20:26, Dicebot wrote:

 OK, time to make a short summary.

 There have been mentioned several issues / improvement possibilities. I
 don't think they prevent voting and it is up to Jacob to decide what he
 want to incorporate from it.

I've been quite busy lately but I've tried to address the minor issues 
with regards of documentation. I've hit a new problem in the process:

http://forum.dlang.org/thread/kujcns$1quo$1 digitalmars.com

 However, there are two things that do matter in my opinion - pre-UDA
 part of API and uncertainty about range-based lazy approach. Important
 thing here is that while library can be included with plenty of features
 lacking we can't really afford to break its API only few releases later
 just to add/remove these features.

What do you mean with "pre-UDA part of API"?

I think it will be fairly easy to add support for ranges, at least for 
the output. I'll see what I can do.

 So as a review manager, I think voting should be delayed until API is
 ready to address lazy range-based work model. No actual implementation
 is required but

 1) it should be possible to do it later without breaking user code
 2) library should not make an assumption about implementation being lazy
 or eager

 That is my understanding based on current knowledge of Phobos modules,
 please correct me if I am wrong.

 Jacob, please tell if you have any objections or, if this decision
 sounds reasonable - just contact me via e-mail when you will find
 std.serialization suitable for final voting. I think it is pretty clear
 that package itself is considered useful and welcome to Phobos.


-- 
/Jacob Carlborg

Aug 19 2013

"Dicebot" <public dicebot.lv> writes:

On Monday, 19 August 2013 at 12:57:56 UTC, Jacob Carlborg wrote:
 I've been quite busy lately but I've tried to address the minor 
 issues with regards of documentation. I've hit a new problem in 
 the process:

 http://forum.dlang.org/thread/kujcns$1quo$1 digitalmars.com

I also expect that enhancement to dlang.org to support package.d 
documentation will also probably be needed at some point to get 
proper examples. Such issues can be worked on during actual merge 
process and are not worth blocking voting.

 What do you mean with "pre-UDA part of API"?

This thread: 
http://forum.dlang.org/post/xqklcesoguxujifijadp forum.dlang.org

 I think it will be fairly easy to add support for ranges, at 
 least for the output. I'll see what I can do.

Great! Are there any difficulties with the input?

Aug 19 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-19 15:03, Dicebot wrote:

 Great! Are there any difficulties with the input?

It just that I don't clearly know how the code will need to look like, 
and I'm not particular familiar with implementing range based code.

-- 
/Jacob Carlborg

Aug 19 2013

"Dicebot" <public dicebot.lv> writes:

On Monday, 19 August 2013 at 13:31:27 UTC, Jacob Carlborg wrote:
 On 2013-08-19 15:03, Dicebot wrote:

 Great! Are there any difficulties with the input?

 It just that I don't clearly know how the code will need to 
 look like, and I'm not particular familiar with implementing 
 range based code.

Ok, I'll investigate related part of package a bit more in 
details during this week and see if I can suggest something.

Aug 19 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-19 15:47, Dicebot wrote:

 Ok, I'll investigate related part of package a bit more in details
 during this week and see if I can suggest something.

What I have now is something like this:

auto foo = new Foo;
foo.a = 3;

auto archive = new XmlArchive!(string); // string is the range type
auto serializer = new Serializer(archive);

serializer.serialize(foo);
auto data = archive.data; // returns a range, typed as XmlArchiveData

The problem now is that the range type is "string", so I can't set the 
data using any other range type:

archive.data = data;

Results in:

Error: cannot implicitly convert expression (range) of type 
XmlArchiveData to string

How can I handle that?

-- 
/Jacob Carlborg

Aug 19 2013

"Tyler Jameson Little" <beatgammit gmail.com> writes:

On Monday, 19 August 2013 at 13:31:27 UTC, Jacob Carlborg wrote:
 On 2013-08-19 15:03, Dicebot wrote:

 Great! Are there any difficulties with the input?

 It just that I don't clearly know how the code will need to 
 look like, and I'm not particular familiar with implementing 
 range based code.

Maybe we need some kind of doc explaining the idiomatic usage of 
ranges?

Personally, I'd like to do something like this:

     auto archive = new XmlArchive!(char); // create an XML archive
     auto serializer = new Serializer(archive); // create the 
serializer
     serializer.serialize(foo);

     pipe(archive.out, someFile);

Where pipe would read from the left and write to the right. My 
idea for an implementation is through using take():

     void pipe(R) (R input, File output) // isInputRange(R)...
     {
         while (!input.empty) {
             // if Serializer has no data cached, goes through one 
step
             // and returns what it has
             auto arr = input.take(BUF_SIZE);
             input.popFrontN(arr.length);
             output.write(arr);
         }
     }

For now, I'd be happy for serializer to process all data in 
serialize(), but change the behavior later to do step through 
computation when calling take().

I don't know if this helps, and others are very likely to have 
better ideas.

Aug 19 2013

Johannes Pfau <nospam example.com> writes:

Am Mon, 19 Aug 2013 16:21:44 +0200
schrieb "Tyler Jameson Little" <beatgammit gmail.com>:

 On Monday, 19 August 2013 at 13:31:27 UTC, Jacob Carlborg wrote:
 On 2013-08-19 15:03, Dicebot wrote:

 Great! Are there any difficulties with the input?

 It just that I don't clearly know how the code will need to 
 look like, and I'm not particular familiar with implementing 
 range based code.

 
 Maybe we need some kind of doc explaining the idiomatic usage of 
 ranges?
 
 Personally, I'd like to do something like this:
 
      auto archive = new XmlArchive!(char); // create an XML archive
      auto serializer = new Serializer(archive); // create the 
 serializer
      serializer.serialize(foo);
 
      pipe(archive.out, someFile);

Your "pipe" function is the same as std.algorithm.copy(InputRange,
OutputRange) or std.range.put(OutputRange, InputRange);



An important question regarding ranges for std.serialization is whether
we want it to work as an InputRange or if it should _take_ an
OutputRange. So the question is

-----------------
auto archive = new Archive();
Serializer(archive).serialize(object);
//Archive takes OutputRange, writes to it
archive.writeTo(OutputRange);

vs

auto archive = new Archive()
Serializer(archive).serialize(object);
//Archive implements InputRange for ubyte[]
foreach(ubyte[] data; archive) {}
-----------------

I'd use the first approach as it should be simpler to implement. The
second approach would be useful if the ubyte[] elements were processed
via other ranges (map, take, ...). But as binary data is usually
not processed in this way but just stored to disk or sent over network
(basically streaming operations) the first approach should be fine.

The first approach has the additional benefit that we can easily do
streaming like this:
----------------
auto archive = new Archive(OutputRange);
//Immediately write the data to the output range
Serializer(archive).serialize([1,2,3]);
----------------

This is difficult to implement with the second approach as you somehow
have to interleave calls to serialize and reads to the InputRange
interface:
------------
Serializer(archive).serialize(1);
foreach(data; archive) {stdout.write(data);}
Serializer(archive).serialize(2);
foreach(data; archive) {stdout.write(data);}
------------
And it's still less efficient than approach 1 as it has to keep an
internal buffer.

Another point is that "serialize" in the above example could be
renamed to "put". This way Serializer would itself be an OutputRange
which allows stuff like [1,2,3,4,5].stride(2).take(2).copy(archive);

Then serialize could also accept InputRanges to allow this:
archive.serialize([1,2,3,4,5].stride(2).take(2));
However, this use case is already covered by using copy so it would just
be for convenience.

Aug 19 2013

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

19-Aug-2013 22:05, Johannes Pfau пишет:
 Am Mon, 19 Aug 2013 16:21:44 +0200
 schrieb "Tyler Jameson Little" <beatgammit gmail.com>:


 Another point is that "serialize" in the above example could be
 renamed to "put". This way Serializer would itself be an OutputRange
 which allows stuff like [1,2,3,4,5].stride(2).take(2).copy(archive);

+1
I totally expect serializer to be a sink.

 Then serialize could also accept InputRanges to allow this:
 archive.serialize([1,2,3,4,5].stride(2).take(2));
 However, this use case is already covered by using copy so it would just
 be for convenience.


-- 
Dmitry Olshansky

Aug 19 2013

"Tyler Jameson Little" <beatgammit gmail.com> writes:

On Monday, 19 August 2013 at 18:06:00 UTC, Johannes Pfau wrote:
 Am Mon, 19 Aug 2013 16:21:44 +0200
 schrieb "Tyler Jameson Little" <beatgammit gmail.com>:

 On Monday, 19 August 2013 at 13:31:27 UTC, Jacob Carlborg 
 wrote:
 On 2013-08-19 15:03, Dicebot wrote:

 Great! Are there any difficulties with the input?

 It just that I don't clearly know how the code will need to 
 look like, and I'm not particular familiar with implementing 
 range based code.

 
 Maybe we need some kind of doc explaining the idiomatic usage 
 of ranges?
 
 Personally, I'd like to do something like this:
 
      auto archive = new XmlArchive!(char); // create an XML 
 archive
      auto serializer = new Serializer(archive); // create the 
 serializer
      serializer.serialize(foo);
 
      pipe(archive.out, someFile);

 Your "pipe" function is the same as 
 std.algorithm.copy(InputRange,
 OutputRange) or std.range.put(OutputRange, InputRange);

Right, for some reason I couldn't find it... Moot point though.

 An important question regarding ranges for std.serialization is 
 whether
 we want it to work as an InputRange or if it should _take_ an
 OutputRange. So the question is

 -----------------
 auto archive = new Archive();
 Serializer(archive).serialize(object);
 //Archive takes OutputRange, writes to it
 archive.writeTo(OutputRange);

 vs

 auto archive = new Archive()
 Serializer(archive).serialize(object);
 //Archive implements InputRange for ubyte[]
 foreach(ubyte[] data; archive) {}
 -----------------

 I'd use the first approach as it should be simpler to 
 implement. The
 second approach would be useful if the ubyte[] elements were 
 processed
 via other ranges (map, take, ...). But as binary data is usually
 not processed in this way but just stored to disk or sent over 
 network
 (basically streaming operations) the first approach should be 
 fine.

+1 for the first way.

 The first approach has the additional benefit that we can 
 easily do
 streaming like this:
 ----------------
 auto archive = new Archive(OutputRange);
 //Immediately write the data to the output range
 Serializer(archive).serialize([1,2,3]);
 ----------------

This can make a nice one-liner for the general case:

Serializer(new Archive(OutputRange)).serialize(...);

 Another point is that "serialize" in the above example could be
 renamed to "put". This way Serializer would itself be an 
 OutputRange
 which allows stuff like 
 [1,2,3,4,5].stride(2).take(2).copy(archive);

 Then serialize could also accept InputRanges to allow this:
 archive.serialize([1,2,3,4,5].stride(2).take(2));
 However, this use case is already covered by using copy so it 
 would just
 be for convenience.

This is nice, but I think I like serialize() better. I also don't 
think serializing a range is it's primary purpose, so it doesn't 
make a lot of sense to optimize for the uncommon case.

Aug 19 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Tuesday, 20 August 2013 at 03:42:48 UTC, Tyler Jameson Little 
wrote:
 On Monday, 19 August 2013 at 18:06:00 UTC, Johannes Pfau wrote:
 An important question regarding ranges for std.serialization 
 is whether
 we want it to work as an InputRange or if it should _take_ an
 OutputRange. So the question is

 -----------------
 auto archive = new Archive();
 Serializer(archive).serialize(object);
 //Archive takes OutputRange, writes to it
 archive.writeTo(OutputRange);

 vs

 auto archive = new Archive()
 Serializer(archive).serialize(object);
 //Archive implements InputRange for ubyte[]
 foreach(ubyte[] data; archive) {}
 -----------------

 I'd use the first approach as it should be simpler to 
 implement. The
 second approach would be useful if the ubyte[] elements were 
 processed
 via other ranges (map, take, ...). But as binary data is 
 usually
 not processed in this way but just stored to disk or sent over 
 network
 (basically streaming operations) the first approach should be 
 fine.

 +1 for the first way.

No, you are WRONG. InputRange is MORE flexible: it can be lazy or 
eager. OutputRange is only eager. As we know, lazy ranges is 
required if it's possible:

On Sunday, 18 August 2013 at 18:26:55 UTC, Dicebot wrote:
 So as a review manager, I think voting should be delayed until 
 API is ready to address lazy range-based work model. No actual 
 implementation is required but

 1) it should be possible to do it later without breaking user 
 code
 2) library should not make an assumption about implementation 
 being lazy or eager

We can use InputRange like this:

import std.file;
auto archive = new Archive()
Serializer(archive).serialize(object);
//Archive implements InputRange for ubyte[]
write("file", archive);

Another benefit: we can process InputRange. For example, if we 
have
ZipRange zip(InputRange)
function, it's easy to compress data:
write("file", zip(archive));

Another example: we would like to change output xml file and 
filter some data (because we already have it). Or we would like 
to transform output xml to the html web page. No problems:

XmlRange transformXml(InputRange);
write("file", transformXml(archive));

Ideas?

Aug 20 2013

Johannes Pfau <nospam example.com> writes:

Am Tue, 20 Aug 2013 10:40:57 +0200
schrieb "ilya-stromberg" <ilya-stromberg-2009 yandex.ru>:

 On Tuesday, 20 August 2013 at 03:42:48 UTC, Tyler Jameson Little 
 wrote:
 On Monday, 19 August 2013 at 18:06:00 UTC, Johannes Pfau wrote:
 An important question regarding ranges for std.serialization 
 is whether
 we want it to work as an InputRange or if it should _take_ an
 OutputRange. So the question is

 -----------------
 auto archive = new Archive();
 Serializer(archive).serialize(object);
 //Archive takes OutputRange, writes to it
 archive.writeTo(OutputRange);

 vs

 auto archive = new Archive()
 Serializer(archive).serialize(object);
 //Archive implements InputRange for ubyte[]
 foreach(ubyte[] data; archive) {}
 -----------------

 I'd use the first approach as it should be simpler to 
 implement. The
 second approach would be useful if the ubyte[] elements were 
 processed
 via other ranges (map, take, ...). But as binary data is 
 usually
 not processed in this way but just stored to disk or sent over 
 network
 (basically streaming operations) the first approach should be 
 fine.

 +1 for the first way.

 
 No, you are WRONG. InputRange is MORE flexible: it can be lazy or 
 eager. OutputRange is only eager. As we know, lazy ranges is 
 required if it's possible:
 
 On Sunday, 18 August 2013 at 18:26:55 UTC, Dicebot wrote:
 So as a review manager, I think voting should be delayed until 
 API is ready to address lazy range-based work model. No actual 
 implementation is required but

 1) it should be possible to do it later without breaking user 
 code
 2) library should not make an assumption about implementation 
 being lazy or eager

 
 We can use InputRange like this:
 
 import std.file;
 auto archive = new Archive()
 Serializer(archive).serialize(object);
 //Archive implements InputRange for ubyte[]
 write("file", archive);

Yes, InputRange is more flexible, but it's also more difficult to
implement and less efficient:
What happens between the 'serialize' and the 'write' call? Archive
has to cache the data, either the original object or the final
produced data in an ubyte[] buffer. 

 
 Another benefit: we can process InputRange. For example, if we 
 have
 ZipRange zip(InputRange)
 function, it's easy to compress data:
 write("file", zip(archive));
 
 Another example: we would like to change output xml file and 
 filter some data (because we already have it). Or we would like 
 to transform output xml to the html web page. No problems:

Filtering is easier with an InputRange. "Zip-Streams" on the other
hand should be OutputRanges and therefore work fine with both
approaches.

 XmlRange transformXml(InputRange);
 write("file", transformXml(archive));
 
 Ideas?

The question is are there real-world examples where this is useful. You
have to gauge the utility of this approach against it's more complicated
and less efficient implementation.

Aug 20 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Tuesday, 20 August 2013 at 10:51:25 UTC, Johannes Pfau wrote:
 Am Tue, 20 Aug 2013 10:40:57 +0200
 schrieb "ilya-stromberg" <ilya-stromberg-2009 yandex.ru>:
 We can use InputRange like this:
 
 import std.file;
 auto archive = new Archive()
 Serializer(archive).serialize(object);
 //Archive implements InputRange for ubyte[]
 write("file", archive);

 Yes, InputRange is more flexible, but it's also more difficult 
 to
 implement and less efficient:
 What happens between the 'serialize' and the 'write' call? 
 Archive
 has to cache the data, either the original object or the final
 produced data in an ubyte[] buffer.

No, Archive have to do NOTHING. 'serialize' call must only store 
pointer to the object - without this requirement we can't have 
lazy range. Serialization starts afrer 'write' call, and 
ArchiveInputRange have to store current serialization state (like 
Serializer in current implementation).

Aug 20 2013

"Dicebot" <public dicebot.lv> writes:

Ok, I was trying to avoid expressing personal opinion until now 
and mostly keep track of comments of others but now that I have 
started reading docs/sources in details, will step down from 
review manager role for a moment and do some very subjective 
reviewing :)

-----------------------

Hot topic first. Ranges. As far as I can see it it is not about 
"lets stick range API whenever possible because it is the way 
Phobos does things". Key moment here to recognized use cases that 
are likely to require range-based interface and focus on them.

As far as I can see it there two important places where 
possibility for range-based API can be helpful - providing values 
for serialize and providing raw data to deserialize, as well as 
matching Archiver changes.

Former is relatively trivial - "serialize" should have an 
overload that accepts InputRange of monotyped values to take care 
of and provides ForwardRange as a result, which serializes values 
one-by-one lazily. Same goes to archiver.

Latter is a bit more interesting. It would have been cool if 
instead of accepting raw data chunk that matches deserialized 
object size serializer.deserialize could have accepted InputRange 
that provides sequence of any random chunks of raw data and use 
it to construct values on per-request basis, lazily. This will 
require maintaining a buffer that will keep unconsumed remainder 
of the last chunk and make some decisions about behavior in case 
of hitting "empty()" before getting enough data to deserialize 
object.

But it is not be something you should care about right now 
because only actual function/method signatures are needed with 
static asserts insides, actual implementation can be added later 
by anyone willing to spend time.

-----------------------

Now about my personal feeling about std.serialization as a 
potential user. Core functionality I'd like to see in such module 
is the ability to dump D data type state into arbitrary formats 
in a robust way that requires minimal interference from the user 
code. Something like what is done with toJSON/fromJSON in vibe.d 
API stuff but more generic when in comes to output formats and 
more robust when it comes to data hierarchies to load/store.

Judging by examples and documentation this is exactly what 
std.serialization does and I like it. It lacks some better output 
(Archiver) choices but it is more like Phobos fault.

What I really don't like is excessive amount of object in the 
API. For example, I have found no reason why I need to create 
serializer object to simply dump a struct state. It is both 
boilerplate and runtime overhead I can't justify. Only state 
serializer has is archiver - and it is simply collection of 
methods on its own. I prefer to be able to do something like 
`auto data = serialize!XmlArchiver(value);`

That is not something that would have made me vote against the 
inclusion (I think it is much needed anyway) but that may have 
discouraged me from using this part of Phobos and fall to some 
NIH syndrome.

I have found documentation complete enough to get a basic 
understanding personally but one thing that has caused some 
frustration is that docs don't make clear distinction between 
minimal stuff and extra features. For example, there is 
https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/std_serializati
n_serializable.html 
- my guess that it is only used if user wants to override default 
serialization method for an aggregate type. But documentation for 
it is written in such manner that it gives an impression that it 
is absolutely required.

-----------------------

Last thing is not really relevant but is more about general 
documentation problem. This may be the first package that makes 
use of new "package.d" system and it shows that we need some way 
to provide package-wide documentation to keep things clear. I 
guess for DDOC itself generating output from package.d is nothing 
special - but what about dlang.org? How hard will it be to update 
a documentation page to support own block for package roots?

Aug 20 2013

"Daniel Murphy" <yebblies nospamgmail.com> writes:

"Dicebot" <public dicebot.lv> wrote in message 
news:luhuyerzmkebcltxhgjj forum.dlang.org...
 What I really don't like is excessive amount of object in the API. For 
 example, I have found no reason why I need to create serializer object to 
 simply dump a struct state. It is both boilerplate and runtime overhead I 
 can't justify. Only state serializer has is archiver - and it is simply 
 collection of methods on its own. I prefer to be able to do something like 
 `auto data = serialize!XmlArchiver(value);`

I think this is very important.  Simple uses should be as simple as 
possible.

Aug 20 2013

"Tyler Jameson Little" <beatgammit gmail.com> writes:

On Tuesday, 20 August 2013 at 13:44:01 UTC, Daniel Murphy wrote:
 "Dicebot" <public dicebot.lv> wrote in message
 news:luhuyerzmkebcltxhgjj forum.dlang.org...
 What I really don't like is excessive amount of object in the 
 API. For example, I have found no reason why I need to create 
 serializer object to simply dump a struct state. It is both 
 boilerplate and runtime overhead I can't justify. Only state 
 serializer has is archiver - and it is simply collection of 
 methods on its own. I prefer to be able to do something like 
 `auto data = serialize!XmlArchiver(value);`

 I think this is very important.  Simple uses should be as 
 simple as
 possible.

+1

This would enhance the 1-liner: write("file", 
serialize!XmlArchiver(InputRange));

We could even make nearly everything private except an 
isArchiver() template and serialize!().

Aug 20 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-20 17:07, Tyler Jameson Little wrote:

 +1

 This would enhance the 1-liner: write("file",
 serialize!XmlArchiver(InputRange));

 We could even make nearly everything private except an isArchiver()
 template and serialize!().

The rest of the API is need for more advanced use cases.

-- 
/Jacob Carlborg

Aug 20 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Tuesday, 20 August 2013 at 15:07:39 UTC, Tyler Jameson Little 
wrote:
 On Tuesday, 20 August 2013 at 13:44:01 UTC, Daniel Murphy wrote:
 "Dicebot" <public dicebot.lv> wrote in message
 news:luhuyerzmkebcltxhgjj forum.dlang.org...
 What I really don't like is excessive amount of object in the 
 API. For example, I have found no reason why I need to create 
 serializer object to simply dump a struct state. It is both 
 boilerplate and runtime overhead I can't justify. Only state 
 serializer has is archiver - and it is simply collection of 
 methods on its own. I prefer to be able to do something like 
 `auto data = serialize!XmlArchiver(value);`

 I think this is very important.  Simple uses should be as 
 simple as
 possible.

 +1

 This would enhance the 1-liner: write("file", 
 serialize!XmlArchiver(InputRange));

 We could even make nearly everything private except an 
 isArchiver() template and serialize!().

It will be great! Also, whith Uniform Function Call Syntax (UFCS) 
it can be better:
InputRange.serialize!XmlArchiver.zip.save("file");

Also, we can provide a default Archiver type, for example 
XmlArchiver or BinaryArchiver:
auto serialize(Archiver = BinaryArchiver, R)(R InputRange);
//Use default Archiver type
InputRange.serialize.zip.save("file");

Aug 20 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-21 08:45, ilya-stromberg wrote:

 Also, we can provide a default Archiver type, for example XmlArchiver or
 BinaryArchiver:
 auto serialize(Archiver = BinaryArchiver, R)(R InputRange);
 //Use default Archiver type
 InputRange.serialize.zip.save("file");

That's the plan.

-- 
/Jacob Carlborg

Aug 20 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-20 15:12, Dicebot wrote:

 What I really don't like is excessive amount of object in the API. For
 example, I have found no reason why I need to create serializer object
 to simply dump a struct state. It is both boilerplate and runtime
 overhead I can't justify. Only state serializer has is archiver - and it
 is simply collection of methods on its own. I prefer to be able to do
 something like `auto data = serialize!XmlArchiver(value);`

I have been planning to add a function like that but just haven't got 
around doing it. This is just a convenience function that is easy to add.

Some reasons for having an object oriented API are:

* The serializer does have state. It stores information about what's 
serialized and keep track that an object is not stored more than once in 
the archive and similar things.

* When doing custom serialization the serializer is passed to the 
methods: https://github.com/jacob-carlborg/orange/wiki/Custom-Serialization

 I have found documentation complete enough to get a basic understanding
 personally but one thing that has caused some frustration is that docs
 don't make clear distinction between minimal stuff and extra features.
 For example, there is
 https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/std_serialization_serializable.html
 - my guess that it is only used if user wants to override default
 serialization method for an aggregate type. But documentation for it is
 written in such manner that it gives an impression that it is absolutely
 required.

Ok, I can try and clarify that.

-- 
/Jacob Carlborg

Aug 20 2013

"Dicebot" <public dicebot.lv> writes:

On Tuesday, 20 August 2013 at 19:34:54 UTC, Jacob Carlborg wrote:
 I have been planning to add a function like that but just 
 haven't got around doing it. This is just a convenience 
 function that is easy to add.

Cool, as I have said it is not something critical that would 
impact my voting, just personal preferences.

 * The serializer does have state. It stores information about 
 what's serialized and keep track that an object is not stored 
 more than once in the archive and similar things.

Ah, makes sense. Well I guess then this is the power for 
robustness in data structure support and nothing can be done but 
hide it behind convenience wrappers. Sad but true :)

Aug 20 2013

"Dicebot" <public dicebot.lv> writes:

P.S. Right now most important (and probably only really 
important) thing is range API. I think it is worth focusing on it 
and getting through the voting stage - actual merge can happen at 
any time you / Phobos devs are satisfied with implementation 
state, it does not require major community attention.

Aug 20 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-21 02:09, Dicebot wrote:
 P.S. Right now most important (and probably only really important) thing
 is range API. I think it is worth focusing on it and getting through the
 voting stage - actual merge can happen at any time you / Phobos devs are
 satisfied with implementation state, it does not require major community
 attention.

Yes, but now there have been quite a lot suggestions for how the range 
API should look like that I'm even more confused. I'll think a start a 
new thread for this.

-- 
/Jacob Carlborg

Aug 20 2013

"ilya-stromberg" <ilya-stromberg-2009 yandex.ru> writes:

On Wednesday, 21 August 2013 at 06:55:56 UTC, Jacob Carlborg 
wrote:
 On 2013-08-21 02:09, Dicebot wrote:
 P.S. Right now most important (and probably only really 
 important) thing
 is range API. I think it is worth focusing on it and getting 
 through the
 voting stage - actual merge can happen at any time you / 
 Phobos devs are
 satisfied with implementation state, it does not require major 
 community
 attention.

 Yes, but now there have been quite a lot suggestions for how 
 the range API should look like that I'm even more confused. 
 I'll think a start a new thread for this.

Try to read the article:
http://wiki.dlang.org/Component_programming_with_ranges
It has got a lot of range examples, including std.range, 
std.algorithm and creation of new ranges.

Aug 21 2013

"Dicebot" <public dicebot.lv> writes:

On Wednesday, 21 August 2013 at 06:55:56 UTC, Jacob Carlborg 
wrote:
 Yes, but now there have been quite a lot suggestions for how 
 the range API should look like that I'm even more confused. 
 I'll think a start a new thread for this.

Sure. I have already written my opinion on this but getting 
attention / opinion of some Phobos developers on this topic could 
have been valuable.

Aug 21 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-19 15:03, Dicebot wrote:

 This thread:
 http://forum.dlang.org/post/xqklcesoguxujifijadp forum.dlang.org

I have removed all uses of "mixin annotations".

-- 
/Jacob Carlborg

Aug 19 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 8/12/2013 6:27 AM, Dicebot wrote:
 Documentation:
 https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/index.html

Thank you, Jacob. It looks like you've put a lot of nice work into this.

I've perused the documentation, and all I can think of is "What's a cubit?"

http://www.youtube.com/watch?v=so9o3_daDZw

I.e. there are 9 documentation pages of details. There's no obvious place to 
start, no overview, no explanation of what serialization is for and why I might 
want to use it and what's great about this implementation. At least none that I 
could find. Also needs some non-trivial canonical example code.

Something that answers who what where when why and how would be immensely
useful.



Some nits:

https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/std_serialization_serializationexception.html

Something went horribly wrong here:
----------------
Parameters:
Exception exception the exception exception to wrap
----------------

https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/std_serialization_registerwrapper.html

Lacks an illuminating example.

https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/std_serialization_serializer.html

When would I use a struct Array or a struct Slice?

https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/std_serialization_attribute.html

struct attribute should be capitalized. When would I use an attribute? Does
this 
have anything to do with User Defined Attributes? Need a canonical example.

https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/std_serialization_archives_archive.html

Aren't interfaces already abstract? I.e. abstract is redundant. The 
documentation defines an archive more or less as an archive. I still don't know 
what an archive is. (E.g. a zip file is an archive - can this create zip files?)

Aug 20 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-20 10:01, Walter Bright wrote:

 Thank you, Jacob. It looks like you've put a lot of nice work into this.

 I've perused the documentation, and all I can think of is "What's a cubit?"

 http://www.youtube.com/watch?v=so9o3_daDZw

 I.e. there are 9 documentation pages of details. There's no obvious
 place to start, no overview, no explanation of what serialization is for
 and why I might want to use it and what's great about this
 implementation. At least none that I could find. Also needs some
 non-trivial canonical example code.

 Something that answers who what where when why and how would be
 immensely useful.

Yes, I need to add some overview documentation. There's still the 
problem of finding the overview.

 Some nits:

 https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/std_serialization_serializationexception.html


 Something went horribly wrong here:
 ----------------
 Parameters:
 Exception exception the exception exception to wrap
 ----------------

Hehe, yeah :)

 https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/std_serialization_registerwrapper.html


 Lacks an illuminating example.

That doesn't need to be ddoc comments at all. The whole module is 
declared "package". I would be really nice if ddoc could automatically 
hide anything that wasn't public or protected but still generate the 
documentation for package and private.

 https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/std_serialization_serializer.html


 When would I use a struct Array or a struct Slice?

Same as above. I'll see if they really have to be public.

 https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/std_serialization_attribute.html


 struct attribute should be capitalized. When would I use an attribute?
 Does this have anything to do with User Defined Attributes? Need a
 canonical example.

Same as above.

I have used lower case because I don't consider this a struct, yes 
technically it is. This is an attribute (UDA) and I think attributes 
should be lower case. Or rather it's supposed to be used on types to 
indicate they are UDA's:

 attribute struct foo {}

The reason for this is that I'm a bit disappointed in the implementation 
of UDA's in D. I would have liked to have some kind of entity that I can 
point to and say "this is an attribute". Currently all random values and 
types can be used as an UDA, I don't like that.

Same idea why to have "interface" and "abstract" keywords. It's possible 
to avoid these, i.e. C++, but I think it's a lot better to have them.

 https://dl.dropboxusercontent.com/u/18386187/docs/std.serialization/std_serialization_archives_archive.html


 Aren't interfaces already abstract? I.e. abstract is redundant.

I have no idea why "abstract" is added there. The definition looks like 
this:

https://github.com/jacob-carlborg/phobos/blob/serialization/std/serialization/archives/archive.d#L88

 The documentation defines an archive more or less as an archive. I still
 don't know what an archive is.

"The archive is the backend in the serialization process."

And

"The archive is responsible for archiving primitive types in the format 
chosen by the archive implementation. The archive ensures that all types 
are properly archived in a format that can be later unarchived."

 (E.g. a zip file is an archive - can this create zip files?)

Theoretically one can create an archive that serializes to a zip file, 
yes. Or rather the format used by zip. An archive shouldn't write to disk.

-- 
/Jacob Carlborg

Aug 20 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 8/20/2013 6:28 AM, Jacob Carlborg wrote:
 That doesn't need to be ddoc comments at all. The whole module is declared
 "package". I would be really nice if ddoc could automatically hide anything
that
 wasn't public or protected but still generate the documentation for package and
 private.

You can hide comments from ddoc by not starting them with /** but with /*

 I have no idea why "abstract" is added there. The definition looks like this:

 https://github.com/jacob-carlborg/phobos/blob/serialization/std/serialization/archives/archive.d#L88

Hmm. That looks then like a ddoc bug.


 The documentation defines an archive more or less as an archive. I still
 don't know what an archive is.

 "The archive is the backend in the serialization process."

Doesn't make sense to me. I would think the archive would be what is created, 
not the creator.

 And

 "The archive is responsible for archiving primitive types in the format chosen
 by the archive implementation. The archive ensures that all types are properly
 archived in a format that can be later unarchived."

What confuses me here is the conflation between the archiveR and the resulting 
archive, i.e. "an archiver creates an archive". Saying "archive creates the 
archive" is a bit of a disastrous conflation of the terms, as it makes the 
documentation a constant source of confusion.


 (E.g. a zip file is an archive - can this create zip files?)

 Theoretically one can create an archive that serializes to a zip file, yes. Or
 rather the format used by zip. An archive shouldn't write to disk.

Some exposition of this is necessary, along with some comments along the line 
that the package provides a generic archiving interface, and a couple 
implementations X and Y of that interface, and that other implementations such 
as Z, the zip archiver, are possible.

Aug 20 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-20 20:04, Walter Bright wrote:

 You can hide comments from ddoc by not starting them with /** but with /*

Yeah, I know that.

 Doesn't make sense to me. I would think the archive would be what is
 created, not the creator.

I guess it could be called "archiver", or do you have a better suggestion?

 What confuses me here is the conflation between the archiveR and the
 resulting archive, i.e. "an archiver creates an archive". Saying
 "archive creates the archive" is a bit of a disastrous conflation of the
 terms, as it makes the documentation a constant source of confusion.

Would calling it "archiver" or some other name be better?

 Some exposition of this is necessary, along with some comments along the
 line that the package provides a generic archiving interface, and a
 couple implementations X and Y of that interface, and that other
 implementations such as Z, the zip archiver, are possible.

I don't understand what's so confusing.

"This is the interface all archive implementations need to implement to 
be able to be used as an archive with the serializer".

-- 
/Jacob Carlborg

Aug 20 2013

"David Nadlinger" <code klickverbot.at> writes:

On Tuesday, 20 August 2013 at 20:06:12 UTC, Jacob Carlborg wrote:
 Would calling it "archiver" or some other name be better?

Almost certainly, yes. An "archive" is something you put data 
into, not something that puts data somewhere else. ;)

David

Aug 20 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 8/20/2013 1:06 PM, Jacob Carlborg wrote:
 I guess it could be called "archiver", or do you have a better suggestion?

That sounds perfect.


 Some exposition of this is necessary, along with some comments along the
 line that the package provides a generic archiving interface, and a
 couple implementations X and Y of that interface, and that other
 implementations such as Z, the zip archiver, are possible.

 I don't understand what's so confusing.

 "This is the interface all archive implementations need to implement to be able
 to be used as an archive with the serializer".

I tend to think in terms of concrete examples, rather than abstract concepts. 
Hence my suggestion.

Aug 20 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-20 20:04, Walter Bright wrote:

 Hmm. That looks then like a ddoc bug.

Added as: http://d.puremagic.com/issues/show_bug.cgi?id=10870
Found this as well: http://d.puremagic.com/issues/show_bug.cgi?id=10869

-- 
/Jacob Carlborg

Aug 22 2013

Walter Bright <newshound2 digitalmars.com> writes:

On 8/22/2013 9:31 AM, Jacob Carlborg wrote:
 Added as: http://d.puremagic.com/issues/show_bug.cgi?id=10870
 Found this as well: http://d.puremagic.com/issues/show_bug.cgi?id=10869

Thanks

Aug 22 2013

"Dicebot" <public dicebot.lv> writes:

Jacob, what are your current plans on this (considering recent 
range API discussion thread)?

Aug 31 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-31 14:11, Dicebot wrote:
 Jacob, what are your current plans on this (considering recent range API
 discussion thread)?

My todo list looks like this:

- write an overview documentation
- improve the documentation for std.serialization.serializable to 
indicate it's not required
- implement a convenience function for serializing
- implement a convenience function for serializing to a file
- remove Serializeable
- check only for "toData" when serializing
- check only for "fromData" when deserializing
- split Serializer in to two parts
- make the parts structs
- possibly provide class wrappers
- split Archive in two parts
- add range interface to Serializer and Archive
- rename all archives to archivers
- replace ddoc comments with regular comments for all package protected 
symbols

Although I'm guessing I won't be able to finish it in time for voting. 
How much time is it left anyway?

-- 
/Jacob Carlborg

Aug 31 2013

"Dicebot" <public dicebot.lv> writes:

On Saturday, 31 August 2013 at 17:58:57 UTC, Jacob Carlborg wrote:
 My todo list looks like this:

 - write an overview documentation
 - improve the documentation for std.serialization.serializable 
 to indicate it's not required
 - implement a convenience function for serializing
 - implement a convenience function for serializing to a file
 - remove Serializeable
 - check only for "toData" when serializing
 - check only for "fromData" when deserializing
 - split Serializer in to two parts
 - make the parts structs
 - possibly provide class wrappers
 - split Archive in two parts
 - add range interface to Serializer and Archive
 - rename all archives to archivers
 - replace ddoc comments with regular comments for all package 
 protected symbols

 Although I'm guessing I won't be able to finish it in time for 
 voting. How much time is it left anyway?

Great. No hurry here, there is no hard deadline for voting - I'll 
put it on pause until you are ready. Just doing some personal 
bookkeeping. No pressure, just write me an e-mail when ready for 
next stage.

Aug 31 2013

Jacob Carlborg <doob me.com> writes:

On 2013-08-31 20:51, Dicebot wrote:

 Great. No hurry here, there is no hard deadline for voting - I'll put it
 on pause until you are ready. Just doing some personal bookkeeping. No
 pressure, just write me an e-mail when ready for next stage.

What I mean is that we usual have a couple of weeks for reviewing and 
then about one week for voting. I don't want to put the whole review 
queue on hold.

-- 
/Jacob Carlborg

Aug 31 2013

"Dicebot" <public dicebot.lv> writes:

On Saturday, 31 August 2013 at 19:37:34 UTC, Jacob Carlborg wrote:
 On 2013-08-31 20:51, Dicebot wrote:

 Great. No hurry here, there is no hard deadline for voting - 
 I'll put it
 on pause until you are ready. Just doing some personal 
 bookkeeping. No
 pressure, just write me an e-mail when ready for next stage.

 What I mean is that we usual have a couple of weeks for 
 reviewing and then about one week for voting. I don't want to 
 put the whole review queue on hold.

You won't. Reviewing is not a blocking operation, if anyone wants 
to acts as a review manager for some other contribution, nothing 
prevents from doing it right now. I have simply marked 
`std.serialization` as "Incorporating review comments" in wiki 
and given no new comments this round of review can be considered 
finished.

Aug 31 2013

"Dicebot" <public dicebot.lv> writes:

Review summary: http://wiki.dlang.org/Review/std.serialization
( please review :) )

Sep 28 2013

D Programming

C/C++ Programming

Other

digitalmars.D - std.serialization: pre-voting review / discussion