digitalmars.D - toString or not toString

Paul D. Anderson (16/16) Aug 30 2011 Can someone clarify for me the status and/or direction of string formatt...

Jonathan M Davis (27/64) Aug 30 2011 At this point, it's toString with no parameters. Don's completely out in...

Don (10/51) Aug 30 2011 Try to write BigFloat such that:

Jonathan M Davis (11/71) Aug 31 2011 The thing is that all most people care about is converting the object to...

Don (10/80) Aug 31 2011 That is simply not true. When you print a floating-point number, you

Jonathan M Davis (19/112) Aug 31 2011 Actually, I pretty much never do. I can understand wanting to, and I agr...

Marco Leise (9/10) Aug 31 2011 Let's look at Java again. They have BigDecimal, so how does that work?

Timon Gehr (14/104) Aug 31 2011 (3) data can be output with standard formatting in case one does not

Timon Gehr (6/8) Aug 31 2011 I have never understood the rationale behind deprecating toString once

Don (10/19) Sep 01 2011 Code bloat. Every struct contains string toString().

Timon Gehr (18/38) Sep 01 2011 I was just suggesting to keep the existing support for toString() inside...

Steven Schveighoffer (5/49) Sep 01 2011 Why? It's trivial to write writeTo if you have already written toString...

Timon Gehr (4/55) Sep 01 2011 I see. =). Still, the signature of writeTo is about as large as my

Steven Schveighoffer (11/74) Sep 02 2011 Here, use this:

Timon Gehr (13/89) Sep 02 2011 I don't agree those are minor, because this is going into the standard

Steven Schveighoffer (10/27) Sep 02 2011 This is probably impossible. Just for the object case alone, writeTo ne...

Timon Gehr (12/42) Sep 02 2011 Object could, in theory, just use delegate(const(char)[]). But I agree

Steven Schveighoffer (20/52) Sep 02 2011 Yes and no. There is a kludgy "interface" that all structs provide. It...

Timon Gehr (10/65) Sep 02 2011 I did know that there was some RTTI for the inefficient built-in sort,

Timon Gehr (2/5) Sep 02 2011 And the GC that needs to call destructors obv.
Steven Schveighoffer (20/42) Sep 03 2011 Actually, I think yes, that is the main unpleasantness. I wasn't about ...

Simen Kjaeraas (6/8) Sep 04 2011 This would require compiler magic, not just library features. Templates

travert phare.normalesup.org (Christophe) (3/12) Sep 03 2011 I disagree. void delegate(const(char)[]) means something, whereas Sink

Andrei Alexandrescu (3/15) Sep 03 2011 Even in the library "Sink" is too vague to be useful as a top-level symb...

Timon Gehr (3/20) Sep 03 2011 I am quite sure that useful is the same as short/too vague in this case....

Andrei Alexandrescu (4/27) Sep 03 2011 There are vastly better names than Sink. TextSink, TextWriter,

Timon Gehr (7/35) Sep 05 2011 'string' is quite obscure/vague too, if you don't know what it is. The

Andrei Alexandrescu (6/45) Sep 05 2011 I strongly disagree. In many programming languages, "string" denotes the...

Timon Gehr (5/53) Sep 05 2011 Sure, but that was not a valid argument when the term was introduced.

Sean Kelly (5/18) Sep 05 2011 It would be really great if the new toString call could be compatible =

kennytm (16/63) Sep 01 2011 to!string of array can support multiple arguments:

Timon Gehr (5/68) Sep 01 2011 This runs in ~66% of the time of Steve's formattedWrite solution.

Don (3/77) Sep 01 2011 If you're concerned about speed, the writeTo method is much quicker,

Timon Gehr (8/93) Sep 02 2011 allocating a new string on the heap always requires heap activity. I was...

Steven Schveighoffer (13/24) Sep 02 2011 Simple appending is slow. There are better ways to do it. For example,...

Timon Gehr (5/29) Sep 02 2011 Point taken. Is there anything else that stops writeTo from being

Steven Schveighoffer (13/50) Sep 02 2011 Two things:

Jacob Carlborg (6/20) Sep 01 2011 Note that to!string and/or write(f)(ln) could be implemented to inspect

Andrej Mitrovic (5/8) Sep 02 2011 http://codepad.org/1PZY7YTX

Jacob Carlborg (5/13) Sep 02 2011 If this is implemented in std.conv.to, how would that add any more

Lars T. Kyllingstad (12/56) Sep 01 2011 Actually, std.complex.Complex also has toString(sink, format), and Don

bearophile (6/7) Aug 30 2011 From a practical point of view, a good starting point is to review (and ...
Paul D. Anderson (9/35) Aug 31 2011 So, IIUC, toString has its faults but it has deep-rooted user expectatio...

Timon Gehr (4/37) Aug 31 2011 I think your approach is what std.bigint should do too. Great!

kenji hara (5/11) Sep 02 2011 I have posted pull request to fix BigInt's formatting with writef(ln)

bearophile (4/7) Sep 02 2011 You are doing good work! I hope to see your patches in the final release...
Timon Gehr (2/13) Sep 02 2011 Thank you very much! That is really useful.
Paul D. Anderson (20/33) Sep 03 2011 There are problems with opCmp as well. The "<" and ">" operators won't c...

Jonathan M Davis (6/48) Sep 03 2011 It's all part of http://d.puremagic.com/issues/show_bug.cgi?id=3659

Andrei Alexandrescu (15/17) Sep 03 2011 [snip]

Steven Schveighoffer (23/40) Sep 03 2011 toString is a means of communicating the state of an object to a person ...

Jacob Carlborg (4/51) Sep 04 2011 This sounds more like something for a serialization library.
travert phare.normalesup.org (Christophe) (82/84) Sep 06 2011 What happens if the buffer data get exhausted ? The function calling

Steven Schveighoffer (13/45) Sep 06 2011 This is probably clearer if you read the documentation for readUntil:

travert phare.normalesup.org (Christophe) (22/50) Sep 06 2011 Wouldn't your life be easier if you could ? :P

Steven Schveighoffer (31/77) Sep 06 2011 d =

travert phare.normalesup.org (Christophe) (66/66) Sep 06 2011 I've had a look at readUntil API, and it's not completely clear. Is the

Steven Schveighoffer (56/110) Sep 08 2011 The start is an index at which new data was added. The deal is, the

travert phare.normalesup.org (Christophe) (18/27) Sep 08 2011 I see. Are you not concerned by the fact that with this API, the input

Steven Schveighoffer (20/46) Sep 08 2011 No. The buffer then becomes that much bigger, and less likely to be =

Andrei Alexandrescu (6/15) Sep 06 2011 This won't work for cases such as "parse digits until a non-digit is

travert phare.normalesup.org (Christophe) (22/41) Sep 06 2011 It does, since the characters are only discarded at the next call to

Jacob Carlborg (5/22) Sep 04 2011 I see no reason to deprecate toString. toString could just call writeTo
Marco Leise (18/23) Sep 04 2011 Doesn't that come down to using a serialization API like Orange?
Robert Jacques (7/24) Sep 04 2011 I'd like to point out that parsing a format string for every object/vari...

Steven Schveighoffer (28/69) Sep 06 2011 Hm... I haven't delved (yet) into the specifics of how std.format works,...

kenji hara (7/9) Sep 04 2011 I think const void toString(scope void delegate(const(char)[]) sink,

Timon Gehr (2/12) Sep 04 2011 Yes, but imho the function name does not document really well what it do...
Andrei Alexandrescu (3/13) Sep 04 2011 Works for me. Walter?

Walter Bright (2/16) Sep 04 2011 It'll break every D program.

Andrei Alexandrescu (8/25) Sep 05 2011 Probably you and I have a different thing in mind. I'm thinking of

Timon Gehr (2/27) Sep 05 2011 I think writeTo is a suitable name.

David Nadlinger (5/10) Sep 04 2011 Requiring only a scoped delegate certainly makes sense, but I'm not too

Sean Kelly (10/28) Sep 05 2011 caused by a blind toString() that creates a whole new string without any...

Marco Leise (6/33) Sep 05 2011 I'm highly skeptical to say the least :). I know there are languages tha...

Jacob Carlborg (5/13) Sep 06 2011 If we ever get a serialization package in Phobos, Orange for example:

Marco Leise (14/27) Sep 06 2011 Ok I get the picture, but the details are vague.

Jacob Carlborg (11/24) Sep 07 2011 If the pointer points to a value that have been or later will be

kenji hara (41/41) Sep 04 2011 I have already posted some pull requests around formatting.

Timon Gehr (6/47) Sep 04 2011 appender is slower than direct appending unless you are dealing with

kenji hara (4/9) Sep 04 2011 I thought the same issue, so I'm working to fix it. Please wait for a wh...

Paul D. Anderson <paul.d.removethis.anderson comcast.andthis.net> writes:

Can someone clarify for me the status and/or direction of string formatting in
D? 

We've got:

1. toString, the object method with no parameters.
2. toString(sink, format)
3. to!String()
4. format
5. writef/writefln
6. write/writeln

I realize these exist for various reasons, some (1,3) are simple (unformatted)
conversions, others (2,4-6) are designed to provide configurable formatting.
The problem is that they are inconsistent with each other.

Using std.bigint as an example: 1, 3, 4 and 6 don't work, or don't work as
expected (to me at least). 1. prints 'BigInt', 3 and 4 are compile errors.

I know bigint is a controversial example because Don has strong feelings

or the other but I need to know what to implement in my arbitrary-precision
floating point module. This obviously relies heavily on bigint.

So, is there a transition underway in the language (or just Phobos) from
toString, writeln and format, to toString(sink,format) and writefln?

Or is this just a divergence of views, both of which are acceptable and we'll
have to get used to choosing one or the other?

Or am I just mistaken in believing there is any significant conflict?

I apologize if this has already been hashed out in the past and, if so, I would
appreciate someone pointing me to that discussion. (Or just the results of the
discussion.)

Paul

Aug 30 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, August 30, 2011 20:59:06 Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string formatting
 in D?
 
 We've got:
 
 1. toString, the object method with no parameters.
 2. toString(sink, format)
 3. to!String()
 4. format
 5. writef/writefln
 6. write/writeln
 
 I realize these exist for various reasons, some (1,3) are simple
 (unformatted) conversions, others (2,4-6) are designed to provide
 configurable formatting. The problem is that they are inconsistent with
 each other.
 
 Using std.bigint as an example: 1, 3, 4 and 6 don't work, or don't work as
 expected (to me at least). 1. prints 'BigInt', 3 and 4 are compile errors.
 
 I know bigint is a controversial example because Don has strong feelings

 way or the other but I need to know what to implement in my
 arbitrary-precision floating point module. This obviously relies heavily on
 bigint.
 
 So, is there a transition underway in the language (or just Phobos) from
 toString, writeln and format, to toString(sink,format) and writefln?
 
 Or is this just a divergence of views, both of which are acceptable and
 we'll have to get used to choosing one or the other?
 
 Or am I just mistaken in believing there is any significant conflict?
 
 I apologize if this has already been hashed out in the past and, if so, I
 would appreciate someone pointing me to that discussion. (Or just the
 results of the discussion.)

At this point, it's toString with no parameters. Don's completely out in left 
field with regards to how things currently work. I believe that BigInt is the 
_only_ example of toString(sink, format).

to!string is what you use when converting generic stuff to a string, and is 
probably better to use than calling toString directly. format is used when 
formatting strings separate for printing. write and writeln are for printing 
strings, and writef and writefln are for printing strings using formatting. I 
don't understand why there would be any confusion over the printing functions. 
If you want an automatic newline, then you pick one that ends in ln, and if 
you want formatting, then you pick one that ends in f (fln for both). The 
printing functions are not going to change at this point, and neither is 
format. They're for different purposes.

Now, what may change is toString on objects. In part due to Don's stance on 
the matter, there has been some discussion of creating a new function to 
replace toString called writeTo, which would be similar to toString(sink, 
format). It would integrate with std.conv.to, format, and the printing 
functions. And if you wanted to convert something to a string, you'd use 
to!string rather than calling writeTo directly. The DIP for it is here:

http://www.prowiki.org/wiki4d/wiki.cgi?LanguageDevel/DIPs/DIP9

Unfortunately however, the proposal seems to have gone nowhere thus far. Until 
it does, pretty much every object is just going to use toString without 
parameters, and the problems with BigInt's toString remain. However, if the 
proposal actually gets implemented, then the issue should then be able to be 
sorted out. Objects would have writeTo and toString would presumably be 
deprecated.

- Jonathan M Davis

Aug 30 2011

Don <nospam nospam.com> writes:

On 31.08.2011 04:41, Jonathan M Davis wrote:
 On Tuesday, August 30, 2011 20:59:06 Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string formatting
 in D?

 We've got:

 1. toString, the object method with no parameters.
 2. toString(sink, format)
 3. to!String()
 4. format
 5. writef/writefln
 6. write/writeln

 I realize these exist for various reasons, some (1,3) are simple
 (unformatted) conversions, others (2,4-6) are designed to provide
 configurable formatting. The problem is that they are inconsistent with
 each other.

 Using std.bigint as an example: 1, 3, 4 and 6 don't work, or don't work as
 expected (to me at least). 1. prints 'BigInt', 3 and 4 are compile errors.

 I know bigint is a controversial example because Don has strong feelings

 way or the other but I need to know what to implement in my
 arbitrary-precision floating point module. This obviously relies heavily on
 bigint.

 So, is there a transition underway in the language (or just Phobos) from
 toString, writeln and format, to toString(sink,format) and writefln?

 Or is this just a divergence of views, both of which are acceptable and
 we'll have to get used to choosing one or the other?

 Or am I just mistaken in believing there is any significant conflict?

 I apologize if this has already been hashed out in the past and, if so, I
 would appreciate someone pointing me to that discussion. (Or just the
 results of the discussion.)

 At this point, it's toString with no parameters. Don's completely out in left
 field with regards to how things currently work. I believe that BigInt is the
 _only_ example of toString(sink, format).

Try to write BigFloat such that:
BigFloat f = 2.3e69;
writefln("%f %g", f, f);
will work. It's just not possible.

toString with no parameters does not work, and CANNOT work. It just can't.
I implemented something quickly which actually works.
But, it's just a stop-gap measure until this black hole in the language 
gets fixed. We really need the format string to be exposed in a digested 
manner.

Aug 30 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, August 31, 2011 08:53:29 Don wrote:
 On 31.08.2011 04:41, Jonathan M Davis wrote:
 On Tuesday, August 30, 2011 20:59:06 Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?
 
 We've got:
 
 1. toString, the object method with no parameters.
 2. toString(sink, format)
 3. to!String()
 4. format
 5. writef/writefln
 6. write/writeln
 
 I realize these exist for various reasons, some (1,3) are simple
 (unformatted) conversions, others (2,4-6) are designed to provide
 configurable formatting. The problem is that they are inconsistent
 with
 each other.
 
 Using std.bigint as an example: 1, 3, 4 and 6 don't work, or don't
 work as expected (to me at least). 1. prints 'BigInt', 3 and 4 are
 compile errors.
 
 I know bigint is a controversial example because Don has strong
 feelings

 opinion one way or the other but I need to know what to implement in
 my
 arbitrary-precision floating point module. This obviously relies
 heavily on bigint.
 
 So, is there a transition underway in the language (or just Phobos)
 from
 toString, writeln and format, to toString(sink,format) and writefln?
 
 Or is this just a divergence of views, both of which are acceptable
 and
 we'll have to get used to choosing one or the other?
 
 Or am I just mistaken in believing there is any significant conflict?
 
 I apologize if this has already been hashed out in the past and, if
 so, I would appreciate someone pointing me to that discussion. (Or
 just the results of the discussion.)

 
 At this point, it's toString with no parameters. Don's completely out in
 left field with regards to how things currently work. I believe that
 BigInt is the _only_ example of toString(sink, format).

 
 Try to write BigFloat such that:
 BigFloat f = 2.3e69;
 writefln("%f %g", f, f);
 will work. It's just not possible.
 
 toString with no parameters does not work, and CANNOT work. It just can't.
 I implemented something quickly which actually works.
 But, it's just a stop-gap measure until this black hole in the language
 gets fixed. We really need the format string to be exposed in a digested
 manner.

The thing is that all most people care about is converting the object to a 
string. They don't care about %d vs %x or %f vs %g. They just want it to be 
converted to a string. So, the lack of a toString is a major impediment.

Now, I can definitely see why you would want to have toString/writeTo work with 
a format string, and I think that ultimately writeTo is probably a good 
solution (though it does seem to be a bit overkill for the average situation), 
but the truth of the matter is that while your concerns are perfectly valid, 
most people wouldn't even think of the issues that you're seeing with BigInt 
or BigFloat. They just want to convert them to a string as decimal values.

- Jonathan M Davis

Aug 31 2011

Don <nospam nospam.com> writes:

On 31.08.2011 09:03, Jonathan M Davis wrote:
 On Wednesday, August 31, 2011 08:53:29 Don wrote:
 On 31.08.2011 04:41, Jonathan M Davis wrote:
 On Tuesday, August 30, 2011 20:59:06 Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?

 We've got:

 1. toString, the object method with no parameters.
 2. toString(sink, format)
 3. to!String()
 4. format
 5. writef/writefln
 6. write/writeln

 I realize these exist for various reasons, some (1,3) are simple
 (unformatted) conversions, others (2,4-6) are designed to provide
 configurable formatting. The problem is that they are inconsistent
 with
 each other.

 Using std.bigint as an example: 1, 3, 4 and 6 don't work, or don't
 work as expected (to me at least). 1. prints 'BigInt', 3 and 4 are
 compile errors.

 I know bigint is a controversial example because Don has strong
 feelings

 opinion one way or the other but I need to know what to implement in
 my
 arbitrary-precision floating point module. This obviously relies
 heavily on bigint.

 So, is there a transition underway in the language (or just Phobos)
 from
 toString, writeln and format, to toString(sink,format) and writefln?

 Or is this just a divergence of views, both of which are acceptable
 and
 we'll have to get used to choosing one or the other?

 Or am I just mistaken in believing there is any significant conflict?

 I apologize if this has already been hashed out in the past and, if
 so, I would appreciate someone pointing me to that discussion. (Or
 just the results of the discussion.)

 At this point, it's toString with no parameters. Don's completely out in
 left field with regards to how things currently work. I believe that
 BigInt is the _only_ example of toString(sink, format).

 Try to write BigFloat such that:
 BigFloat f = 2.3e69;
 writefln("%f %g", f, f);
 will work. It's just not possible.

 toString with no parameters does not work, and CANNOT work. It just can't.
 I implemented something quickly which actually works.
 But, it's just a stop-gap measure until this black hole in the language
 gets fixed. We really need the format string to be exposed in a digested
 manner.

 The thing is that all most people care about is converting the object to a
 string. They don't care about %d vs %x or %f vs %g. They just want it to be
 converted to a string. So, the lack of a toString is a major impediment.

That is simply not true. When you print a floating-point number, you 
almost ALWAYS use a format string.

 Now, I can definitely see why you would want to have toString/writeTo work with
 a format string, and I think that ultimately writeTo is probably a good
 solution (though it does seem to be a bit overkill for the average situation),
 but the truth of the matter is that while your concerns are perfectly valid,
 most people wouldn't even think of the issues that you're seeing with BigInt
 or BigFloat. They just want to convert them to a string as decimal values.

Compare with C++ iostreams. There are two fundamental features:
(1) the formatting to be used is specified;
(2) components are output piece-by-piece.

The inability to do either of these things is not a minor limitation of 
string toString(). They are absolutely fundamental.
Now, once you have the full functionality, you can think about how to 
simplify the common, trivial cases. But you cannot argue the other way.

Aug 31 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, August 31, 2011 10:51:14 Don wrote:
 On 31.08.2011 09:03, Jonathan M Davis wrote:
 On Wednesday, August 31, 2011 08:53:29 Don wrote:
 On 31.08.2011 04:41, Jonathan M Davis wrote:
 On Tuesday, August 30, 2011 20:59:06 Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?
 
 We've got:
 
 1. toString, the object method with no parameters.
 2. toString(sink, format)
 3. to!String()
 4. format
 5. writef/writefln
 6. write/writeln
 
 I realize these exist for various reasons, some (1,3) are simple
 (unformatted) conversions, others (2,4-6) are designed to provide
 configurable formatting. The problem is that they are inconsistent
 with
 each other.
 
 Using std.bigint as an example: 1, 3, 4 and 6 don't work, or don't
 work as expected (to me at least). 1. prints 'BigInt', 3 and 4 are
 compile errors.
 
 I know bigint is a controversial example because Don has strong
 feelings

 opinion one way or the other but I need to know what to implement
 in
 my
 arbitrary-precision floating point module. This obviously relies
 heavily on bigint.
 
 So, is there a transition underway in the language (or just
 Phobos)
 from
 toString, writeln and format, to toString(sink,format) and
 writefln?
 
 Or is this just a divergence of views, both of which are
 acceptable
 and
 we'll have to get used to choosing one or the other?
 
 Or am I just mistaken in believing there is any significant
 conflict?
 
 I apologize if this has already been hashed out in the past and,
 if
 so, I would appreciate someone pointing me to that discussion. (Or
 just the results of the discussion.)

 
 At this point, it's toString with no parameters. Don's completely
 out in left field with regards to how things currently work. I
 believe that BigInt is the _only_ example of toString(sink,
 format).

 
 Try to write BigFloat such that:
 BigFloat f = 2.3e69;
 writefln("%f %g", f, f);
 will work. It's just not possible.
 
 toString with no parameters does not work, and CANNOT work. It just
 can't. I implemented something quickly which actually works.
 But, it's just a stop-gap measure until this black hole in the
 language
 gets fixed. We really need the format string to be exposed in a
 digested
 manner.

 
 The thing is that all most people care about is converting the object to
 a string. They don't care about %d vs %x or %f vs %g. They just want it
 to be converted to a string. So, the lack of a toString is a major
 impediment.

 That is simply not true. When you print a floating-point number, you
 almost ALWAYS use a format string.

Actually, I pretty much never do. I can understand wanting to, and I agree 
that it would be useful to do so with BigFloat, but I disagree that that's 
what you _always_ want. If anything, I typically avoid using anything other 
than %s in format strings. Sometimes, you need to be more specific, but for 
what I do, at least, it's rarely useful.

 Now, I can definitely see why you would want to have toString/writeTo
 work with a format string, and I think that ultimately writeTo is
 probably a good solution (though it does seem to be a bit overkill for
 the average situation), but the truth of the matter is that while your
 concerns are perfectly valid, most people wouldn't even think of the
 issues that you're seeing with BigInt or BigFloat. They just want to
 convert them to a string as decimal values.

 Compare with C++ iostreams. There are two fundamental features:
 (1) the formatting to be used is specified;
 (2) components are output piece-by-piece.
 
 The inability to do either of these things is not a minor limitation of
 string toString(). They are absolutely fundamental.
 Now, once you have the full functionality, you can think about how to
 simplify the common, trivial cases. But you cannot argue the other way.

Java has a toString which doesn't take any arguments, and it works fine. I've 
never had a problem with it, and the only real issue that I've had with 
toString in D is the fact that you can't use any attributes such as const or 

this issue, it never even occured to me that there was one. So, while I agree 
with you that we should find a good solution for this issue (and writeTo may 
very well be it), I disagree that this is generally a big deal. It really does 
feel to me like you're blowing this issue out of proportion. Maybe you just 
deal with numeric stuff way more than I do, but I've _never_ felt the lack of a 
format string with toString.

Regardless, while I don't think that this is a big issue, I have no problem 
with us reworking toString/writeTo so that the issue is fixed.

- Jonathan M Davis

Aug 31 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 31.08.2011, 11:13 Uhr, schrieb Jonathan M Davis <jmdavisProg gmx.com>:

 Java has a toString which doesn't take any arguments, and it works fine.

Let's look at Java again. They have BigDecimal, so how does that work?
First of all they added convenience functions akin to "toString()" to  
BigDecimal: "toEngineeringString()" and "toPlainString()".
Their number formatting class has a method with a long list of cases that  
distinguishes between the different Number classes including wrapper  
objects around primitive types and BigDecimal. So they practically have a  
sealed list of numerical types that are specially formatted, while for the  
rest "doubleValue()" is called.

Aug 31 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 08/31/2011 10:51 AM, Don wrote:
 On 31.08.2011 09:03, Jonathan M Davis wrote:
 On Wednesday, August 31, 2011 08:53:29 Don wrote:
 On 31.08.2011 04:41, Jonathan M Davis wrote:
 On Tuesday, August 30, 2011 20:59:06 Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?

 We've got:

 1. toString, the object method with no parameters.
 2. toString(sink, format)
 3. to!String()
 4. format
 5. writef/writefln
 6. write/writeln

 I realize these exist for various reasons, some (1,3) are simple
 (unformatted) conversions, others (2,4-6) are designed to provide
 configurable formatting. The problem is that they are inconsistent
 with
 each other.

 Using std.bigint as an example: 1, 3, 4 and 6 don't work, or don't
 work as expected (to me at least). 1. prints 'BigInt', 3 and 4 are
 compile errors.

 I know bigint is a controversial example because Don has strong
 feelings

 opinion one way or the other but I need to know what to implement in
 my
 arbitrary-precision floating point module. This obviously relies
 heavily on bigint.

 So, is there a transition underway in the language (or just Phobos)
 from
 toString, writeln and format, to toString(sink,format) and writefln?

 Or is this just a divergence of views, both of which are acceptable
 and
 we'll have to get used to choosing one or the other?

 Or am I just mistaken in believing there is any significant conflict?

 I apologize if this has already been hashed out in the past and, if
 so, I would appreciate someone pointing me to that discussion. (Or
 just the results of the discussion.)

 At this point, it's toString with no parameters. Don's completely
 out in
 left field with regards to how things currently work. I believe that
 BigInt is the _only_ example of toString(sink, format).

 Try to write BigFloat such that:
 BigFloat f = 2.3e69;
 writefln("%f %g", f, f);
 will work. It's just not possible.

 toString with no parameters does not work, and CANNOT work. It just
 can't.
 I implemented something quickly which actually works.
 But, it's just a stop-gap measure until this black hole in the language
 gets fixed. We really need the format string to be exposed in a digested
 manner.

 The thing is that all most people care about is converting the object
 to a
 string. They don't care about %d vs %x or %f vs %g. They just want it
 to be
 converted to a string. So, the lack of a toString is a major impediment.

 That is simply not true. When you print a floating-point number, you
 almost ALWAYS use a format string.

 Now, I can definitely see why you would want to have toString/writeTo
 work with
 a format string, and I think that ultimately writeTo is probably a good
 solution (though it does seem to be a bit overkill for the average
 situation),
 but the truth of the matter is that while your concerns are perfectly
 valid,
 most people wouldn't even think of the issues that you're seeing with
 BigInt
 or BigFloat. They just want to convert them to a string as decimal
 values.

 Compare with C++ iostreams. There are two fundamental features:
 (1) the formatting to be used is specified;
 (2) components are output piece-by-piece.

 The inability to do either of these things is not a minor limitation of
 string toString(). They are absolutely fundamental.
 Now, once you have the full functionality, you can think about how to
 simplify the common, trivial cases. But you cannot argue the other way.

(3) data can be output with standard formatting in case one does not 
care. I claim that is used most of the time.

The inability to use to!string(bigint) or writeln(bigint) is not a minor 
limitation either. What stops bigint from having an overload

string toString(){string r; toString((const(char)[] 
x){r~=x;},"d");return r;}

that would actually work with the rest of current Phobos? The current 
design has never been any more than an annoyance to me. Most generic 
code that wants to work with BigInt has to specifically test for it. 
That is unacceptable. It is very good to have the possibility to specify 
formatting but

1. It should not be the only way to do things.
2. The method should really not be called toString, but writeTo.

Aug 31 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.

I have never understood the rationale behind deprecating toString once 
we have writeTo. Why should it be deprecated? toString is great in case 
you just want to quickly and easily convert something to a string, and 
later, if formatting or more efficient output etc. is needed, the method 
can transparently be replaced by writeTo.

Aug 31 2011

Don <nospam nospam.com> writes:

On 31.08.2011 14:35, Timon Gehr wrote:
 On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.

 I have never understood the rationale behind deprecating toString once
 we have writeTo. Why should it be deprecated?

Code bloat. Every struct contains string toString().
Quite unnecessarily, since it can always be synthesized from the more 
complete version.

toString is great in case
 you just want to quickly and easily convert something to a string, and
 later, if formatting or more efficient output etc. is needed, the method
 can transparently be replaced by writeTo.

BTW, you do realize that code using writeTo is shorter in most cases? 
The reason is, that it can omit all the calls to format().
Pretty much the only time when toString is simpler, is when it is a 
single call to format().
It's only really the signature which is more complicated.

Sep 01 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/01/2011 09:41 PM, Don wrote:
 On 31.08.2011 14:35, Timon Gehr wrote:
 On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.

 I have never understood the rationale behind deprecating toString once
 we have writeTo. Why should it be deprecated?

 Code bloat. Every struct contains string toString().
 Quite unnecessarily, since it can always be synthesized from the more
 complete version.

I was just suggesting to keep the existing support for toString() inside 
to, format etc. Of course, all the structs in Phobos should probably 
completely migrate to writeTo.

 toString is great in case
 you just want to quickly and easily convert something to a string, and
 later, if formatting or more efficient output etc. is needed, the method
 can transparently be replaced by writeTo.

 BTW, you do realize that code using writeTo is shorter in most cases?
 The reason is, that it can omit all the calls to format().
 Pretty much the only time when toString is simpler, is when it is a
 single call to format().
 It's only really the signature which is more complicated.

I am not convinced:

struct S{
     int x,y,z;
     void writeTo(void delegate(const(char)[]) sink, string format = null){
         sink("(");
         .writeTo(x,sink,"d"); // still no UFCS
         sink(", ");
         .writeTo(y,sink,"d");
         sink(", ");
         .writeTo(z,sink,"d");
         sink(")");
     }

     string toString(){return "("~join(map!(to!string)([x,y,z]),", ")~")";}
}

Sep 01 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 01 Sep 2011 16:26:55 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/01/2011 09:41 PM, Don wrote:
 On 31.08.2011 14:35, Timon Gehr wrote:
 On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.

 I have never understood the rationale behind deprecating toString once
 we have writeTo. Why should it be deprecated?

 Code bloat. Every struct contains string toString().
 Quite unnecessarily, since it can always be synthesized from the more
 complete version.

 I was just suggesting to keep the existing support for toString() inside  
 to, format etc. Of course, all the structs in Phobos should probably  
 completely migrate to writeTo.

Why?  It's trivial to write writeTo if you have already written toString.   
There is a clear deprecation path in the DIP.

 toString is great in case
 you just want to quickly and easily convert something to a string, and
 later, if formatting or more efficient output etc. is needed, the  
 method
 can transparently be replaced by writeTo.

 BTW, you do realize that code using writeTo is shorter in most cases?
 The reason is, that it can omit all the calls to format().
 Pretty much the only time when toString is simpler, is when it is a
 single call to format().
 It's only really the signature which is more complicated.

 I am not convinced:

 struct S{
      int x,y,z;
      void writeTo(void delegate(const(char)[]) sink, string format =  
 null){
          sink("(");
          .writeTo(x,sink,"d"); // still no UFCS
          sink(", ");
          .writeTo(y,sink,"d");
          sink(", ");
          .writeTo(z,sink,"d");
          sink(")");
      }

      string toString(){return "("~join(map!(to!string)([x,y,z]),",  
 ")~")";}
 }

Um... formattedWrite(sink, "(%d, %d, %d)", x, y, z);

-Steve

Sep 01 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/01/2011 10:57 PM, Steven Schveighoffer wrote:
 On Thu, 01 Sep 2011 16:26:55 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/01/2011 09:41 PM, Don wrote:
 On 31.08.2011 14:35, Timon Gehr wrote:
 On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.

 I have never understood the rationale behind deprecating toString once
 we have writeTo. Why should it be deprecated?

 Code bloat. Every struct contains string toString().
 Quite unnecessarily, since it can always be synthesized from the more
 complete version.

 I was just suggesting to keep the existing support for toString()
 inside to, format etc. Of course, all the structs in Phobos should
 probably completely migrate to writeTo.

 Why? It's trivial to write writeTo if you have already written toString.
 There is a clear deprecation path in the DIP.

Exactly that is my point, it is trivial and tedious.

 toString is great in case
 you just want to quickly and easily convert something to a string, and
 later, if formatting or more efficient output etc. is needed, the
 method
 can transparently be replaced by writeTo.

 BTW, you do realize that code using writeTo is shorter in most cases?
 The reason is, that it can omit all the calls to format().
 Pretty much the only time when toString is simpler, is when it is a
 single call to format().
 It's only really the signature which is more complicated.

 I am not convinced:

 struct S{
 int x,y,z;
 void writeTo(void delegate(const(char)[]) sink, string format = null){
 sink("(");
 .writeTo(x,sink,"d"); // still no UFCS
 sink(", ");
 .writeTo(y,sink,"d");
 sink(", ");
 .writeTo(z,sink,"d");
 sink(")");
 }

 string toString(){return "("~join(map!(to!string)([x,y,z]),", ")~")";}
 }

 Um... formattedWrite(sink, "(%d, %d, %d)", x, y, z);

 -Steve

I see. =). Still, the signature of writeTo is about as large as my 
entire toString function.

Sep 01 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 01 Sep 2011 17:09:30 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/01/2011 10:57 PM, Steven Schveighoffer wrote:
 On Thu, 01 Sep 2011 16:26:55 -0400, Timon Gehr <timon.gehr gmx.ch>  
 wrote:

 On 09/01/2011 09:41 PM, Don wrote:
 On 31.08.2011 14:35, Timon Gehr wrote:
 On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.

 I have never understood the rationale behind deprecating toString  
 once
 we have writeTo. Why should it be deprecated?

 Code bloat. Every struct contains string toString().
 Quite unnecessarily, since it can always be synthesized from the more
 complete version.

 I was just suggesting to keep the existing support for toString()
 inside to, format etc. Of course, all the structs in Phobos should
 probably completely migrate to writeTo.

 Why? It's trivial to write writeTo if you have already written toString.
 There is a clear deprecation path in the DIP.

 Exactly that is my point, it is trivial and tedious.

Here, use this:

const writeToImpl = "void writeTo(void delegate(const(char)[]) sink,  
string format = null) { sink(this.toString()); }";

// put this line in all your classes/structs where you don't feel like  
writing a proper writeTo.
mixin(writeToImpl);

 toString is great in case
 you just want to quickly and easily convert something to a string,  
 and
 later, if formatting or more efficient output etc. is needed, the
 method
 can transparently be replaced by writeTo.

 BTW, you do realize that code using writeTo is shorter in most cases?
 The reason is, that it can omit all the calls to format().
 Pretty much the only time when toString is simpler, is when it is a
 single call to format().
 It's only really the signature which is more complicated.

 I am not convinced:

 struct S{
 int x,y,z;
 void writeTo(void delegate(const(char)[]) sink, string format = null){
 sink("(");
 .writeTo(x,sink,"d"); // still no UFCS
 sink(", ");
 .writeTo(y,sink,"d");
 sink(", ");
 .writeTo(z,sink,"d");
 sink(")");
 }

 string toString(){return "("~join(map!(to!string)([x,y,z]),", ")~")";}
 }

 Um... formattedWrite(sink, "(%d, %d, %d)", x, y, z);

 -Steve

 I see. =). Still, the signature of writeTo is about as large as my  
 entire toString function.

The sink type could be aliased.  But this is really getting into minor  
issues :)  The amount of power and performance you get by switching to  
writeTo is well worth the extra parameters.

-Steve

Sep 02 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/02/2011 03:47 PM, Steven Schveighoffer wrote:
 On Thu, 01 Sep 2011 17:09:30 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/01/2011 10:57 PM, Steven Schveighoffer wrote:
 On Thu, 01 Sep 2011 16:26:55 -0400, Timon Gehr <timon.gehr gmx.ch>
 wrote:

 On 09/01/2011 09:41 PM, Don wrote:
 On 31.08.2011 14:35, Timon Gehr wrote:
 On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.

 I have never understood the rationale behind deprecating toString
 once
 we have writeTo. Why should it be deprecated?

 Code bloat. Every struct contains string toString().
 Quite unnecessarily, since it can always be synthesized from the more
 complete version.

 I was just suggesting to keep the existing support for toString()
 inside to, format etc. Of course, all the structs in Phobos should
 probably completely migrate to writeTo.

 Why? It's trivial to write writeTo if you have already written toString.
 There is a clear deprecation path in the DIP.

 Exactly that is my point, it is trivial and tedious.

 Here, use this:

 const writeToImpl = "void writeTo(void delegate(const(char)[]) sink,
 string format = null) { sink(this.toString()); }";

 // put this line in all your classes/structs where you don't feel like
 writing a proper writeTo.
 mixin(writeToImpl);

 toString is great in case
 you just want to quickly and easily convert something to a string,
 and
 later, if formatting or more efficient output etc. is needed, the
 method
 can transparently be replaced by writeTo.

 BTW, you do realize that code using writeTo is shorter in most cases?
 The reason is, that it can omit all the calls to format().
 Pretty much the only time when toString is simpler, is when it is a
 single call to format().
 It's only really the signature which is more complicated.

 I am not convinced:

 struct S{
 int x,y,z;
 void writeTo(void delegate(const(char)[]) sink, string format = null){
 sink("(");
 .writeTo(x,sink,"d"); // still no UFCS
 sink(", ");
 .writeTo(y,sink,"d");
 sink(", ");
 .writeTo(z,sink,"d");
 sink(")");
 }

 string toString(){return "("~join(map!(to!string)([x,y,z]),", ")~")";}
 }

 Um... formattedWrite(sink, "(%d, %d, %d)", x, y, z);

 -Steve

 I see. =). Still, the signature of writeTo is about as large as my
 entire toString function.

 The sink type could be aliased. But this is really getting into minor
 issues :) The amount of power and performance you get by switching to
 writeTo is well worth the extra parameters.

I don't agree those are minor, because this is going into the standard 
library and should respect all use cases.

Basically, what should be done is:

1. provide an alias void delegate(const(char)[]) Sink; This should be in 
std.conv; or std.format;, because nobody wants to add it to every single 
module and if there is a standard way to handle it, no maintenance 
programmer will be confused by alias.
2. the format parameter should be completely optional in the signature.

Because then, writeTo wins not only at the efficiency and flexibility 
part, but also on the 'pleasant to write' part.

void writeTo(Sink s){ ... }
string toString(){ return ... }

Sep 02 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 02 Sep 2011 12:04:08 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/02/2011 03:47 PM, Steven Schveighoffer wrote:

 The sink type could be aliased. But this is really getting into minor
 issues :) The amount of power and performance you get by switching to
 writeTo is well worth the extra parameters.

 I don't agree those are minor, because this is going into the standard  
 library and should respect all use cases.

 Basically, what should be done is:

 1. provide an alias void delegate(const(char)[]) Sink; This should be in  
 std.conv; or std.format;, because nobody wants to add it to every single  
 module and if there is a standard way to handle it, no maintenance  
 programmer will be confused by alias.

it needs to go into object.di, because Object needs it.

 2. the format parameter should be completely optional in the signature.

This is probably impossible.  Just for the object case alone, writeTo need  
to be declared in Object, which means you'd have to override it with the  
same parameters.

It's one of the reasons the sink has to stick with one char width.

 Because then, writeTo wins not only at the efficiency and flexibility  
 part, but also on the 'pleasant to write' part.

 void writeTo(Sink s){ ... }
 string toString(){ return ... }

I think this works if you want to ignore the format string:

void writeTo(Sink s, string) {...}

Probably the best we can get.

-Steve

Sep 02 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/02/2011 06:15 PM, Steven Schveighoffer wrote:
 On Fri, 02 Sep 2011 12:04:08 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/02/2011 03:47 PM, Steven Schveighoffer wrote:

 The sink type could be aliased. But this is really getting into minor
 issues :) The amount of power and performance you get by switching to
 writeTo is well worth the extra parameters.

 I don't agree those are minor, because this is going into the standard
 library and should respect all use cases.

 Basically, what should be done is:

 1. provide an alias void delegate(const(char)[]) Sink; This should be
 in std.conv; or std.format;, because nobody wants to add it to every
 single module and if there is a standard way to handle it, no
 maintenance programmer will be confused by alias.

 it needs to go into object.di, because Object needs it.

Object could, in theory, just use delegate(const(char)[]). But I agree 
that putting it in object.di would be the cleanest solution.

 2. the format parameter should be completely optional in the signature.

 This is probably impossible. Just for the object case alone, writeTo
 need to be declared in Object, which means you'd have to override it
 with the same parameters.

Oh, yes, for classes it cannot work. But structs are more flexible.

 It's one of the reasons the sink has to stick with one char width.

Probably the library code should still make use of structs or classes 
that provide the appropriate overloads. If somebody is in desperate need 
of having, say, a dchar sink for their classes, they could then define 
an own root class.

 Because then, writeTo wins not only at the efficiency and flexibility
 part, but also on the 'pleasant to write' part.

 void writeTo(Sink s){ ... }
 string toString(){ return ... }

 I think this works if you want to ignore the format string:

 void writeTo(Sink s, string) {...}

 Probably the best we can get.

For classes the best we can get is
override void writeTo(Sink s, string) {...}

Because override adds quite some bloat anyways, the additional ignored 
string argument is not a big issue. But structs are more flexible than that.

Sep 02 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 02 Sep 2011 13:17:02 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/02/2011 06:15 PM, Steven Schveighoffer wrote:
 On Fri, 02 Sep 2011 12:04:08 -0400, Timon Gehr <timon.gehr gmx.ch>  
 wrote:
 2. the format parameter should be completely optional in the signature.

 This is probably impossible. Just for the object case alone, writeTo
 need to be declared in Object, which means you'd have to override it
 with the same parameters.

 Oh, yes, for classes it cannot work. But structs are more flexible.

Yes and no.  There is a kludgy "interface" that all structs provide.  Its  
value is somewhat suspect, but it allows some RTTI for structs.  For  
example the xtoString member of the TypeInfo_Struct.

It's arguable that the value of this interface is very low -- currently it  
enables things like the builtin sort property on arrays (which I think  
should be abolished ASAP), and allows AA's current implementation (which  
does not use templates).

 It's one of the reasons the sink has to stick with one char width.

 Probably the library code should still make use of structs or classes  
 that provide the appropriate overloads. If somebody is in desperate need  
 of having, say, a dchar sink for their classes, they could then define  
 an own root class.

It's actually probably a benefit to stick with char:

1. That's the default output width for streams
2. It's the default width for what most people consider strings (in fact,  
the string type).
3. It's pretty simple to convert char[] to wchar[] or dchar[], without  
incurring much penalty.

I think the library might be able to, in the future, deal with templated  
writeTo, but there are many things that would need changing.

 Because then, writeTo wins not only at the efficiency and flexibility
 part, but also on the 'pleasant to write' part.

 void writeTo(Sink s){ ... }
 string toString(){ return ... }

 I think this works if you want to ignore the format string:

 void writeTo(Sink s, string) {...}

 Probably the best we can get.

 For classes the best we can get is
 override void writeTo(Sink s, string) {...}

 Because override adds quite some bloat anyways, the additional ignored  
 string argument is not a big issue. But structs are more flexible than  
 that.

Yes, I wouldn't be sorry to see the special treatment of certain struct  
functions go away (i.e. the kludgy "interface" mentioned above).  In that  
case, making the format part optional is fine for structs.

-Steve

Sep 02 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/02/2011 07:46 PM, Steven Schveighoffer wrote:
 On Fri, 02 Sep 2011 13:17:02 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/02/2011 06:15 PM, Steven Schveighoffer wrote:
 On Fri, 02 Sep 2011 12:04:08 -0400, Timon Gehr <timon.gehr gmx.ch>
 wrote:
 2. the format parameter should be completely optional in the signature.

 This is probably impossible. Just for the object case alone, writeTo
 need to be declared in Object, which means you'd have to override it
 with the same parameters.

 Oh, yes, for classes it cannot work. But structs are more flexible.

 Yes and no. There is a kludgy "interface" that all structs provide. Its
 value is somewhat suspect, but it allows some RTTI for structs. For
 example the xtoString member of the TypeInfo_Struct.

 It's arguable that the value of this interface is very low -- currently
 it enables things like the builtin sort property on arrays (which I
 think should be abolished ASAP), and allows AA's current implementation
 (which does not use templates).

I did know that there was some RTTI for the inefficient built-in sort, 
but I did not know that xtoString is in that interface. So basically, 
rethinking struct RTTI and changing the compiler to reflect that is the 
main thing that makes the DIP unpleasant to implement?

 It's one of the reasons the sink has to stick with one char width.

 Probably the library code should still make use of structs or classes
 that provide the appropriate overloads. If somebody is in desperate
 need of having, say, a dchar sink for their classes, they could then
 define an own root class.

 It's actually probably a benefit to stick with char:

 1. That's the default output width for streams
 2. It's the default width for what most people consider strings (in
 fact, the string type).
 3. It's pretty simple to convert char[] to wchar[] or dchar[], without
 incurring much penalty.

 I think the library might be able to, in the future, deal with templated
 writeTo, but there are many things that would need changing.

I guess 'properly' supporting wchar and dchar it is not a high priority 
anyways.

 Because then, writeTo wins not only at the efficiency and flexibility
 part, but also on the 'pleasant to write' part.

 void writeTo(Sink s){ ... }
 string toString(){ return ... }

 I think this works if you want to ignore the format string:

 void writeTo(Sink s, string) {...}

 Probably the best we can get.

 For classes the best we can get is
 override void writeTo(Sink s, string) {...}

 Because override adds quite some bloat anyways, the additional ignored
 string argument is not a big issue. But structs are more flexible than
 that.

 Yes, I wouldn't be sorry to see the special treatment of certain struct
 functions go away (i.e. the kludgy "interface" mentioned above). In that
 case, making the format part optional is fine for structs.

I would be very happy to see struct RTTI go away, together with built-in 
sort. Are there other features that rely on struct RTTI or is it only 
built-in sort and AAs?

Sep 02 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/03/2011 01:38 AM, Timon Gehr wrote:
 I would be very happy to see struct RTTI go away, together with built-in
 sort. Are there other features that rely on struct RTTI or is it only
 built-in sort and AAs?

And the GC that needs to call destructors obv.

Sep 02 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 02 Sep 2011 19:38:23 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/02/2011 07:46 PM, Steven Schveighoffer wrote:
 It's arguable that the value of this interface is very low -- currently
 it enables things like the builtin sort property on arrays (which I
 think should be abolished ASAP), and allows AA's current implementation
 (which does not use templates).

 I did know that there was some RTTI for the inefficient built-in sort,  
 but I did not know that xtoString is in that interface. So basically,  
 rethinking struct RTTI and changing the compiler to reflect that is the  
 main thing that makes the DIP unpleasant to implement?

Actually, I think yes, that is the main unpleasantness.  I wasn't about to  
suggest in the DIP that we should abolish even part of the RTTI interface  
for structs, it seems outside the scope.

But now that I think about it, xtoString is probably not used anywhere  
anymore.  I think it used to be used in write* functions when they were  
not templates.  AFAIK, no TypeInfo functions use xtoString, you have to  
call it directly.  The xopCmp, xopEquals, and xtoHash functions all are  
wrapped by TypeInfo virtual methods.

I'll start a new thread to talk about this.  This might make the DIP much  
easier to implement.

 It's actually probably a benefit to stick with char:

 1. That's the default output width for streams
 2. It's the default width for what most people consider strings (in
 fact, the string type).
 3. It's pretty simple to convert char[] to wchar[] or dchar[], without
 incurring much penalty.

 I think the library might be able to, in the future, deal with templated
 writeTo, but there are many things that would need changing.

 I guess 'properly' supporting wchar and dchar it is not a high priority  
 anyways.

Well, is it more prudent for every printable type to provide a char[],  
wchar[], and dchar[] version of writeTo, or for the things that call  
writeTo to provide translations from char[] to wchar[] and dchar[]?  In  
other words, should to!wstring(T) fail if T.writeTo(void  
delegate(const(wchar)[]) sink, wstring format) is not implemented?

It might be that char[] is used, unless the wchar[] or dchar[] version  
exists, and then it's used.  But I think setting the minimum to providing  
char[] makes type-implementor's job easier.

-Steve

Sep 03 2011

"Simen Kjaeraas" <simen.kjaras gmail.com> writes:

On Fri, 02 Sep 2011 19:46:28 +0200, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 I think the library might be able to, in the future, deal with templated  
 writeTo, but there are many things that would need changing.

This would require compiler magic, not just library features. Templates
need to be instantiated, which the library can only do explicitly.

-- 
   Simen

Sep 04 2011

travert phare.normalesup.org (Christophe) writes:

 1. provide an alias void delegate(const(char)[]) Sink; This should 
 be in std.conv; or std.format;, because nobody wants to add it to 
 every single module and if there is a standard way to handle it, no 
 maintenance programmer will be confused by alias.

                                                                               
                                                                             
 it needs to go into object.di, because Object needs it.

                                                                               
                                                                              
 Object could, in theory, just use delegate(const(char)[]). But I agree 
 that putting it in object.di would be the cleanest solution.

I disagree. void delegate(const(char)[]) means something, whereas Sink 
is rather obscure. Providing the alias in the library seems fine, but 
providing it in the langage is too much IMO.

Sep 03 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/3/11 5:41 AM, Christophe wrote:
 1. provide an alias void delegate(const(char)[]) Sink; This should
 be in std.conv; or std.format;, because nobody wants to add it to
 every single module and if there is a standard way to handle it, no
 maintenance programmer will be confused by alias.

 it needs to go into object.di, because Object needs it.

 Object could, in theory, just use delegate(const(char)[]). But I agree
 that putting it in object.di would be the cleanest solution.

 I disagree. void delegate(const(char)[]) means something, whereas Sink
 is rather obscure. Providing the alias in the library seems fine, but
 providing it in the langage is too much IMO.

Even in the library "Sink" is too vague to be useful as a top-level symbol.

Andrei

Sep 03 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/03/2011 07:21 PM, Andrei Alexandrescu wrote:
 On 9/3/11 5:41 AM, Christophe wrote:
 1. provide an alias void delegate(const(char)[]) Sink; This should
 be in std.conv; or std.format;, because nobody wants to add it to
 every single module and if there is a standard way to handle it, no
 maintenance programmer will be confused by alias.

 it needs to go into object.di, because Object needs it.

 Object could, in theory, just use delegate(const(char)[]). But I agree
 that putting it in object.di would be the cleanest solution.

 I disagree. void delegate(const(char)[]) means something, whereas Sink
 is rather obscure.

 Providing the alias in the library seems fine, but
 providing it in the langage is too much IMO.

 Even in the library "Sink" is too vague to be useful as a top-level symbol.

 Andrei

I am quite sure that useful is the same as short/too vague in this case. 
Are you suggesting not to add an alias at all?

Sep 03 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/3/11 4:40 PM, Timon Gehr wrote:
 On 09/03/2011 07:21 PM, Andrei Alexandrescu wrote:
 On 9/3/11 5:41 AM, Christophe wrote:
 1. provide an alias void delegate(const(char)[]) Sink; This should
 be in std.conv; or std.format;, because nobody wants to add it to
 every single module and if there is a standard way to handle it, no
 maintenance programmer will be confused by alias.

 it needs to go into object.di, because Object needs it.

 Object could, in theory, just use delegate(const(char)[]). But I agree
 that putting it in object.di would be the cleanest solution.

 I disagree. void delegate(const(char)[]) means something, whereas Sink
 is rather obscure.

 Providing the alias in the library seems fine, but
 providing it in the langage is too much IMO.

 Even in the library "Sink" is too vague to be useful as a top-level
 symbol.

 Andrei

 I am quite sure that useful is the same as short/too vague in this case.
 Are you suggesting not to add an alias at all?

There are vastly better names than Sink. TextSink, TextWriter, 
StringWriter, StringSink (heh), StringStreamer, ...


Andrei

Sep 03 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/04/2011 05:46 AM, Andrei Alexandrescu wrote:
 On 9/3/11 4:40 PM, Timon Gehr wrote:
 On 09/03/2011 07:21 PM, Andrei Alexandrescu wrote:
 On 9/3/11 5:41 AM, Christophe wrote:
 1. provide an alias void delegate(const(char)[]) Sink; This should
 be in std.conv; or std.format;, because nobody wants to add it to
 every single module and if there is a standard way to handle it, no
 maintenance programmer will be confused by alias.

 it needs to go into object.di, because Object needs it.

 Object could, in theory, just use delegate(const(char)[]). But I agree
 that putting it in object.di would be the cleanest solution.

 I disagree. void delegate(const(char)[]) means something, whereas Sink
 is rather obscure.

 Providing the alias in the library seems fine, but
 providing it in the langage is too much IMO.

 Even in the library "Sink" is too vague to be useful as a top-level
 symbol.

 Andrei

 I am quite sure that useful is the same as short/too vague in this case.
 Are you suggesting not to add an alias at all?

 There are vastly better names than Sink. TextSink, TextWriter,
 StringWriter, StringSink (heh), StringStreamer, ...


 Andrei

'string' is quite obscure/vague too, if you don't know what it is. The 
'string' alias in object.di should probably be renamed to 
TailImmutableDynamicCharArray. :o)

imho better = shorter, because that is the whole point of providing an 
alias. 'Sink' stops being vague as soon as people know what it is. That 
should be quite early.

Sep 05 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/5/11 8:38 AM, Timon Gehr wrote:
 On 09/04/2011 05:46 AM, Andrei Alexandrescu wrote:
 On 9/3/11 4:40 PM, Timon Gehr wrote:
 On 09/03/2011 07:21 PM, Andrei Alexandrescu wrote:
 On 9/3/11 5:41 AM, Christophe wrote:
 1. provide an alias void delegate(const(char)[]) Sink; This should
 be in std.conv; or std.format;, because nobody wants to add it to
 every single module and if there is a standard way to handle it, no
 maintenance programmer will be confused by alias.

 it needs to go into object.di, because Object needs it.

 Object could, in theory, just use delegate(const(char)[]). But I
 agree
 that putting it in object.di would be the cleanest solution.

 I disagree. void delegate(const(char)[]) means something, whereas Sink
 is rather obscure.

 Providing the alias in the library seems fine, but
 providing it in the langage is too much IMO.

 Even in the library "Sink" is too vague to be useful as a top-level
 symbol.

 Andrei

 I am quite sure that useful is the same as short/too vague in this case.
 Are you suggesting not to add an alias at all?

 There are vastly better names than Sink. TextSink, TextWriter,
 StringWriter, StringSink (heh), StringStreamer, ...


 Andrei

 'string' is quite obscure/vague too, if you don't know what it is. The
 'string' alias in object.di should probably be renamed to
 TailImmutableDynamicCharArray. :o)

I strongly disagree. In many programming languages, "string" denotes the 
default abstraction for text representation. "Sink" has nowhere near 
that brand power.

 imho better = shorter, because that is the whole point of providing an
 alias. 'Sink' stops being vague as soon as people know what it is. That
 should be quite early.

Again I completely disagree with all these three statements, sorry.


Andrei

Sep 05 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/05/2011 03:29 PM, Andrei Alexandrescu wrote:
 On 9/5/11 8:38 AM, Timon Gehr wrote:
 On 09/04/2011 05:46 AM, Andrei Alexandrescu wrote:
 On 9/3/11 4:40 PM, Timon Gehr wrote:
 On 09/03/2011 07:21 PM, Andrei Alexandrescu wrote:
 On 9/3/11 5:41 AM, Christophe wrote:
 1. provide an alias void delegate(const(char)[]) Sink; This should
 be in std.conv; or std.format;, because nobody wants to add it to
 every single module and if there is a standard way to handle
 it, no
 maintenance programmer will be confused by alias.

 it needs to go into object.di, because Object needs it.

 Object could, in theory, just use delegate(const(char)[]). But I
 agree
 that putting it in object.di would be the cleanest solution.

 I disagree. void delegate(const(char)[]) means something, whereas
 Sink
 is rather obscure.

 Providing the alias in the library seems fine, but
 providing it in the langage is too much IMO.

 Even in the library "Sink" is too vague to be useful as a top-level
 symbol.

 Andrei

 I am quite sure that useful is the same as short/too vague in this
 case.
 Are you suggesting not to add an alias at all?

 There are vastly better names than Sink. TextSink, TextWriter,
 StringWriter, StringSink (heh), StringStreamer, ...


 Andrei

 'string' is quite obscure/vague too, if you don't know what it is. The
 'string' alias in object.di should probably be renamed to
 TailImmutableDynamicCharArray. :o)

 I strongly disagree. In many programming languages, "string" denotes the
 default abstraction for text representation. "Sink" has nowhere near
 that brand power.

Sure, but that was not a valid argument when the term was introduced.

BTW: http://www.google.com/search?channel=fs&q=string&um=1&tbm=isch

 imho better = shorter, because that is the whole point of providing an
 alias. 'Sink' stops being vague as soon as people know what it is. That
 should be quite early.

 Again I completely disagree with all these three statements, sorry.

Why would you want to have an alias if not to relieve people from 
writing cumbersome boilerplate?

Sep 05 2011

Sean Kelly <sean invisibleduck.org> writes:

On Sep 3, 2011, at 2:41 AM, Christophe wrote:

 1. provide an alias void delegate(const(char)[]) Sink; This should=20=




 be in std.conv; or std.format;, because nobody wants to add it to=20=




 every single module and if there is a standard way to handle it, no=20=




 maintenance programmer will be confused by alias.

=20
 it needs to go into object.di, because Object needs it.

=20
 Object could, in theory, just use delegate(const(char)[]). But I =


agree=20
 that putting it in object.di would be the cleanest solution.

=20
 I disagree. void delegate(const(char)[]) means something, whereas Sink=20=

 is rather obscure. Providing the alias in the library seems fine, but=20=

 providing it in the langage is too much IMO.

It would be really great if the new toString call could be compatible =
with whatever serialization mechanism is added.  This probably wouldn't =
allow the use of format strings though.=

Sep 05 2011

kennytm <kennytm gmail.com> writes:

Timon Gehr <timon.gehr gmx.ch> wrote:
 On 09/01/2011 09:41 PM, Don wrote:
 On 31.08.2011 14:35, Timon Gehr wrote:
 On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.
 

 
 I have never understood the rationale behind deprecating toString once
 we have writeTo. Why should it be deprecated?

 
 Code bloat. Every struct contains string toString().
 Quite unnecessarily, since it can always be synthesized from the more
 complete version.

 
 I was just suggesting to keep the existing support for toString() inside
 to, format etc. Of course, all the structs in Phobos should probably
 completely migrate to writeTo.
 
 
 toString is great in case
 you just want to quickly and easily convert something to a string, and
 later, if formatting or more efficient output etc. is needed, the method
 can transparently be replaced by writeTo.

 
 BTW, you do realize that code using writeTo is shorter in most cases?
 The reason is, that it can omit all the calls to format().
 Pretty much the only time when toString is simpler, is when it is a
 single call to format().
 It's only really the signature which is more complicated.
 

 
 I am not convinced:
 
 struct S{
     int x,y,z;
     void writeTo(void delegate(const(char)[]) sink, string format = null){
         sink("(");
         .writeTo(x,sink,"d"); // still no UFCS
         sink(", ");
         .writeTo(y,sink,"d");
         sink(", ");
         .writeTo(z,sink,"d");
         sink(")");
     }
 
     string toString(){return "("~join(map!(to!string)([x,y,z]),", ")~")";}
 }

to!string of array can support multiple arguments:

string toString() {
    return to!string([x, y, z], "(", ", ", ")");
}

and I believe writeTo could be made to accept extra arguments too:

struct S {
void writeTo(SomeType sink, const char[] format = null) {
    [x, y, z].writeTo(sink, format, "(", ", ", ")");
}
...
}

void writeTo(T)(T[] arr, SomeType sink, const char[] format = null, const
char[] open = "[", etc) {
  ...
}

Sep 01 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/01/2011 11:15 PM, kennytm wrote:
 Timon Gehr<timon.gehr gmx.ch>  wrote:
 On 09/01/2011 09:41 PM, Don wrote:
 On 31.08.2011 14:35, Timon Gehr wrote:
 On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.

 I have never understood the rationale behind deprecating toString once
 we have writeTo. Why should it be deprecated?

 Code bloat. Every struct contains string toString().
 Quite unnecessarily, since it can always be synthesized from the more
 complete version.

 I was just suggesting to keep the existing support for toString() inside
 to, format etc. Of course, all the structs in Phobos should probably
 completely migrate to writeTo.

 toString is great in case
 you just want to quickly and easily convert something to a string, and
 later, if formatting or more efficient output etc. is needed, the method
 can transparently be replaced by writeTo.

 BTW, you do realize that code using writeTo is shorter in most cases?
 The reason is, that it can omit all the calls to format().
 Pretty much the only time when toString is simpler, is when it is a
 single call to format().
 It's only really the signature which is more complicated.

 I am not convinced:

 struct S{
      int x,y,z;
      void writeTo(void delegate(const(char)[]) sink, string format = null){
          sink("(");
          .writeTo(x,sink,"d"); // still no UFCS
          sink(", ");
          .writeTo(y,sink,"d");
          sink(", ");
          .writeTo(z,sink,"d");
          sink(")");
      }

      string toString(){return "("~join(map!(to!string)([x,y,z]),", ")~")";}
 }

 to!string of array can support multiple arguments:

 string toString() {
      return to!string([x, y, z], "(", ", ", ")");
 }

This runs in ~66% of the time of Steve's formattedWrite solution.
(if the delegate just appends to some string variable)


 and I believe writeTo could be made to accept extra arguments too:

 struct S {
 void writeTo(SomeType sink, const char[] format = null) {
      [x, y, z].writeTo(sink, format, "(", ", ", ")");
 }
 ...
 }

 void writeTo(T)(T[] arr, SomeType sink, const char[] format = null, const
 char[] open = "[", etc) {
    ...
 }

ok, getting better. But still, I think to!string should remain to be 
able to use toString if available. (in this case, a 33% speed advantage!)

Sep 01 2011

Don <nospam nospam.com> writes:

On 01.09.2011 23:35, Timon Gehr wrote:
 On 09/01/2011 11:15 PM, kennytm wrote:
 Timon Gehr<timon.gehr gmx.ch> wrote:
 On 09/01/2011 09:41 PM, Don wrote:
 On 31.08.2011 14:35, Timon Gehr wrote:
 On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.

 I have never understood the rationale behind deprecating toString once
 we have writeTo. Why should it be deprecated?

 Code bloat. Every struct contains string toString().
 Quite unnecessarily, since it can always be synthesized from the more
 complete version.

 I was just suggesting to keep the existing support for toString() inside
 to, format etc. Of course, all the structs in Phobos should probably
 completely migrate to writeTo.

 toString is great in case
 you just want to quickly and easily convert something to a string, and
 later, if formatting or more efficient output etc. is needed, the
 method
 can transparently be replaced by writeTo.

 BTW, you do realize that code using writeTo is shorter in most cases?
 The reason is, that it can omit all the calls to format().
 Pretty much the only time when toString is simpler, is when it is a
 single call to format().
 It's only really the signature which is more complicated.

 I am not convinced:

 struct S{
 int x,y,z;
 void writeTo(void delegate(const(char)[]) sink, string format = null){
 sink("(");
 .writeTo(x,sink,"d"); // still no UFCS
 sink(", ");
 .writeTo(y,sink,"d");
 sink(", ");
 .writeTo(z,sink,"d");
 sink(")");
 }

 string toString(){return "("~join(map!(to!string)([x,y,z]),", ")~")";}
 }

 to!string of array can support multiple arguments:

 string toString() {
 return to!string([x, y, z], "(", ", ", ")");
 }

 This runs in ~66% of the time of Steve's formattedWrite solution.
 (if the delegate just appends to some string variable)


 and I believe writeTo could be made to accept extra arguments too:

 struct S {
 void writeTo(SomeType sink, const char[] format = null) {
 [x, y, z].writeTo(sink, format, "(", ", ", ")");
 }
 ...
 }

 void writeTo(T)(T[] arr, SomeType sink, const char[] format = null, const
 char[] open = "[", etc) {
 ...
 }

 ok, getting better. But still, I think to!string should remain to be
 able to use toString if available. (in this case, a 33% speed advantage!)

If you're concerned about speed, the writeTo method is much quicker, 
since it doesn't require any heap activity at all.

Sep 01 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/02/2011 03:29 AM, Don wrote:
 On 01.09.2011 23:35, Timon Gehr wrote:
 On 09/01/2011 11:15 PM, kennytm wrote:
 Timon Gehr<timon.gehr gmx.ch> wrote:
 On 09/01/2011 09:41 PM, Don wrote:
 On 31.08.2011 14:35, Timon Gehr wrote:
 On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.

 I have never understood the rationale behind deprecating toString
 once
 we have writeTo. Why should it be deprecated?

 Code bloat. Every struct contains string toString().
 Quite unnecessarily, since it can always be synthesized from the more
 complete version.

 I was just suggesting to keep the existing support for toString()
 inside
 to, format etc. Of course, all the structs in Phobos should probably
 completely migrate to writeTo.

 toString is great in case
 you just want to quickly and easily convert something to a string,
 and
 later, if formatting or more efficient output etc. is needed, the
 method
 can transparently be replaced by writeTo.

 BTW, you do realize that code using writeTo is shorter in most cases?
 The reason is, that it can omit all the calls to format().
 Pretty much the only time when toString is simpler, is when it is a
 single call to format().
 It's only really the signature which is more complicated.

 I am not convinced:

 struct S{
 int x,y,z;
 void writeTo(void delegate(const(char)[]) sink, string format = null){
 sink("(");
 .writeTo(x,sink,"d"); // still no UFCS
 sink(", ");
 .writeTo(y,sink,"d");
 sink(", ");
 .writeTo(z,sink,"d");
 sink(")");
 }

 string toString(){return "("~join(map!(to!string)([x,y,z]),", ")~")";}
 }

 to!string of array can support multiple arguments:

 string toString() {
 return to!string([x, y, z], "(", ", ", ")");
 }

 This runs in ~66% of the time of Steve's formattedWrite solution.
 (if the delegate just appends to some string variable)


 and I believe writeTo could be made to accept extra arguments too:

 struct S {
 void writeTo(SomeType sink, const char[] format = null) {
 [x, y, z].writeTo(sink, format, "(", ", ", ")");
 }
 ...
 }

 void writeTo(T)(T[] arr, SomeType sink, const char[] format = null,
 const
 char[] open = "[", etc) {
 ...
 }

 ok, getting better. But still, I think to!string should remain to be
 able to use toString if available. (in this case, a 33% speed advantage!)

 If you're concerned about speed, the writeTo method is much quicker,
 since it doesn't require any heap activity at all.

allocating a new string on the heap always requires heap activity. I was 
benchmarking to!string with toString and with what would probably be the 
solution for writeTo, and toString was quicker.

writeTo can support a variety of other use cases where it is much 
quicker of course, and I consider it a very worthy addition. I just 
think that the support for data types with a toString method should not 
just disappear.

Sep 02 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 02 Sep 2011 06:17:53 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/02/2011 03:29 AM, Don wrote:

 If you're concerned about speed, the writeTo method is much quicker,
 since it doesn't require any heap activity at all.

 allocating a new string on the heap always requires heap activity. I was  
 benchmarking to!string with toString and with what would probably be the  
 solution for writeTo, and toString was quicker.

Simple appending is slow.  There are better ways to do it.  For example,  
use Appender.

writeTo should be faster than toString for most cases, on principal that  
it can generally avoid *any* heap allocations.  Ideally, to!string should  
use a stack-allocated buffer and idup it to get the final string.  Of  
course, toString for simple cases, like printing one integer, can be  
optimized in a toString method better than writeTo.

 writeTo can support a variety of other use cases where it is much  
 quicker of course, and I consider it a very worthy addition. I just  
 think that the support for data types with a toString method should not  
 just disappear.

There are very few (if any) use cases for toString that don't involve  
showing the result, for which allocating a string is typically wasted  
cycles/space.  toString is not a serializable form of an object, I don't  
see why it should be encouraged.

-Steve

Sep 02 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/02/2011 03:59 PM, Steven Schveighoffer wrote:
 On Fri, 02 Sep 2011 06:17:53 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/02/2011 03:29 AM, Don wrote:

 If you're concerned about speed, the writeTo method is much quicker,
 since it doesn't require any heap activity at all.

 allocating a new string on the heap always requires heap activity. I
 was benchmarking to!string with toString and with what would probably
 be the solution for writeTo, and toString was quicker.

 Simple appending is slow. There are better ways to do it. For example,
 use Appender.

Appender does not help in this case, I have tested that.


 writeTo should be faster than toString for most cases, on principal that
 it can generally avoid *any* heap allocations. Ideally, to!string should
 use a stack-allocated buffer and idup it to get the final string.

 Of course, toString for simple cases, like printing one integer, can be
 optimized in a toString method better than writeTo.

 writeTo can support a variety of other use cases where it is much
 quicker of course, and I consider it a very worthy addition. I just
 think that the support for data types with a toString method should
 not just disappear.

 There are very few (if any) use cases for toString that don't involve
 showing the result, for which allocating a string is typically wasted
 cycles/space. toString is not a serializable form of an object, I don't
 see why it should be encouraged.

Point taken. Is there anything else that stops writeTo from being 
implemented? The DIP has been around for quite some time now. (and 
having only toString is certainly not optimal.)

Sep 02 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Fri, 02 Sep 2011 11:46:19 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/02/2011 03:59 PM, Steven Schveighoffer wrote:
 On Fri, 02 Sep 2011 06:17:53 -0400, Timon Gehr <timon.gehr gmx.ch>  
 wrote:

 On 09/02/2011 03:29 AM, Don wrote:

 If you're concerned about speed, the writeTo method is much quicker,
 since it doesn't require any heap activity at all.

 allocating a new string on the heap always requires heap activity. I
 was benchmarking to!string with toString and with what would probably
 be the solution for writeTo, and toString was quicker.

 Simple appending is slow. There are better ways to do it. For example,
 use Appender.

 Appender does not help in this case, I have tested that.

Two things:

  - Make sure you give appender a stack-allocated buffer, otherwise you  
incur a penalty of allocating the appending buffer on the heap
  - Appender currently allocates its implementation on the heap.  We need a  
stack-based version to make this work the best.  I think there is a bug  
report somewhere, where someone created a patch (not github'd) that  
provides a start.

 writeTo should be faster than toString for most cases, on principal that
 it can generally avoid *any* heap allocations. Ideally, to!string should
 use a stack-allocated buffer and idup it to get the final string.

 Of course, toString for simple cases, like printing one integer, can be
 optimized in a toString method better than writeTo.

 writeTo can support a variety of other use cases where it is much
 quicker of course, and I consider it a very worthy addition. I just
 think that the support for data types with a toString method should
 not just disappear.

 There are very few (if any) use cases for toString that don't involve
 showing the result, for which allocating a string is typically wasted
 cycles/space. toString is not a serializable form of an object, I don't
 see why it should be encouraged.

 Point taken. Is there anything else that stops writeTo from being  
 implemented? The DIP has been around for quite some time now. (and  
 having only toString is certainly not optimal.)

No, someone just has to do it.  I think the DIP is fairly complete.  I  
might try my hand at it in the coming months if someone else doesn't.

I don't have the knowledge to do the compiler pieces yet, but I should be  
able to get pretty far without that.

-Steve

Sep 02 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-01 22:26, Timon Gehr wrote:
 I am not convinced:

 struct S{
 int x,y,z;
 void writeTo(void delegate(const(char)[]) sink, string format = null){
 sink("(");
 .writeTo(x,sink,"d"); // still no UFCS
 sink(", ");
 .writeTo(y,sink,"d");
 sink(", ");
 .writeTo(z,sink,"d");
 sink(")");
 }

 string toString(){return "("~join(map!(to!string)([x,y,z]),", ")~")";}
 }

Note that to!string and/or write(f)(ln) could be implemented to inspect 
the fields and just print them in some standard format. This would allow 
you to skip implementing toString/writeTo in simple cases like the above.

-- 
/Jacob Carlborg

Sep 01 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 9/2/11, Jacob Carlborg <doob me.com> wrote:
 Note that to!string and/or write(f)(ln) could be implemented to inspect
 the fields and just print them in some standard format. This would allow
 you to skip implementing toString/writeTo in simple cases like the above.

http://codepad.org/1PZY7YTX

But I'm pretty sure this suffers from template instantiation bloat. I
had a similar template like this (a bit more complex though) and the
compilation speed slowed down considerably on every instantiation.

Sep 02 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-02 19:16, Andrej Mitrovic wrote:
 On 9/2/11, Jacob Carlborg<doob me.com>  wrote:
 Note that to!string and/or write(f)(ln) could be implemented to inspect
 the fields and just print them in some standard format. This would allow
 you to skip implementing toString/writeTo in simple cases like the above.

 http://codepad.org/1PZY7YTX

 But I'm pretty sure this suffers from template instantiation bloat. I
 had a similar template like this (a bit more complex though) and the
 compilation speed slowed down considerably on every instantiation.

If this is implemented in std.conv.to, how would that add any more 
template bloat?

-- 
/Jacob Carlborg

Sep 02 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Tue, 30 Aug 2011 19:41:37 -0700, Jonathan M Davis wrote:

 On Tuesday, August 30, 2011 20:59:06 Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?
 
 We've got:
 
 1. toString, the object method with no parameters. 2. toString(sink,
 format)
 3. to!String()
 4. format
 5. writef/writefln
 6. write/writeln
 
 I realize these exist for various reasons, some (1,3) are simple
 (unformatted) conversions, others (2,4-6) are designed to provide
 configurable formatting. The problem is that they are inconsistent with
 each other.
 
 Using std.bigint as an example: 1, 3, 4 and 6 don't work, or don't work
 as expected (to me at least). 1. prints 'BigInt', 3 and 4 are compile
 errors.
 
 I know bigint is a controversial example because Don has strong

 an opinion one way or the other but I need to know what to implement in
 my arbitrary-precision floating point module. This obviously relies
 heavily on bigint.
 
 So, is there a transition underway in the language (or just Phobos)
 from toString, writeln and format, to toString(sink,format) and
 writefln?
 
 Or is this just a divergence of views, both of which are acceptable and
 we'll have to get used to choosing one or the other?
 
 Or am I just mistaken in believing there is any significant conflict?
 
 I apologize if this has already been hashed out in the past and, if so,
 I would appreciate someone pointing me to that discussion. (Or just the
 results of the discussion.)

 
 At this point, it's toString with no parameters. Don's completely out in
 left field with regards to how things currently work. I believe that
 BigInt is the _only_ example of toString(sink, format).

Actually, std.complex.Complex also has toString(sink, format), and Don 
even fixed the write* functions so that they work with both Complex and 
BigInt.  The following works as you'd expect:

    BigInt i = "1234567890";
    writeln(i);

    auto z = complex(123.4, 5678.9);
    writefln("%.10e", z);

There is only one important missing piece here:  std.conv.to!string 
should be implemented to call toString with an appropriate sink and "%s" 
as the format string.

-Lars

Sep 01 2011

bearophile <bearophileHUGS lycos.com> writes:

Paul D. Anderson:

 Can someone clarify for me the status and/or direction of string formatting in
D?

From a practical point of view, a good starting point is to review (and
eventually fix) and put this into DMD 2.055:
https://github.com/D-Programming-Language/phobos/pull/126

(It doesn't solve the problems with BigInt, and I don't know if it solves bug
6529. But it solves most problems I see with D textual output).

Bye,
bearophile

Aug 30 2011

Paul D. Anderson <paul.d.removethis.anderson comcast.andthis.net> writes:

Paul D. Anderson Wrote:

 Can someone clarify for me the status and/or direction of string formatting in
D? 
 
 We've got:
 
 1. toString, the object method with no parameters.
 2. toString(sink, format)
 3. to!String()
 4. format
 5. writef/writefln
 6. write/writeln
 
 I realize these exist for various reasons, some (1,3) are simple (unformatted)
conversions, others (2,4-6) are designed to provide configurable formatting.
The problem is that they are inconsistent with each other.
 
 Using std.bigint as an example: 1, 3, 4 and 6 don't work, or don't work as
expected (to me at least). 1. prints 'BigInt', 3 and 4 are compile errors.
 
 I know bigint is a controversial example because Don has strong feelings

or the other but I need to know what to implement in my arbitrary-precision
floating point module. This obviously relies heavily on bigint.
 
 So, is there a transition underway in the language (or just Phobos) from
toString, writeln and format, to toString(sink,format) and writefln?
 
 Or is this just a divergence of views, both of which are acceptable and we'll
have to get used to choosing one or the other?
 
 Or am I just mistaken in believing there is any significant conflict?
 
 I apologize if this has already been hashed out in the past and, if so, I
would appreciate someone pointing me to that discussion. (Or just the results
of the discussion.)
 
 Paul

So, IIUC, toString has its faults but it has deep-rooted user expectations,
while toString(sink, format) [or writeTo(sink, format)] is a better
implementation, but the current state of development doesn't have a lot of
support for it. 

With respect to the Java implementation: they provide the two number-to-string
functions called out in the specification, i.e., toScientificString and
toEngineeringString and use the toScientificString method as the default
toString. One of the big advantages of doing this is that the read and write
routines are complementary -- writing out a number and reading it back in
results in not just the same value, but the same internal representation. (This
is one of the goals of the specification.)

Based on this, my proposal for the BigDecimal type is to provide similar
functionality -- the two functions listed above, with the first being called by
the toString function. In addition the toString(sink, format) function will be
provided, and/or whatever it takes to work with format and writef.

I poked around a little in the std.stdio and std.format source code and I see
that stdio.writef calls std.format.formattedWrite, so anything that works for
the one should work for the other.

I don't know what is required to make to!string work but that is a discussion
for another day.

Please advise if I've misunderstood.

Thanks, 

Paul

Aug 31 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 08/31/2011 11:00 PM, Paul D. Anderson wrote:
 Paul D. Anderson Wrote:

 Can someone clarify for me the status and/or direction of string formatting in
D?

 We've got:

 1. toString, the object method with no parameters.
 2. toString(sink, format)
 3. to!String()
 4. format
 5. writef/writefln
 6. write/writeln

 I realize these exist for various reasons, some (1,3) are simple (unformatted)
conversions, others (2,4-6) are designed to provide configurable formatting.
The problem is that they are inconsistent with each other.

 Using std.bigint as an example: 1, 3, 4 and 6 don't work, or don't work as
expected (to me at least). 1. prints 'BigInt', 3 and 4 are compile errors.

 I know bigint is a controversial example because Don has strong feelings

or the other but I need to know what to implement in my arbitrary-precision
floating point module. This obviously relies heavily on bigint.

 So, is there a transition underway in the language (or just Phobos) from
toString, writeln and format, to toString(sink,format) and writefln?

 Or is this just a divergence of views, both of which are acceptable and we'll
have to get used to choosing one or the other?

 Or am I just mistaken in believing there is any significant conflict?

 I apologize if this has already been hashed out in the past and, if so, I
would appreciate someone pointing me to that discussion. (Or just the results
of the discussion.)

 Paul

 So, IIUC, toString has its faults but it has deep-rooted user expectations,
while toString(sink, format) [or writeTo(sink, format)] is a better
implementation, but the current state of development doesn't have a lot of
support for it.

 With respect to the Java implementation: they provide the two number-to-string
functions called out in the specification, i.e., toScientificString and
toEngineeringString and use the toScientificString method as the default
toString. One of the big advantages of doing this is that the read and write
routines are complementary -- writing out a number and reading it back in
results in not just the same value, but the same internal representation. (This
is one of the goals of the specification.)

 Based on this, my proposal for the BigDecimal type is to provide similar
functionality -- the two functions listed above, with the first being called by
the toString function. In addition the toString(sink, format) function will be
provided, and/or whatever it takes to work with format and writef.

 I poked around a little in the std.stdio and std.format source code and I see
that stdio.writef calls std.format.formattedWrite, so anything that works for
the one should work for the other.

 I don't know what is required to make to!string work but that is a discussion
for another day.

 Please advise if I've misunderstood.

I think your approach is what std.bigint should do too. Great!
to!string will already work, because you provide the toString() member 
function.

Aug 31 2011

kenji hara <k.hara.pg gmail.com> writes:

2011/8/31 Jonathan M Davis <jmdavisProg gmx.com>:
 Unfortunately however, the proposal seems to have gone nowhere thus far. Until
 it does, pretty much every object is just going to use toString without
 parameters, and the problems with BigInt's toString remain. However, if the
 proposal actually gets implemented, then the issue should then be able to be
 sorted out. Objects would have writeTo and toString would presumably be
 deprecated.

I have posted pull request to fix BigInt's formatting with writef(ln)
<- formattedWrite().
https://github.com/D-Programming-Language/phobos/pull/230

Kenji Hara

Sep 02 2011

bearophile <bearophileHUGS lycos.com> writes:

Kenji Hara:

 I have posted pull request to fix BigInt's formatting with writef(ln)
 <- formattedWrite().
 https://github.com/D-Programming-Language/phobos/pull/230

You are doing good work! I hope to see your patches in the final release of DMD
2.055!

Bye,
bearophile

Sep 02 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/02/2011 11:15 PM, kenji hara wrote:
 2011/8/31 Jonathan M Davis<jmdavisProg gmx.com>:
 Unfortunately however, the proposal seems to have gone nowhere thus far. Until
 it does, pretty much every object is just going to use toString without
 parameters, and the problems with BigInt's toString remain. However, if the
 proposal actually gets implemented, then the issue should then be able to be
 sorted out. Objects would have writeTo and toString would presumably be
 deprecated.

 I have posted pull request to fix BigInt's formatting with writef(ln)
 <- formattedWrite().
 https://github.com/D-Programming-Language/phobos/pull/230

 Kenji Hara

Thank you very much! That is really useful.

Sep 02 2011

Paul D. Anderson <paul.d.removethis.anderson comcast.andthis.net> writes:

kenji hara Wrote:

 2011/8/31 Jonathan M Davis <jmdavisProg gmx.com>:
 Unfortunately however, the proposal seems to have gone nowhere thus far. Until
 it does, pretty much every object is just going to use toString without
 parameters, and the problems with BigInt's toString remain. However, if the
 proposal actually gets implemented, then the issue should then be able to be
 sorted out. Objects would have writeTo and toString would presumably be
 deprecated.

 
 I have posted pull request to fix BigInt's formatting with writef(ln)
 <- formattedWrite().
 https://github.com/D-Programming-Language/phobos/pull/230
 
 Kenji Hara

There are problems with opCmp as well. The "<" and ">" operators won't compile
if either argument is a const BigInt, so a lot of otherwise unnecessary copying
is required.

You can see this in these functions I've had to add the following functions to
my BigDecimal package:

private BigInt abs(const BigInt num) {
    BigInt big = copy(num);
    return big < BigInt(0) ? -big : big;
}

private BigInt copy(const BigInt num) {
    BigInt big = cast(BigInt)num;
    return big;
}

private int sgn(const BigInt num) {
    BigInt zero = BigInt(0);
    BigInt big = copy(num);
    if (big < zero) return -1;
    if (big < zero) return 1;
    return 0;
}

(I'd be happy to learn there's a better way to implement these.)

Paul

Sep 03 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, September 03, 2011 17:03:33 Paul D. Anderson wrote:
 kenji hara Wrote:
 2011/8/31 Jonathan M Davis <jmdavisProg gmx.com>:
 Unfortunately however, the proposal seems to have gone nowhere thus
 far. Until it does, pretty much every object is just going to use
 toString without parameters, and the problems with BigInt's
 toString remain. However, if the proposal actually gets
 implemented, then the issue should then be able to be sorted out.
 Objects would have writeTo and toString would presumably be
 deprecated.

 
 I have posted pull request to fix BigInt's formatting with writef(ln)
 <- formattedWrite().
 https://github.com/D-Programming-Language/phobos/pull/230
 
 Kenji Hara

 
 There are problems with opCmp as well. The "<" and ">" operators won't
 compile if either argument is a const BigInt, so a lot of otherwise
 unnecessary copying is required.
 
 You can see this in these functions I've had to add the following functions
 to my BigDecimal package:
 
 private BigInt abs(const BigInt num) {
     BigInt big = copy(num);
     return big < BigInt(0) ? -big : big;
 }
 
 private BigInt copy(const BigInt num) {
     BigInt big = cast(BigInt)num;
     return big;
 }
 
 private int sgn(const BigInt num) {
     BigInt zero = BigInt(0);
     BigInt big = copy(num);
     if (big < zero) return -1;
     if (big < zero) return 1;
     return 0;
 }
 
 (I'd be happy to learn there's a better way to implement these.)

It's all part of http://d.puremagic.com/issues/show_bug.cgi?id=3659

The compiler is too strict on the signature of various struct functions (e.g. 
opEquals, opCmp, and toString), and const and immutable aren't dealt with very 
well.

- Jonathan M Davis

Sep 03 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 8/30/11 8:59 PM, Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?

[snip]

I agree there are major inefficiencies and composability problems caused 
by a blind toString() that creates a whole new string without any 
assistance. So we need to fix that.

There are suggestions to add this method to Object:

void writeTo(void delegate(const(char)[]) sink, string format = null);

Then, the suggestion goes, whether or not we deprecate toString, in the 
short term it should be implemented in terms of writeTo.

There are a few questions raised by this proposal:

1. Okay, this takes care of streaming text. How about streaming in 
binary format?

2. Since we have a relatively involved "output to text" routine, how 
about an "input from text" routine? If writeTo is there, where is readFrom?


Andrei

Sep 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sun, 04 Sep 2011 00:06:47 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 8/30/11 8:59 PM, Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?

 [snip]

 I agree there are major inefficiencies and composability problems caused  
 by a blind toString() that creates a whole new string without any  
 assistance. So we need to fix that.

 There are suggestions to add this method to Object:

 void writeTo(void delegate(const(char)[]) sink, string format = null);

 Then, the suggestion goes, whether or not we deprecate toString, in the  
 short term it should be implemented in terms of writeTo.

 There are a few questions raised by this proposal:

 1. Okay, this takes care of streaming text. How about streaming in  
 binary format?

toString is a means of communicating the state of an object to a person  
reading a screen.  That's it.  It's not meant to be a serializer function.

So I guess the answer is, because people cannot read binary (well, some  
can, but that's just showing off).

 2. Since we have a relatively involved "output to text" routine, how  
 about an "input from text" routine? If writeTo is there, where is  
 readFrom?

It's a good point.  But there is no current means to do this.  AFAIK,  
readf only works on primitives, right?  Note that the DIP was born out of  
frustration with the inefficiency of toString.  There was no frustration  
at the inefficiency of, um.. parseString? because it didn't exist.

I don't feel that lack of readFrom necessarily precludes writeTo, because  
printing objects for debugging is a well-used and frequently needed  
thing.  However, parsing objects from text is not as frequently needed or  
common.  But I also feel that a proposal for readFrom is not precluded by  
DIP9, and in fact, it's probably very logical to derive such a proposal  
 from DIP9.  I think they can be implemented separately.

If I could be so bold as to suggest tying in with my recently revealed  
stdio overhaul:

size_t readFrom(const(char)[] data, size_t start); // same as readUntil  
delegate

readf calls (with a possible translation to char[] data):

input.readUntil(&obj.readFrom);

-Steve

Sep 03 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-04 06:24, Steven Schveighoffer wrote:
 On Sun, 04 Sep 2011 00:06:47 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 8/30/11 8:59 PM, Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?

 [snip]

 I agree there are major inefficiencies and composability problems
 caused by a blind toString() that creates a whole new string without
 any assistance. So we need to fix that.

 There are suggestions to add this method to Object:

 void writeTo(void delegate(const(char)[]) sink, string format = null);

 Then, the suggestion goes, whether or not we deprecate toString, in
 the short term it should be implemented in terms of writeTo.

 There are a few questions raised by this proposal:

 1. Okay, this takes care of streaming text. How about streaming in
 binary format?

 toString is a means of communicating the state of an object to a person
 reading a screen. That's it. It's not meant to be a serializer function.

 So I guess the answer is, because people cannot read binary (well, some
 can, but that's just showing off).

 2. Since we have a relatively involved "output to text" routine, how
 about an "input from text" routine? If writeTo is there, where is
 readFrom?

 It's a good point. But there is no current means to do this. AFAIK,
 readf only works on primitives, right? Note that the DIP was born out of
 frustration with the inefficiency of toString. There was no frustration
 at the inefficiency of, um.. parseString? because it didn't exist.

 I don't feel that lack of readFrom necessarily precludes writeTo,
 because printing objects for debugging is a well-used and frequently
 needed thing. However, parsing objects from text is not as frequently
 needed or common. But I also feel that a proposal for readFrom is not
 precluded by DIP9, and in fact, it's probably very logical to derive
 such a proposal from DIP9. I think they can be implemented separately.

 If I could be so bold as to suggest tying in with my recently revealed
 stdio overhaul:

 size_t readFrom(const(char)[] data, size_t start); // same as readUntil
 delegate

 readf calls (with a possible translation to char[] data):

 input.readUntil(&obj.readFrom);

 -Steve

This sounds more like something for a serialization library.

-- 
/Jacob Carlborg

Sep 04 2011

travert phare.normalesup.org (Christophe) writes:

 size_t readFrom(const(char)[] data, size_t start); // same as 
 readUntil delegate

What happens if the buffer data get exhausted ? The function calling 
readFrom has no way to know how many characters to put into data to 
allow the read.
What is the point of start ?

We could use a delegate to return new characters:

void readFrom(const(char)[] delegate(size_t) stream,
              in char[] format = null);

-format is the usual format specifier.
-stream is a delegate that takes a size_t argument, discards as many 
characters from its internal buffer, and returns data to read from.
The returned data has any length, but must be empty only when the end of 
all the data to be read is reached. stream may overwrite previously 
returned data.

Examples of suitable delegates for stream:
| const(char)[] delegate(size_t) myStringStream(string str)
| {
|   return (size_t n) { str = str[n..$]; return str; };
| }

| const(char)[] delegate(size_t) myFileStream(File file, size_t size)
| {
|   char[] chunk = new char[size];
|   int i=0;
|   chunk = file.rawRead(chunk); //  Bug?: file is read in binary mode...
|   return (size_t n)
|     {
|       i += n;
|       if (i>=chunk.length) chunk = file.rawRead(chunk);
|       return chunk[i..$];
|     };
| }

The readFrom method should looks like that:
| // read data from buffer until a whitespace is found and put it in 
| // string s
| void readFrom(ref string s, const(char)[] delegate(size_t) buffer,
|               in char[] format = null)
| {
|   s = "";
|   int r = 0;            // number of read character
|   auto buf = buffer(0); // ask for some data to read.
|   // readFrom can throw a ReadException:
|   if (!buf.length) { throw new ReadException(); }
|
|   while (!(buf[r] == ' ' || buf[r] == '\t' || buf[r] == '\n'))
|     {
|       ++r;
|       if (r == buf.length)
|         {
|           s ~= buf;
|           if (!buf.length)  // end of stream.
|             return;
|         }
|     }
|   s ~= buf[0..r];
|   buffer(r); // do not forget to tell the stream how much you read
|              // from it
|   return;
| }


But implementation of readFrom will be made easier by the following 
functions:

void read(T...)(const(char)[] delegate(size_t), ref T);
void readf(T...)(const(char)[] delegate(size_t),
                 in char[] format, ref T);

Example:
| struct Point
| {
|   int[3] data;
|   
|   void readFrom(const(char)[] delegate(size_t) stream,
|                 in char[] format = null)
|   {
|     readf(stream, "[%s, %s, %s]", data[0], data[1], data[2]);
|   }
| }

Note: One could make a similar signature for writeTo to be more 
consistent. I have no idea if this should be more efficient than the 
currently proposed writeTo.

void writeTo(char[] delegate(size_t) stream, in char[] format = null);

Note: I replaced "string format=null" by "in char[] format = null" to be 
consistent with current stdio.readf


What are your thoughts about this ?

-- 
Christophe Travert

Sep 06 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 06 Sep 2011 08:18:15 -0400, Christophe  
<travert phare.normalesup.org> wrote:

 size_t readFrom(const(char)[] data, size_t start); // same as
 readUntil delegate

 What happens if the buffer data get exhausted ? The function calling
 readFrom has no way to know how many characters to put into data to
 allow the read.
 What is the point of start ?

This is probably clearer if you read the documentation for readUntil:

http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html#readUntil

I don't know if it's a good idea to tie a possible readFrom to an  
unreleased (and quite frankly, not much liked) proposal, but it was just a  
thought.

 We could use a delegate to return new characters:

 void readFrom(const(char)[] delegate(size_t) stream,
               in char[] format = null);

 -format is the usual format specifier.
 -stream is a delegate that takes a size_t argument, discards as many
 characters from its internal buffer, and returns data to read from.
 The returned data has any length, but must be empty only when the end of
 all the data to be read is reached. stream may overwrite previously
 returned data.

So essentially, stream "peeks" at buffered data, and also discards data  
you deem "consumed"?  Note that with the current stdio package, you can  
only peek at one character.

 Examples of suitable delegates for stream:
 | const(char)[] delegate(size_t) myStringStream(string str)
 | {
 |   return (size_t n) { str = str[n..$]; return str; };
 | }

 | const(char)[] delegate(size_t) myFileStream(File file, size_t size)
 | {
 |   char[] chunk = new char[size];
 |   int i=0;
 |   chunk = file.rawRead(chunk); //  Bug?: file is read in binary mode...
 |   return (size_t n)
 |     {
 |       i += n;
 |       if (i>=chunk.length) chunk = file.rawRead(chunk);
 |       return chunk[i..$];
 |     };
 | }

This doesn't work.  What happens to unconsumed data from chunk?  You can  
only put one char back on the stream.

-Steve

Sep 06 2011

travert phare.normalesup.org (Christophe) writes:

"Steven Schveighoffer" , dans le message (digitalmars.D:143998), a
 écrit :
 void readFrom(const(char)[] delegate(size_t) stream,
               in char[] format = null);

 -format is the usual format specifier.
 -stream is a delegate that takes a size_t argument, discards as many
 characters from its internal buffer, and returns data to read from.
 The returned data has any length, but must be empty only when the end of
 all the data to be read is reached. stream may overwrite previously
 returned data.

 
 So essentially, stream "peeks" at buffered data, and also discards data  
 you deem "consumed"?  Note that with the current stdio package, you can  
 only peek at one character.

Wouldn't your life be easier if you could ? :P 
Well, I thought there were be some internal buffer in the read functions 
of stdio, and in scanf (although it is not accessible). Maybe that's why 
it is so slow. Anyway, stream is allowed to return a one-character 
const(char)[], although it might not be optimal at all.

If the "stream" comes from stdin, either chars are peeked one by one and 
no changes are to be made to stdin, or all stdin functions must use the 
"stream" or at least the same buffer.
The same can be said to std.stdio.File, if we want to make all File 
instances compatible with this way of reading.
The readFrom API could be changed to use peek/get delegate instead of 
stream, but wouldn't that be such a loss of power ?

 
 | const(char)[] delegate(size_t) myFileStream(File file, size_t size)
 | {
 |   char[] chunk = new char[size];
 |   int i=0;
 |   chunk = file.rawRead(chunk); //  Bug?: file is read in binary mode...
 |   return (size_t n)
 |     {
 |       i += n;
 |       if (i>=chunk.length) chunk = file.rawRead(chunk);
 |       return chunk[i..$];
 |     };
 | }

 
 This doesn't work.  What happens to unconsumed data from chunk?  You can  
 only put one char back on the stream. 

Nothing should be read from file you put in myFileStream if not by the 
stream itself. Why putting characters back then ?
Anyway, this is just a (bad) example of what is legal for a stream 
parameter. You probably don't want to allocate a File instance on the 
heap like I just did.

-- 
Christophe

Sep 06 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 06 Sep 2011 10:07:23 -0400, Christophe  =

<travert phare.normalesup.org> wrote:

 "Steven Schveighoffer" , dans le message (digitalmars.D:143998), a
  =C3=A9crit :
 void readFrom(const(char)[] delegate(size_t) stream,
               in char[] format =3D null);

 -format is the usual format specifier.
 -stream is a delegate that takes a size_t argument, discards as many=



 characters from its internal buffer, and returns data to read from.
 The returned data has any length, but must be empty only when the en=



d  =

 of
 all the data to be read is reached. stream may overwrite previously
 returned data.

 So essentially, stream "peeks" at buffered data, and also discards da=


ta
 you deem "consumed"?  Note that with the current stdio package, you c=


an
 only peek at one character.

 Wouldn't your life be easier if you could ? :P

I'd love to, which is why I wrote the revamped stdio ;)

 Well, I thought there were be some internal buffer in the read functio=

ns
 of stdio, and in scanf (although it is not accessible). Maybe that's w=

hy
 it is so slow. Anyway, stream is allowed to return a one-character
 const(char)[], although it might not be optimal at all.

Yes, if you look at the input range given to formattedRead in  =

std.stdio.File, it's a one-char-at-a-time range.

It works by calling fgetc, then immediately putting it back using funget=
c.

 If the "stream" comes from stdin, either chars are peeked one by one a=

nd
 no changes are to be made to stdin, or all stdin functions must use th=

e
 "stream" or at least the same buffer.
 The same can be said to std.stdio.File, if we want to make all File
 instances compatible with this way of reading.
 The readFrom API could be changed to use peek/get delegate instead of
 stream, but wouldn't that be such a loss of power ?

That means double-buffering.  So FILE * will be buffering the data for  =

you, then you will also buffer the data in File so you can have access t=
o  =

it.

Plus, that makes File incompatible with C functions (i.e. fscanf) since =
 =

those functions will be unaware of your "unconsumed" buffer.

 | const(char)[] delegate(size_t) myFileStream(File file, size_t size=



)
 | {
 |   char[] chunk =3D new char[size];
 |   int i=3D0;
 |   chunk =3D file.rawRead(chunk); //  Bug?: file is read in binary =



 =

 mode...
 |   return (size_t n)
 |     {
 |       i +=3D n;
 |       if (i>=3Dchunk.length) chunk =3D file.rawRead(chunk);
 |       return chunk[i..$];
 |     };
 | }

 This doesn't work.  What happens to unconsumed data from chunk?  You =


can
 only put one char back on the stream.

 Nothing should be read from file you put in myFileStream if not by the=

 stream itself. Why putting characters back then ?

rawRead removes the characters from the stream.

no API exists that allows you to peek at more than one character for FIL=
E  =

*.  This is part of the problem of why I've been working on a new stdio =
--  =

there is no good direct buffer access.

-Steve

Sep 06 2011

travert phare.normalesup.org (Christophe) writes:

I've had a look at readUntil API, and it's not completely clear. Is the 
delegate supposed to remember what it has read and interpreted so far, 
or does it have to start from scratch each time ? Where could I see an 
implementation of a delegate suitable for readUntil ?

Basically, in both your and my API, a stream is giving some more
characters to a readFrom method, as long as it asks for more. What I am 
not sure is if readFrom is supposed to build the read object like in my 
API , or if it is supposed to be built after with the string returned by 
readUntil.

I think the main difference is that your API is written from the stream 
point of view, whereas my API is written from the point of view of the 
object being read, which will make implementation of readFrom easier by 
the users, who will not have to worry about their delegate being called 
multiple time.

If I have more time, I may look deeper into Phobos stdin and your stdin 
proposal, but I'm not sure I should afford that...
 
In the mean time, I hope I gave you nice ideas to improve your own 
proposal. Here are some more...

I will sum up the different ways to deal with buffering and any one of 
your API for readUntil, and my proposed API:

1/ _use only peek_
-the API is written to peek only one character at a time. You 
definitely lose the possibility for a stream to give a char[] directly 
to the parsing function, even for streams that are not files...

2/ _use c for low level stdin_
-the default stream derived from stdin or from a file peeks only one 
character at a time. Everything works fine with c functions.
-you can still explicitly create a stream object from a File to make 
double buffering and return several characters, but that makes the File 
no longer suitable for c functions, since some unread buffer can be 
hidden in the object performing the streaming operations.

3/ _hack into c functions_
-the default stream stream hacks into FILE* to use it's own internal 
buffer. This may not be easy to implement, but should be feasible by a 
system programmer, shouldn't it ?

4/ _WTH, d should not rely on c functions to do all low level jobs_
-the default stream peeks several characters. c functions are broken.
-you can still rewrite c-like functions. For example, scanf could be the 
same as readf, but would support 0-terminated strings, and be 
implemented as a c-style variadic function (avoiding multiple template 
instanciation which make the generated code so big Walter refuses to 
use it).
-if you need, you can still instanciate a FILE* that will never be seen 
by the d library, and that will work fine with c functions.

5/ _variation on 2 and 4_
- File are still compatible with current Phobos API, and the default 
streaming mode for file only peek one caracter at a time.
- Some new struct can perform file operations in a d-like way that is
incompatible with c function. However, no accessible File object is ever 
created for this structure, so no one will mix c and d read/write 
function.


for it to work, everybody is happy and can start implementing readFrom 
without breakinf any old code (as long as no other changes are made in 

changes will occur at the library level, so it should not break code 

forward to make d a langage that do not rely in c anymore. That may or 
may not be desirable. Some code will have to change, even if my 

everything you want the d-way, while keeping old File working.

One last point: any comments about using writeTo with my "stream" API 
like readFrom ?

-- 
Christophe Travert

Sep 06 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 06 Sep 2011 19:46:35 -0400, Christophe  
<travert phare.normalesup.org> wrote:

 I've had a look at readUntil API, and it's not completely clear. Is the
 delegate supposed to remember what it has read and interpreted so far,
 or does it have to start from scratch each time ?

The start is an index at which new data was added.  The deal is, the  
stream continually appends more data to the array until the delegate is  
satisfied.  The start helps keep some context of "how much of this haven't  
I seen before?"

So depending on how your delegate is implemented, you can avoid reading  
anything before start if you wish.  However, you still have to take into  
account the data prior to start when returning how much data was processed.

I can see where this scheme has its downsides for parsing that needs to  
keep state.  It might be aggravating or even impossible to do this when  
your delegate has to exit when not enough data is present.

However, the buffer default size is something like 10 pages.  So the  
likelyhood that you have to return "get me more data" is pretty low, and  
even if it is, restarting the parsing would be a rare occurrence.

So I agree, a delegate *callable* by the "readFrom" function would be  
preferrable and easier to deal with than using readUntil.

 Where could I see an
 implementation of a delegate suitable for readUntil ?

In the source code for the revamped stdio.  Here is a byChunk range which  
uses it:

https://github.com/schveiguy/phobos/blob/ceb4ec43057d18d42371128a614e81dbec45a5f6/std/stdio.d#L1665

 Basically, in both your and my API, a stream is giving some more
 characters to a readFrom method, as long as it asks for more. What I am
 not sure is if readFrom is supposed to build the read object like in my
 API , or if it is supposed to be built after with the string returned by
 readUntil.

It should be processed while the delegate is called for checking if  
readUntil should be stopped.  In other words, the data returned by  
readUntil will be ignored.

 I think the main difference is that your API is written from the stream
 point of view, whereas my API is written from the point of view of the
 object being read, which will make implementation of readFrom easier by
 the users, who will not have to worry about their delegate being called
 multiple time.

 If I have more time, I may look deeper into Phobos stdin and your stdin
 proposal, but I'm not sure I should afford that...
 In the mean time, I hope I gave you nice ideas to improve your own
 proposal. Here are some more...

Yes, I'm thinking readFrom probably instead of being a readUntil delegate  
itself, should just accept a DInput (or whatever it gets renamed to).   
Then it has the choice of running the show, or just using readUntil.

 I will sum up the different ways to deal with buffering and any one of
 your API for readUntil, and my proposed API:

 1/ _use only peek_
 -the API is written to peek only one character at a time. You
 definitely lose the possibility for a stream to give a char[] directly
 to the parsing function, even for streams that are not files...

I plan in the next iteration of my revamped stdio to implement a peek  
function.  It's actually pretty simple to implement in terms of readUntil:

const(ubyte)[] peek(size_t nbytes)
{
    const(ubyte)[] retval;
    size_t stopCond(const(ubyte)[] data, size_t start)
    {
        retval = data;
        if(data.length == start)
           return 0; // EOF
        return data.length >= nbytes ? 0 : size_t.max;
    }

    readUntil(&stopCond);
    return retval.length > nbytes ? retval[0..nbytes] : retval;
}

 2/ _use c for low level stdin_
 -the default stream derived from stdin or from a file peeks only one
 character at a time. Everything works fine with c functions.
 -you can still explicitly create a stream object from a File to make
 double buffering and return several characters, but that makes the File
 no longer suitable for c functions, since some unread buffer can be
 hidden in the object performing the streaming operations.

If you are going this route, I think you're better off to use a rewritten  
buffering scheme.  You've already lost the only reason to use C stdio to  
begin with -- compatibility with C functions.

 3/ _hack into c functions_
 -the default stream stream hacks into FILE* to use it's own internal
 buffer. This may not be easy to implement, but should be feasible by a
 system programmer, shouldn't it ?

Yes and no.  There are issues:

- What if the implementation is opaque?
- What if you run out of buffer?
- What if the implementation is open-source, but uses static functions?

There are also other issues with FILE * not related to this discussion  
which make it a good idea to avoid.

 4/ _WTH, d should not rely on c functions to do all low level jobs_
 -the default stream peeks several characters. c functions are broken.
 -you can still rewrite c-like functions. For example, scanf could be the
 same as readf, but would support 0-terminated strings, and be
 implemented as a c-style variadic function (avoiding multiple template
 instanciation which make the generated code so big Walter refuses to
 use it).
 -if you need, you can still instanciate a FILE* that will never be seen
 by the d library, and that will work fine with c functions.

 5/ _variation on 2 and 4_
 - File are still compatible with current Phobos API, and the default
 streaming mode for file only peek one caracter at a time.
 - Some new struct can perform file operations in a d-like way that is
 incompatible with c function. However, no accessible File object is ever
 created for this structure, so no one will mix c and d read/write
 function.

This is somewhat what my new strategy is.  Except File will seamlessly  
support both the existing phobos implementation and my new  
implementation.  I'll be outlining how it works once I've settled on the  
API (and I'll probably have implementation ready too).

 One last point: any comments about using writeTo with my "stream" API
 like readFrom ?

I think this is what writeTo (as proposed) already does.

-Steve

Sep 08 2011

travert phare.normalesup.org (Christophe) writes:

"Steven Schveighoffer" , dans le message (digitalmars.D:144156), a
 écrit :
 Where could I see an
 implementation of a delegate suitable for readUntil ?

 
 In the source code for the revamped stdio.  Here is a byChunk range which  
 uses it:

I see. Are you not concerned by the fact that with this API, the input 
stream has to perform heap allocation when its internal buffer is full 
because the delegate could always ask for some more characters. That 
prevents the possibility to make a reading mecanisme that does not 
allocate anything on the stack.

 One last point: any comments about using writeTo with my "stream" API
 like readFrom ?

 
 I think this is what writeTo (as proposed) already does.

the proposed writeTo is:
writeTo(void delegate(const(char)[]) sink, in char[] format).
Here, the writeTo method writes the character in its own buffer, then 
gives it to sink.

A writeTo with a "stream API" would be:
writeTo(char[] delegate(size_t) stream, in char[] format).
Here stream provides a buffer, and writeTo has to use this buffer, then 
tell how much buffer it used to stream.

I'm not sure mine is better, I'm just asking.

-- 
Christophe

Sep 08 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 08 Sep 2011 16:19:28 -0400, Christophe  =

<travert phare.normalesup.org> wrote:

 "Steven Schveighoffer" , dans le message (digitalmars.D:144156), a
  =C3=A9crit :
 Where could I see an
 implementation of a delegate suitable for readUntil ?

 In the source code for the revamped stdio.  Here is a byChunk range  =


 which
 uses it:

 I see. Are you not concerned by the fact that with this API, the input=

 stream has to perform heap allocation when its internal buffer is full=

 because the delegate could always ask for some more characters. That
 prevents the possibility to make a reading mecanisme that does not
 allocate anything on the stack.

No.  The buffer then becomes that much bigger, and less likely to be  =

increased.  In other words, the buffer "adjusts" itself to the largest  =

size needed, then becomes stable.  This is over the lifetime of the inpu=
t  =

stream, not just for this parse.

 One last point: any comments about using writeTo with my "stream" AP=



I
 like readFrom ?

 I think this is what writeTo (as proposed) already does.

 the proposed writeTo is:
 writeTo(void delegate(const(char)[]) sink, in char[] format).
 Here, the writeTo method writes the character in its own buffer, then
 gives it to sink.

 A writeTo with a "stream API" would be:
 writeTo(char[] delegate(size_t) stream, in char[] format).
 Here stream provides a buffer, and writeTo has to use this buffer, the=

n
 tell how much buffer it used to stream.

 I'm not sure mine is better, I'm just asking.

Oh, ok.  I don't know how the performance would differ.  It's an  =

interesting proposition, since it has the potential to save on copying  =

data.

However, it does require that the stream allocate and maintain a  =

heap-allocated buffer.  There are some cases where such buffering is  =

overkill.  The writeTo method for a type might have a much better idea o=
f  =

how much data is required (or even an upper limit) and can allocate all =
 =

the buffer required on the stack.

-Steve

Sep 08 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/6/11 7:18 AM, Christophe wrote:
 size_t readFrom(const(char)[] data, size_t start); // same as
 readUntil delegate

 What happens if the buffer data get exhausted ? The function calling
 readFrom has no way to know how many characters to put into data to
 allow the read.
 What is the point of start ?

 We could use a delegate to return new characters:

 void readFrom(const(char)[] delegate(size_t) stream,
                in char[] format = null);

This won't work for cases such as "parse digits until a non-digit is 
found, but don't discard that non-digit".

Reading is considerably more difficult than writing. I think it's fair 
to leave it to more sophisticated APIs than one delegate.


Andrei

Sep 06 2011

travert phare.normalesup.org (Christophe) writes:

Andrei Alexandrescu , dans le message (digitalmars.D:144012), a écrit :
 On 9/6/11 7:18 AM, Christophe wrote:
 size_t readFrom(const(char)[] data, size_t start); // same as
 readUntil delegate

 What happens if the buffer data get exhausted ? The function calling
 readFrom has no way to know how many characters to put into data to
 allow the read.
 What is the point of start ?

 We could use a delegate to return new characters:

 void readFrom(const(char)[] delegate(size_t) stream,
                in char[] format = null);

 
 This won't work for cases such as "parse digits until a non-digit is 
 found, but don't discard that non-digit".

It does, since the characters are only discarded at the next call to 
stream(n), according to the value n. See my answer to Steve.

I first considered a slightly more complicated API:

void readFrom(const(char())[] delegate() stream, void delegate(size_t) 
nread)
{
  auto buf = stream();
  // .. do things and count the number of readCharacters
  nread(n);
}

but:

void readFrom(const(char())[] delegate(size_t) stream)
{
  auto buf = stream(0);
  // .. do things and count the number of readCharacters
  stream(n);
}

Works about as good, and is IMO simpler.

 Reading is considerably more difficult than writing. I think it's fair 
 to leave it to more sophisticated APIs than one delegate.

Maybe.

-- 
Christophe Travert

Sep 06 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-04 06:06, Andrei Alexandrescu wrote:
 On 8/30/11 8:59 PM, Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?

 [snip]

 I agree there are major inefficiencies and composability problems caused
 by a blind toString() that creates a whole new string without any
 assistance. So we need to fix that.

 There are suggestions to add this method to Object:

 void writeTo(void delegate(const(char)[]) sink, string format = null);

 Then, the suggestion goes, whether or not we deprecate toString, in the
 short term it should be implemented in terms of writeTo.

I see no reason to deprecate toString. toString could just call writeTo 
and do some standard formatting.

 There are a few questions raised by this proposal:

 1. Okay, this takes care of streaming text. How about streaming in
 binary format?

 2. Since we have a relatively involved "output to text" routine, how
 about an "input from text" routine? If writeTo is there, where is readFrom?


 Andrei


-- 
/Jacob Carlborg

Sep 04 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 04.09.2011, 06:06 Uhr, schrieb Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org>:

 1. Okay, this takes care of streaming text. How about streaming in  
 binary format?

Doesn't that come down to using a serialization API like Orange?
- The text format protocols I used all worked with primitive types and  
have their own structuring syntax (xml, json, proprietary formats)
- If I wanted to save an object in binary, I'd need the serialization  
library to take care of internal pointers as well. Again it needs some  
higher level logic and introspection to get the whole pointer graph safely  
into a binary blob.
- When working with MPEG-2 data, I could have needed some help to convert  
 from file endian-ness to host endian-ness. I was using Delphi there.
What I want to say is that I know these two use cases for writeTo() in  
binary form: Either it is complex 1:1 serialization of D objects and  
structs or it is for reading and writing portable file formats that often  
need data conversion even for primitive types.

 2. Since we have a relatively involved "output to text" routine, how  
 about an "input from text" routine? If writeTo is there, where is  
 readFrom?

Are you thinking of replicating C++ istream >> functionality here that  
works with friend functions to augment istream with routines to read  
complex data types?

Sep 04 2011

"Robert Jacques" <sandford jhu.edu> writes:

On Sun, 04 Sep 2011 00:06:47 -0400, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 On 8/30/11 8:59 PM, Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?

 [snip]

 I agree there are major inefficiencies and composability problems caused
 by a blind toString() that creates a whole new string without any
 assistance. So we need to fix that.

 There are suggestions to add this method to Object:

 void writeTo(void delegate(const(char)[]) sink, string format = null);

 Then, the suggestion goes, whether or not we deprecate toString, in the
 short term it should be implemented in terms of writeTo.

 There are a few questions raised by this proposal:

 1. Okay, this takes care of streaming text. How about streaming in
 binary format?

 2. Since we have a relatively involved "output to text" routine, how
 about an "input from text" routine? If writeTo is there, where is readFrom?


 Andrei

I'd like to point out that parsing a format string for every object/variable is
very inefficient. I'd recommend having the virtual writeTo function accept
FormatSpec, like the formatValue routines, and then make the writeTo which
takes a format string be final. i.e.:

void writeTo(void delegate(const(char)[]) sink, ref FormatSpec!(Char) format);

void writeTo(void delegate(const(char)[]) sink, string format = null) final {
	auto spec = FormatSpec!char(format);
	writeTo(sink, spec);
}

Sep 04 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sun, 04 Sep 2011 13:51:52 -0400, Robert Jacques <sandford jhu.edu>  
wrote:

 On Sun, 04 Sep 2011 00:06:47 -0400, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:

 On 8/30/11 8:59 PM, Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?

 [snip]

 I agree there are major inefficiencies and composability problems caused
 by a blind toString() that creates a whole new string without any
 assistance. So we need to fix that.

 There are suggestions to add this method to Object:

 void writeTo(void delegate(const(char)[]) sink, string format = null);

 Then, the suggestion goes, whether or not we deprecate toString, in the
 short term it should be implemented in terms of writeTo.

 There are a few questions raised by this proposal:

 1. Okay, this takes care of streaming text. How about streaming in
 binary format?

 2. Since we have a relatively involved "output to text" routine, how
 about an "input from text" routine? If writeTo is there, where is  
 readFrom?


 Andrei

 I'd like to point out that parsing a format string for every  
 object/variable is very inefficient. I'd recommend having the virtual  
 writeTo function accept FormatSpec, like the formatValue routines, and  
 then make the writeTo which takes a format string be final. i.e.:

 void writeTo(void delegate(const(char)[]) sink, ref FormatSpec!(Char)  
 format);

 void writeTo(void delegate(const(char)[]) sink, string format = null)  
 final {
 	auto spec = FormatSpec!char(format);
 	writeTo(sink, spec);
 }

Hm... I haven't delved (yet) into the specifics of how std.format works,  
but it seems like it uses this notion.

One thing which became apparent from a reply to Timon early on in this  
thread.  Say you have a struct like this:

struct S
{
    int x, y, z;
    void writeTo(scope Sink s, const(char)[] format = null)
    {
       formattedWrite(s, "(%d,%d,%d)", x, y, z);
    }
}

How to, say, format the output to be hexadecimal?  I'd expect you'd just  
pass "%x" into write to, but formattedWrite would have to be split into 3  
calls, unless you wanted to heap-allocate a new string.  I suppose you  
could allocate a stack buffer to hold the whole format string, but it gets  
a bit tenuous.

I don't even know if FormatSpec would fix this.  We may need a more  
capable formatting facility, which can do loops, or some new way to do the  
formatting.  Actually, can the format string be a range?  I guess not  
since that would require making writeTo a template.  Unless that range is  
some sort of processor that you always use (like FormatSpec, but able to  
add extra functionality, like looping).

One issue with your idea is that all derived classes will have to alias in  
the overload.

-Steve

Sep 06 2011

kenji hara <k.hara.pg gmail.com> writes:

2011/9/4 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:
 There are suggestions to add this method to Object:

 void writeTo(void delegate(const(char)[]) sink, string format = null);

I think const void toString(scope void delegate(const(char)[]) sink,
string format = null); is more better than that, even if it is
different from DIP9.
That is already used in std.bigint, std.complex, and std.format
already support it.

Kenji Hara

Sep 04 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/05/2011 04:20 AM, kenji hara wrote:
 2011/9/4 Andrei Alexandrescu<SeeWebsiteForEmail erdani.org>:
 There are suggestions to add this method to Object:

 void writeTo(void delegate(const(char)[]) sink, string format = null);

 I think const void toString(scope void delegate(const(char)[]) sink,
 string format = null); is more better than that, even if it is
 different from DIP9.
 That is already used in std.bigint, std.complex, and std.format
 already support it.

 Kenji Hara

Yes, but imho the function name does not document really well what it does.

Sep 04 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/4/11 22:20 EDT, kenji hara wrote:
 2011/9/4 Andrei Alexandrescu<SeeWebsiteForEmail erdani.org>:
 There are suggestions to add this method to Object:

 void writeTo(void delegate(const(char)[]) sink, string format = null);

 I think const void toString(scope void delegate(const(char)[]) sink,
 string format = null); is more better than that, even if it is
 different from DIP9.
 That is already used in std.bigint, std.complex, and std.format
 already support it.

 Kenji Hara

Works for me. Walter?

Andrei

Sep 04 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 9/4/2011 7:34 PM, Andrei Alexandrescu wrote:
 On 9/4/11 22:20 EDT, kenji hara wrote:
 2011/9/4 Andrei Alexandrescu<SeeWebsiteForEmail erdani.org>:
 There are suggestions to add this method to Object:

 void writeTo(void delegate(const(char)[]) sink, string format = null);

 I think const void toString(scope void delegate(const(char)[]) sink,
 string format = null); is more better than that, even if it is
 different from DIP9.
 That is already used in std.bigint, std.complex, and std.format
 already support it.

 Kenji Hara

 Works for me. Walter?

It'll break every D program.

Sep 04 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/5/11 12:22 AM, Walter Bright wrote:
 On 9/4/2011 7:34 PM, Andrei Alexandrescu wrote:
 On 9/4/11 22:20 EDT, kenji hara wrote:
 2011/9/4 Andrei Alexandrescu<SeeWebsiteForEmail erdani.org>:
 There are suggestions to add this method to Object:

 void writeTo(void delegate(const(char)[]) sink, string format = null);

 I think const void toString(scope void delegate(const(char)[]) sink,
 string format = null); is more better than that, even if it is
 different from DIP9.
 That is already used in std.bigint, std.complex, and std.format
 already support it.

 Kenji Hara

 Works for me. Walter?

 It'll break every D program.

Probably you and I have a different thing in mind. I'm thinking of 
adding that alongside the existing toString.

Thinking more about it, I fear that ascribing the two overloads the same 
name will cause problems when e.g. a class overrides one overload thus 
hiding the other.

So we should look for a different name. Which?


Andrei

Sep 05 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/05/2011 03:33 PM, Andrei Alexandrescu wrote:
 On 9/5/11 12:22 AM, Walter Bright wrote:
 On 9/4/2011 7:34 PM, Andrei Alexandrescu wrote:
 On 9/4/11 22:20 EDT, kenji hara wrote:
 2011/9/4 Andrei Alexandrescu<SeeWebsiteForEmail erdani.org>:
 There are suggestions to add this method to Object:

 void writeTo(void delegate(const(char)[]) sink, string format = null);

 I think const void toString(scope void delegate(const(char)[]) sink,
 string format = null); is more better than that, even if it is
 different from DIP9.
 That is already used in std.bigint, std.complex, and std.format
 already support it.

 Kenji Hara

 Works for me. Walter?

 It'll break every D program.

 Probably you and I have a different thing in mind. I'm thinking of
 adding that alongside the existing toString.

 Thinking more about it, I fear that ascribing the two overloads the same
 name will cause problems when e.g. a class overrides one overload thus
 hiding the other.

 So we should look for a different name. Which?

I think writeTo is a suitable name.

Sep 05 2011

David Nadlinger <see klickverbot.at> writes:

On 9/5/11 4:20 AM, kenji hara wrote:
 I think const void toString(scope void delegate(const(char)[]) sink,
 string format = null); is more better than that, even if it is
 different from DIP9.
 That is already used in std.bigint, std.complex, and std.format
 already support it.

Requiring only a scoped delegate certainly makes sense, but I'm not too 
sure about the name – to me, toString() suggests a function returning a 
string, not void.

David

Sep 04 2011

Sean Kelly <sean invisibleduck.org> writes:

On Sep 3, 2011, at 9:06 PM, Andrei Alexandrescu wrote:

 On 8/30/11 8:59 PM, Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?

 [snip]
=20
 I agree there are major inefficiencies and composability problems =

caused by a blind toString() that creates a whole new string without any =
assistance. So we need to fix that.
=20
 There are suggestions to add this method to Object:
=20
 void writeTo(void delegate(const(char)[]) sink, string format =3D =

null);
=20
 Then, the suggestion goes, whether or not we deprecate toString, in =

the short term it should be implemented in terms of writeTo.
=20
 There are a few questions raised by this proposal:
=20
 1. Okay, this takes care of streaming text. How about streaming in =

binary format?
=20
 2. Since we have a relatively involved "output to text" routine, how =

about an "input from text" routine? If writeTo is there, where is =
readFrom?

Right.  Which is why I've suggested in the past that we may want to use =
the serialization calls for toString.=

Sep 05 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 05.09.2011, 19:51 Uhr, schrieb Sean Kelly <sean invisibleduck.org>:

 On Sep 3, 2011, at 9:06 PM, Andrei Alexandrescu wrote:

 On 8/30/11 8:59 PM, Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?

 [snip]

 I agree there are major inefficiencies and composability problems  
 caused by a blind toString() that creates a whole new string without  
 any assistance. So we need to fix that.

 There are suggestions to add this method to Object:

 void writeTo(void delegate(const(char)[]) sink, string format = null);

 Then, the suggestion goes, whether or not we deprecate toString, in the  
 short term it should be implemented in terms of writeTo.

 There are a few questions raised by this proposal:

 1. Okay, this takes care of streaming text. How about streaming in  
 binary format?

 2. Since we have a relatively involved "output to text" routine, how  
 about an "input from text" routine? If writeTo is there, where is  
 readFrom?

 Right.  Which is why I've suggested in the past that we may want to use  
 the serialization calls for toString.

I'm highly skeptical to say the least :). I know there are languages that  
serialize solely through text representations of the data, like  
JavaScript, but I've yet to see this mix work in a systems language. What  
serialization calls do you refer to?

- Marco

Sep 05 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-05 20:37, Marco Leise wrote:
 Am 05.09.2011, 19:51 Uhr, schrieb Sean Kelly <sean invisibleduck.org>:
 Right. Which is why I've suggested in the past that we may want to use
 the serialization calls for toString.

 I'm highly skeptical to say the least :). I know there are languages
 that serialize solely through text representations of the data, like
 JavaScript, but I've yet to see this mix work in a systems language.
 What serialization calls do you refer to?

 - Marco

If we ever get a serialization package in Phobos, Orange for example: 
http://www.dsource.org/projects/orange

-- 
/Jacob Carlborg

Sep 06 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 06.09.2011, 11:12 Uhr, schrieb Jacob Carlborg <doob me.com>:

 On 2011-09-05 20:37, Marco Leise wrote:
 Am 05.09.2011, 19:51 Uhr, schrieb Sean Kelly <sean invisibleduck.org>:
 Right. Which is why I've suggested in the past that we may want to use
 the serialization calls for toString.

 I'm highly skeptical to say the least :). I know there are languages
 that serialize solely through text representations of the data, like
 JavaScript, but I've yet to see this mix work in a systems language.
 What serialization calls do you refer to?

 - Marco

 If we ever get a serialization package in Phobos, Orange for example:  
 http://www.dsource.org/projects/orange

Ok I get the picture, but the details are vague.

- How are pointers printed? As a hex value or as the data they point to  
(flat toString vs. deep toString). A serialization API typically follows  
class references and pointers.

- What do you do with classes that in Java don't inherit the Serializable  
interface. Thread.toString() for example should - in my eyes - print the  
thread id or pointer and the thread name if available, maybe also the  
thread group.

And that's why I keep repeating that toString() is different from  
serialization. It can _assist_ if you know you just want to print all  
members of a struct in their default representation (which is what you  
often want), but not replace it. Maybe that is what Sean meant to say, but  
I wanted to clarify that.

Sep 06 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-06 18:15, Marco Leise wrote:
 Ok I get the picture, but the details are vague.

 - How are pointers printed? As a hex value or as the data they point to
 (flat toString vs. deep toString). A serialization API typically follows
 class references and pointers.

If the pointer points to a value that have been or later will be 
serialized as well it will just print it as a reference. If the pointed 
value is not serialized it will print the pointed data.

 - What do you do with classes that in Java don't inherit the
 Serializable interface. Thread.toString() for example should - in my
 eyes - print the thread id or pointer and the thread name if available,
 maybe also the thread group.

In my Orange "Serializable" isn't needed. It will try to serialize 
everything unless otherwise told, i.e. there's a NonSerialized mixin. 
But for Thread.toString() you would most likely not use the 
serialization library.

 And that's why I keep repeating that toString() is different from
 serialization. It can _assist_ if you know you just want to print all
 members of a struct in their default representation (which is what you
 often want), but not replace it. Maybe that is what Sean meant to say,
 but I wanted to clarify that.

I think that's what Sean is trying to say.

-- 
/Jacob Carlborg

Sep 07 2011

kenji hara <k.hara.pg gmail.com> writes:

I have already posted some pull requests around formatting.







After merging them, we can use const void toString(scope void
delegate(const(char)[]) sink, ...) at all.

If you need custom formatting with class/struct, you can define
toString taking sink.

class UserClass {
    string name;
    double value;

    // taking sink and formatStr version
    const void toString(scope void delegate(const(char)[]) sink,
string formatStr) {
        formattedWrite(sink, "{%s %s}", name, value)
    }
    // taking sink and FormatSpec!char version, more efficiently than above
    const void toString(scope void delegate(const(char)[]) sink,
FormatSpec!char f) {
        std.range.put(sink, '{');
        std.range.put(sink, name);
        std.range.put(sink, ' ');
        formatValue(sink, value, f);
            // To through spec to 'value' field formatting, then
support %s, %g, %a ...
        std.range.put(sink, '}');
    }
}

And if you need heapfied formatting, you can write like follows:

auto obj = new UserClass("name", 1.0);
assert(std.conv.to!string(obj) == "{name 1.0}");  // used Appender +
formatValue internally
assert(std.string.format("%s", obj) == "{name 1.0}");  // ditto

, and if you really need formatting into stack-allocated buffer:

char[20] buf;
char[] result = std.string.sformat(sink[], "%s", obj);  // When
buf.length is insufficient, FormatError is thrown
assert(result == "{name, 1.0}");

Kenji Hara

Sep 04 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/05/2011 04:30 AM, kenji hara wrote:
 I have already posted some pull requests around formatting.







 After merging them, we can use const void toString(scope void
 delegate(const(char)[]) sink, ...) at all.

Great. Thank you.

 If you need custom formatting with class/struct, you can define
 toString taking sink.

 class UserClass {
      string name;
      double value;

      // taking sink and formatStr version
      const void toString(scope void delegate(const(char)[]) sink,
 string formatStr) {
          formattedWrite(sink, "{%s %s}", name, value)
      }
      // taking sink and FormatSpec!char version, more efficiently than above
      const void toString(scope void delegate(const(char)[]) sink,
 FormatSpec!char f) {
          std.range.put(sink, '{');
          std.range.put(sink, name);
          std.range.put(sink, ' ');
          formatValue(sink, value, f);
              // To through spec to 'value' field formatting, then
 support %s, %g, %a ...
          std.range.put(sink, '}');
      }
 }

 And if you need heapfied formatting, you can write like follows:

 auto obj = new UserClass("name", 1.0);
 assert(std.conv.to!string(obj) == "{name 1.0}");  // used Appender +
 formatValue internally

appender is slower than direct appending unless you are dealing with 
quite long arrays. I am not sure it is a good fit for this case, because 
the strings returned are usually quite short.

 assert(std.string.format("%s", obj) == "{name 1.0}");  // ditto

 , and if you really need formatting into stack-allocated buffer:

 char[20] buf;
 char[] result = std.string.sformat(sink[], "%s", obj);  // When
 buf.length is insufficient, FormatError is thrown

I think throwing an Error might be overkill, an Exception should suffice.

 assert(result == "{name, 1.0}");

 Kenji Hara

Sep 04 2011

kenji hara <k.hara.pg gmail.com> writes:

2011/9/5 Timon Gehr <timon.gehr gmx.ch>:
 On 09/05/2011 04:30 AM, kenji hara wrote:
 char[20] buf;
 char[] result =3D std.string.sformat(sink[], "%s", obj); =A0// When
 buf.length is insufficient, FormatError is thrown

 I think throwing an Error might be overkill, an Exception should suffice.

I thought the same issue, so I'm working to fix it. Please wait for a while=
.

Kenji

Sep 04 2011

D Programming

C/C++ Programming

Other

digitalmars.D - toString or not toString