www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - toString or not toString

reply Paul D. Anderson <paul.d.removethis.anderson comcast.andthis.net> writes:
Can someone clarify for me the status and/or direction of string formatting in
D? 

We've got:

1. toString, the object method with no parameters.
2. toString(sink, format)
3. to!String()
4. format
5. writef/writefln
6. write/writeln

I realize these exist for various reasons, some (1,3) are simple (unformatted)
conversions, others (2,4-6) are designed to provide configurable formatting.
The problem is that they are inconsistent with each other.

Using std.bigint as an example: 1, 3, 4 and 6 don't work, or don't work as
expected (to me at least). 1. prints 'BigInt', 3 and 4 are compile errors.

I know bigint is a controversial example because Don has strong feelings
against 1 and favors 2. (See bug #5231). I don't really have an opinion one way
or the other but I need to know what to implement in my arbitrary-precision
floating point module. This obviously relies heavily on bigint.

So, is there a transition underway in the language (or just Phobos) from
toString, writeln and format, to toString(sink,format) and writefln?

Or is this just a divergence of views, both of which are acceptable and we'll
have to get used to choosing one or the other?

Or am I just mistaken in believing there is any significant conflict?

I apologize if this has already been hashed out in the past and, if so, I would
appreciate someone pointing me to that discussion. (Or just the results of the
discussion.)

Paul
Aug 30 2011
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, August 30, 2011 20:59:06 Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string formatting
 in D?
 
 We've got:
 
 1. toString, the object method with no parameters.
 2. toString(sink, format)
 3. to!String()
 4. format
 5. writef/writefln
 6. write/writeln
 
 I realize these exist for various reasons, some (1,3) are simple
 (unformatted) conversions, others (2,4-6) are designed to provide
 configurable formatting. The problem is that they are inconsistent with
 each other.
 
 Using std.bigint as an example: 1, 3, 4 and 6 don't work, or don't work as
 expected (to me at least). 1. prints 'BigInt', 3 and 4 are compile errors.
 
 I know bigint is a controversial example because Don has strong feelings
 against 1 and favors 2. (See bug #5231). I don't really have an opinion one
 way or the other but I need to know what to implement in my
 arbitrary-precision floating point module. This obviously relies heavily on
 bigint.
 
 So, is there a transition underway in the language (or just Phobos) from
 toString, writeln and format, to toString(sink,format) and writefln?
 
 Or is this just a divergence of views, both of which are acceptable and
 we'll have to get used to choosing one or the other?
 
 Or am I just mistaken in believing there is any significant conflict?
 
 I apologize if this has already been hashed out in the past and, if so, I
 would appreciate someone pointing me to that discussion. (Or just the
 results of the discussion.)

At this point, it's toString with no parameters. Don's completely out in left field with regards to how things currently work. I believe that BigInt is the _only_ example of toString(sink, format). to!string is what you use when converting generic stuff to a string, and is probably better to use than calling toString directly. format is used when formatting strings separate for printing. write and writeln are for printing strings, and writef and writefln are for printing strings using formatting. I don't understand why there would be any confusion over the printing functions. If you want an automatic newline, then you pick one that ends in ln, and if you want formatting, then you pick one that ends in f (fln for both). The printing functions are not going to change at this point, and neither is format. They're for different purposes. Now, what may change is toString on objects. In part due to Don's stance on the matter, there has been some discussion of creating a new function to replace toString called writeTo, which would be similar to toString(sink, format). It would integrate with std.conv.to, format, and the printing functions. And if you wanted to convert something to a string, you'd use to!string rather than calling writeTo directly. The DIP for it is here: http://www.prowiki.org/wiki4d/wiki.cgi?LanguageDevel/DIPs/DIP9 Unfortunately however, the proposal seems to have gone nowhere thus far. Until it does, pretty much every object is just going to use toString without parameters, and the problems with BigInt's toString remain. However, if the proposal actually gets implemented, then the issue should then be able to be sorted out. Objects would have writeTo and toString would presumably be deprecated. - Jonathan M Davis
Aug 30 2011
next sibling parent reply Don <nospam nospam.com> writes:
On 31.08.2011 04:41, Jonathan M Davis wrote:
 On Tuesday, August 30, 2011 20:59:06 Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string formatting
 in D?

 We've got:

 1. toString, the object method with no parameters.
 2. toString(sink, format)
 3. to!String()
 4. format
 5. writef/writefln
 6. write/writeln

 I realize these exist for various reasons, some (1,3) are simple
 (unformatted) conversions, others (2,4-6) are designed to provide
 configurable formatting. The problem is that they are inconsistent with
 each other.

 Using std.bigint as an example: 1, 3, 4 and 6 don't work, or don't work as
 expected (to me at least). 1. prints 'BigInt', 3 and 4 are compile errors.

 I know bigint is a controversial example because Don has strong feelings
 against 1 and favors 2. (See bug #5231). I don't really have an opinion one
 way or the other but I need to know what to implement in my
 arbitrary-precision floating point module. This obviously relies heavily on
 bigint.

 So, is there a transition underway in the language (or just Phobos) from
 toString, writeln and format, to toString(sink,format) and writefln?

 Or is this just a divergence of views, both of which are acceptable and
 we'll have to get used to choosing one or the other?

 Or am I just mistaken in believing there is any significant conflict?

 I apologize if this has already been hashed out in the past and, if so, I
 would appreciate someone pointing me to that discussion. (Or just the
 results of the discussion.)

At this point, it's toString with no parameters. Don's completely out in left field with regards to how things currently work. I believe that BigInt is the _only_ example of toString(sink, format).

Try to write BigFloat such that: BigFloat f = 2.3e69; writefln("%f %g", f, f); will work. It's just not possible. toString with no parameters does not work, and CANNOT work. It just can't. I implemented something quickly which actually works. But, it's just a stop-gap measure until this black hole in the language gets fixed. We really need the format string to be exposed in a digested manner.
Aug 30 2011
parent reply Don <nospam nospam.com> writes:
On 31.08.2011 09:03, Jonathan M Davis wrote:
 On Wednesday, August 31, 2011 08:53:29 Don wrote:
 On 31.08.2011 04:41, Jonathan M Davis wrote:
 On Tuesday, August 30, 2011 20:59:06 Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?

 We've got:

 1. toString, the object method with no parameters.
 2. toString(sink, format)
 3. to!String()
 4. format
 5. writef/writefln
 6. write/writeln

 I realize these exist for various reasons, some (1,3) are simple
 (unformatted) conversions, others (2,4-6) are designed to provide
 configurable formatting. The problem is that they are inconsistent
 with
 each other.

 Using std.bigint as an example: 1, 3, 4 and 6 don't work, or don't
 work as expected (to me at least). 1. prints 'BigInt', 3 and 4 are
 compile errors.

 I know bigint is a controversial example because Don has strong
 feelings
 against 1 and favors 2. (See bug #5231). I don't really have an
 opinion one way or the other but I need to know what to implement in
 my
 arbitrary-precision floating point module. This obviously relies
 heavily on bigint.

 So, is there a transition underway in the language (or just Phobos)
 from
 toString, writeln and format, to toString(sink,format) and writefln?

 Or is this just a divergence of views, both of which are acceptable
 and
 we'll have to get used to choosing one or the other?

 Or am I just mistaken in believing there is any significant conflict?

 I apologize if this has already been hashed out in the past and, if
 so, I would appreciate someone pointing me to that discussion. (Or
 just the results of the discussion.)

At this point, it's toString with no parameters. Don's completely out in left field with regards to how things currently work. I believe that BigInt is the _only_ example of toString(sink, format).

Try to write BigFloat such that: BigFloat f = 2.3e69; writefln("%f %g", f, f); will work. It's just not possible. toString with no parameters does not work, and CANNOT work. It just can't. I implemented something quickly which actually works. But, it's just a stop-gap measure until this black hole in the language gets fixed. We really need the format string to be exposed in a digested manner.

The thing is that all most people care about is converting the object to a string. They don't care about %d vs %x or %f vs %g. They just want it to be converted to a string. So, the lack of a toString is a major impediment.

That is simply not true. When you print a floating-point number, you almost ALWAYS use a format string.
 Now, I can definitely see why you would want to have toString/writeTo work with
 a format string, and I think that ultimately writeTo is probably a good
 solution (though it does seem to be a bit overkill for the average situation),
 but the truth of the matter is that while your concerns are perfectly valid,
 most people wouldn't even think of the issues that you're seeing with BigInt
 or BigFloat. They just want to convert them to a string as decimal values.

Compare with C++ iostreams. There are two fundamental features: (1) the formatting to be used is specified; (2) components are output piece-by-piece. The inability to do either of these things is not a minor limitation of string toString(). They are absolutely fundamental. Now, once you have the full functionality, you can think about how to simplify the common, trivial cases. But you cannot argue the other way.
Aug 31 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 08/31/2011 10:51 AM, Don wrote:
 On 31.08.2011 09:03, Jonathan M Davis wrote:
 On Wednesday, August 31, 2011 08:53:29 Don wrote:
 On 31.08.2011 04:41, Jonathan M Davis wrote:
 On Tuesday, August 30, 2011 20:59:06 Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?

 We've got:

 1. toString, the object method with no parameters.
 2. toString(sink, format)
 3. to!String()
 4. format
 5. writef/writefln
 6. write/writeln

 I realize these exist for various reasons, some (1,3) are simple
 (unformatted) conversions, others (2,4-6) are designed to provide
 configurable formatting. The problem is that they are inconsistent
 with
 each other.

 Using std.bigint as an example: 1, 3, 4 and 6 don't work, or don't
 work as expected (to me at least). 1. prints 'BigInt', 3 and 4 are
 compile errors.

 I know bigint is a controversial example because Don has strong
 feelings
 against 1 and favors 2. (See bug #5231). I don't really have an
 opinion one way or the other but I need to know what to implement in
 my
 arbitrary-precision floating point module. This obviously relies
 heavily on bigint.

 So, is there a transition underway in the language (or just Phobos)
 from
 toString, writeln and format, to toString(sink,format) and writefln?

 Or is this just a divergence of views, both of which are acceptable
 and
 we'll have to get used to choosing one or the other?

 Or am I just mistaken in believing there is any significant conflict?

 I apologize if this has already been hashed out in the past and, if
 so, I would appreciate someone pointing me to that discussion. (Or
 just the results of the discussion.)

At this point, it's toString with no parameters. Don's completely out in left field with regards to how things currently work. I believe that BigInt is the _only_ example of toString(sink, format).

Try to write BigFloat such that: BigFloat f = 2.3e69; writefln("%f %g", f, f); will work. It's just not possible. toString with no parameters does not work, and CANNOT work. It just can't. I implemented something quickly which actually works. But, it's just a stop-gap measure until this black hole in the language gets fixed. We really need the format string to be exposed in a digested manner.

The thing is that all most people care about is converting the object to a string. They don't care about %d vs %x or %f vs %g. They just want it to be converted to a string. So, the lack of a toString is a major impediment.

That is simply not true. When you print a floating-point number, you almost ALWAYS use a format string.
 Now, I can definitely see why you would want to have toString/writeTo
 work with
 a format string, and I think that ultimately writeTo is probably a good
 solution (though it does seem to be a bit overkill for the average
 situation),
 but the truth of the matter is that while your concerns are perfectly
 valid,
 most people wouldn't even think of the issues that you're seeing with
 BigInt
 or BigFloat. They just want to convert them to a string as decimal
 values.

Compare with C++ iostreams. There are two fundamental features: (1) the formatting to be used is specified; (2) components are output piece-by-piece. The inability to do either of these things is not a minor limitation of string toString(). They are absolutely fundamental. Now, once you have the full functionality, you can think about how to simplify the common, trivial cases. But you cannot argue the other way.

(3) data can be output with standard formatting in case one does not care. I claim that is used most of the time. The inability to use to!string(bigint) or writeln(bigint) is not a minor limitation either. What stops bigint from having an overload string toString(){string r; toString((const(char)[] x){r~=x;},"d");return r;} that would actually work with the rest of current Phobos? The current design has never been any more than an annoyance to me. Most generic code that wants to work with BigInt has to specifically test for it. That is unacceptable. It is very good to have the possibility to specify formatting but 1. It should not be the only way to do things. 2. The method should really not be called toString, but writeTo.
Aug 31 2011
prev sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.

I have never understood the rationale behind deprecating toString once we have writeTo. Why should it be deprecated? toString is great in case you just want to quickly and easily convert something to a string, and later, if formatting or more efficient output etc. is needed, the method can transparently be replaced by writeTo.
Aug 31 2011
parent reply Don <nospam nospam.com> writes:
On 31.08.2011 14:35, Timon Gehr wrote:
 On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.

I have never understood the rationale behind deprecating toString once we have writeTo. Why should it be deprecated?

Code bloat. Every struct contains string toString(). Quite unnecessarily, since it can always be synthesized from the more complete version. toString is great in case
 you just want to quickly and easily convert something to a string, and
 later, if formatting or more efficient output etc. is needed, the method
 can transparently be replaced by writeTo.

BTW, you do realize that code using writeTo is shorter in most cases? The reason is, that it can omit all the calls to format(). Pretty much the only time when toString is simpler, is when it is a single call to format(). It's only really the signature which is more complicated.
Sep 01 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09/01/2011 09:41 PM, Don wrote:
 On 31.08.2011 14:35, Timon Gehr wrote:
 On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.

I have never understood the rationale behind deprecating toString once we have writeTo. Why should it be deprecated?

Code bloat. Every struct contains string toString(). Quite unnecessarily, since it can always be synthesized from the more complete version.

I was just suggesting to keep the existing support for toString() inside to, format etc. Of course, all the structs in Phobos should probably completely migrate to writeTo.
 toString is great in case
 you just want to quickly and easily convert something to a string, and
 later, if formatting or more efficient output etc. is needed, the method
 can transparently be replaced by writeTo.

BTW, you do realize that code using writeTo is shorter in most cases? The reason is, that it can omit all the calls to format(). Pretty much the only time when toString is simpler, is when it is a single call to format(). It's only really the signature which is more complicated.

I am not convinced: struct S{ int x,y,z; void writeTo(void delegate(const(char)[]) sink, string format = null){ sink("("); .writeTo(x,sink,"d"); // still no UFCS sink(", "); .writeTo(y,sink,"d"); sink(", "); .writeTo(z,sink,"d"); sink(")"); } string toString(){return "("~join(map!(to!string)([x,y,z]),", ")~")";} }
Sep 01 2011
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09/01/2011 10:57 PM, Steven Schveighoffer wrote:
 On Thu, 01 Sep 2011 16:26:55 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/01/2011 09:41 PM, Don wrote:
 On 31.08.2011 14:35, Timon Gehr wrote:
 On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.

I have never understood the rationale behind deprecating toString once we have writeTo. Why should it be deprecated?

Code bloat. Every struct contains string toString(). Quite unnecessarily, since it can always be synthesized from the more complete version.

I was just suggesting to keep the existing support for toString() inside to, format etc. Of course, all the structs in Phobos should probably completely migrate to writeTo.

Why? It's trivial to write writeTo if you have already written toString. There is a clear deprecation path in the DIP.

Exactly that is my point, it is trivial and tedious.
 toString is great in case
 you just want to quickly and easily convert something to a string, and
 later, if formatting or more efficient output etc. is needed, the
 method
 can transparently be replaced by writeTo.

BTW, you do realize that code using writeTo is shorter in most cases? The reason is, that it can omit all the calls to format(). Pretty much the only time when toString is simpler, is when it is a single call to format(). It's only really the signature which is more complicated.

I am not convinced: struct S{ int x,y,z; void writeTo(void delegate(const(char)[]) sink, string format = null){ sink("("); .writeTo(x,sink,"d"); // still no UFCS sink(", "); .writeTo(y,sink,"d"); sink(", "); .writeTo(z,sink,"d"); sink(")"); } string toString(){return "("~join(map!(to!string)([x,y,z]),", ")~")";} }

Um... formattedWrite(sink, "(%d, %d, %d)", x, y, z); -Steve

I see. =). Still, the signature of writeTo is about as large as my entire toString function.
Sep 01 2011
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09/02/2011 03:47 PM, Steven Schveighoffer wrote:
 On Thu, 01 Sep 2011 17:09:30 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/01/2011 10:57 PM, Steven Schveighoffer wrote:
 On Thu, 01 Sep 2011 16:26:55 -0400, Timon Gehr <timon.gehr gmx.ch>
 wrote:

 On 09/01/2011 09:41 PM, Don wrote:
 On 31.08.2011 14:35, Timon Gehr wrote:
 On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.

I have never understood the rationale behind deprecating toString once we have writeTo. Why should it be deprecated?

Code bloat. Every struct contains string toString(). Quite unnecessarily, since it can always be synthesized from the more complete version.

I was just suggesting to keep the existing support for toString() inside to, format etc. Of course, all the structs in Phobos should probably completely migrate to writeTo.

Why? It's trivial to write writeTo if you have already written toString. There is a clear deprecation path in the DIP.

Exactly that is my point, it is trivial and tedious.

Here, use this: const writeToImpl = "void writeTo(void delegate(const(char)[]) sink, string format = null) { sink(this.toString()); }"; // put this line in all your classes/structs where you don't feel like writing a proper writeTo. mixin(writeToImpl);
 toString is great in case
 you just want to quickly and easily convert something to a string,
 and
 later, if formatting or more efficient output etc. is needed, the
 method
 can transparently be replaced by writeTo.

BTW, you do realize that code using writeTo is shorter in most cases? The reason is, that it can omit all the calls to format(). Pretty much the only time when toString is simpler, is when it is a single call to format(). It's only really the signature which is more complicated.

I am not convinced: struct S{ int x,y,z; void writeTo(void delegate(const(char)[]) sink, string format = null){ sink("("); .writeTo(x,sink,"d"); // still no UFCS sink(", "); .writeTo(y,sink,"d"); sink(", "); .writeTo(z,sink,"d"); sink(")"); } string toString(){return "("~join(map!(to!string)([x,y,z]),", ")~")";} }

Um... formattedWrite(sink, "(%d, %d, %d)", x, y, z); -Steve

I see. =). Still, the signature of writeTo is about as large as my entire toString function.

The sink type could be aliased. But this is really getting into minor issues :) The amount of power and performance you get by switching to writeTo is well worth the extra parameters.

I don't agree those are minor, because this is going into the standard library and should respect all use cases. Basically, what should be done is: 1. provide an alias void delegate(const(char)[]) Sink; This should be in std.conv; or std.format;, because nobody wants to add it to every single module and if there is a standard way to handle it, no maintenance programmer will be confused by alias. 2. the format parameter should be completely optional in the signature. Because then, writeTo wins not only at the efficiency and flexibility part, but also on the 'pleasant to write' part. void writeTo(Sink s){ ... } string toString(){ return ... }
Sep 02 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09/02/2011 06:15 PM, Steven Schveighoffer wrote:
 On Fri, 02 Sep 2011 12:04:08 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/02/2011 03:47 PM, Steven Schveighoffer wrote:

 The sink type could be aliased. But this is really getting into minor
 issues :) The amount of power and performance you get by switching to
 writeTo is well worth the extra parameters.

I don't agree those are minor, because this is going into the standard library and should respect all use cases. Basically, what should be done is: 1. provide an alias void delegate(const(char)[]) Sink; This should be in std.conv; or std.format;, because nobody wants to add it to every single module and if there is a standard way to handle it, no maintenance programmer will be confused by alias.

it needs to go into object.di, because Object needs it.

Object could, in theory, just use delegate(const(char)[]). But I agree that putting it in object.di would be the cleanest solution.
 2. the format parameter should be completely optional in the signature.

This is probably impossible. Just for the object case alone, writeTo need to be declared in Object, which means you'd have to override it with the same parameters.

Oh, yes, for classes it cannot work. But structs are more flexible.
 It's one of the reasons the sink has to stick with one char width.

Probably the library code should still make use of structs or classes that provide the appropriate overloads. If somebody is in desperate need of having, say, a dchar sink for their classes, they could then define an own root class.
 Because then, writeTo wins not only at the efficiency and flexibility
 part, but also on the 'pleasant to write' part.

 void writeTo(Sink s){ ... }
 string toString(){ return ... }

I think this works if you want to ignore the format string: void writeTo(Sink s, string) {...} Probably the best we can get.

For classes the best we can get is override void writeTo(Sink s, string) {...} Because override adds quite some bloat anyways, the additional ignored string argument is not a big issue. But structs are more flexible than that.
Sep 02 2011
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09/02/2011 07:46 PM, Steven Schveighoffer wrote:
 On Fri, 02 Sep 2011 13:17:02 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/02/2011 06:15 PM, Steven Schveighoffer wrote:
 On Fri, 02 Sep 2011 12:04:08 -0400, Timon Gehr <timon.gehr gmx.ch>
 wrote:
 2. the format parameter should be completely optional in the signature.

This is probably impossible. Just for the object case alone, writeTo need to be declared in Object, which means you'd have to override it with the same parameters.

Oh, yes, for classes it cannot work. But structs are more flexible.

Yes and no. There is a kludgy "interface" that all structs provide. Its value is somewhat suspect, but it allows some RTTI for structs. For example the xtoString member of the TypeInfo_Struct. It's arguable that the value of this interface is very low -- currently it enables things like the builtin sort property on arrays (which I think should be abolished ASAP), and allows AA's current implementation (which does not use templates).

I did know that there was some RTTI for the inefficient built-in sort, but I did not know that xtoString is in that interface. So basically, rethinking struct RTTI and changing the compiler to reflect that is the main thing that makes the DIP unpleasant to implement?
 It's one of the reasons the sink has to stick with one char width.

Probably the library code should still make use of structs or classes that provide the appropriate overloads. If somebody is in desperate need of having, say, a dchar sink for their classes, they could then define an own root class.

It's actually probably a benefit to stick with char: 1. That's the default output width for streams 2. It's the default width for what most people consider strings (in fact, the string type). 3. It's pretty simple to convert char[] to wchar[] or dchar[], without incurring much penalty. I think the library might be able to, in the future, deal with templated writeTo, but there are many things that would need changing.

I guess 'properly' supporting wchar and dchar it is not a high priority anyways.
 Because then, writeTo wins not only at the efficiency and flexibility
 part, but also on the 'pleasant to write' part.

 void writeTo(Sink s){ ... }
 string toString(){ return ... }

I think this works if you want to ignore the format string: void writeTo(Sink s, string) {...} Probably the best we can get.

For classes the best we can get is override void writeTo(Sink s, string) {...} Because override adds quite some bloat anyways, the additional ignored string argument is not a big issue. But structs are more flexible than that.

Yes, I wouldn't be sorry to see the special treatment of certain struct functions go away (i.e. the kludgy "interface" mentioned above). In that case, making the format part optional is fine for structs.

I would be very happy to see struct RTTI go away, together with built-in sort. Are there other features that rely on struct RTTI or is it only built-in sort and AAs?
Sep 02 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 09/03/2011 01:38 AM, Timon Gehr wrote:
 I would be very happy to see struct RTTI go away, together with built-in
 sort. Are there other features that rely on struct RTTI or is it only
 built-in sort and AAs?

And the GC that needs to call destructors obv.
Sep 02 2011
prev sibling parent reply travert phare.normalesup.org (Christophe) writes:
 1. provide an alias void delegate(const(char)[]) Sink; This should 
 be in std.conv; or std.format;, because nobody wants to add it to 
 every single module and if there is a standard way to handle it, no 
 maintenance programmer will be confused by alias.

it needs to go into object.di, because Object needs it.

Object could, in theory, just use delegate(const(char)[]). But I agree that putting it in object.di would be the cleanest solution.

I disagree. void delegate(const(char)[]) means something, whereas Sink is rather obscure. Providing the alias in the library seems fine, but providing it in the langage is too much IMO.
Sep 03 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/3/11 5:41 AM, Christophe wrote:
 1. provide an alias void delegate(const(char)[]) Sink; This should
 be in std.conv; or std.format;, because nobody wants to add it to
 every single module and if there is a standard way to handle it, no
 maintenance programmer will be confused by alias.

it needs to go into object.di, because Object needs it.

Object could, in theory, just use delegate(const(char)[]). But I agree that putting it in object.di would be the cleanest solution.

I disagree. void delegate(const(char)[]) means something, whereas Sink is rather obscure. Providing the alias in the library seems fine, but providing it in the langage is too much IMO.

Even in the library "Sink" is too vague to be useful as a top-level symbol. Andrei
Sep 03 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09/03/2011 07:21 PM, Andrei Alexandrescu wrote:
 On 9/3/11 5:41 AM, Christophe wrote:
 1. provide an alias void delegate(const(char)[]) Sink; This should
 be in std.conv; or std.format;, because nobody wants to add it to
 every single module and if there is a standard way to handle it, no
 maintenance programmer will be confused by alias.

it needs to go into object.di, because Object needs it.

Object could, in theory, just use delegate(const(char)[]). But I agree that putting it in object.di would be the cleanest solution.

I disagree. void delegate(const(char)[]) means something, whereas Sink is rather obscure.

 providing it in the langage is too much IMO.

Even in the library "Sink" is too vague to be useful as a top-level symbol. Andrei

I am quite sure that useful is the same as short/too vague in this case. Are you suggesting not to add an alias at all?
Sep 03 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/3/11 4:40 PM, Timon Gehr wrote:
 On 09/03/2011 07:21 PM, Andrei Alexandrescu wrote:
 On 9/3/11 5:41 AM, Christophe wrote:
 1. provide an alias void delegate(const(char)[]) Sink; This should
 be in std.conv; or std.format;, because nobody wants to add it to
 every single module and if there is a standard way to handle it, no
 maintenance programmer will be confused by alias.

it needs to go into object.di, because Object needs it.

Object could, in theory, just use delegate(const(char)[]). But I agree that putting it in object.di would be the cleanest solution.

I disagree. void delegate(const(char)[]) means something, whereas Sink is rather obscure.

 providing it in the langage is too much IMO.

Even in the library "Sink" is too vague to be useful as a top-level symbol. Andrei

I am quite sure that useful is the same as short/too vague in this case. Are you suggesting not to add an alias at all?

There are vastly better names than Sink. TextSink, TextWriter, StringWriter, StringSink (heh), StringStreamer, ... Andrei
Sep 03 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09/04/2011 05:46 AM, Andrei Alexandrescu wrote:
 On 9/3/11 4:40 PM, Timon Gehr wrote:
 On 09/03/2011 07:21 PM, Andrei Alexandrescu wrote:
 On 9/3/11 5:41 AM, Christophe wrote:
 1. provide an alias void delegate(const(char)[]) Sink; This should
 be in std.conv; or std.format;, because nobody wants to add it to
 every single module and if there is a standard way to handle it, no
 maintenance programmer will be confused by alias.

it needs to go into object.di, because Object needs it.

Object could, in theory, just use delegate(const(char)[]). But I agree that putting it in object.di would be the cleanest solution.

I disagree. void delegate(const(char)[]) means something, whereas Sink is rather obscure.

 providing it in the langage is too much IMO.

Even in the library "Sink" is too vague to be useful as a top-level symbol. Andrei

I am quite sure that useful is the same as short/too vague in this case. Are you suggesting not to add an alias at all?

There are vastly better names than Sink. TextSink, TextWriter, StringWriter, StringSink (heh), StringStreamer, ... Andrei

'string' is quite obscure/vague too, if you don't know what it is. The 'string' alias in object.di should probably be renamed to TailImmutableDynamicCharArray. :o) imho better = shorter, because that is the whole point of providing an alias. 'Sink' stops being vague as soon as people know what it is. That should be quite early.
Sep 05 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/5/11 8:38 AM, Timon Gehr wrote:
 On 09/04/2011 05:46 AM, Andrei Alexandrescu wrote:
 On 9/3/11 4:40 PM, Timon Gehr wrote:
 On 09/03/2011 07:21 PM, Andrei Alexandrescu wrote:
 On 9/3/11 5:41 AM, Christophe wrote:
 1. provide an alias void delegate(const(char)[]) Sink; This should
 be in std.conv; or std.format;, because nobody wants to add it to
 every single module and if there is a standard way to handle it, no
 maintenance programmer will be confused by alias.

it needs to go into object.di, because Object needs it.

Object could, in theory, just use delegate(const(char)[]). But I agree that putting it in object.di would be the cleanest solution.

I disagree. void delegate(const(char)[]) means something, whereas Sink is rather obscure.

 providing it in the langage is too much IMO.

Even in the library "Sink" is too vague to be useful as a top-level symbol. Andrei

I am quite sure that useful is the same as short/too vague in this case. Are you suggesting not to add an alias at all?

There are vastly better names than Sink. TextSink, TextWriter, StringWriter, StringSink (heh), StringStreamer, ... Andrei

'string' is quite obscure/vague too, if you don't know what it is. The 'string' alias in object.di should probably be renamed to TailImmutableDynamicCharArray. :o)

I strongly disagree. In many programming languages, "string" denotes the default abstraction for text representation. "Sink" has nowhere near that brand power.
 imho better = shorter, because that is the whole point of providing an
 alias. 'Sink' stops being vague as soon as people know what it is. That
 should be quite early.

Again I completely disagree with all these three statements, sorry. Andrei
Sep 05 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 09/05/2011 03:29 PM, Andrei Alexandrescu wrote:
 On 9/5/11 8:38 AM, Timon Gehr wrote:
 On 09/04/2011 05:46 AM, Andrei Alexandrescu wrote:
 On 9/3/11 4:40 PM, Timon Gehr wrote:
 On 09/03/2011 07:21 PM, Andrei Alexandrescu wrote:
 On 9/3/11 5:41 AM, Christophe wrote:
 1. provide an alias void delegate(const(char)[]) Sink; This should
 be in std.conv; or std.format;, because nobody wants to add it to
 every single module and if there is a standard way to handle
 it, no
 maintenance programmer will be confused by alias.

it needs to go into object.di, because Object needs it.

Object could, in theory, just use delegate(const(char)[]). But I agree that putting it in object.di would be the cleanest solution.

I disagree. void delegate(const(char)[]) means something, whereas Sink is rather obscure.

 providing it in the langage is too much IMO.

Even in the library "Sink" is too vague to be useful as a top-level symbol. Andrei

I am quite sure that useful is the same as short/too vague in this case. Are you suggesting not to add an alias at all?

There are vastly better names than Sink. TextSink, TextWriter, StringWriter, StringSink (heh), StringStreamer, ... Andrei

'string' is quite obscure/vague too, if you don't know what it is. The 'string' alias in object.di should probably be renamed to TailImmutableDynamicCharArray. :o)

I strongly disagree. In many programming languages, "string" denotes the default abstraction for text representation. "Sink" has nowhere near that brand power.

Sure, but that was not a valid argument when the term was introduced. BTW: http://www.google.com/search?channel=fs&q=string&um=1&tbm=isch
 imho better = shorter, because that is the whole point of providing an
 alias. 'Sink' stops being vague as soon as people know what it is. That
 should be quite early.

Again I completely disagree with all these three statements, sorry.

Why would you want to have an alias if not to relieve people from writing cumbersome boilerplate?
Sep 05 2011
prev sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Sep 3, 2011, at 2:41 AM, Christophe wrote:

 1. provide an alias void delegate(const(char)[]) Sink; This should=20=




 be in std.conv; or std.format;, because nobody wants to add it to=20=




 every single module and if there is a standard way to handle it, no=20=




 maintenance programmer will be confused by alias.

it needs to go into object.di, because Object needs it.

Object could, in theory, just use delegate(const(char)[]). But I =


 that putting it in object.di would be the cleanest solution.

I disagree. void delegate(const(char)[]) means something, whereas Sink=20=

 is rather obscure. Providing the alias in the library seems fine, but=20=

 providing it in the langage is too much IMO.

It would be really great if the new toString call could be compatible = with whatever serialization mechanism is added. This probably wouldn't = allow the use of format strings though.=
Sep 05 2011
prev sibling next sibling parent reply kennytm <kennytm gmail.com> writes:
Timon Gehr <timon.gehr gmx.ch> wrote:
 On 09/01/2011 09:41 PM, Don wrote:
 On 31.08.2011 14:35, Timon Gehr wrote:
 On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.
 

I have never understood the rationale behind deprecating toString once we have writeTo. Why should it be deprecated?

Code bloat. Every struct contains string toString(). Quite unnecessarily, since it can always be synthesized from the more complete version.

I was just suggesting to keep the existing support for toString() inside to, format etc. Of course, all the structs in Phobos should probably completely migrate to writeTo.
 
 toString is great in case
 you just want to quickly and easily convert something to a string, and
 later, if formatting or more efficient output etc. is needed, the method
 can transparently be replaced by writeTo.

BTW, you do realize that code using writeTo is shorter in most cases? The reason is, that it can omit all the calls to format(). Pretty much the only time when toString is simpler, is when it is a single call to format(). It's only really the signature which is more complicated.

I am not convinced: struct S{ int x,y,z; void writeTo(void delegate(const(char)[]) sink, string format = null){ sink("("); .writeTo(x,sink,"d"); // still no UFCS sink(", "); .writeTo(y,sink,"d"); sink(", "); .writeTo(z,sink,"d"); sink(")"); } string toString(){return "("~join(map!(to!string)([x,y,z]),", ")~")";} }

to!string of array can support multiple arguments: string toString() { return to!string([x, y, z], "(", ", ", ")"); } and I believe writeTo could be made to accept extra arguments too: struct S { void writeTo(SomeType sink, const char[] format = null) { [x, y, z].writeTo(sink, format, "(", ", ", ")"); } ... } void writeTo(T)(T[] arr, SomeType sink, const char[] format = null, const char[] open = "[", etc) { ... }
Sep 01 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09/01/2011 11:15 PM, kennytm wrote:
 Timon Gehr<timon.gehr gmx.ch>  wrote:
 On 09/01/2011 09:41 PM, Don wrote:
 On 31.08.2011 14:35, Timon Gehr wrote:
 On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.

I have never understood the rationale behind deprecating toString once we have writeTo. Why should it be deprecated?

Code bloat. Every struct contains string toString(). Quite unnecessarily, since it can always be synthesized from the more complete version.

I was just suggesting to keep the existing support for toString() inside to, format etc. Of course, all the structs in Phobos should probably completely migrate to writeTo.
 toString is great in case
 you just want to quickly and easily convert something to a string, and
 later, if formatting or more efficient output etc. is needed, the method
 can transparently be replaced by writeTo.

BTW, you do realize that code using writeTo is shorter in most cases? The reason is, that it can omit all the calls to format(). Pretty much the only time when toString is simpler, is when it is a single call to format(). It's only really the signature which is more complicated.

I am not convinced: struct S{ int x,y,z; void writeTo(void delegate(const(char)[]) sink, string format = null){ sink("("); .writeTo(x,sink,"d"); // still no UFCS sink(", "); .writeTo(y,sink,"d"); sink(", "); .writeTo(z,sink,"d"); sink(")"); } string toString(){return "("~join(map!(to!string)([x,y,z]),", ")~")";} }

to!string of array can support multiple arguments: string toString() { return to!string([x, y, z], "(", ", ", ")"); }

This runs in ~66% of the time of Steve's formattedWrite solution. (if the delegate just appends to some string variable)
 and I believe writeTo could be made to accept extra arguments too:

 struct S {
 void writeTo(SomeType sink, const char[] format = null) {
      [x, y, z].writeTo(sink, format, "(", ", ", ")");
 }
 ...
 }

 void writeTo(T)(T[] arr, SomeType sink, const char[] format = null, const
 char[] open = "[", etc) {
    ...
 }

ok, getting better. But still, I think to!string should remain to be able to use toString if available. (in this case, a 33% speed advantage!)
Sep 01 2011
parent reply Don <nospam nospam.com> writes:
On 01.09.2011 23:35, Timon Gehr wrote:
 On 09/01/2011 11:15 PM, kennytm wrote:
 Timon Gehr<timon.gehr gmx.ch> wrote:
 On 09/01/2011 09:41 PM, Don wrote:
 On 31.08.2011 14:35, Timon Gehr wrote:
 On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.

I have never understood the rationale behind deprecating toString once we have writeTo. Why should it be deprecated?

Code bloat. Every struct contains string toString(). Quite unnecessarily, since it can always be synthesized from the more complete version.

I was just suggesting to keep the existing support for toString() inside to, format etc. Of course, all the structs in Phobos should probably completely migrate to writeTo.
 toString is great in case
 you just want to quickly and easily convert something to a string, and
 later, if formatting or more efficient output etc. is needed, the
 method
 can transparently be replaced by writeTo.

BTW, you do realize that code using writeTo is shorter in most cases? The reason is, that it can omit all the calls to format(). Pretty much the only time when toString is simpler, is when it is a single call to format(). It's only really the signature which is more complicated.

I am not convinced: struct S{ int x,y,z; void writeTo(void delegate(const(char)[]) sink, string format = null){ sink("("); .writeTo(x,sink,"d"); // still no UFCS sink(", "); .writeTo(y,sink,"d"); sink(", "); .writeTo(z,sink,"d"); sink(")"); } string toString(){return "("~join(map!(to!string)([x,y,z]),", ")~")";} }

to!string of array can support multiple arguments: string toString() { return to!string([x, y, z], "(", ", ", ")"); }

This runs in ~66% of the time of Steve's formattedWrite solution. (if the delegate just appends to some string variable)
 and I believe writeTo could be made to accept extra arguments too:

 struct S {
 void writeTo(SomeType sink, const char[] format = null) {
 [x, y, z].writeTo(sink, format, "(", ", ", ")");
 }
 ...
 }

 void writeTo(T)(T[] arr, SomeType sink, const char[] format = null, const
 char[] open = "[", etc) {
 ...
 }

ok, getting better. But still, I think to!string should remain to be able to use toString if available. (in this case, a 33% speed advantage!)

If you're concerned about speed, the writeTo method is much quicker, since it doesn't require any heap activity at all.
Sep 01 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09/02/2011 03:29 AM, Don wrote:
 On 01.09.2011 23:35, Timon Gehr wrote:
 On 09/01/2011 11:15 PM, kennytm wrote:
 Timon Gehr<timon.gehr gmx.ch> wrote:
 On 09/01/2011 09:41 PM, Don wrote:
 On 31.08.2011 14:35, Timon Gehr wrote:
 On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.

I have never understood the rationale behind deprecating toString once we have writeTo. Why should it be deprecated?

Code bloat. Every struct contains string toString(). Quite unnecessarily, since it can always be synthesized from the more complete version.

I was just suggesting to keep the existing support for toString() inside to, format etc. Of course, all the structs in Phobos should probably completely migrate to writeTo.
 toString is great in case
 you just want to quickly and easily convert something to a string,
 and
 later, if formatting or more efficient output etc. is needed, the
 method
 can transparently be replaced by writeTo.

BTW, you do realize that code using writeTo is shorter in most cases? The reason is, that it can omit all the calls to format(). Pretty much the only time when toString is simpler, is when it is a single call to format(). It's only really the signature which is more complicated.

I am not convinced: struct S{ int x,y,z; void writeTo(void delegate(const(char)[]) sink, string format = null){ sink("("); .writeTo(x,sink,"d"); // still no UFCS sink(", "); .writeTo(y,sink,"d"); sink(", "); .writeTo(z,sink,"d"); sink(")"); } string toString(){return "("~join(map!(to!string)([x,y,z]),", ")~")";} }

to!string of array can support multiple arguments: string toString() { return to!string([x, y, z], "(", ", ", ")"); }

This runs in ~66% of the time of Steve's formattedWrite solution. (if the delegate just appends to some string variable)
 and I believe writeTo could be made to accept extra arguments too:

 struct S {
 void writeTo(SomeType sink, const char[] format = null) {
 [x, y, z].writeTo(sink, format, "(", ", ", ")");
 }
 ...
 }

 void writeTo(T)(T[] arr, SomeType sink, const char[] format = null,
 const
 char[] open = "[", etc) {
 ...
 }

ok, getting better. But still, I think to!string should remain to be able to use toString if available. (in this case, a 33% speed advantage!)

If you're concerned about speed, the writeTo method is much quicker, since it doesn't require any heap activity at all.

allocating a new string on the heap always requires heap activity. I was benchmarking to!string with toString and with what would probably be the solution for writeTo, and toString was quicker. writeTo can support a variety of other use cases where it is much quicker of course, and I consider it a very worthy addition. I just think that the support for data types with a toString method should not just disappear.
Sep 02 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 09/02/2011 03:59 PM, Steven Schveighoffer wrote:
 On Fri, 02 Sep 2011 06:17:53 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/02/2011 03:29 AM, Don wrote:

 If you're concerned about speed, the writeTo method is much quicker,
 since it doesn't require any heap activity at all.

allocating a new string on the heap always requires heap activity. I was benchmarking to!string with toString and with what would probably be the solution for writeTo, and toString was quicker.

Simple appending is slow. There are better ways to do it. For example, use Appender.

Appender does not help in this case, I have tested that.
 writeTo should be faster than toString for most cases, on principal that
 it can generally avoid *any* heap allocations. Ideally, to!string should
 use a stack-allocated buffer and idup it to get the final string.

 Of course, toString for simple cases, like printing one integer, can be
 optimized in a toString method better than writeTo.

 writeTo can support a variety of other use cases where it is much
 quicker of course, and I consider it a very worthy addition. I just
 think that the support for data types with a toString method should
 not just disappear.

There are very few (if any) use cases for toString that don't involve showing the result, for which allocating a string is typically wasted cycles/space. toString is not a serializable form of an object, I don't see why it should be encouraged.

Point taken. Is there anything else that stops writeTo from being implemented? The DIP has been around for quite some time now. (and having only toString is certainly not optimal.)
Sep 02 2011
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-09-01 22:26, Timon Gehr wrote:
 I am not convinced:

 struct S{
 int x,y,z;
 void writeTo(void delegate(const(char)[]) sink, string format = null){
 sink("(");
 .writeTo(x,sink,"d"); // still no UFCS
 sink(", ");
 .writeTo(y,sink,"d");
 sink(", ");
 .writeTo(z,sink,"d");
 sink(")");
 }

 string toString(){return "("~join(map!(to!string)([x,y,z]),", ")~")";}
 }

Note that to!string and/or write(f)(ln) could be implemented to inspect the fields and just print them in some standard format. This would allow you to skip implementing toString/writeTo in simple cases like the above. -- /Jacob Carlborg
Sep 01 2011
parent Jacob Carlborg <doob me.com> writes:
On 2011-09-02 19:16, Andrej Mitrovic wrote:
 On 9/2/11, Jacob Carlborg<doob me.com>  wrote:
 Note that to!string and/or write(f)(ln) could be implemented to inspect
 the fields and just print them in some standard format. This would allow
 you to skip implementing toString/writeTo in simple cases like the above.

http://codepad.org/1PZY7YTX But I'm pretty sure this suffers from template instantiation bloat. I had a similar template like this (a bit more complex though) and the compilation speed slowed down considerably on every instantiation.

If this is implemented in std.conv.to, how would that add any more template bloat? -- /Jacob Carlborg
Sep 02 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 02 Sep 2011 06:17:53 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/02/2011 03:29 AM, Don wrote:

 If you're concerned about speed, the writeTo method is much quicker,
 since it doesn't require any heap activity at all.

allocating a new string on the heap always requires heap activity. I was benchmarking to!string with toString and with what would probably be the solution for writeTo, and toString was quicker.

Simple appending is slow. There are better ways to do it. For example, use Appender. writeTo should be faster than toString for most cases, on principal that it can generally avoid *any* heap allocations. Ideally, to!string should use a stack-allocated buffer and idup it to get the final string. Of course, toString for simple cases, like printing one integer, can be optimized in a toString method better than writeTo.
 writeTo can support a variety of other use cases where it is much  
 quicker of course, and I consider it a very worthy addition. I just  
 think that the support for data types with a toString method should not  
 just disappear.

There are very few (if any) use cases for toString that don't involve showing the result, for which allocating a string is typically wasted cycles/space. toString is not a serializable form of an object, I don't see why it should be encouraged. -Steve
Sep 02 2011
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 02 Sep 2011 11:46:19 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/02/2011 03:59 PM, Steven Schveighoffer wrote:
 On Fri, 02 Sep 2011 06:17:53 -0400, Timon Gehr <timon.gehr gmx.ch>  
 wrote:

 On 09/02/2011 03:29 AM, Don wrote:

 If you're concerned about speed, the writeTo method is much quicker,
 since it doesn't require any heap activity at all.

allocating a new string on the heap always requires heap activity. I was benchmarking to!string with toString and with what would probably be the solution for writeTo, and toString was quicker.

Simple appending is slow. There are better ways to do it. For example, use Appender.

Appender does not help in this case, I have tested that.

Two things: - Make sure you give appender a stack-allocated buffer, otherwise you incur a penalty of allocating the appending buffer on the heap - Appender currently allocates its implementation on the heap. We need a stack-based version to make this work the best. I think there is a bug report somewhere, where someone created a patch (not github'd) that provides a start.
 writeTo should be faster than toString for most cases, on principal that
 it can generally avoid *any* heap allocations. Ideally, to!string should
 use a stack-allocated buffer and idup it to get the final string.

 Of course, toString for simple cases, like printing one integer, can be
 optimized in a toString method better than writeTo.

 writeTo can support a variety of other use cases where it is much
 quicker of course, and I consider it a very worthy addition. I just
 think that the support for data types with a toString method should
 not just disappear.

There are very few (if any) use cases for toString that don't involve showing the result, for which allocating a string is typically wasted cycles/space. toString is not a serializable form of an object, I don't see why it should be encouraged.

Point taken. Is there anything else that stops writeTo from being implemented? The DIP has been around for quite some time now. (and having only toString is certainly not optimal.)

No, someone just has to do it. I think the DIP is fairly complete. I might try my hand at it in the coming months if someone else doesn't. I don't have the knowledge to do the compiler pieces yet, but I should be able to get pretty far without that. -Steve
Sep 02 2011
prev sibling next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Paul D. Anderson:

 Can someone clarify for me the status and/or direction of string formatting in
D?

From a practical point of view, a good starting point is to review (and eventually fix) and put this into DMD 2.055: https://github.com/D-Programming-Language/phobos/pull/126 (It doesn't solve the problems with BigInt, and I don't know if it solves bug 6529. But it solves most problems I see with D textual output). Bye, bearophile
Aug 30 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, August 31, 2011 08:53:29 Don wrote:
 On 31.08.2011 04:41, Jonathan M Davis wrote:
 On Tuesday, August 30, 2011 20:59:06 Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?
 
 We've got:
 
 1. toString, the object method with no parameters.
 2. toString(sink, format)
 3. to!String()
 4. format
 5. writef/writefln
 6. write/writeln
 
 I realize these exist for various reasons, some (1,3) are simple
 (unformatted) conversions, others (2,4-6) are designed to provide
 configurable formatting. The problem is that they are inconsistent
 with
 each other.
 
 Using std.bigint as an example: 1, 3, 4 and 6 don't work, or don't
 work as expected (to me at least). 1. prints 'BigInt', 3 and 4 are
 compile errors.
 
 I know bigint is a controversial example because Don has strong
 feelings
 against 1 and favors 2. (See bug #5231). I don't really have an
 opinion one way or the other but I need to know what to implement in
 my
 arbitrary-precision floating point module. This obviously relies
 heavily on bigint.
 
 So, is there a transition underway in the language (or just Phobos)
 from
 toString, writeln and format, to toString(sink,format) and writefln?
 
 Or is this just a divergence of views, both of which are acceptable
 and
 we'll have to get used to choosing one or the other?
 
 Or am I just mistaken in believing there is any significant conflict?
 
 I apologize if this has already been hashed out in the past and, if
 so, I would appreciate someone pointing me to that discussion. (Or
 just the results of the discussion.)

At this point, it's toString with no parameters. Don's completely out in left field with regards to how things currently work. I believe that BigInt is the _only_ example of toString(sink, format).

Try to write BigFloat such that: BigFloat f = 2.3e69; writefln("%f %g", f, f); will work. It's just not possible. toString with no parameters does not work, and CANNOT work. It just can't. I implemented something quickly which actually works. But, it's just a stop-gap measure until this black hole in the language gets fixed. We really need the format string to be exposed in a digested manner.

The thing is that all most people care about is converting the object to a string. They don't care about %d vs %x or %f vs %g. They just want it to be converted to a string. So, the lack of a toString is a major impediment. Now, I can definitely see why you would want to have toString/writeTo work with a format string, and I think that ultimately writeTo is probably a good solution (though it does seem to be a bit overkill for the average situation), but the truth of the matter is that while your concerns are perfectly valid, most people wouldn't even think of the issues that you're seeing with BigInt or BigFloat. They just want to convert them to a string as decimal values. - Jonathan M Davis
Aug 31 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, August 31, 2011 10:51:14 Don wrote:
 On 31.08.2011 09:03, Jonathan M Davis wrote:
 On Wednesday, August 31, 2011 08:53:29 Don wrote:
 On 31.08.2011 04:41, Jonathan M Davis wrote:
 On Tuesday, August 30, 2011 20:59:06 Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?
 
 We've got:
 
 1. toString, the object method with no parameters.
 2. toString(sink, format)
 3. to!String()
 4. format
 5. writef/writefln
 6. write/writeln
 
 I realize these exist for various reasons, some (1,3) are simple
 (unformatted) conversions, others (2,4-6) are designed to provide
 configurable formatting. The problem is that they are inconsistent
 with
 each other.
 
 Using std.bigint as an example: 1, 3, 4 and 6 don't work, or don't
 work as expected (to me at least). 1. prints 'BigInt', 3 and 4 are
 compile errors.
 
 I know bigint is a controversial example because Don has strong
 feelings
 against 1 and favors 2. (See bug #5231). I don't really have an
 opinion one way or the other but I need to know what to implement
 in
 my
 arbitrary-precision floating point module. This obviously relies
 heavily on bigint.
 
 So, is there a transition underway in the language (or just
 Phobos)
 from
 toString, writeln and format, to toString(sink,format) and
 writefln?
 
 Or is this just a divergence of views, both of which are
 acceptable
 and
 we'll have to get used to choosing one or the other?
 
 Or am I just mistaken in believing there is any significant
 conflict?
 
 I apologize if this has already been hashed out in the past and,
 if
 so, I would appreciate someone pointing me to that discussion. (Or
 just the results of the discussion.)

At this point, it's toString with no parameters. Don's completely out in left field with regards to how things currently work. I believe that BigInt is the _only_ example of toString(sink, format).

Try to write BigFloat such that: BigFloat f = 2.3e69; writefln("%f %g", f, f); will work. It's just not possible. toString with no parameters does not work, and CANNOT work. It just can't. I implemented something quickly which actually works. But, it's just a stop-gap measure until this black hole in the language gets fixed. We really need the format string to be exposed in a digested manner.

The thing is that all most people care about is converting the object to a string. They don't care about %d vs %x or %f vs %g. They just want it to be converted to a string. So, the lack of a toString is a major impediment.

almost ALWAYS use a format string.

Actually, I pretty much never do. I can understand wanting to, and I agree that it would be useful to do so with BigFloat, but I disagree that that's what you _always_ want. If anything, I typically avoid using anything other than %s in format strings. Sometimes, you need to be more specific, but for what I do, at least, it's rarely useful.
 Now, I can definitely see why you would want to have toString/writeTo
 work with a format string, and I think that ultimately writeTo is
 probably a good solution (though it does seem to be a bit overkill for
 the average situation), but the truth of the matter is that while your
 concerns are perfectly valid, most people wouldn't even think of the
 issues that you're seeing with BigInt or BigFloat. They just want to
 convert them to a string as decimal values.

(1) the formatting to be used is specified; (2) components are output piece-by-piece. The inability to do either of these things is not a minor limitation of string toString(). They are absolutely fundamental. Now, once you have the full functionality, you can think about how to simplify the common, trivial cases. But you cannot argue the other way.

Java has a toString which doesn't take any arguments, and it works fine. I've never had a problem with it, and the only real issue that I've had with toString in D is the fact that you can't use any attributes such as const or pure on struct's toString thanks to bug# 3659. In fact, before you brought up this issue, it never even occured to me that there was one. So, while I agree with you that we should find a good solution for this issue (and writeTo may very well be it), I disagree that this is generally a big deal. It really does feel to me like you're blowing this issue out of proportion. Maybe you just deal with numeric stuff way more than I do, but I've _never_ felt the lack of a format string with toString. Regardless, while I don't think that this is a big issue, I have no problem with us reworking toString/writeTo so that the issue is fixed. - Jonathan M Davis
Aug 31 2011
prev sibling next sibling parent "Marco Leise" <Marco.Leise gmx.de> writes:
Am 31.08.2011, 11:13 Uhr, schrieb Jonathan M Davis <jmdavisProg gmx.com>:

 Java has a toString which doesn't take any arguments, and it works fine.

Let's look at Java again. They have BigDecimal, so how does that work? First of all they added convenience functions akin to "toString()" to BigDecimal: "toEngineeringString()" and "toPlainString()". Their number formatting class has a method with a long list of cases that distinguishes between the different Number classes including wrapper objects around primitive types and BigDecimal. So they practically have a sealed list of numerical types that are specially formatted, while for the rest "doubleValue()" is called.
Aug 31 2011
prev sibling next sibling parent reply Paul D. Anderson <paul.d.removethis.anderson comcast.andthis.net> writes:
Paul D. Anderson Wrote:

 Can someone clarify for me the status and/or direction of string formatting in
D? 
 
 We've got:
 
 1. toString, the object method with no parameters.
 2. toString(sink, format)
 3. to!String()
 4. format
 5. writef/writefln
 6. write/writeln
 
 I realize these exist for various reasons, some (1,3) are simple (unformatted)
conversions, others (2,4-6) are designed to provide configurable formatting.
The problem is that they are inconsistent with each other.
 
 Using std.bigint as an example: 1, 3, 4 and 6 don't work, or don't work as
expected (to me at least). 1. prints 'BigInt', 3 and 4 are compile errors.
 
 I know bigint is a controversial example because Don has strong feelings
against 1 and favors 2. (See bug #5231). I don't really have an opinion one way
or the other but I need to know what to implement in my arbitrary-precision
floating point module. This obviously relies heavily on bigint.
 
 So, is there a transition underway in the language (or just Phobos) from
toString, writeln and format, to toString(sink,format) and writefln?
 
 Or is this just a divergence of views, both of which are acceptable and we'll
have to get used to choosing one or the other?
 
 Or am I just mistaken in believing there is any significant conflict?
 
 I apologize if this has already been hashed out in the past and, if so, I
would appreciate someone pointing me to that discussion. (Or just the results
of the discussion.)
 
 Paul

So, IIUC, toString has its faults but it has deep-rooted user expectations, while toString(sink, format) [or writeTo(sink, format)] is a better implementation, but the current state of development doesn't have a lot of support for it. With respect to the Java implementation: they provide the two number-to-string functions called out in the specification, i.e., toScientificString and toEngineeringString and use the toScientificString method as the default toString. One of the big advantages of doing this is that the read and write routines are complementary -- writing out a number and reading it back in results in not just the same value, but the same internal representation. (This is one of the goals of the specification.) Based on this, my proposal for the BigDecimal type is to provide similar functionality -- the two functions listed above, with the first being called by the toString function. In addition the toString(sink, format) function will be provided, and/or whatever it takes to work with format and writef. I poked around a little in the std.stdio and std.format source code and I see that stdio.writef calls std.format.formattedWrite, so anything that works for the one should work for the other. I don't know what is required to make to!string work but that is a discussion for another day. Please advise if I've misunderstood. Thanks, Paul
Aug 31 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 08/31/2011 11:00 PM, Paul D. Anderson wrote:
 Paul D. Anderson Wrote:

 Can someone clarify for me the status and/or direction of string formatting in
D?

 We've got:

 1. toString, the object method with no parameters.
 2. toString(sink, format)
 3. to!String()
 4. format
 5. writef/writefln
 6. write/writeln

 I realize these exist for various reasons, some (1,3) are simple (unformatted)
conversions, others (2,4-6) are designed to provide configurable formatting.
The problem is that they are inconsistent with each other.

 Using std.bigint as an example: 1, 3, 4 and 6 don't work, or don't work as
expected (to me at least). 1. prints 'BigInt', 3 and 4 are compile errors.

 I know bigint is a controversial example because Don has strong feelings
against 1 and favors 2. (See bug #5231). I don't really have an opinion one way
or the other but I need to know what to implement in my arbitrary-precision
floating point module. This obviously relies heavily on bigint.

 So, is there a transition underway in the language (or just Phobos) from
toString, writeln and format, to toString(sink,format) and writefln?

 Or is this just a divergence of views, both of which are acceptable and we'll
have to get used to choosing one or the other?

 Or am I just mistaken in believing there is any significant conflict?

 I apologize if this has already been hashed out in the past and, if so, I
would appreciate someone pointing me to that discussion. (Or just the results
of the discussion.)

 Paul

So, IIUC, toString has its faults but it has deep-rooted user expectations, while toString(sink, format) [or writeTo(sink, format)] is a better implementation, but the current state of development doesn't have a lot of support for it. With respect to the Java implementation: they provide the two number-to-string functions called out in the specification, i.e., toScientificString and toEngineeringString and use the toScientificString method as the default toString. One of the big advantages of doing this is that the read and write routines are complementary -- writing out a number and reading it back in results in not just the same value, but the same internal representation. (This is one of the goals of the specification.) Based on this, my proposal for the BigDecimal type is to provide similar functionality -- the two functions listed above, with the first being called by the toString function. In addition the toString(sink, format) function will be provided, and/or whatever it takes to work with format and writef. I poked around a little in the std.stdio and std.format source code and I see that stdio.writef calls std.format.formattedWrite, so anything that works for the one should work for the other. I don't know what is required to make to!string work but that is a discussion for another day. Please advise if I've misunderstood.

I think your approach is what std.bigint should do too. Great! to!string will already work, because you provide the toString() member function.
Aug 31 2011
prev sibling next sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Tue, 30 Aug 2011 19:41:37 -0700, Jonathan M Davis wrote:

 On Tuesday, August 30, 2011 20:59:06 Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?
 
 We've got:
 
 1. toString, the object method with no parameters. 2. toString(sink,
 format)
 3. to!String()
 4. format
 5. writef/writefln
 6. write/writeln
 
 I realize these exist for various reasons, some (1,3) are simple
 (unformatted) conversions, others (2,4-6) are designed to provide
 configurable formatting. The problem is that they are inconsistent with
 each other.
 
 Using std.bigint as an example: 1, 3, 4 and 6 don't work, or don't work
 as expected (to me at least). 1. prints 'BigInt', 3 and 4 are compile
 errors.
 
 I know bigint is a controversial example because Don has strong
 feelings against 1 and favors 2. (See bug #5231). I don't really have
 an opinion one way or the other but I need to know what to implement in
 my arbitrary-precision floating point module. This obviously relies
 heavily on bigint.
 
 So, is there a transition underway in the language (or just Phobos)
 from toString, writeln and format, to toString(sink,format) and
 writefln?
 
 Or is this just a divergence of views, both of which are acceptable and
 we'll have to get used to choosing one or the other?
 
 Or am I just mistaken in believing there is any significant conflict?
 
 I apologize if this has already been hashed out in the past and, if so,
 I would appreciate someone pointing me to that discussion. (Or just the
 results of the discussion.)

At this point, it's toString with no parameters. Don's completely out in left field with regards to how things currently work. I believe that BigInt is the _only_ example of toString(sink, format).

Actually, std.complex.Complex also has toString(sink, format), and Don even fixed the write* functions so that they work with both Complex and BigInt. The following works as you'd expect: BigInt i = "1234567890"; writeln(i); auto z = complex(123.4, 5678.9); writefln("%.10e", z); There is only one important missing piece here: std.conv.to!string should be implemented to call toString with an appropriate sink and "%s" as the format string. -Lars
Sep 01 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 01 Sep 2011 16:26:55 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/01/2011 09:41 PM, Don wrote:
 On 31.08.2011 14:35, Timon Gehr wrote:
 On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.

I have never understood the rationale behind deprecating toString once we have writeTo. Why should it be deprecated?

Code bloat. Every struct contains string toString(). Quite unnecessarily, since it can always be synthesized from the more complete version.

I was just suggesting to keep the existing support for toString() inside to, format etc. Of course, all the structs in Phobos should probably completely migrate to writeTo.

Why? It's trivial to write writeTo if you have already written toString. There is a clear deprecation path in the DIP.
 toString is great in case
 you just want to quickly and easily convert something to a string, and
 later, if formatting or more efficient output etc. is needed, the  
 method
 can transparently be replaced by writeTo.

BTW, you do realize that code using writeTo is shorter in most cases? The reason is, that it can omit all the calls to format(). Pretty much the only time when toString is simpler, is when it is a single call to format(). It's only really the signature which is more complicated.

I am not convinced: struct S{ int x,y,z; void writeTo(void delegate(const(char)[]) sink, string format = null){ sink("("); .writeTo(x,sink,"d"); // still no UFCS sink(", "); .writeTo(y,sink,"d"); sink(", "); .writeTo(z,sink,"d"); sink(")"); } string toString(){return "("~join(map!(to!string)([x,y,z]),", ")~")";} }

Um... formattedWrite(sink, "(%d, %d, %d)", x, y, z); -Steve
Sep 01 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 01 Sep 2011 17:09:30 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/01/2011 10:57 PM, Steven Schveighoffer wrote:
 On Thu, 01 Sep 2011 16:26:55 -0400, Timon Gehr <timon.gehr gmx.ch>  
 wrote:

 On 09/01/2011 09:41 PM, Don wrote:
 On 31.08.2011 14:35, Timon Gehr wrote:
 On 08/31/2011 04:41 AM, Jonathan M Davis wrote:
 Objects would have writeTo and toString would presumably be
 deprecated.

I have never understood the rationale behind deprecating toString once we have writeTo. Why should it be deprecated?

Code bloat. Every struct contains string toString(). Quite unnecessarily, since it can always be synthesized from the more complete version.

I was just suggesting to keep the existing support for toString() inside to, format etc. Of course, all the structs in Phobos should probably completely migrate to writeTo.

Why? It's trivial to write writeTo if you have already written toString. There is a clear deprecation path in the DIP.

Exactly that is my point, it is trivial and tedious.

Here, use this: const writeToImpl = "void writeTo(void delegate(const(char)[]) sink, string format = null) { sink(this.toString()); }"; // put this line in all your classes/structs where you don't feel like writing a proper writeTo. mixin(writeToImpl);
 toString is great in case
 you just want to quickly and easily convert something to a string,  
 and
 later, if formatting or more efficient output etc. is needed, the
 method
 can transparently be replaced by writeTo.

BTW, you do realize that code using writeTo is shorter in most cases? The reason is, that it can omit all the calls to format(). Pretty much the only time when toString is simpler, is when it is a single call to format(). It's only really the signature which is more complicated.

I am not convinced: struct S{ int x,y,z; void writeTo(void delegate(const(char)[]) sink, string format = null){ sink("("); .writeTo(x,sink,"d"); // still no UFCS sink(", "); .writeTo(y,sink,"d"); sink(", "); .writeTo(z,sink,"d"); sink(")"); } string toString(){return "("~join(map!(to!string)([x,y,z]),", ")~")";} }

Um... formattedWrite(sink, "(%d, %d, %d)", x, y, z); -Steve

I see. =). Still, the signature of writeTo is about as large as my entire toString function.

The sink type could be aliased. But this is really getting into minor issues :) The amount of power and performance you get by switching to writeTo is well worth the extra parameters. -Steve
Sep 02 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 02 Sep 2011 12:04:08 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/02/2011 03:47 PM, Steven Schveighoffer wrote:

 The sink type could be aliased. But this is really getting into minor
 issues :) The amount of power and performance you get by switching to
 writeTo is well worth the extra parameters.

I don't agree those are minor, because this is going into the standard library and should respect all use cases. Basically, what should be done is: 1. provide an alias void delegate(const(char)[]) Sink; This should be in std.conv; or std.format;, because nobody wants to add it to every single module and if there is a standard way to handle it, no maintenance programmer will be confused by alias.

it needs to go into object.di, because Object needs it.
 2. the format parameter should be completely optional in the signature.

This is probably impossible. Just for the object case alone, writeTo need to be declared in Object, which means you'd have to override it with the same parameters. It's one of the reasons the sink has to stick with one char width.
 Because then, writeTo wins not only at the efficiency and flexibility  
 part, but also on the 'pleasant to write' part.

 void writeTo(Sink s){ ... }
 string toString(){ return ... }

I think this works if you want to ignore the format string: void writeTo(Sink s, string) {...} Probably the best we can get. -Steve
Sep 02 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 9/2/11, Jacob Carlborg <doob me.com> wrote:
 Note that to!string and/or write(f)(ln) could be implemented to inspect
 the fields and just print them in some standard format. This would allow
 you to skip implementing toString/writeTo in simple cases like the above.

http://codepad.org/1PZY7YTX But I'm pretty sure this suffers from template instantiation bloat. I had a similar template like this (a bit more complex though) and the compilation speed slowed down considerably on every instantiation.
Sep 02 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 02 Sep 2011 13:17:02 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/02/2011 06:15 PM, Steven Schveighoffer wrote:
 On Fri, 02 Sep 2011 12:04:08 -0400, Timon Gehr <timon.gehr gmx.ch>  
 wrote:
 2. the format parameter should be completely optional in the signature.

This is probably impossible. Just for the object case alone, writeTo need to be declared in Object, which means you'd have to override it with the same parameters.

Oh, yes, for classes it cannot work. But structs are more flexible.

Yes and no. There is a kludgy "interface" that all structs provide. Its value is somewhat suspect, but it allows some RTTI for structs. For example the xtoString member of the TypeInfo_Struct. It's arguable that the value of this interface is very low -- currently it enables things like the builtin sort property on arrays (which I think should be abolished ASAP), and allows AA's current implementation (which does not use templates).
 It's one of the reasons the sink has to stick with one char width.

Probably the library code should still make use of structs or classes that provide the appropriate overloads. If somebody is in desperate need of having, say, a dchar sink for their classes, they could then define an own root class.

It's actually probably a benefit to stick with char: 1. That's the default output width for streams 2. It's the default width for what most people consider strings (in fact, the string type). 3. It's pretty simple to convert char[] to wchar[] or dchar[], without incurring much penalty. I think the library might be able to, in the future, deal with templated writeTo, but there are many things that would need changing.
 Because then, writeTo wins not only at the efficiency and flexibility
 part, but also on the 'pleasant to write' part.

 void writeTo(Sink s){ ... }
 string toString(){ return ... }

I think this works if you want to ignore the format string: void writeTo(Sink s, string) {...} Probably the best we can get.

For classes the best we can get is override void writeTo(Sink s, string) {...} Because override adds quite some bloat anyways, the additional ignored string argument is not a big issue. But structs are more flexible than that.

Yes, I wouldn't be sorry to see the special treatment of certain struct functions go away (i.e. the kludgy "interface" mentioned above). In that case, making the format part optional is fine for structs. -Steve
Sep 02 2011
prev sibling next sibling parent reply kenji hara <k.hara.pg gmail.com> writes:
2011/8/31 Jonathan M Davis <jmdavisProg gmx.com>:
 Unfortunately however, the proposal seems to have gone nowhere thus far. Until
 it does, pretty much every object is just going to use toString without
 parameters, and the problems with BigInt's toString remain. However, if the
 proposal actually gets implemented, then the issue should then be able to be
 sorted out. Objects would have writeTo and toString would presumably be
 deprecated.

I have posted pull request to fix BigInt's formatting with writef(ln) <- formattedWrite(). https://github.com/D-Programming-Language/phobos/pull/230 Kenji Hara
Sep 02 2011
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Kenji Hara:

 I have posted pull request to fix BigInt's formatting with writef(ln)
 <- formattedWrite().
 https://github.com/D-Programming-Language/phobos/pull/230

You are doing good work! I hope to see your patches in the final release of DMD 2.055! Bye, bearophile
Sep 02 2011
prev sibling next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 09/02/2011 11:15 PM, kenji hara wrote:
 2011/8/31 Jonathan M Davis<jmdavisProg gmx.com>:
 Unfortunately however, the proposal seems to have gone nowhere thus far. Until
 it does, pretty much every object is just going to use toString without
 parameters, and the problems with BigInt's toString remain. However, if the
 proposal actually gets implemented, then the issue should then be able to be
 sorted out. Objects would have writeTo and toString would presumably be
 deprecated.

I have posted pull request to fix BigInt's formatting with writef(ln) <- formattedWrite(). https://github.com/D-Programming-Language/phobos/pull/230 Kenji Hara

Thank you very much! That is really useful.
Sep 02 2011
prev sibling parent Paul D. Anderson <paul.d.removethis.anderson comcast.andthis.net> writes:
kenji hara Wrote:

 2011/8/31 Jonathan M Davis <jmdavisProg gmx.com>:
 Unfortunately however, the proposal seems to have gone nowhere thus far. Until
 it does, pretty much every object is just going to use toString without
 parameters, and the problems with BigInt's toString remain. However, if the
 proposal actually gets implemented, then the issue should then be able to be
 sorted out. Objects would have writeTo and toString would presumably be
 deprecated.

I have posted pull request to fix BigInt's formatting with writef(ln) <- formattedWrite(). https://github.com/D-Programming-Language/phobos/pull/230 Kenji Hara

There are problems with opCmp as well. The "<" and ">" operators won't compile if either argument is a const BigInt, so a lot of otherwise unnecessary copying is required. You can see this in these functions I've had to add the following functions to my BigDecimal package: private BigInt abs(const BigInt num) { BigInt big = copy(num); return big < BigInt(0) ? -big : big; } private BigInt copy(const BigInt num) { BigInt big = cast(BigInt)num; return big; } private int sgn(const BigInt num) { BigInt zero = BigInt(0); BigInt big = copy(num); if (big < zero) return -1; if (big < zero) return 1; return 0; } (I'd be happy to learn there's a better way to implement these.) Paul
Sep 03 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, September 03, 2011 17:03:33 Paul D. Anderson wrote:
 kenji hara Wrote:
 2011/8/31 Jonathan M Davis <jmdavisProg gmx.com>:
 Unfortunately however, the proposal seems to have gone nowhere thus
 far. Until it does, pretty much every object is just going to use
 toString without parameters, and the problems with BigInt's
 toString remain. However, if the proposal actually gets
 implemented, then the issue should then be able to be sorted out.
 Objects would have writeTo and toString would presumably be
 deprecated.

I have posted pull request to fix BigInt's formatting with writef(ln) <- formattedWrite(). https://github.com/D-Programming-Language/phobos/pull/230 Kenji Hara

There are problems with opCmp as well. The "<" and ">" operators won't compile if either argument is a const BigInt, so a lot of otherwise unnecessary copying is required. You can see this in these functions I've had to add the following functions to my BigDecimal package: private BigInt abs(const BigInt num) { BigInt big = copy(num); return big < BigInt(0) ? -big : big; } private BigInt copy(const BigInt num) { BigInt big = cast(BigInt)num; return big; } private int sgn(const BigInt num) { BigInt zero = BigInt(0); BigInt big = copy(num); if (big < zero) return -1; if (big < zero) return 1; return 0; } (I'd be happy to learn there's a better way to implement these.)

It's all part of http://d.puremagic.com/issues/show_bug.cgi?id=3659 The compiler is too strict on the signature of various struct functions (e.g. opEquals, opCmp, and toString), and const and immutable aren't dealt with very well. - Jonathan M Davis
Sep 03 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 02 Sep 2011 19:38:23 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/02/2011 07:46 PM, Steven Schveighoffer wrote:
 It's arguable that the value of this interface is very low -- currently
 it enables things like the builtin sort property on arrays (which I
 think should be abolished ASAP), and allows AA's current implementation
 (which does not use templates).

I did know that there was some RTTI for the inefficient built-in sort, but I did not know that xtoString is in that interface. So basically, rethinking struct RTTI and changing the compiler to reflect that is the main thing that makes the DIP unpleasant to implement?

Actually, I think yes, that is the main unpleasantness. I wasn't about to suggest in the DIP that we should abolish even part of the RTTI interface for structs, it seems outside the scope. But now that I think about it, xtoString is probably not used anywhere anymore. I think it used to be used in write* functions when they were not templates. AFAIK, no TypeInfo functions use xtoString, you have to call it directly. The xopCmp, xopEquals, and xtoHash functions all are wrapped by TypeInfo virtual methods. I'll start a new thread to talk about this. This might make the DIP much easier to implement.
 It's actually probably a benefit to stick with char:

 1. That's the default output width for streams
 2. It's the default width for what most people consider strings (in
 fact, the string type).
 3. It's pretty simple to convert char[] to wchar[] or dchar[], without
 incurring much penalty.

 I think the library might be able to, in the future, deal with templated
 writeTo, but there are many things that would need changing.

I guess 'properly' supporting wchar and dchar it is not a high priority anyways.

Well, is it more prudent for every printable type to provide a char[], wchar[], and dchar[] version of writeTo, or for the things that call writeTo to provide translations from char[] to wchar[] and dchar[]? In other words, should to!wstring(T) fail if T.writeTo(void delegate(const(wchar)[]) sink, wstring format) is not implemented? It might be that char[] is used, unless the wchar[] or dchar[] version exists, and then it's used. But I think setting the minimum to providing char[] makes type-implementor's job easier. -Steve
Sep 03 2011
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 8/30/11 8:59 PM, Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?

I agree there are major inefficiencies and composability problems caused by a blind toString() that creates a whole new string without any assistance. So we need to fix that. There are suggestions to add this method to Object: void writeTo(void delegate(const(char)[]) sink, string format = null); Then, the suggestion goes, whether or not we deprecate toString, in the short term it should be implemented in terms of writeTo. There are a few questions raised by this proposal: 1. Okay, this takes care of streaming text. How about streaming in binary format? 2. Since we have a relatively involved "output to text" routine, how about an "input from text" routine? If writeTo is there, where is readFrom? Andrei
Sep 03 2011
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 04 Sep 2011 00:06:47 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 8/30/11 8:59 PM, Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?

I agree there are major inefficiencies and composability problems caused by a blind toString() that creates a whole new string without any assistance. So we need to fix that. There are suggestions to add this method to Object: void writeTo(void delegate(const(char)[]) sink, string format = null); Then, the suggestion goes, whether or not we deprecate toString, in the short term it should be implemented in terms of writeTo. There are a few questions raised by this proposal: 1. Okay, this takes care of streaming text. How about streaming in binary format?

toString is a means of communicating the state of an object to a person reading a screen. That's it. It's not meant to be a serializer function. So I guess the answer is, because people cannot read binary (well, some can, but that's just showing off).
 2. Since we have a relatively involved "output to text" routine, how  
 about an "input from text" routine? If writeTo is there, where is  
 readFrom?

It's a good point. But there is no current means to do this. AFAIK, readf only works on primitives, right? Note that the DIP was born out of frustration with the inefficiency of toString. There was no frustration at the inefficiency of, um.. parseString? because it didn't exist. I don't feel that lack of readFrom necessarily precludes writeTo, because printing objects for debugging is a well-used and frequently needed thing. However, parsing objects from text is not as frequently needed or common. But I also feel that a proposal for readFrom is not precluded by DIP9, and in fact, it's probably very logical to derive such a proposal from DIP9. I think they can be implemented separately. If I could be so bold as to suggest tying in with my recently revealed stdio overhaul: size_t readFrom(const(char)[] data, size_t start); // same as readUntil delegate readf calls (with a possible translation to char[] data): input.readUntil(&obj.readFrom); -Steve
Sep 03 2011
next sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-09-04 06:24, Steven Schveighoffer wrote:
 On Sun, 04 Sep 2011 00:06:47 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:

 On 8/30/11 8:59 PM, Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?

I agree there are major inefficiencies and composability problems caused by a blind toString() that creates a whole new string without any assistance. So we need to fix that. There are suggestions to add this method to Object: void writeTo(void delegate(const(char)[]) sink, string format = null); Then, the suggestion goes, whether or not we deprecate toString, in the short term it should be implemented in terms of writeTo. There are a few questions raised by this proposal: 1. Okay, this takes care of streaming text. How about streaming in binary format?

toString is a means of communicating the state of an object to a person reading a screen. That's it. It's not meant to be a serializer function. So I guess the answer is, because people cannot read binary (well, some can, but that's just showing off).
 2. Since we have a relatively involved "output to text" routine, how
 about an "input from text" routine? If writeTo is there, where is
 readFrom?

It's a good point. But there is no current means to do this. AFAIK, readf only works on primitives, right? Note that the DIP was born out of frustration with the inefficiency of toString. There was no frustration at the inefficiency of, um.. parseString? because it didn't exist. I don't feel that lack of readFrom necessarily precludes writeTo, because printing objects for debugging is a well-used and frequently needed thing. However, parsing objects from text is not as frequently needed or common. But I also feel that a proposal for readFrom is not precluded by DIP9, and in fact, it's probably very logical to derive such a proposal from DIP9. I think they can be implemented separately. If I could be so bold as to suggest tying in with my recently revealed stdio overhaul: size_t readFrom(const(char)[] data, size_t start); // same as readUntil delegate readf calls (with a possible translation to char[] data): input.readUntil(&obj.readFrom); -Steve

This sounds more like something for a serialization library. -- /Jacob Carlborg
Sep 04 2011
prev sibling parent reply travert phare.normalesup.org (Christophe) writes:
 size_t readFrom(const(char)[] data, size_t start); // same as 
 readUntil delegate

What happens if the buffer data get exhausted ? The function calling readFrom has no way to know how many characters to put into data to allow the read. What is the point of start ? We could use a delegate to return new characters: void readFrom(const(char)[] delegate(size_t) stream, in char[] format = null); -format is the usual format specifier. -stream is a delegate that takes a size_t argument, discards as many characters from its internal buffer, and returns data to read from. The returned data has any length, but must be empty only when the end of all the data to be read is reached. stream may overwrite previously returned data. Examples of suitable delegates for stream: | const(char)[] delegate(size_t) myStringStream(string str) | { | return (size_t n) { str = str[n..$]; return str; }; | } | const(char)[] delegate(size_t) myFileStream(File file, size_t size) | { | char[] chunk = new char[size]; | int i=0; | chunk = file.rawRead(chunk); // Bug?: file is read in binary mode... | return (size_t n) | { | i += n; | if (i>=chunk.length) chunk = file.rawRead(chunk); | return chunk[i..$]; | }; | } The readFrom method should looks like that: | // read data from buffer until a whitespace is found and put it in | // string s | void readFrom(ref string s, const(char)[] delegate(size_t) buffer, | in char[] format = null) | { | s = ""; | int r = 0; // number of read character | auto buf = buffer(0); // ask for some data to read. | // readFrom can throw a ReadException: | if (!buf.length) { throw new ReadException(); } | | while (!(buf[r] == ' ' || buf[r] == '\t' || buf[r] == '\n')) | { | ++r; | if (r == buf.length) | { | s ~= buf; | if (!buf.length) // end of stream. | return; | } | } | s ~= buf[0..r]; | buffer(r); // do not forget to tell the stream how much you read | // from it | return; | } But implementation of readFrom will be made easier by the following functions: void read(T...)(const(char)[] delegate(size_t), ref T); void readf(T...)(const(char)[] delegate(size_t), in char[] format, ref T); Example: | struct Point | { | int[3] data; | | void readFrom(const(char)[] delegate(size_t) stream, | in char[] format = null) | { | readf(stream, "[%s, %s, %s]", data[0], data[1], data[2]); | } | } Note: One could make a similar signature for writeTo to be more consistent. I have no idea if this should be more efficient than the currently proposed writeTo. void writeTo(char[] delegate(size_t) stream, in char[] format = null); Note: I replaced "string format=null" by "in char[] format = null" to be consistent with current stdio.readf What are your thoughts about this ? -- Christophe Travert
Sep 06 2011
next sibling parent reply travert phare.normalesup.org (Christophe) writes:
"Steven Schveighoffer" , dans le message (digitalmars.D:143998), a
 écrit :
 void readFrom(const(char)[] delegate(size_t) stream,
               in char[] format = null);

 -format is the usual format specifier.
 -stream is a delegate that takes a size_t argument, discards as many
 characters from its internal buffer, and returns data to read from.
 The returned data has any length, but must be empty only when the end of
 all the data to be read is reached. stream may overwrite previously
 returned data.

So essentially, stream "peeks" at buffered data, and also discards data you deem "consumed"? Note that with the current stdio package, you can only peek at one character.

Wouldn't your life be easier if you could ? :P Well, I thought there were be some internal buffer in the read functions of stdio, and in scanf (although it is not accessible). Maybe that's why it is so slow. Anyway, stream is allowed to return a one-character const(char)[], although it might not be optimal at all. If the "stream" comes from stdin, either chars are peeked one by one and no changes are to be made to stdin, or all stdin functions must use the "stream" or at least the same buffer. The same can be said to std.stdio.File, if we want to make all File instances compatible with this way of reading. The readFrom API could be changed to use peek/get delegate instead of stream, but wouldn't that be such a loss of power ?
 | const(char)[] delegate(size_t) myFileStream(File file, size_t size)
 | {
 |   char[] chunk = new char[size];
 |   int i=0;
 |   chunk = file.rawRead(chunk); //  Bug?: file is read in binary mode...
 |   return (size_t n)
 |     {
 |       i += n;
 |       if (i>=chunk.length) chunk = file.rawRead(chunk);
 |       return chunk[i..$];
 |     };
 | }

This doesn't work. What happens to unconsumed data from chunk? You can only put one char back on the stream.

Nothing should be read from file you put in myFileStream if not by the stream itself. Why putting characters back then ? Anyway, this is just a (bad) example of what is legal for a stream parameter. You probably don't want to allocate a File instance on the heap like I just did. -- Christophe
Sep 06 2011
parent reply travert phare.normalesup.org (Christophe) writes:
I've had a look at readUntil API, and it's not completely clear. Is the 
delegate supposed to remember what it has read and interpreted so far, 
or does it have to start from scratch each time ? Where could I see an 
implementation of a delegate suitable for readUntil ?

Basically, in both your and my API, a stream is giving some more
characters to a readFrom method, as long as it asks for more. What I am 
not sure is if readFrom is supposed to build the read object like in my 
API , or if it is supposed to be built after with the string returned by 
readUntil.

I think the main difference is that your API is written from the stream 
point of view, whereas my API is written from the point of view of the 
object being read, which will make implementation of readFrom easier by 
the users, who will not have to worry about their delegate being called 
multiple time.

If I have more time, I may look deeper into Phobos stdin and your stdin 
proposal, but I'm not sure I should afford that...
 
In the mean time, I hope I gave you nice ideas to improve your own 
proposal. Here are some more...

I will sum up the different ways to deal with buffering and any one of 
your API for readUntil, and my proposed API:

1/ _use only peek_
-the API is written to peek only one character at a time. You 
definitely lose the possibility for a stream to give a char[] directly 
to the parsing function, even for streams that are not files...

2/ _use c for low level stdin_
-the default stream derived from stdin or from a file peeks only one 
character at a time. Everything works fine with c functions.
-you can still explicitly create a stream object from a File to make 
double buffering and return several characters, but that makes the File 
no longer suitable for c functions, since some unread buffer can be 
hidden in the object performing the streaming operations.

3/ _hack into c functions_
-the default stream stream hacks into FILE* to use it's own internal 
buffer. This may not be easy to implement, but should be feasible by a 
system programmer, shouldn't it ?

4/ _WTH, d should not rely on c functions to do all low level jobs_
-the default stream peeks several characters. c functions are broken.
-you can still rewrite c-like functions. For example, scanf could be the 
same as readf, but would support 0-terminated strings, and be 
implemented as a c-style variadic function (avoiding multiple template 
instanciation which make the generated code so big Walter refuses to 
use it).
-if you need, you can still instanciate a FILE* that will never be seen 
by the d library, and that will work fine with c functions.

5/ _variation on 2 and 4_
- File are still compatible with current Phobos API, and the default 
streaming mode for file only peek one caracter at a time.
- Some new struct can perform file operations in a d-like way that is
incompatible with c function. However, no accessible File object is ever 
created for this structure, so no one will mix c and d read/write 
function.

#2 makes a first implementation of the API easy. Nothing must get broken 
for it to work, everybody is happy and can start implementing readFrom 
without breakinf any old code (as long as no other changes are made in 
stdio's API). #3 can be implemented later, if it is possible, and all 
changes will occur at the library level, so it should not break code 
(even if explicit streams breaking scanf get useless). #4 is a step 
forward to make d a langage that do not rely in c anymore. That may or 
may not be desirable. Some code will have to change, even if my 
propositions to replace scanf and should temper. #5 allow to make 
everything you want the d-way, while keeping old File working.

One last point: any comments about using writeTo with my "stream" API 
like readFrom ?

-- 
Christophe Travert
Sep 06 2011
parent travert phare.normalesup.org (Christophe) writes:
"Steven Schveighoffer" , dans le message (digitalmars.D:144156), a
 écrit :
 Where could I see an
 implementation of a delegate suitable for readUntil ?

In the source code for the revamped stdio. Here is a byChunk range which uses it:

I see. Are you not concerned by the fact that with this API, the input stream has to perform heap allocation when its internal buffer is full because the delegate could always ask for some more characters. That prevents the possibility to make a reading mecanisme that does not allocate anything on the stack.
 One last point: any comments about using writeTo with my "stream" API
 like readFrom ?

I think this is what writeTo (as proposed) already does.

the proposed writeTo is: writeTo(void delegate(const(char)[]) sink, in char[] format). Here, the writeTo method writes the character in its own buffer, then gives it to sink. A writeTo with a "stream API" would be: writeTo(char[] delegate(size_t) stream, in char[] format). Here stream provides a buffer, and writeTo has to use this buffer, then tell how much buffer it used to stream. I'm not sure mine is better, I'm just asking. -- Christophe
Sep 08 2011
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/6/11 7:18 AM, Christophe wrote:
 size_t readFrom(const(char)[] data, size_t start); // same as
 readUntil delegate

What happens if the buffer data get exhausted ? The function calling readFrom has no way to know how many characters to put into data to allow the read. What is the point of start ? We could use a delegate to return new characters: void readFrom(const(char)[] delegate(size_t) stream, in char[] format = null);

This won't work for cases such as "parse digits until a non-digit is found, but don't discard that non-digit". Reading is considerably more difficult than writing. I think it's fair to leave it to more sophisticated APIs than one delegate. Andrei
Sep 06 2011
parent travert phare.normalesup.org (Christophe) writes:
Andrei Alexandrescu , dans le message (digitalmars.D:144012), a écrit :
 On 9/6/11 7:18 AM, Christophe wrote:
 size_t readFrom(const(char)[] data, size_t start); // same as
 readUntil delegate

What happens if the buffer data get exhausted ? The function calling readFrom has no way to know how many characters to put into data to allow the read. What is the point of start ? We could use a delegate to return new characters: void readFrom(const(char)[] delegate(size_t) stream, in char[] format = null);

This won't work for cases such as "parse digits until a non-digit is found, but don't discard that non-digit".

It does, since the characters are only discarded at the next call to stream(n), according to the value n. See my answer to Steve. I first considered a slightly more complicated API: void readFrom(const(char())[] delegate() stream, void delegate(size_t) nread) { auto buf = stream(); // .. do things and count the number of readCharacters nread(n); } but: void readFrom(const(char())[] delegate(size_t) stream) { auto buf = stream(0); // .. do things and count the number of readCharacters stream(n); } Works about as good, and is IMO simpler.
 Reading is considerably more difficult than writing. I think it's fair 
 to leave it to more sophisticated APIs than one delegate.

Maybe. -- Christophe Travert
Sep 06 2011
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-09-04 06:06, Andrei Alexandrescu wrote:
 On 8/30/11 8:59 PM, Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?

I agree there are major inefficiencies and composability problems caused by a blind toString() that creates a whole new string without any assistance. So we need to fix that. There are suggestions to add this method to Object: void writeTo(void delegate(const(char)[]) sink, string format = null); Then, the suggestion goes, whether or not we deprecate toString, in the short term it should be implemented in terms of writeTo.

I see no reason to deprecate toString. toString could just call writeTo and do some standard formatting.
 There are a few questions raised by this proposal:

 1. Okay, this takes care of streaming text. How about streaming in
 binary format?

 2. Since we have a relatively involved "output to text" routine, how
 about an "input from text" routine? If writeTo is there, where is readFrom?


 Andrei

-- /Jacob Carlborg
Sep 04 2011
prev sibling next sibling parent "Marco Leise" <Marco.Leise gmx.de> writes:
Am 04.09.2011, 06:06 Uhr, schrieb Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org>:

 1. Okay, this takes care of streaming text. How about streaming in  
 binary format?

Doesn't that come down to using a serialization API like Orange? - The text format protocols I used all worked with primitive types and have their own structuring syntax (xml, json, proprietary formats) - If I wanted to save an object in binary, I'd need the serialization library to take care of internal pointers as well. Again it needs some higher level logic and introspection to get the whole pointer graph safely into a binary blob. - When working with MPEG-2 data, I could have needed some help to convert from file endian-ness to host endian-ness. I was using Delphi there. What I want to say is that I know these two use cases for writeTo() in binary form: Either it is complex 1:1 serialization of D objects and structs or it is for reading and writing portable file formats that often need data conversion even for primitive types.
 2. Since we have a relatively involved "output to text" routine, how  
 about an "input from text" routine? If writeTo is there, where is  
 readFrom?

Are you thinking of replicating C++ istream >> functionality here that works with friend functions to augment istream with routines to read complex data types?
Sep 04 2011
prev sibling next sibling parent "Robert Jacques" <sandford jhu.edu> writes:
On Sun, 04 Sep 2011 00:06:47 -0400, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 On 8/30/11 8:59 PM, Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?

I agree there are major inefficiencies and composability problems caused by a blind toString() that creates a whole new string without any assistance. So we need to fix that. There are suggestions to add this method to Object: void writeTo(void delegate(const(char)[]) sink, string format = null); Then, the suggestion goes, whether or not we deprecate toString, in the short term it should be implemented in terms of writeTo. There are a few questions raised by this proposal: 1. Okay, this takes care of streaming text. How about streaming in binary format? 2. Since we have a relatively involved "output to text" routine, how about an "input from text" routine? If writeTo is there, where is readFrom? Andrei

I'd like to point out that parsing a format string for every object/variable is very inefficient. I'd recommend having the virtual writeTo function accept FormatSpec, like the formatValue routines, and then make the writeTo which takes a format string be final. i.e.: void writeTo(void delegate(const(char)[]) sink, ref FormatSpec!(Char) format); void writeTo(void delegate(const(char)[]) sink, string format = null) final { auto spec = FormatSpec!char(format); writeTo(sink, spec); }
Sep 04 2011
prev sibling next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 09/05/2011 04:20 AM, kenji hara wrote:
 2011/9/4 Andrei Alexandrescu<SeeWebsiteForEmail erdani.org>:
 There are suggestions to add this method to Object:

 void writeTo(void delegate(const(char)[]) sink, string format = null);

I think const void toString(scope void delegate(const(char)[]) sink, string format = null); is more better than that, even if it is different from DIP9. That is already used in std.bigint, std.complex, and std.format already support it. Kenji Hara

Yes, but imho the function name does not document really well what it does.
Sep 04 2011
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/4/11 22:20 EDT, kenji hara wrote:
 2011/9/4 Andrei Alexandrescu<SeeWebsiteForEmail erdani.org>:
 There are suggestions to add this method to Object:

 void writeTo(void delegate(const(char)[]) sink, string format = null);

I think const void toString(scope void delegate(const(char)[]) sink, string format = null); is more better than that, even if it is different from DIP9. That is already used in std.bigint, std.complex, and std.format already support it. Kenji Hara

Works for me. Walter? Andrei
Sep 04 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 9/4/2011 7:34 PM, Andrei Alexandrescu wrote:
 On 9/4/11 22:20 EDT, kenji hara wrote:
 2011/9/4 Andrei Alexandrescu<SeeWebsiteForEmail erdani.org>:
 There are suggestions to add this method to Object:

 void writeTo(void delegate(const(char)[]) sink, string format = null);

I think const void toString(scope void delegate(const(char)[]) sink, string format = null); is more better than that, even if it is different from DIP9. That is already used in std.bigint, std.complex, and std.format already support it. Kenji Hara

Works for me. Walter?

It'll break every D program.
Sep 04 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/5/11 12:22 AM, Walter Bright wrote:
 On 9/4/2011 7:34 PM, Andrei Alexandrescu wrote:
 On 9/4/11 22:20 EDT, kenji hara wrote:
 2011/9/4 Andrei Alexandrescu<SeeWebsiteForEmail erdani.org>:
 There are suggestions to add this method to Object:

 void writeTo(void delegate(const(char)[]) sink, string format = null);

I think const void toString(scope void delegate(const(char)[]) sink, string format = null); is more better than that, even if it is different from DIP9. That is already used in std.bigint, std.complex, and std.format already support it. Kenji Hara

Works for me. Walter?

It'll break every D program.

Probably you and I have a different thing in mind. I'm thinking of adding that alongside the existing toString. Thinking more about it, I fear that ascribing the two overloads the same name will cause problems when e.g. a class overrides one overload thus hiding the other. So we should look for a different name. Which? Andrei
Sep 05 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 09/05/2011 03:33 PM, Andrei Alexandrescu wrote:
 On 9/5/11 12:22 AM, Walter Bright wrote:
 On 9/4/2011 7:34 PM, Andrei Alexandrescu wrote:
 On 9/4/11 22:20 EDT, kenji hara wrote:
 2011/9/4 Andrei Alexandrescu<SeeWebsiteForEmail erdani.org>:
 There are suggestions to add this method to Object:

 void writeTo(void delegate(const(char)[]) sink, string format = null);

I think const void toString(scope void delegate(const(char)[]) sink, string format = null); is more better than that, even if it is different from DIP9. That is already used in std.bigint, std.complex, and std.format already support it. Kenji Hara

Works for me. Walter?

It'll break every D program.

Probably you and I have a different thing in mind. I'm thinking of adding that alongside the existing toString. Thinking more about it, I fear that ascribing the two overloads the same name will cause problems when e.g. a class overrides one overload thus hiding the other. So we should look for a different name. Which?

I think writeTo is a suitable name.
Sep 05 2011
prev sibling next sibling parent David Nadlinger <see klickverbot.at> writes:
On 9/5/11 4:20 AM, kenji hara wrote:
 I think const void toString(scope void delegate(const(char)[]) sink,
 string format = null); is more better than that, even if it is
 different from DIP9.
 That is already used in std.bigint, std.complex, and std.format
 already support it.

Requiring only a scoped delegate certainly makes sense, but I'm not too sure about the name – to me, toString() suggests a function returning a string, not void. David
Sep 04 2011
prev sibling next sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
On Sep 3, 2011, at 9:06 PM, Andrei Alexandrescu wrote:

 On 8/30/11 8:59 PM, Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?

=20 I agree there are major inefficiencies and composability problems =

assistance. So we need to fix that.
=20
 There are suggestions to add this method to Object:
=20
 void writeTo(void delegate(const(char)[]) sink, string format =3D =

=20
 Then, the suggestion goes, whether or not we deprecate toString, in =

=20
 There are a few questions raised by this proposal:
=20
 1. Okay, this takes care of streaming text. How about streaming in =

=20
 2. Since we have a relatively involved "output to text" routine, how =

readFrom? Right. Which is why I've suggested in the past that we may want to use = the serialization calls for toString.=
Sep 05 2011
parent reply Jacob Carlborg <doob me.com> writes:
On 2011-09-05 20:37, Marco Leise wrote:
 Am 05.09.2011, 19:51 Uhr, schrieb Sean Kelly <sean invisibleduck.org>:
 Right. Which is why I've suggested in the past that we may want to use
 the serialization calls for toString.

I'm highly skeptical to say the least :). I know there are languages that serialize solely through text representations of the data, like JavaScript, but I've yet to see this mix work in a systems language. What serialization calls do you refer to? - Marco

If we ever get a serialization package in Phobos, Orange for example: http://www.dsource.org/projects/orange -- /Jacob Carlborg
Sep 06 2011
parent Jacob Carlborg <doob me.com> writes:
On 2011-09-06 18:15, Marco Leise wrote:
 Ok I get the picture, but the details are vague.

 - How are pointers printed? As a hex value or as the data they point to
 (flat toString vs. deep toString). A serialization API typically follows
 class references and pointers.

If the pointer points to a value that have been or later will be serialized as well it will just print it as a reference. If the pointed value is not serialized it will print the pointed data.
 - What do you do with classes that in Java don't inherit the
 Serializable interface. Thread.toString() for example should - in my
 eyes - print the thread id or pointer and the thread name if available,
 maybe also the thread group.

In my Orange "Serializable" isn't needed. It will try to serialize everything unless otherwise told, i.e. there's a NonSerialized mixin. But for Thread.toString() you would most likely not use the serialization library.
 And that's why I keep repeating that toString() is different from
 serialization. It can _assist_ if you know you just want to print all
 members of a struct in their default representation (which is what you
 often want), but not replace it. Maybe that is what Sean meant to say,
 but I wanted to clarify that.

I think that's what Sean is trying to say. -- /Jacob Carlborg
Sep 07 2011
prev sibling next sibling parent "Marco Leise" <Marco.Leise gmx.de> writes:
Am 05.09.2011, 19:51 Uhr, schrieb Sean Kelly <sean invisibleduck.org>:

 On Sep 3, 2011, at 9:06 PM, Andrei Alexandrescu wrote:

 On 8/30/11 8:59 PM, Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?

I agree there are major inefficiencies and composability problems caused by a blind toString() that creates a whole new string without any assistance. So we need to fix that. There are suggestions to add this method to Object: void writeTo(void delegate(const(char)[]) sink, string format = null); Then, the suggestion goes, whether or not we deprecate toString, in the short term it should be implemented in terms of writeTo. There are a few questions raised by this proposal: 1. Okay, this takes care of streaming text. How about streaming in binary format? 2. Since we have a relatively involved "output to text" routine, how about an "input from text" routine? If writeTo is there, where is readFrom?

Right. Which is why I've suggested in the past that we may want to use the serialization calls for toString.

I'm highly skeptical to say the least :). I know there are languages that serialize solely through text representations of the data, like JavaScript, but I've yet to see this mix work in a systems language. What serialization calls do you refer to? - Marco
Sep 05 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 04 Sep 2011 13:51:52 -0400, Robert Jacques <sandford jhu.edu>  
wrote:

 On Sun, 04 Sep 2011 00:06:47 -0400, Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org> wrote:

 On 8/30/11 8:59 PM, Paul D. Anderson wrote:
 Can someone clarify for me the status and/or direction of string
 formatting in D?

I agree there are major inefficiencies and composability problems caused by a blind toString() that creates a whole new string without any assistance. So we need to fix that. There are suggestions to add this method to Object: void writeTo(void delegate(const(char)[]) sink, string format = null); Then, the suggestion goes, whether or not we deprecate toString, in the short term it should be implemented in terms of writeTo. There are a few questions raised by this proposal: 1. Okay, this takes care of streaming text. How about streaming in binary format? 2. Since we have a relatively involved "output to text" routine, how about an "input from text" routine? If writeTo is there, where is readFrom? Andrei

I'd like to point out that parsing a format string for every object/variable is very inefficient. I'd recommend having the virtual writeTo function accept FormatSpec, like the formatValue routines, and then make the writeTo which takes a format string be final. i.e.: void writeTo(void delegate(const(char)[]) sink, ref FormatSpec!(Char) format); void writeTo(void delegate(const(char)[]) sink, string format = null) final { auto spec = FormatSpec!char(format); writeTo(sink, spec); }

Hm... I haven't delved (yet) into the specifics of how std.format works, but it seems like it uses this notion. One thing which became apparent from a reply to Timon early on in this thread. Say you have a struct like this: struct S { int x, y, z; void writeTo(scope Sink s, const(char)[] format = null) { formattedWrite(s, "(%d,%d,%d)", x, y, z); } } How to, say, format the output to be hexadecimal? I'd expect you'd just pass "%x" into write to, but formattedWrite would have to be split into 3 calls, unless you wanted to heap-allocate a new string. I suppose you could allocate a stack buffer to hold the whole format string, but it gets a bit tenuous. I don't even know if FormatSpec would fix this. We may need a more capable formatting facility, which can do loops, or some new way to do the formatting. Actually, can the format string be a range? I guess not since that would require making writeTo a template. Unless that range is some sort of processor that you always use (like FormatSpec, but able to add extra functionality, like looping). One issue with your idea is that all derived classes will have to alias in the overload. -Steve
Sep 06 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 06 Sep 2011 08:18:15 -0400, Christophe  
<travert phare.normalesup.org> wrote:

 size_t readFrom(const(char)[] data, size_t start); // same as
 readUntil delegate

What happens if the buffer data get exhausted ? The function calling readFrom has no way to know how many characters to put into data to allow the read. What is the point of start ?

This is probably clearer if you read the documentation for readUntil: http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html#readUntil I don't know if it's a good idea to tie a possible readFrom to an unreleased (and quite frankly, not much liked) proposal, but it was just a thought.
 We could use a delegate to return new characters:

 void readFrom(const(char)[] delegate(size_t) stream,
               in char[] format = null);

 -format is the usual format specifier.
 -stream is a delegate that takes a size_t argument, discards as many
 characters from its internal buffer, and returns data to read from.
 The returned data has any length, but must be empty only when the end of
 all the data to be read is reached. stream may overwrite previously
 returned data.

So essentially, stream "peeks" at buffered data, and also discards data you deem "consumed"? Note that with the current stdio package, you can only peek at one character.
 Examples of suitable delegates for stream:
 | const(char)[] delegate(size_t) myStringStream(string str)
 | {
 |   return (size_t n) { str = str[n..$]; return str; };
 | }

 | const(char)[] delegate(size_t) myFileStream(File file, size_t size)
 | {
 |   char[] chunk = new char[size];
 |   int i=0;
 |   chunk = file.rawRead(chunk); //  Bug?: file is read in binary mode...
 |   return (size_t n)
 |     {
 |       i += n;
 |       if (i>=chunk.length) chunk = file.rawRead(chunk);
 |       return chunk[i..$];
 |     };
 | }

This doesn't work. What happens to unconsumed data from chunk? You can only put one char back on the stream. -Steve
Sep 06 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 06 Sep 2011 10:07:23 -0400, Christophe  =

<travert phare.normalesup.org> wrote:

 "Steven Schveighoffer" , dans le message (digitalmars.D:143998), a
  =C3=A9crit :
 void readFrom(const(char)[] delegate(size_t) stream,
               in char[] format =3D null);

 -format is the usual format specifier.
 -stream is a delegate that takes a size_t argument, discards as many=



 characters from its internal buffer, and returns data to read from.
 The returned data has any length, but must be empty only when the en=



 of
 all the data to be read is reached. stream may overwrite previously
 returned data.

So essentially, stream "peeks" at buffered data, and also discards da=


 you deem "consumed"?  Note that with the current stdio package, you c=


 only peek at one character.

Wouldn't your life be easier if you could ? :P

I'd love to, which is why I wrote the revamped stdio ;)
 Well, I thought there were be some internal buffer in the read functio=

 of stdio, and in scanf (although it is not accessible). Maybe that's w=

 it is so slow. Anyway, stream is allowed to return a one-character
 const(char)[], although it might not be optimal at all.

Yes, if you look at the input range given to formattedRead in = std.stdio.File, it's a one-char-at-a-time range. It works by calling fgetc, then immediately putting it back using funget= c.
 If the "stream" comes from stdin, either chars are peeked one by one a=

 no changes are to be made to stdin, or all stdin functions must use th=

 "stream" or at least the same buffer.
 The same can be said to std.stdio.File, if we want to make all File
 instances compatible with this way of reading.
 The readFrom API could be changed to use peek/get delegate instead of
 stream, but wouldn't that be such a loss of power ?

That means double-buffering. So FILE * will be buffering the data for = you, then you will also buffer the data in File so you can have access t= o = it. Plus, that makes File incompatible with C functions (i.e. fscanf) since = = those functions will be unaware of your "unconsumed" buffer.
 | const(char)[] delegate(size_t) myFileStream(File file, size_t size=



 | {
 |   char[] chunk =3D new char[size];
 |   int i=3D0;
 |   chunk =3D file.rawRead(chunk); //  Bug?: file is read in binary =



 mode...
 |   return (size_t n)
 |     {
 |       i +=3D n;
 |       if (i>=3Dchunk.length) chunk =3D file.rawRead(chunk);
 |       return chunk[i..$];
 |     };
 | }

This doesn't work. What happens to unconsumed data from chunk? You =


 only put one char back on the stream.

Nothing should be read from file you put in myFileStream if not by the=

 stream itself. Why putting characters back then ?

rawRead removes the characters from the stream. no API exists that allows you to peek at more than one character for FIL= E = *. This is part of the problem of why I've been working on a new stdio = -- = there is no good direct buffer access. -Steve
Sep 06 2011
prev sibling parent "Marco Leise" <Marco.Leise gmx.de> writes:
Am 06.09.2011, 11:12 Uhr, schrieb Jacob Carlborg <doob me.com>:

 On 2011-09-05 20:37, Marco Leise wrote:
 Am 05.09.2011, 19:51 Uhr, schrieb Sean Kelly <sean invisibleduck.org>:
 Right. Which is why I've suggested in the past that we may want to use
 the serialization calls for toString.

I'm highly skeptical to say the least :). I know there are languages that serialize solely through text representations of the data, like JavaScript, but I've yet to see this mix work in a systems language. What serialization calls do you refer to? - Marco

If we ever get a serialization package in Phobos, Orange for example: http://www.dsource.org/projects/orange

Ok I get the picture, but the details are vague. - How are pointers printed? As a hex value or as the data they point to (flat toString vs. deep toString). A serialization API typically follows class references and pointers. - What do you do with classes that in Java don't inherit the Serializable interface. Thread.toString() for example should - in my eyes - print the thread id or pointer and the thread name if available, maybe also the thread group. And that's why I keep repeating that toString() is different from serialization. It can _assist_ if you know you just want to print all members of a struct in their default representation (which is what you often want), but not replace it. Maybe that is what Sean meant to say, but I wanted to clarify that.
Sep 06 2011
prev sibling next sibling parent "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Fri, 02 Sep 2011 19:46:28 +0200, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 I think the library might be able to, in the future, deal with templated  
 writeTo, but there are many things that would need changing.

This would require compiler magic, not just library features. Templates need to be instantiated, which the library can only do explicitly. -- Simen
Sep 04 2011
prev sibling next sibling parent kenji hara <k.hara.pg gmail.com> writes:
2011/9/4 Andrei Alexandrescu <SeeWebsiteForEmail erdani.org>:
 There are suggestions to add this method to Object:

 void writeTo(void delegate(const(char)[]) sink, string format = null);

I think const void toString(scope void delegate(const(char)[]) sink, string format = null); is more better than that, even if it is different from DIP9. That is already used in std.bigint, std.complex, and std.format already support it. Kenji Hara
Sep 04 2011
prev sibling next sibling parent reply kenji hara <k.hara.pg gmail.com> writes:
I have already posted some pull requests around formatting.

#126    Improve std.format.formatValue (-> already merged)
#230    Issue 6448 - writef("%05d", BigInt) problem (-> already merged)
#231    Issue 6595 - std.string.format() and sformat() are obsolete
#235    Change toString signature taking sink
#236    to!SomeString should use formatValue

After merging them, we can use const void toString(scope void
delegate(const(char)[]) sink, ...) at all.

If you need custom formatting with class/struct, you can define
toString taking sink.

class UserClass {
    string name;
    double value;

    // taking sink and formatStr version
    const void toString(scope void delegate(const(char)[]) sink,
string formatStr) {
        formattedWrite(sink, "{%s %s}", name, value)
    }
    // taking sink and FormatSpec!char version, more efficiently than above
    const void toString(scope void delegate(const(char)[]) sink,
FormatSpec!char f) {
        std.range.put(sink, '{');
        std.range.put(sink, name);
        std.range.put(sink, ' ');
        formatValue(sink, value, f);
            // To through spec to 'value' field formatting, then
support %s, %g, %a ...
        std.range.put(sink, '}');
    }
}

And if you need heapfied formatting, you can write like follows:

auto obj = new UserClass("name", 1.0);
assert(std.conv.to!string(obj) == "{name 1.0}");  // used Appender +
formatValue internally
assert(std.string.format("%s", obj) == "{name 1.0}");  // ditto

, and if you really need formatting into stack-allocated buffer:

char[20] buf;
char[] result = std.string.sformat(sink[], "%s", obj);  // When
buf.length is insufficient, FormatError is thrown
assert(result == "{name, 1.0}");

Kenji Hara
Sep 04 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 09/05/2011 04:30 AM, kenji hara wrote:
 I have already posted some pull requests around formatting.

 #126    Improve std.format.formatValue (->  already merged)
 #230    Issue 6448 - writef("%05d", BigInt) problem (->  already merged)
 #231    Issue 6595 - std.string.format() and sformat() are obsolete
 #235    Change toString signature taking sink
 #236    to!SomeString should use formatValue

 After merging them, we can use const void toString(scope void
 delegate(const(char)[]) sink, ...) at all.

Great. Thank you.
 If you need custom formatting with class/struct, you can define
 toString taking sink.

 class UserClass {
      string name;
      double value;

      // taking sink and formatStr version
      const void toString(scope void delegate(const(char)[]) sink,
 string formatStr) {
          formattedWrite(sink, "{%s %s}", name, value)
      }
      // taking sink and FormatSpec!char version, more efficiently than above
      const void toString(scope void delegate(const(char)[]) sink,
 FormatSpec!char f) {
          std.range.put(sink, '{');
          std.range.put(sink, name);
          std.range.put(sink, ' ');
          formatValue(sink, value, f);
              // To through spec to 'value' field formatting, then
 support %s, %g, %a ...
          std.range.put(sink, '}');
      }
 }

 And if you need heapfied formatting, you can write like follows:

 auto obj = new UserClass("name", 1.0);
 assert(std.conv.to!string(obj) == "{name 1.0}");  // used Appender +
 formatValue internally

appender is slower than direct appending unless you are dealing with quite long arrays. I am not sure it is a good fit for this case, because the strings returned are usually quite short.
 assert(std.string.format("%s", obj) == "{name 1.0}");  // ditto

 , and if you really need formatting into stack-allocated buffer:

 char[20] buf;
 char[] result = std.string.sformat(sink[], "%s", obj);  // When
 buf.length is insufficient, FormatError is thrown

I think throwing an Error might be overkill, an Exception should suffice.
 assert(result == "{name, 1.0}");

 Kenji Hara

Sep 04 2011
prev sibling next sibling parent kenji hara <k.hara.pg gmail.com> writes:
2011/9/5 Timon Gehr <timon.gehr gmx.ch>:
 On 09/05/2011 04:30 AM, kenji hara wrote:
 char[20] buf;
 char[] result =3D std.string.sformat(sink[], "%s", obj); =A0// When
 buf.length is insufficient, FormatError is thrown

I think throwing an Error might be overkill, an Exception should suffice.

I thought the same issue, so I'm working to fix it. Please wait for a while= . Kenji
Sep 04 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 06 Sep 2011 19:46:35 -0400, Christophe  
<travert phare.normalesup.org> wrote:

 I've had a look at readUntil API, and it's not completely clear. Is the
 delegate supposed to remember what it has read and interpreted so far,
 or does it have to start from scratch each time ?

The start is an index at which new data was added. The deal is, the stream continually appends more data to the array until the delegate is satisfied. The start helps keep some context of "how much of this haven't I seen before?" So depending on how your delegate is implemented, you can avoid reading anything before start if you wish. However, you still have to take into account the data prior to start when returning how much data was processed. I can see where this scheme has its downsides for parsing that needs to keep state. It might be aggravating or even impossible to do this when your delegate has to exit when not enough data is present. However, the buffer default size is something like 10 pages. So the likelyhood that you have to return "get me more data" is pretty low, and even if it is, restarting the parsing would be a rare occurrence. So I agree, a delegate *callable* by the "readFrom" function would be preferrable and easier to deal with than using readUntil.
 Where could I see an
 implementation of a delegate suitable for readUntil ?

In the source code for the revamped stdio. Here is a byChunk range which uses it: https://github.com/schveiguy/phobos/blob/ceb4ec43057d18d42371128a614e81dbec45a5f6/std/stdio.d#L1665
 Basically, in both your and my API, a stream is giving some more
 characters to a readFrom method, as long as it asks for more. What I am
 not sure is if readFrom is supposed to build the read object like in my
 API , or if it is supposed to be built after with the string returned by
 readUntil.

It should be processed while the delegate is called for checking if readUntil should be stopped. In other words, the data returned by readUntil will be ignored.
 I think the main difference is that your API is written from the stream
 point of view, whereas my API is written from the point of view of the
 object being read, which will make implementation of readFrom easier by
 the users, who will not have to worry about their delegate being called
 multiple time.

 If I have more time, I may look deeper into Phobos stdin and your stdin
 proposal, but I'm not sure I should afford that...
 In the mean time, I hope I gave you nice ideas to improve your own
 proposal. Here are some more...

Yes, I'm thinking readFrom probably instead of being a readUntil delegate itself, should just accept a DInput (or whatever it gets renamed to). Then it has the choice of running the show, or just using readUntil.
 I will sum up the different ways to deal with buffering and any one of
 your API for readUntil, and my proposed API:

 1/ _use only peek_
 -the API is written to peek only one character at a time. You
 definitely lose the possibility for a stream to give a char[] directly
 to the parsing function, even for streams that are not files...

I plan in the next iteration of my revamped stdio to implement a peek function. It's actually pretty simple to implement in terms of readUntil: const(ubyte)[] peek(size_t nbytes) { const(ubyte)[] retval; size_t stopCond(const(ubyte)[] data, size_t start) { retval = data; if(data.length == start) return 0; // EOF return data.length >= nbytes ? 0 : size_t.max; } readUntil(&stopCond); return retval.length > nbytes ? retval[0..nbytes] : retval; }
 2/ _use c for low level stdin_
 -the default stream derived from stdin or from a file peeks only one
 character at a time. Everything works fine with c functions.
 -you can still explicitly create a stream object from a File to make
 double buffering and return several characters, but that makes the File
 no longer suitable for c functions, since some unread buffer can be
 hidden in the object performing the streaming operations.

If you are going this route, I think you're better off to use a rewritten buffering scheme. You've already lost the only reason to use C stdio to begin with -- compatibility with C functions.
 3/ _hack into c functions_
 -the default stream stream hacks into FILE* to use it's own internal
 buffer. This may not be easy to implement, but should be feasible by a
 system programmer, shouldn't it ?

Yes and no. There are issues: - What if the implementation is opaque? - What if you run out of buffer? - What if the implementation is open-source, but uses static functions? There are also other issues with FILE * not related to this discussion which make it a good idea to avoid.
 4/ _WTH, d should not rely on c functions to do all low level jobs_
 -the default stream peeks several characters. c functions are broken.
 -you can still rewrite c-like functions. For example, scanf could be the
 same as readf, but would support 0-terminated strings, and be
 implemented as a c-style variadic function (avoiding multiple template
 instanciation which make the generated code so big Walter refuses to
 use it).
 -if you need, you can still instanciate a FILE* that will never be seen
 by the d library, and that will work fine with c functions.

 5/ _variation on 2 and 4_
 - File are still compatible with current Phobos API, and the default
 streaming mode for file only peek one caracter at a time.
 - Some new struct can perform file operations in a d-like way that is
 incompatible with c function. However, no accessible File object is ever
 created for this structure, so no one will mix c and d read/write
 function.

This is somewhat what my new strategy is. Except File will seamlessly support both the existing phobos implementation and my new implementation. I'll be outlining how it works once I've settled on the API (and I'll probably have implementation ready too).
 One last point: any comments about using writeTo with my "stream" API
 like readFrom ?

I think this is what writeTo (as proposed) already does. -Steve
Sep 08 2011
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 08 Sep 2011 16:19:28 -0400, Christophe  =

<travert phare.normalesup.org> wrote:

 "Steven Schveighoffer" , dans le message (digitalmars.D:144156), a
  =C3=A9crit :
 Where could I see an
 implementation of a delegate suitable for readUntil ?

In the source code for the revamped stdio. Here is a byChunk range =


 which
 uses it:

I see. Are you not concerned by the fact that with this API, the input=

 stream has to perform heap allocation when its internal buffer is full=

 because the delegate could always ask for some more characters. That
 prevents the possibility to make a reading mecanisme that does not
 allocate anything on the stack.

No. The buffer then becomes that much bigger, and less likely to be = increased. In other words, the buffer "adjusts" itself to the largest = size needed, then becomes stable. This is over the lifetime of the inpu= t = stream, not just for this parse.
 One last point: any comments about using writeTo with my "stream" AP=



 like readFrom ?

I think this is what writeTo (as proposed) already does.

the proposed writeTo is: writeTo(void delegate(const(char)[]) sink, in char[] format). Here, the writeTo method writes the character in its own buffer, then gives it to sink. A writeTo with a "stream API" would be: writeTo(char[] delegate(size_t) stream, in char[] format). Here stream provides a buffer, and writeTo has to use this buffer, the=

 tell how much buffer it used to stream.

 I'm not sure mine is better, I'm just asking.

Oh, ok. I don't know how the performance would differ. It's an = interesting proposition, since it has the potential to save on copying = data. However, it does require that the stream allocate and maintain a = heap-allocated buffer. There are some cases where such buffering is = overkill. The writeTo method for a type might have a much better idea o= f = how much data is required (or even an upper limit) and can allocate all = = the buffer required on the stack. -Steve
Sep 08 2011